Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Translations: decode Crop Production Percentile crop names #614

Closed
emlys opened this issue Aug 6, 2021 · 4 comments · Fixed by #1734
Closed

Translations: decode Crop Production Percentile crop names #614

emlys opened this issue Aug 6, 2021 · 4 comments · Fixed by #1734
Assignees
Labels
enhancement New feature or request
Milestone

Comments

@emlys
Copy link
Member

emlys commented Aug 6, 2021

Some of the crop names that you can choose from in the Crop Production Percentile model are suffixed with "nes" or "for": beetfor, tropicalnes, carrotfor. What's a carrotfor? 🤣 . I'm not sure what the suffixes means and they will be problematic to translate. Ideally we could de-code these and use translate-able names for every crop.

@emlys emlys added enhancement New feature or request question Further information is requested labels Aug 6, 2021
@emlys
Copy link
Member Author

emlys commented Dec 17, 2021

I asked Becky about this on slack and got no response (she probably isn't on our slack much anymore). Need to follow up by email.

@emlys emlys removed the question Further information is requested label Dec 17, 2021
@emlys emlys self-assigned this Dec 17, 2021
@emlys emlys added the icebox Low priority task that we might look more into later on label Sep 7, 2023
@emlys
Copy link
Member Author

emlys commented Sep 7, 2023

Moving to icebox since we don't have real definitions for the original names to begin with.

@emilyanndavis
Copy link
Member

While updating the Crop Production models, I stumbled across a code comment referencing this issue and embarked on a brief side quest to see if it might be feasible to pull this out of the icebox.

Here is what I found:

  1. The Global Harvested Area and Yield for 175 Crops Metadata and Technical Documentation PDF includes a table of EarthStat and FAO crop names, in which it is evident the short crop names we use in the models are the EarthStat crop names. The FAO crop names are more readable, but still unclear in several cases, e.g., "berrynes" maps simply to "Berries Nes" (which I'm guessing is either a special-edition Nintendo console, or something along the lines of "berries not elsewhere specified"), and indeed, "carrotfor" (along with other "-for" names) remains untranslated.
  2. Table 1 in Farming the planet: 2. Geographic distribution of crop areas, yields, physiological types, and net primary production in the year 2000 (cited by the User Guide) lists all 175 crops by their human-readable/translatable name. Although the table does not include references to the EarthStat crop names, there are patterns that make it reasonable to conclude that the "nes" suffix (probably an acronym) essentially means "other", while the "for" suffix means either "for fodder" or "for forage and silage", depending on the crop.

By cross-referencing the "Farming the planet..." table with the EarthStat table, I think we can confidently map most, if not all, of the EarthStat crop names to human-readable/translatable descriptions. And if there are any we're still unsure about, there's no harm in leaving them untranslated.

And, just to be clear, I'm also volunteering to take this on, as long as there are no objections!

@emilyanndavis
Copy link
Member

In PR #1734, each crop supported by the Percentile model has a human-readable/translatable name. These names were determined by cross-referencing Table 1 in Farming the planet: 2. Geographic distribution of crop areas, yields, physiological types, and net primary production in the year 2000 by Monfreda et. al. ("Monfreda") with the EarthStat and FAO crop names and crop groups table ("FAO") that is distributed alongside the datasets.

In many cases, the Monfreda crop name is equivalent to the FAO crop name. In cases where they differ, I defaulted to choosing the Monfreda crop name, since these tend to be more descriptive, particularly where crop names are suffixed with nes or for.

However, I opted for the FAO crop name in cases where it was more descriptive. For example, some FAO crop names include additional list items (such as coriander in addition to anise, badian, and fennel; clementines in addition to tangerines and mandarins; broccoli in addition to cauliflower; etc.). Other FAO crop names include qualifiers (such as with shell), alternate/British English names (such as (aubergines)), or disambiguations (such as (Piper spp.)) that I judged likely to be helpful. In a very small number of cases, I altered the word order to yield a more natural phrase (e.g., Fresh tropical fruit, other instead of Monfreda's Fruit tropical fresh, other) for easier reading and translation.

When in doubt about how to resolve discrepancies, I sought additional context from the annexes of the FAO's World Programme for the Census of Agriculture 2020 ("WPCA"). For example, uncertain whether greenbroadbean should be Broad beans, green (Monfreda) or Leguminous vegetables, other (FAO), I turned to the WPCA for a tiebreaker, found Broad bean, harvested green, and therefore decided to use the Monfreda name.

@emilyanndavis emilyanndavis added this to the 3.14.4 milestone Jan 14, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants