Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

reducing complexity #26

Open
paulgirard opened this issue Jun 13, 2018 · 1 comment
Open

reducing complexity #26

paulgirard opened this issue Jun 13, 2018 · 1 comment

Comments

@paulgirard
Copy link
Member

paulgirard commented Jun 13, 2018

Objective : reduce number of entities by reducing complexity

city/part_of

les city/part_of are aggregated under their country_part_of by summing flows by year, reporting.
We careate a vritual entity "country agregated".
Then we delete the flows with the city/part_og we aggregated.
We merge the "coutnrey aggregated" with the country.
We keep flows to country but use aggraged flows when missing.

If you use aggregated flows, we flag them as city/part_of_aggregated.

colonial areas

colonial areas are aggregated by country_part_of,continent by summing flows by year, reporting.
we create virtual entity 'colonial empire of country x in continent y'.
For a reporting R which has flows with a virtual colonial empire :
For the same year we sum flows from reporting countries which are colonies and whose partner is reporting R if they are not partner ofreporting R in that year by continent.
We then substract this amount from the colonial area virtual entities for reporting R
negative resutls implies to delete the flow.
FLows to colonial continental areas are flaged as colonial_areas_aggregated

group

groups which are composed of countries Cn which are partner of a reporting R in year Y

  • if all Cn are a reporting in year Y and declared a flow to R then we delete the group from R flows
    alternative : then we calculate import ratio among Cn from R. We apply those ratios on the group exports from R to create export flows from R to Cn.
  • if some Cn are a reporting in year Y and declared a flow to R then we substract from the group flow to R the amount of flows from Cn to R AND we remove Cn from group 'label'

Deprecated group explanation
for the partner which are groups of countries, if those countries are reporting for this year, we calculate the percentages of imports declared by those countries. We then create export flows from the reporting to the countries by applygin the percentages to the export to the group entiy.
TAG : group_divided
If we only have some of the countries of the group we substract the import of available countries from the export the group entity. We remove the available from the group.
TAG: group_substract

We have the part_of_country column which indicates the country.

@paulgirard
Copy link
Member Author

Change of plan.

Because of mirror flows discrepancies, our first plan is not reliable enough (lots of negative flows)
Thus we decide to test a new plan :

  • We translate colonial areas into groups thanks to COW data (as adapted into RICentity links table)

  • for the group :

    • for each pair of reporting groups :
      • we look for all flows from R to all groups member for all years
        => for years n wihc we have all the membre flows, we calcule the ratio for the nearest year
        we call this ratio r_direct and the distance in years d_direct
        if d_direct <=10:
        we apply this ratio
        else:
      • we look for all mirror flows from the entities composing the group for all years
        => for years in which we have all the mirrors, we calculate the ratio for the nearest year
        we call this ratio r_mirror and d_mirror the distance in years d_mirror
        if d_mirror <=10 :
        we apply this ratio
        else
        we keep the flow
  • apply the city part of algorithm

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant