-
Notifications
You must be signed in to change notification settings - Fork 3
/
Copy pathhackpad.txt
162 lines (104 loc) · 17.9 KB
/
hackpad.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
HACK FOR WESTERN MASS: Mapping Safety Net Service Needs
UPDATE as of our meeting at Owl&Raven on Saturday, June 8th
Many of us will meet again at Owl&Raven on Saturday, June 15th from 2-5! Yay!
Here are some of the tasks we talked about and worked on at the June 8th meeting:
2 Crucial tasks with no one working on them currently, both javascript, A and B (not ordered):
A) Figure out how to make leaflet make the colors and hatching Lissie specified.-- need someone to implement Lissie's suggested colors and hatching in leaflet-- Could this be Andy? Jack? Jason?
An interesting thing I discovered that may or may not quicken implementation of any desired design: http://mapbox.github.io/tilemill/ - It's a desktop app that uses something called MSS that's similar to CSS to style maps. It can be fed several different kinds of shape and point data, and provides lots of cool functionality and control. Works with leaflet. In use here: http://www.npr.org/censusmap/
Also, if there's still any struggling going on with deciding on cartographic color schemes, here's a tool I ran across that appears to be for EXACTLY THAT PURPOSE. What are the odds?: http://colorbrewer2.org/
B) write the JS behind the website UI. What we have is Matt's basic html and css setup (on this https://github.com/hackforwesternmass/community-action-maps GitHub Repo) and it needs javascript to drive the buttons, connect to the map stuff via leaflet, etc. (Matthew, needs a Javascript person to do this-- Jack can you do it? Or Andy, can you? Matt might try to teach himself javascript if no one is avail.)
Note: The static site can be viewed at http://hackforwesternmass.github.io/community-action-maps And to be clear: I'll be working on polishing my js skills regardless, but one hopes that they get an adequate shine in time to be useful to this project!
C)--anything else belong here?
All To Dos:
-submitting the project http://hackforchange.org/submit ASAP -- can submit un-finished and update it multiple times, projects accepted and selected on a rolling basis, so do it now?
-Decide data cuts (based on the census data cuts (Carrie)) and which colors and hatching represent that (designer decision-- (Lissie)).
-Figure out how to make leaflet make those paticular colors and hatching.-- need someone to implement Lissie's suggested colors and hatching in leaflet, it needs to loop over the aggregated data originally from community action (colors) and separately on another layer loop over the census data (hatching). (then Lissie can see if it looks like it works OK from a designer's view and tweak it if it's too illegible, it needs help now, so she can see it). Jason tells me some of this is complete (a script is written for interpolation between two colors). What I think is left to do with it: figuring out how to do hatching for the census data set, scripting the looping (?), uniting the script with the geographies, having it do it for the data sets we have.
-Descide hosting and platform (Matthew and Carrie started thinking on this but did not resolve it fully). For right now we are serving the webpage straight from GitHub [ Could someone please add the link? ]. (http://hackforwesternmass.github.io/community-action-maps). Andrew started working in Django, created an admin interface for data entry, and would like to serve from Heroku. Andrew was planning to commit something that does this to GitHub early this week.
-Draw Census data via API (Carrie and Caleb, see far below) Moderate success, now we have poverty levels by tract for all three counties.
-Choose which Census data to use (Carrie and Claire). Income for sure, maybe vet status later. They have 100% FPL but do not do a cut at 125% FPL. Likely we will use number below poverty line per census tract.We could calculate the % using a call for the total in the universe and a call for the level. Right now we are just calling the level (at or below 100% of the Federal Poverty Line by census tract, per each of the 3 counties: need to exclude by tract in Hampden County and is currently acheived in three calls instead of one).
-JS behind the website. This is a key need with no one working on it right now. What we have is Matt's basic html and css setup and it needs javascript to drive the buttons, connect to the map stuff via leaflet, etc. (Matt, and needs a Javascript person to do this)
-Get transport data onto base layer (Jason, Alex). Jason and Alex made good headway on this, it will add to the finished prototype some nice aspects.
-Website design for displaying the map with 2 layers: overlaying an anonymized-client-data-layer and a census-data-layer on the map (Lissie). More to do, with help from data people (potentially Carrie, Claire, Caleb, and/or Matt/UI) to help determine what to turn off/on contextually based on what makes sense in the data
-Add Community Action service points by address (Andy, & Claire provided the address data)
-Client data input tool for anonymizing the data by aggregating it into tracts. There are two ways to do this: put all the code on the website, including the tool that takes the client data and anonymizes it by aggregating it into census tracts (already scripted in Python by Caleb, with improvements/tweaks by Jack)(Andrew developed something this does this on the server side, an admin interface that will allow them to upload data in Django)
Conceptual data refinements:
-Lissie is trying to make the 2-layer map approach work visually, so the idea is: 1 layer in colors by census tract for the aggregated Community Action data, and 1 layer in hatching for the census data that the user compares by seeing them overlaid and uses their mind to 'see'.
- We will focus on representing income by census tract on the map to narrow what we are working on (conceptualized as number below the poverty line per census tract. We could do percent if we wanted to take some extra steps). If we have time to add more things, I think the next data concept we could add would be veteran status.
Challenge
To make the best use of its resources, Community Action uses data to ensure their services are reaching areas that have the most need. For example, do they have Parent-Child Development Centers located in areas with the highest rates of child poverty?
Their data includes keeping internal records of their clients in Franklin, Hampshire, and Hampden counties, including addresses and the programs used. They also use county-level Census data (anything from poverty rates to commuting times) to assess local need.
Community Action would like to improve insight into its reach by doing the following:
Using more granular level Census data to track need (e.g., use Census tracts instead of county level)
Assigning the locations of their clients to a census tract
Visually comparing the two items, most likely on a map
Constraints: Because Community Action’s client data is private, they need an easy way to upload/input the data internally and have it feed into the map. In other words, hackathon participants won’t have access to the actual addresses of people being served. However, their data expert will be on hand to advise on the best format.
Even though the client information is secure, the mechanism for adding it to the map should follow the Hack for Western Mass open source guidelines.
Solution
Community Action is looking for a way to compare need and persons served. They want to answer questions like, “do we have Head Start sites in locations with high rates of child poverty?” and “what percentage of poor kids in the region are our programs reaching.?”
They currently use tables for this kind of analysis but feel that a map would provide more powerful insight and make it easier for them to tell their story. Are there other kinds of easy-to-understand visualizations that would help?
Ideally, the solution would:
cover Community Action’s service area of Franklin County, Hampshire County, and Western Hampden County
show census and client data not only for the current year but for previous years
export data to a spreadsheet
Depending on how it’s built, the solution might also qualify for the Census Bureau’s national American Community challenge (i.e., an opportunity to submit it to theirapplication showcase).
Some steps:
Geocode confidential address data to produce lat/long [AA: check privacy policy of geocoder – if none, then data may no longer be confidential. ] Here's a site offering free geocodes of simple tables.
Identify census tract containing address lat/long;
Aggregate client characteristics to census tract level, supress or aggregate tracts to preserve confidentiality [AA: unlikely once at tract level, Census Bureau gives sensitive information down to block group level — maybe also a possibility here instead of using tracts] {Code that is an example for these first three steps is on github}
Client characteristics can be expressed as an average or total over the census tract, depending on the type of data. Probably we will overlay the aggregated client data in tracts under a census data layer to show both at once. Possibly, to get into one image, it could be expressed as total number of clients or as a percent of population in that census tract- but a separate census data overlay might be a better way to go conceptually. (Update: I think we are going forward with the two-layer method). Fake client data is on github here and Census data on poverty levels by tracts is currently in three pieces for 3 counties: Hampshire numerator, Franklin numerator, and Hampden numerator (which we only want half of the tracts in) divided respectively by Hampshire denominator, Franklin denominator, and Hampden denominator. Table S1701 would have done this percent calculation for us, but we can't get to it.
Bring this data into a geographically-based visualization tool. OpenStreetMap will work for this; example: http://www3.amherst.edu/~aanderson/hack/communityaction.html This map uses Leaflet [ http://leafletjs.com/ ], "a modern open-source JavaScript library for mobile-friendly interactive maps. Weighing just about 28 KB of JS code, it has all the features most developers ever need for online maps." [AA: I'm already starting to doubt this :-]
Website wireframe/design (concurrent)
Website actualization (concurrent)
Useful Leaflet plugins
Leaftlet plugins page - http://leafletjs.com/plugins.html
Heatmaps
HeatCanvas - Simple heatmp api based on HTML5 canvas https://github.com/sunng87/heatcanvas
heatmap.js - JavaScript Library for HTML5 canvas based heatmaps. Its Leaflet layer implementation supports large datasets because it is tile based and uses a quadtree index to store the data http://www.patrick-wied.at/static/heatmapjs/example-heatmap-leaflet.html
Labels
Leaflet.label - Add text labels to map markers and vector layers https://github.com/Leaflet/Leaflet.label
Leaflet Data Visualization Framework - New markers, layers, and utility classes for easy thematic mapping and data visualization http://humangeo.github.com/leaflet-dvf/
These would be useful for tying stories to certain locations
Leaflet tutorials:
Choropleth map: http://leafletjs.com/examples/choropleth.html
Layers: http://leafletjs.com/examples/layers-control.html
API Key
If you need a Census API key, sign up here: http://www.census.gov/developers/tos/key_request.html
Other Maps/Data Displays
Some other maps/data that may be of interest:
Social Explorer: http://www.socialexplorer.com/pub/home/home.aspx (flash-based, shows whole country, choropleth, can use different filters to display)
West Coast Poverty Center: http://depts.washington.edu/wcpc/maps_interactives (collects a variety of similar maps, may be useful for pinpointing details of a type of map or ideas of things to display on the map)
Mapping America: http://projects.nytimes.com/census/2010/explorer (flash-based, a variety of different sets of data, dots and choropleth, views by census tract)
National Center for Children in Poverty: http://www.nccp.org/tools/demographics/ (Not maps, but generates tables based on the data. Would this be an adjunct to consider _way_ down the line?)
County maps of Massachusetts (pictorial):
http://quickfacts.census.gov/qfd/maps/massachusetts_map.html
http://www.doe.mass.edu/resources/countymap.pdf
Census tracts covering Franklin, Hampshire, and Western Hampden County:
25011040100 25011040200 25011040300 25011040400 25011040501 25011040502 25011040600 25011040701 25011040702 25011040800 25011040900 25011041000 25011041100 25011041200 25011041300 25011041400 25011041501 25011041502 25013812201 25013812202 25013812300 25013812401 25013812403 25013812404 25013812500 25013812600 25013812701 25013812702 25013812800 25013812901 25013812902 25013812903 25013813000 25013813101 25013813102 25013813204 25013813205 25013813206 25013813207 25013813208 25013813209 25015820101 25015820102 25015820202 25015820203 25015820204 25015820300 25015820400 25015820500 25015820600 25015820700 25015820801 25015820802 25015820900 25015821000 25015821100 25015821200 25015821300 25015821400 25015821500 25015821601 25015821602 25015821700 25015821901 25015821903 25015821904 25015822000 25015822200 25015822300 25015822401 25015822402 25015822500 25015822601 25015822603 25015822605 25015822606 25015822700
GeoJSON of Community Action Service Area: Centered on 42.369373, -72.639241
Lissie and I stayed late on Saturday of the hackathon talking about design and I realized the data to be presented on the map has not been fully conceptualized, especially when considering design constraints and what is most crucial to present when considering the perspective of what should be clear to the user, and what CA most would like to show. There is a challenge in doing a separate overly of the CA data and then the census data for the same tract transparently over that in is own layer, to compare. Example: By census tract, percent in poverty in the last 12 months, compared to aggregate number of persons below the poverty line served by Community Action in the census tract (data from Community action).
issue: data is not currently unified across programs in the Community Action data. solution: people must be deduplicated or uniquely IDed when data from multiple programs is pooled.
In the short term, it might be possible to create maps by each program, but perhaps we should design should be for long-term use, since the agency is working to unify their data system, and this provides more clear, readable data concepts.
Census API provides
URL
Census API example: Note that this example uses Caleb's key
http://api.census.gov/data/2010/acs5?key=d5d4b40c4adddb44bdc56cdf1d75df32c83eab14&get=B17001_001E&for=tract:*&in=state:02+county:170
The variable list: (XML): http://www.census.gov/developers/data/acs_5yr_2011_var.xml
ACS5/2010
The variable list: (XML): http://www.census.gov/developers/data/sf1.xml
Summary Table/2010
Neither one of these things has variables defined in such a way that we get out what we want....
The following links ARE WORKING! They return levels [AA: do you mean raw numbers? If so, I would not use them without scaling them – by the population for which the poverty status is known (not total population).][CB: ok, we can do that. I'll generate the links for totals that will become the denominator of the fraction that is the percentage, it just doubles the steps and requires a calculation on our end because S1701, which provide poverty %s, is not available through the API], not percentages. See notes below, we could hack something together to get percentages if those are preferable. Census tract population sizes do vary around 4,000, but not a huge amount, so levels might be OK to compare, especially since the comparator layer of Community Action data is also levels and is based on identical geography. [Eh, I checked this claim, and tract pops are +/- 1k so Andy I think you are right]
The following data is from the 2011 ACS 5-year estimates, table B17001, "POVERTY STATUS IN THE PAST 12 MONTHS BY SEX BY AGE". Field 002E is "Estimate; Income in the past 12 months below poverty level", while field 001E is "Estimate; Total". Data links below are in the JSON-compatible format
["B17001_001E","B17001_002E","state","county","tract"]
which can be compiled into a dictionary object. To get percentage for each tract, divide by the total people in the tract (if greater than 0), i.e.:
poverty["state"+"county"+"tract"] =
B17001_001E > 0 ? B17001_002E / B17001_001E : 0
All census tracts in Franklin County:
<api.census.gov/data/2011/acs5?key=45eb040aa8066098e3844a5e2fa9bf100de52774&get=B17001_001E,B17001_002E&for=tract:*&in=state:25+county:011> link (uses Carrie's key)
All census tracts in Hampshire County:
<api.census.gov/data/2011/acs5?key=45eb040aa8066098e3844a5e2fa9bf100de52774&get=B17001_001E,B17001_002E&for=tract:*&in=state:25+county:015> link uses Carrie's key)
All census tracts in Hampden County:
<api.census.gov/data/2011/acs5?key=45eb040aa8066098e3844a5e2fa9bf100de52774&get=B17001_001E,B17001_002E&for=tract:*&in=state:25+county:013> link (uses Carrie's key)
We have a problem still to solve: we need to NOT represent half of the tracts in Hampden county (=013). We may have to write a script at our end that loops over the tract numbers we wants and then only represents THOSE out of the data drawn down via the Census API. Still searching syntax to see if I can get all three counties to return at once. If I can do that I may also be able to write a really long URL that specifies the exact tracts we want from Hampden County only...
Previously, I did not get table S1701 to work properly. I believe it is because S1701 is a special tabulation, so it's not available in the way we want. (I am looking into using table B17001 instead, see above). We will need to use (B17001_002E) for the level of those below the poverty line. If we want a percent we will have to write a calculation that takes, for each tract, the level of the total (B17001_001E) divided by the level of those below poverty (variable first named).
American Community Survey documentation (pretty light): http://www2.census.gov/acs2011_5yr/summaryfile/ACS_2007_2011_SF_Tech_Doc.pdf