-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathwelfarecap.Rmd
501 lines (369 loc) · 21 KB
/
welfarecap.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
---
title: "Welfare cap"
output: html_notebook
---
# Finding out which areas are likely to be most affected by the welfare cap
In January the benefit cap (which is applied to those who have been claiming Universal Credit for 9 months) will hit those who first began claiming Universal Credit (UC) at the start of the first lockdown.
We want to find out:
* Which areas have the largest proportion of people likely to be affected by that cap
* Which areas have 'struggled most to recover', in terms of having the highest proportion of claimants not to have come off UC since April (and conversely those which have recovered best)
* Which areas have been 'hardest hit', in terms of having the highest proportion of their population and/or claimants claiming UC since April
* What is the average amount of benefit cap in each area (and therefore what might be expected)
Additionally, we might also look at the local housing allowance (LHA) rates and:
* How changes in housing prices have already affected claimants whose benefits were [based on "the 30th percentile of local rents from April 2020."](https://www.gov.uk/government/publications/local-housing-allowance-lha-rates-applicable-from-april-2020-to-march-2021)
A methodological note on comparing UC claimants before and after April:
> "A temporary £1,040 a year increase to the standard allowance for 1 year from April 2020 was introduced as part of a support package to support people through the coronavirus pandemic.
> "Additionally, local housing allowance rates were increased to the 30th percentile of local rents from April 2020.
> "These changes had the effect of bringing more people with higher earnings on to Universal Credit. This is because the more Universal Credit entitlement a household has, the more earnings the household can have before their award reduces to zero by the taper rate. Consequently, there are more people in employment who are eligible for Universal Credit than without the temporary increase. The background information and methodology document has more information on the calculation of a Universal Credit award.
## Which areas have the largest proportion of people likely to be affected by that cap
How many people began claiming UC in April 2020 and are now hitting six months? (Which would mean hitting 9 months in Jan)
This data is in StatXplore in the Universal Credit folder, the People on Universal Credit database, Table 4: Local Authority by Employment Status.
Exploring this data in Excel you can see the 'hump' of new claimants moving from the 0-3 months count to 3-6 months and then, in October, 6-12 months.
We've downloaded data for October 2020 by local authority (rows) and duration (columns). This can be exported as CSV or XLSX - the CSV has annotation columns but these are all empty so we import the XLSX version as it needs less cleaning:
```{r import data on duration Oct 20}
ucbydurationLAoct20 <- read.csv("ucbydurationLAoct20.csv", stringsAsFactors = F, skip = 10)
#import the XLSX instead
ucbydurationLAoct20 <- rio::import("ucbydurationLAoct20.xlsx", skip = 10)
```
Note that the data isn't all imported as numbers:
```{r summary}
summary(ucbydurationLAoct20)
```
So we need to specify data type:
```{r import oct data specifying data types}
ucbydurationLAoct20 <- rio::import("ucbydurationLAoct20.xlsx", skip = 10, col_types = c("guess", "guess", "numeric", "numeric", "numeric", "numeric", "numeric", "numeric", "numeric", "numeric", "numeric"))
summary(ucbydurationLAoct20)
```
Let's clean the colnames
```{r clean colnames}
colnames(ucbydurationLAoct20)
colnames(ucbydurationLAoct20)[2] <- "LA"
colnames(ucbydurationLAoct20)
```
Some annotations to note and store:
```{r store annotations}
ucbydurationLAoct20$LA[379:382]
annotations.ucbydur <- paste(ucbydurationLAoct20$LA[379:382])
```
Now we fetch the September data:
```{r import sep data and clean}
ucbydurationLAsep20 <- rio::import("ucbydurationLAsep20.xlsx", skip = 10, col_types = c("guess", "guess", "numeric", "numeric", "numeric", "numeric", "numeric", "numeric", "numeric", "numeric", "numeric"))
colnames(ucbydurationLAsep20)[2] <- "LA"
```
And check the annotations:
```{r store sep annotations}
ucbydurationLAsep20$LA[380:383]
annotations.ucbydurSEP <- paste(ucbydurationLAoct20$LA[380:3823])
```
Comparing the annotations and the data, the 'p' for provisional appears in the date heading 'October' so those figures are provisional. The 'r' for revised appears in the date heading 'September' so those are revised.
Let's clean these up a bit:
```{r remove empty rows/cols}
#remove first row
ucbydurationLAsep20 <- ucbydurationLAsep20[-1,]
ucbydurationLAoct20 <- ucbydurationLAoct20[-1,]
#remove first col
ucbydurationLAsep20 <- ucbydurationLAsep20[,-1]
ucbydurationLAoct20 <- ucbydurationLAoct20[,-1]
#remove last rows
ucbydurationLAsep20 <- ucbydurationLAsep20[-c(374:383),]
ucbydurationLAoct20 <- ucbydurationLAoct20[-c(373:383),]
```
And reshape:
```{r reshape wide to long}
#We need the tidyr library to use gather
library(tidyr)
#gather all the category columns into a single one
UCbyLAsep20 <- tidyr::gather(ucbydurationLAsep20, category, claimants, colnames(ucbydurationLAsep20)[2:10])
#Check it's worked
table(UCbyLAsep20$category)
```
Now repeat for Oct:
```{r reshape oct}
#gather all the category columns into a single one
UCbyLAoct20 <- tidyr::gather(ucbydurationLAoct20, category, claimants, colnames(ucbydurationLAoct20)[2:10])
#Check it's worked
table(UCbyLAoct20$category)
```
Create a data frame in case we need it.
```{r categories df}
categories <- as.data.frame(table(UCbyLAoct20$category))
```
We want to merge them to calculate change. To do that we need to combine the two fields that so they are unique:
```{r merge datasets}
#Create a new column in each data frame that combines two others
UCbyLAsep20$lacat <- paste(UCbyLAsep20$LA, UCbyLAsep20$category, sep=":")
UCbyLAoct20$lacat <- paste(UCbyLAoct20$LA, UCbyLAoct20$category, sep=":")
#We need distinctive column names for the counts
colnames(UCbyLAsep20)[3] <- "claimantsSep20"
colnames(UCbyLAoct20)[3] <- "claimantsOct20"
#Now merge
UCbyLAsepoct20 <- merge(UCbyLAsep20, UCbyLAoct20, by = "lacat")
```
And calculate change
```{r calc change}
#October figure minus September figure = change
UCbyLAsepoct20$changeSepOct <- UCbyLAsepoct20$claimantsOct20 - UCbyLAsepoct20$claimantsSep20
#change divided by Sep figure = % change
UCbyLAsepoct20$changeSepOctPerc <- UCbyLAsepoct20$changeSepOct / UCbyLAsepoct20$claimantsSep20
```
Then export:
```{r export}
write.csv(UCbyLAsepoct20, "UCbyLAsepoct20.csv")
```
Some further analysis can now take place in Excel. There, we do the following:
1. Create a lookup table showing the 'next level' for each level (e.g. for 3-6 months the next level is 6-12 months)
2. Create a new column which uses VLOOKUP to fetch the next level for each row of data
3. That column is used to create a VLOOKUP to fetch the change for the next level (e.g. for a row showing the numbers of 3-6 month claimants in one area, we now also know the change in the 6-12 month category above)
3. An `IF` function is used to add that change in the next level to the change in the current level IF the change in the current level AND the change in the level above are both above 0 (IF not, then the original figure is left unchanged): `=IF(AND(I2>0,L2>0),I2+L2,I2)`
The logic for this runs as follows:
If an area has 1000 claimants in the 6-12 month category in September, and then 1200 in October, it has gained *at least* 200 people who have moved up from the 3-6 month category (there is no other way to get into the 6-12 month category).
However, the real number is likely to be higher. For example, if 100 people left the 6-12 month category in October (because they found work, or because they entered the 13th month of claiming UC) then it would take 300 people moving up from the 3-6 month category to make the overall number rise by 200 (300 people entering, minus 100 people leaving).
So IF the number of people in the 1-2 year category also rose in October, this tells us the lowest possible number of people who left the 6-12 month category, and therefore the starting point for any rise. E.g. if the 6-12 month category lost 100 people before rising by 200 overall we can add the 100 to the over 200 increase to get a better minimum figure of people entering that category from the 3-6 month category.
This is still an underestimate, as we don't know how many people left that category because they found work/higher earnings/stoped claiming.
Let's bring that analysis back into R:
```{r re-import}
UCbyLAsepoct20analysed <- rio::import("UCbyLAsepoct20ANALYSIS.xlsx", sheet = 1)
```
## How many people entered the system in April and May?
Now we know how many (at least) passed the 6 month mark in October, let's track back to April and May to see how many entered the system then. Although the lockdown began in the third week of March the March figures [would not capture them](https://stat-xplore.dwp.gov.uk/webapi/metadata/UC_Starts/Starts%20on%20Universal%20Credit.html):
> "The reporting month in relation to starts to Universal Credit relates to a period from the Friday following the second Thursday in the previous month to the second Thursday in the reporting month."
We are using the 'Starts to UC' dataset inside the UC folder on StatXplore
This data is only provided by postcode district or JobCentre Plus.
```{r import uc starts}
startsonUCbyPOaprmay1920 <- rio::import("startsonUCbyPOaprmay1920.xlsx", skip = 9, col_types = c("guess", "guess", "numeric", "numeric", "numeric", "numeric", "numeric", "numeric"))
head(startsonUCbyPOaprmay1920[-1,])
#Remove first row
startsonUCbyPOaprmay1920 <- startsonUCbyPOaprmay1920[-1,]
#Remove first col
startsonUCbyPOaprmay1920 <- startsonUCbyPOaprmay1920[,-1]
#rename first col
colnames(startsonUCbyPOaprmay1920)[1] <- "postcodedistrict"
#remove total column
startsonUCbyPOaprmay1920 <- startsonUCbyPOaprmay1920[,-7]
```
Let's add some calculations:
```{r uc start calcs}
#calculate year on year change
startsonUCbyPOaprmay1920$yoychangeapr <- startsonUCbyPOaprmay1920$`April 2020` - startsonUCbyPOaprmay1920$`April 2019`
#And as a %
startsonUCbyPOaprmay1920$yoychangePercApr <- startsonUCbyPOaprmay1920$yoychangeapr / startsonUCbyPOaprmay1920$`April 2019`
#calculate month on month change
startsonUCbyPOaprmay1920$momchangeapr <- startsonUCbyPOaprmay1920$`April 2020` - startsonUCbyPOaprmay1920$`March 2020`
#And as a %
startsonUCbyPOaprmay1920$momchangePercApr <- startsonUCbyPOaprmay1920$momchangeapr / startsonUCbyPOaprmay1920$`March 2020`
#calculate month on month change for May
startsonUCbyPOaprmay1920$momchangemay <- startsonUCbyPOaprmay1920$`May 2020` - startsonUCbyPOaprmay1920$`April 2020`
#And as a %
startsonUCbyPOaprmay1920$momchangePercMay <- startsonUCbyPOaprmay1920$momchangemay / startsonUCbyPOaprmay1920$`April 2020`
```
And export
```{r export uc starts}
write.csv(startsonUCbyPOaprmay1920, "startsonUCbyPOaprmay1920calcs.csv")
```
## Matching up to the April 6 month point in October
In order to match up we need to use the same geo unit: postcode districts.
This data is in StatXplore in the Universal Credit folder, the People on Universal Credit database, Table 6: Postcode Area
This time we've exported from September to November in one file:
```{r import oct data for postcode districts}
ucbydurationPOsepNov20 <- rio::import("ucbydurationPOsepNov20.xlsx", skip = 6)
head(ucbydurationPOsepNov20, 20)
```
Columns 3-11 cover September, 12 to 20 for October.
Let's import the data properly, specifying data type and skipping the first 10 rows:
```{r import PO with data types}
ucbydurationPOsepNov20 <- rio::import("ucbydurationPOsepNov20.xlsx", skip = 10, col_types = c("guess", "guess", "numeric", "numeric", "numeric", "numeric", "numeric", "numeric", "numeric", "numeric", "numeric", "numeric", "numeric", "numeric", "numeric", "numeric", "numeric", "numeric", "numeric", "numeric", "numeric", "numeric", "numeric", "numeric", "numeric", "numeric", "numeric", "numeric", "numeric"))
summary(ucbydurationPOsepNov20)
```
Then clean the rows and cols:
```{r clean PO colnames}
colnames(ucbydurationPOsepNov20)
colnames(ucbydurationPOsepNov20)[2] <- "postcodedistrict"
colnames(ucbydurationPOsepNov20)
```
```{r remove empty rows/cols of PO}
#remove first row
ucbydurationPOsepNov20 <- ucbydurationPOsepNov20[-1,]
#remove first col
ucbydurationPOsepNov20 <- ucbydurationPOsepNov20[,-1]
#remove last rows
ucbydurationPOsepNov20 <- ucbydurationPOsepNov20[-c(123:135),]
```
Now to reshape - first for September:
```{r reshape wide to long PO}
#gather all the category columns into a single one
UCbyPOsep20 <- tidyr::gather(ucbydurationPOsepNov20[,c(1:10)], category, claimants, colnames(ucbydurationPOsepNov20)[2:10])
#Check it's worked
table(UCbyPOsep20$category)
```
These headings need cleaning to get rid of the digits:
```{r rename cols with gsub}
colnames(ucbydurationPOsepNov20) <- gsub("...[0-9]+$","",colnames(ucbydurationPOsepNov20))
```
Then repeat the reshape:
```{r reshape wide to long PO with clean colnames}
#gather all the category columns into a single one
UCbyPOsep20 <- tidyr::gather(ucbydurationPOsepNov20[,c(1:10)], category, claimants, colnames(ucbydurationPOsepNov20)[2:10])
#Check it's worked
table(UCbyPOsep20$category)
```
Then for Oct:
```{r reshape wide to long PO oct}
#this time we select the first col and the 12th-19th
UCbyPOoct20 <- tidyr::gather(ucbydurationPOsepNov20[,c(1,12:19)], category, claimants, colnames(ucbydurationPOsepNov20)[12:19])
#Check it's worked
table(UCbyPOoct20$category)
```
And for Nov:
```{r reshape wide to long PO nov}
#this time we select the first col and the 12th-19th
UCbyPOnov20 <- tidyr::gather(ucbydurationPOsepNov20[,c(1,20:28)], category, claimants, colnames(ucbydurationPOsepNov20)[20:28])
#Check it's worked
table(UCbyPOnov20$category)
```
We again want to merge them to calculate change. To do that we need to combine the two fields that so they are unique:
```{r merge datasets PO}
#Create a new column in each data frame that combines two others
UCbyPOsep20$lacat <- paste(UCbyPOsep20$postcodedistrict, UCbyPOsep20$category, sep=":")
UCbyPOoct20$lacat <- paste(UCbyPOoct20$postcodedistrict, UCbyPOoct20$category, sep=":")
UCbyPOnov20$lacat <- paste(UCbyPOnov20$postcodedistrict, UCbyPOnov20$category, sep=":")
#We need distinctive column names for the counts
colnames(UCbyPOsep20)[3] <- "claimantsSep20"
colnames(UCbyPOoct20)[3] <- "claimantsOct20"
colnames(UCbyPOnov20)[3] <- "claimantsNov20"
#Now merge sep and oct
UCbyPOsepoct20 <- merge(UCbyPOsep20, UCbyPOoct20, by = "lacat")
#Now merge oct and nov
UCbyPOoctnov20 <- merge(UCbyPOoct20, UCbyPOnov20, by = "lacat")
```
Now add the calculations:
```{r calc change PO}
#October figure minus September figure = change
UCbyPOsepoct20$changeSepOct <- UCbyPOsepoct20$claimantsOct20 - UCbyPOsepoct20$claimantsSep20
#change divided by Sep figure = % change
UCbyPOsepoct20$changeSepOctPerc <- UCbyPOsepoct20$changeSepOct / UCbyPOsepoct20$claimantsSep20
#Nov figure minus Oct figure = change
UCbyPOoctnov20$changeOctNov <- UCbyPOoctnov20$claimantsNov20 - UCbyPOoctnov20$claimantsOct20
#change divided by Sep figure = % change
UCbyPOoctnov20$changeOctNovPerc <- UCbyPOoctnov20$changeOctNov / UCbyPOoctnov20$claimantsOct20
```
```{r colnames}
colnames(UCbyPOsepoct20)
colnames(UCbyPOoctnov20)
```
Let's try to remove those duplicate columns:
```{r remove duplicate cols}
UCbyPOsepoct20 <- UCbyPOsepoct20[,c(-5,-6)]
head(UCbyPOsepoct20)
UCbyPOoctnov20 <- UCbyPOoctnov20[,c(-5,-6)]
head(UCbyPOoctnov20)
```
And rename the columns
```{r rename cols}
colnames(UCbyPOsepoct20)[2] = "postcodedistrict"
colnames(UCbyPOsepoct20)[3] = "category"
colnames(UCbyPOsepoct20)
colnames(UCbyPOoctnov20)[2] = "postcodedistrict"
colnames(UCbyPOoctnov20)[3] = "category"
colnames(UCbyPOoctnov20)
```
And export
```{r export PO calcs}
write.csv(UCbyPOsepoct20, "UCbyPOsepoct20.csv")
write.csv(UCbyPOoctnov20, "UCbyPOoctnov20.csv")
```
## Repeat for 2019
For some historical context, let's repeat the above process for the 2019 data:
```{r import 2019 data for postcode districts}
ucbydurationPOsepNov19 <- rio::import("ucbydurationPOsepOctNov19.xlsx", skip = 10, col_types = c("guess", "guess", "numeric", "numeric", "numeric", "numeric", "numeric", "numeric", "numeric", "numeric", "numeric", "numeric", "numeric", "numeric", "numeric", "numeric", "numeric", "numeric", "numeric", "numeric", "numeric", "numeric", "numeric", "numeric", "numeric", "numeric", "numeric", "numeric", "numeric"))
summary(ucbydurationPOsepNov19)
colnames(ucbydurationPOsepNov19)
colnames(ucbydurationPOsepNov19)[2] <- "postcodedistrict"
colnames(ucbydurationPOsepNov19)
#remove first row
ucbydurationPOsepNov19 <- ucbydurationPOsepNov19[-1,]
#remove first col
ucbydurationPOsepNov19 <- ucbydurationPOsepNov19[,-1]
#remove last rows
ucbydurationPOsepNov19 <- ucbydurationPOsepNov19[-c(123:135),]
#gather all the category columns into a single one
UCbyPOsep19 <- tidyr::gather(ucbydurationPOsepNov19[,c(1:10)], category, claimants, colnames(ucbydurationPOsepNov19)[2:10])
#Check it's worked
table(UCbyPOsep19$category)
#clean up colnames
colnames(ucbydurationPOsepNov19) <- gsub("...[0-9]+$","",colnames(ucbydurationPOsepNov19))
#Then repeat the reshape:
#gather all the category columns into a single one
UCbyPOsep19 <- tidyr::gather(ucbydurationPOsepNov19[,c(1:10)], category, claimants, colnames(ucbydurationPOsepNov19)[2:10])
#Check it's worked
table(UCbyPOsep19$category)
##Then for Oct:
#this time we select the first col and the 12th-19th
UCbyPOoct19 <- tidyr::gather(ucbydurationPOsepNov19[,c(1,12:19)], category, claimants, colnames(ucbydurationPOsepNov19)[12:19])
#Check it's worked
table(UCbyPOoct19$category)
#And for Nov:
#this time we select the first col and the 12th-19th
UCbyPOnov19 <- tidyr::gather(ucbydurationPOsepNov19[,c(1,20:28)], category, claimants, colnames(ucbydurationPOsepNov19)[20:28])
#Check it's worked
table(UCbyPOnov19$category)
```
We again want to merge them to calculate change. To do that we need to combine the fields that so they are unique:
```{r merge datasets 2019 PO}
#Create a new column in each data frame that combines two others
UCbyPOsep19$lacat <- paste(UCbyPOsep19$postcodedistrict, UCbyPOsep19$category, sep=":")
UCbyPOoct19$lacat <- paste(UCbyPOoct19$postcodedistrict, UCbyPOoct19$category, sep=":")
UCbyPOnov19$lacat <- paste(UCbyPOnov19$postcodedistrict, UCbyPOnov19$category, sep=":")
#We need distinctive column names for the counts
colnames(UCbyPOsep19)[3] <- "claimantsSep19"
colnames(UCbyPOoct19)[3] <- "claimantsOct19"
colnames(UCbyPOnov19)[3] <- "claimantsNov19"
#Now merge sep and oct
UCbyPOsepoct19 <- merge(UCbyPOsep19, UCbyPOoct19, by = "lacat")
#Now merge oct and nov
UCbyPOoctnov19 <- merge(UCbyPOoct19, UCbyPOnov19, by = "lacat")
```
Now add the calculations:
```{r calc change PO 2019}
#October figure minus September figure = change
UCbyPOsepoct19$changeSepOct <- UCbyPOsepoct19$claimantsOct19 - UCbyPOsepoct19$claimantsSep19
#change divided by Sep figure = % change
UCbyPOsepoct19$changeSepOctPerc <- UCbyPOsepoct19$changeSepOct / UCbyPOsepoct19$claimantsSep19
#Nov figure minus Oct figure = change
UCbyPOoctnov19$changeOctNov <- UCbyPOoctnov19$claimantsNov19 - UCbyPOoctnov19$claimantsOct19
#change divided by Sep figure = % change
UCbyPOoctnov19$changeOctNovPerc <- UCbyPOoctnov19$changeOctNov / UCbyPOoctnov19$claimantsOct19
```
Let's try to remove those duplicate columns:
```{r colnames clean 2019}
colnames(UCbyPOsepoct19)
colnames(UCbyPOoctnov19)
UCbyPOsepoct19 <- UCbyPOsepoct19[,c(-5,-6)]
head(UCbyPOsepoct19)
UCbyPOoctnov19 <- UCbyPOoctnov19[,c(-5,-6)]
head(UCbyPOoctnov19)
colnames(UCbyPOsepoct19)[2] = "postcodedistrict"
colnames(UCbyPOsepoct19)[3] = "category"
colnames(UCbyPOsepoct19)
colnames(UCbyPOoctnov19)[2] = "postcodedistrict"
colnames(UCbyPOoctnov19)[3] = "category"
colnames(UCbyPOoctnov19)
```
And export
```{r export PO calcs 2019}
write.csv(UCbyPOsepoct19, "UCbyPOsepoct19.csv")
write.csv(UCbyPOoctnov19, "UCbyPOoctnov19.csv")
```
## In-work poverty analysis
We want to see how much in-work poverty has changed in different areas. At a national level for the last year around 35% of those claiming UC have been in employment - but in the last few months this has steadily increased to around 39% in October.
We've exported the data from StatXplore (UC > People on UC > Postcode area then used Employment indicator as another row evel and expanded the months to cover Jan 19 onwards), and cleaned it a little so that the categories are repeated in the second column:
```{r import iwp data}
inworkPovertybyLA_jan19dec20 <- rio::import("inworkPovertybyLA_jan19dec20.xlsx", sheet = 2, skip = 9, col_types = c("guess", "guess", "numeric", "numeric", "numeric", "numeric", "numeric", "numeric", "numeric", "numeric", "numeric", "numeric", "numeric", "numeric", "numeric", "numeric", "numeric", "numeric", "numeric", "numeric", "numeric", "numeric", "numeric", "numeric", "numeric"))
```
We want to calculate the percentages of the totals for each LA.
This is completed in Excel.
## Use the API - TBC
You can download the data as an API query - details below and at https://stat-xplore.dwp.gov.uk/webapi/online-help/Open-Data-API.html
```{r}
empdata <- jsonlite::fromJSON("employmentbyLAjan19_oct20.json")
empdata$database
```