-
Notifications
You must be signed in to change notification settings - Fork 1
Charter 29: Explore QA local database with deleted items to discover if fill in gaps method is working as anticipated
mwalker-scottlogic edited this page Aug 30, 2024
·
3 revisions
- Explore QA local database with deleted items to discover if fill in gaps method is working as anticipated
- Mike Walker-Rose
- 30/08/24
- 45 - 90 mins
- Local QA database
- Mongo DB Compass
- run main and establish how many documents are in the last 7 days
- delete specific days and repeat step 1
- explore different days and establish if the data is being filled correctly
- look for re-uploads
- look for things not being overwritten that shouldn't
- deleted everything from in situ qa db
- at 07:25 UTC ran
test__in_situ_etl__calling_actual_api_returns_values_and_stores
test fromopen_aq_etl_test.py
which calls main without mocking openaq response, removed the limit for just London to get all results- 27453 documents resulted
- exported to excel, range of documents had measurement dates 2024-08-29T07:26:00.000Z to 2024-08-30T07:14:00.000Z
- would have expected this to populate with 7 days
- re-run 08:49
- 27810 documents resulted
- all within range 29/08 - 30/08
- changed approach, running in situ etl script in case test setup is limiting results - 08:57
- 216480 documents added
2024-08-30 09:26:20,439 - INFO - 27866 documents upserted, 0 modified
-
2024-08-23T07:57:09.000Z
to2024-08-30T07:57:00.000Z
- re-run at 10:33
2024-08-30 10:37:30,178 - INFO - 1804 documents upserted, 25368 modified
- Able to find first and last item from the first dataset and confirmed they still existed and had not been updated
- later measurements were recorded with a different created_time to the first batch
- delete day from the middle, should be upserted
- { measurement_date: { $gte: ISODate("2024-08-27T00:00:00.000Z"), $lt: ISODate("2024-08-28T00:00:00.000Z") } }
- 31300 documents
- re-run etl, now 220222 documents
- confirmed first and last documents are present
- found evidence of data being re-uploaded but with a different created_time
before:
after:
- from the data observed this appears to be working as expected
Getting Started and Overview
- Product Description
- Roles and Responsibilities
- User Roles and Goals
- Architectural Design
- Iterations
- Decision Records
- Summary Page Explanation
- Deployment Guide
- Working Practices
- Q&A
Investigations and Notebooks
- CAMs Schema
- Exploratory Notebooks
- Forecast ETL Process
- In Situ air pollution data sources
- Notebook: OpenAQ data overview
- Notebook: Unit conversion
- Data Archive Considerations
Manual Test Charters
- Charter 1 (Comparing ECMWF forecast to database values)
- Charter 2 (Backend performance)
- Charter 3 (Forecast range implementation)
- Charter 4 (In situ bad data)
- Charter 5 (Filtering ppm units)
- Charter 7 (Forecast API input validation)
- Charter 8 (Forecast API database sizes)
- Charter 9 (Measurements summary API input validation)
- Charter 10 (Seeding bad data)
- Charter 11 ()Measurements API input validation
- Charter 12 (Validating echart plot accuracy)
- Charter 13 (Explore UI after data outage)
- Charter 14 (City page address)
- Charter 15 (BugFix diff 0 calculation)
- Charter 16 (City page chart data mocking)
- Charter 17 (Summary table logic)
- Charter 18 (AQI chart colour banding)
- Charter 19 (City page screen sizes)
- Charter 20 (Date picker)
- Charter 21 (Graph consistency)
- Charter 22 (High measurement values)
- Charter 23 (ppm -> µg m³)
- Charter 24 (Textures API input validation)
- Charter 25 (Graph line colours)
- Charter 26 (Fill in gaps in forecast)
- Charter 27 (Graph behaviour with mock data)
- Charter 28 (Summary table accuracy)
- Re‐execute: Charter 28
- Charter 29 (Fill in gaps in situ)
- Charter 30 (Forecast window)
- Charter 31 (UI screen sizes)