We have a single train station that can have an arbitrary number of different train lines running through it. (e.g. the Fulton Street stop of the NYC MTA, which hosts the 2, 3, 4, 5, A, C, J, and Z lines). We would like a service to manage the schedules of the different trains that run through this specific station.
Write a small web service (with API) that provides endpoints that track and manage the train schedules for this specific station.
● A means for clients to post a schedule for a new train line that runs through this station. This post should accept the following information:
○ The name of the train line (a string that contains up to four alphanumeric characters, e.g. ‘EWR0’, ‘ALP5’, ‘TOMO’, etc)
○ The list of times when this particular train arrives at this station. These are specific to the minute the train arrives (i.e. ‘9:53 PM’)
● A means for clients to get the next time multiple trains are going to be arriving at this station in the same minute. This request should accept a time value as an argument, and return a timestamp that reflects the next time two or more trains will arrive at this station simultaneously after the submitted time value.
Capability Restrictions
- Train Line name 4 char limit e.g. ‘EWR0’, ‘ALP5’, ‘TOMO’, etc)
- Arrival times to the minute 9:53pm
- Request for multiple trains arriving in the same minute
- Input time value
- Return time Next time two or more trains arrive at the same time after the submitted time value
For all web server requests, I use the built-in JavaScript Fetch API React handles use states and are efficient with state-based rendering and component-based architecture
For usage as a component library for UI elements for speed of development
For usage of hosting web applications separate from a web browser environment in its own contained window. Useful for creating web apps. Popular web applications that use this specific framework include Discord, Slack, Visual Studio Code
For fast development of backend web application services. Usually doesn't offer type safety like C++ but this can be fixed with the typing package (Specifically used this for the List Type for List of strings List[str])
For async communication between the client and the server to allow multiple long-lived connections at once and hot reloading
For scripting and automation of scripts to run the frontend and backend on command
Key Value Pairs are required and I will be using them specifically for caching as I wanted to dive deeper into how databases work, especially relational databases. KVP.keys() is implemented but not used due to the implementation. Scenarios to use keys would be doing an operation over the whole data set
Docker container with TimescaleDB
- For my relational database, I chose TimescaleDB. It extends off PostgreSQL and is used for large datasets of timestamped data and includes optimizations using hyper tables for this type of data. It's an API that is usable in the basic sense for SQL queries but can also be learned through further project research. Docker container for isolation and containerization
Why Relational?
- For handling structured data such as train name and time, and relating one set or column of data to another.
- SQL for complex queries
sql_query = f"""
SELECT DATE_TRUNC('minute', {ARRIVAL_TIME_COLUMN}) as arrival_time,ARRAY_AGG({TRAIN_NAME_COLUMN})
FROM {DB_NAME}
WHERE {ARRIVAL_TIME_COLUMN} > %s
GROUP BY arrival_time
HAVING COUNT({TRAIN_NAME_COLUMN}) > 1
ORDER BY arrival_time
LIMIT 1
"""
-
Round time to the minute
-
Aggregate Train Names into a single array
-
Get the times after the arrival timestamp %s
-
Only filter trains with multiple arrival times
-
Order by the arrival time after %s
-
Only return 1 group
-
Indexing, or composite indexes to speed up queries (Mentioned in TimescaleDB documentation)
- Ex: Tuple of train time and arrival time
cursor.execute(f"CREATE INDEX {NAME_IDX} ON {DB_NAME} ({TRAIN_NAME_COLUMN}, {ARRIVAL_TIME_COLUMN} DESC);")
Indexing
FastAPI | 🍓 Strawberry GraphQL
GraphQL - FastAPI (tiangolo.com)
For designing the [[GraphQL]] (with Strawberry to define type schema) and [[REST]] [[API]]
Note: GraphQL always returns 400 when it receives a process. Throwing 200 in the body response of the message
Deciding between GraphQL and REST.
- GraphQL is good for many complex queries and is more complex than REST and handles under and over-fetching better. It also puts everything into a singular endpoint that is parsed.
- REST is Simple and straight forward and a good, scalable API choice
- I compare the server response timings between both in this project
- On GET requests either from REST or GraphQL if a train name or time set is not in a key-value pair dictionary it will be added to one for caching purposes
- 3 Caches are used to minimize expensive Database calls
- Instance 1 - Key - Arrival Time, Value - List of Train Names
- Instance 2 - Key - Train Name, Value - List of Arrival Times
- Instance 3 - Key - Arrival Time, Value - Closest Next Multiple Trains
- The APIS can be more generalized
- Flask can help with smaller applications and minimizing component usage
- Django can help with load balancing if it becomes big and complex enough
Q: GraphQL and security?
REST better on push
First Database Second Cached Pull REST better on GET
- Second is always a cache
- [1-4] GraphQL Outputs for 2 Queuing Train at Time and 2 for queueing multiple trains after time
- [5-8] REST Outputs for 2 Queuing Train at Time and 2 for queueing multiple trains after time
REST is better for smaller less complex Queries and is preferred over GraphQl because REST has a 50% reduction in response time from the server. When there are more database queries and entries that are needed this reduction is crucial.
This could be due to needing to parse the query statement. Once there are more complex data requirements and a potential for over and under-fetching, GraphQL would be preferred over REST.
trainsAtTime(arrivalTimestamp: "2024-06-09 20:00:00")
{
timesForTrain(trainName: "FEWA")
}
mutation {
addTrain1: addTrain(trainName: "TRN1", arrivalTime: ["06:30 AM", "12:45 PM", "05:15 PM", "10:00 PM"]) {
trainName
arrivalTime
}
addTrain2: addTrain(trainName: "TRN2", arrivalTime: ["07:00 AM", "12:45 PM", "04:00 PM", "09:00 PM"]) {
trainName
arrivalTime
}
addTrain3: addTrain(trainName: "TRN3", arrivalTime: ["06:30 AM", "01:00 PM", "05:15 PM", "11:00 PM"]) {
trainName
arrivalTime
}
addTrain4: addTrain(trainName: "TRN4", arrivalTime: ["07:00 AM", "02:00 PM", "04:00 PM", "10:00 PM"]) {
trainName
arrivalTime
}
addTrain5: addTrain(trainName: "TRN5", arrivalTime: ["08:00 AM", "12:45 PM", "06:00 PM", "09:00 PM"]) {
trainName
arrivalTime
}
addTrain6: addTrain(trainName: "TRN6", arrivalTime: ["06:30 AM", "02:00 PM", "05:15 PM", "11:00 PM"]) {
trainName
arrivalTime
}
addTrain7: addTrain(trainName: "TRN7", arrivalTime: ["07:00 AM", "01:00 PM", "04:00 PM", "10:00 PM"]) {
trainName
arrivalTime
}
}