Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[coepi-backend-aws]: Enable CDN (Cloudfront) caching of reports #10

Open
ramnanib2 opened this issue Apr 13, 2020 · 1 comment
Open
Assignees
Labels
help wanted Extra attention is needed

Comments

@ramnanib2
Copy link
Collaborator

ramnanib2 commented Apr 13, 2020

The API supports retrieving reports generated during a fixed time interval. This means that report batches for all intervals that have occurred in the past can be cached in CDN, optimizing latency and cost.
Cloudfront is the CDN product that is available in AWS and can be used for this purpose. Some of the tasks that can be accomplished.

  1. API Gateway endpoint to be edge-optimized
  2. Enable caching using query string parameters as keys (intervalNumber is passed is as a query string parameter)
  3. Automate setting up the Cloudfront infrastructure using Terraform

https://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/QueryStringParameters.html

@ramnanib2 ramnanib2 added the help wanted Extra attention is needed label Apr 22, 2020
@mansinahar mansinahar self-assigned this Apr 22, 2020
oidebrett pushed a commit to oidebrett/coepi-backend-aws that referenced this issue May 1, 2020
oidebrett added a commit to oidebrett/coepi-backend-aws that referenced this issue May 1, 2020
@oidebrett oidebrett self-assigned this May 1, 2020
@oidebrett
Copy link

oidebrett commented May 7, 2020

A possible implementation is available however this issue requires some consideration before rolling into the project.

Discussion document is here:
https://docs.google.com/document/d/1c1bwQLWLqSGvcJnrhWPwmOgXTgnvwdG5mPNymSZGps0/edit?usp=sharing

Questions arising from investigation:

  • Terraform deployment of behaviours across CDN distribution takes a long time due to rollout to edge locations - can it be sped up?
  • Will POSTing of reports be required via the cloudfront url?
  • Will the domain api.coepi.org or cdn.coepi.org need to be moved to the cloudfront URL?
  • What are the requirements for invalidations (i.e. when to forcibly clear the cache)
    • For development - when testing
    • For in production - manual invalidations of cache
  • What are the best settings for TTL?
  • What report request query strings can be used for caching (if any): E.g. could we cache a report request for a particular date with an Interval number equivalent to 6 hours. This would potentially mean that no new reports would be included in a report request within the 6 hour cache time?
    What would a 6 hour window relate to in terms of intervalNumber and intervalLengthMs query strings?
  • What can we estimate for “on-demand” capacity planning?:
  • What will be the typical size (in Bytes) of a typical report request (this will impact AWS cloudfront pricing)
  • How many report requests per hour or day or week or month?
  • This is required to see what the cost/benefits are?
  • How is it best to test this?
  • I had a look at Bhushans proposal - and I wonder is caching compatible with “Implementing geo-hashing so the devices only need to download keys for the device location, for maximizing scalability”

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

3 participants