No more F5!

I procrastinate a lot by reloading webpages, looking for new content. However, I don't like being a Skinner box rat, so I wrote this digest generator to tame my FOMO.

⚠️ no-more-f5 works only with Java 8! Java 9 is not supported.

Installation

Cloud stack

Rough idea of the required cloud stack:

Emails are sent using AWS SES via the SMTP protocol.
The function itself is deployed to AWS Lambda and is triggered by a scheduled CloudWatch event (cron).

If you want to know, here's the motivation for this stack:

Why SMTP protocol?

Since we need to scrape RSS feeds, we need Internet access. This can be configured in two ways:

Place the Lambda function outside a VPC and connecting to SES via SMTP.

Place the Lambda function inside a VPC and route Internet traffic through a NAT Gateway. In this case we can talk to SES directly.

I use the former way. Although SMTP emails cost a little bit more, configuring a VPC and a NAT Gateway is tedious and a NAT Gateway is certainly much more expensive than the SMTP emails. However, if you already have one, you can certainly try it. YMMV.

Building and packaging your function

You will need Leiningen to build your uberjar. But first, create a list of your Atom/RSS feeds and save it in a file, e.g. my_feeds:

$ cat > my_feeds <<EOF
https://github.com/BurntSushi/ripgrep/releases.atom
https://github.com/atom/atom/releases.atom
EOF

Now we build a standalone uberjar and add my_feeds to it (remember, jars are just zip archives). This process is automated in prepare_package.sh (specify your feeds file as a call parameter):

$ ./prepare_package.sh my_feeds

Preparing your SES

Verify your email address in SES.
Create SMTP credentials and save them -- we'll need them later.

Important: Creating SMTP credentials also creates an IAM user. Do not use this user's credentials for the SMTP server!

Creating and configuring the Lambda function

Create a new Lambda function.
Use a standard IAM role, just enough to store CloudWatch logs.
Select Java 8 as runtime.
Add a CloudWatch event as a trigger. Schedule it to something like cron(0 6 * * ? *), i.e. every day at 6:00 UTC.
Choose something around 384 MB memory and 90 seconds timeout (depends heavily on the number of feeds you want to digest).
Set handler to no_more_f5.core::handler
Now we need to setup environment variables. Add following envvars:

Variable	Note	Example
`FEEDS`	Filename of the file with your feed URLs	`my_feeds`
`USER_AGENT`	See below	`Mozilla/5.0 ...`
`SMTP_SERVER`	Address of your AWS SES SMTP server	`email-smtp.eu-west-1.amazonaws.com`
`SMTP_PORT`	SMTP server port, check out your SES docs	`587`
`SMTP_USER`	Use your SES SMTP credentials here
`SMTP_PASS`	Use your SES SMTP credentials here
`EMAIL_FROM`	Must be verified in AWS SES	`[email protected]`
`EMAIL_TO`	All of them must be verified in AWS SES	`[email protected], [email protected]`
`SINGLE_SITE_TIMEOUT`	Timeout for each fetching connection	`2000`

You need to specify USER_AGENT since some sites block scrapers without it. Just use something similar to your main browser.

EMAIL_TO can contain multiple addresses, separated by commas. Make sure you use only verified addresses if you are still in the SES Sandbox mode.

SINGLE_SITE_TIMEOUT is helpful if some feed is unresponsive. Instead of timing out the whole Lambda function, you'll just get an exception message for the unresponsive feed.

Ok, you should be ready to go! Create a dummy testing event (just use an empty dict {} as context) and see if you've got a digest in your inbox!

Configuring CloudWatch logs retention

One more thing: Go to CloudWatch and configure log retention for your no-more-f5 log group. Set it to something reasonable, e.g. 7 days. Storing a lot of logs (several GBs) might be expensive and it's just not worth it in this case.

Local dev environment

For local testing, create a profiles.clj file in the root repo folder. Add the following map to it:

{:dev
  {:env
    {
      :feeds "dev_feeds"
      :single-site-timeout "2000"
      :smtp-user "..."
      :smtp-pass "..."
      :smtp-server "email-smtp.eu-west-1.amazonaws.com"
      :smtp-port "587"
      :user-agent "..."
      :email-from "..."
      :email-to "..."
      }
    }
  }

Then just use lein run to run the app. Alternatively, you can set all required environment variables and call

$ java -cp <path_to_your_uberjar> no_more_f5.core

If you have your own server running 24/7, you can schedule local execution with cron. And of course you can use your own email account, just make sure to get an app token for SMTP instead of using your password.

How much is the fish?

No idea, I'll update this when I get my first monthly bill. But probably not much.

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
src/no_more_f5		src/no_more_f5
.gitignore		.gitignore
.travis.yml		.travis.yml
LICENSE		LICENSE
README.md		README.md
prepare_package.sh		prepare_package.sh
project.clj		project.clj

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

No more F5!

Installation

Cloud stack

Building and packaging your function

Preparing your SES

Creating and configuring the Lambda function

Configuring CloudWatch logs retention

Local dev environment

How much is the fish?

About

Releases

Packages

Languages

License

mp4096/no-more-f5

Folders and files

Latest commit

History

Repository files navigation

No more F5!

Installation

Cloud stack

Building and packaging your function

Preparing your SES

Creating and configuring the Lambda function

Configuring CloudWatch logs retention

Local dev environment

How much is the fish?

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages