Centralized CMS Caching #1823

anthonyshull · 2023-12-13T13:24:37Z

We're using a centralized Redis cache. We use the Nebulex library with the Redis adapter. We also use Telemetry to get cache stats directly from the adapter.

We have two endpoints, protected by basic auth, that allow us to expire cache keys.

anthonyshull · 2023-12-13T13:27:20Z

apps/cms/lib/repo.ex

+  def generate(_, _, [path, %{}]) do
+    "/cms/#{String.trim(path, "/")}"
+  end


Creates a key from the path by preceding it with /cms.

anthonyshull · 2023-12-13T13:28:07Z

apps/cms/lib/repo.ex

+  def generate(_, _, [path, params]) do
+    "/cms/#{String.trim(path, "/")}" <> params_to_string(params)
+  end


Creates a key like above, but also handles parameters so you can have something like /cms/foo?bar=baz&bop=bam

anthonyshull · 2023-12-13T13:28:57Z

apps/cms/mix.exs

-      {:httpoison, ">= 0.0.0"},
-      {:poison, ">= 0.0.0", override: true},
-      {:timex, ">= 0.0.0"},
-      {:plug, "~> 1.14.2"},
-      {:html_sanitize_ex, "1.3.0"},
      {:bypass, "~> 1.0", only: :test},
-      {:quixir, "~> 0.9", only: :test},
+      {:decorator, "1.4.0"},
+      {:html_sanitize_ex, "1.3.0"},
+      {:httpoison, ">= 0.0.0"},
      {:mock, "~> 0.3.3", only: :test},
+      {:nebulex, "2.5.2"},
+      {:nebulex_redis_adapter, "2.3.1"},
      {:phoenix_html, "~> 3.0"},
-      {:repo_cache, in_umbrella: true},
+      {:plug, "~> 1.14.2"},
+      {:poison, ">= 0.0.0", override: true},
+      {:quixir, "~> 0.9", only: :test},
+      {:telemetry, "0.4.3"},
+      {:telemetry_metrics, "0.6.1"},
+      {:telemetry_metrics_statsd, "0.7.0"},
+      {:telemetry_poller, "0.5.1"},
+      {:timex, ">= 0.0.0"},


Alphabetized these.

anthonyshull · 2023-12-13T13:29:34Z

apps/cms/test/repo_test.exs

-      path = "/news/2018/news-entry"
-      params = %{}
-      cache_key = {:view_or_preview, path: path, params: params}
-
-      # ensure cache is empty
-      case ConCache.get(Repo, cache_key) do
-        nil ->
-          :ok
-
-        {:ok, %{"type" => [%{"target_id" => "news_entry"}]}} ->
-          ConCache.dirty_delete(Repo, cache_key)
-      end
-
-      assert %NewsEntry{} = Repo.get_page(path, params)
-      assert {:ok, %{"type" => [%{"target_id" => "news_entry"}]}} = ConCache.get(Repo, cache_key)


I just removed this for now. The final PR will have added tests.

anthonyshull · 2023-12-13T13:33:03Z

apps/cms/test/repo_test.exs

-      path = "/basic_page_no_sidebar"
-      params = %{"preview" => "", "vid" => "112", "nid" => "6"}
-      cache_key = {:view_or_preview, path: path, params: params}
-      assert ConCache.get(Repo, cache_key) == nil
-      assert %Basic{} = Repo.get_page(path, params)
-      assert ConCache.get(Repo, cache_key) == nil


Same here. This test will be back later.

anthonyshull · 2023-12-13T20:19:09Z

apps/cms/lib/repo.ex

@@ -29,6 +33,8 @@ defmodule CMS.Repo do

  @cms_api Application.get_env(:cms, :cms_api)

+  @ttl :timer.hours(1)


We set a TTL because we don't have a sophisticated mechanism to handle failed requests from Drupal to Dotcom.

question: If you're increasing the TTL to an hour, do need any of the Nebulex changes? It seems like that alone would make a big difference in the number of cache hits.

it will, but the content creators need the ability to reset a key so that content gets pushed more quickly. the ttl has been one minute so that they see the content change quickly. this lets us actually cache the content, but also update it quickly...something we can't do without significant work using ets and elixir nodes that don't talk to each other currently.

the ideal state would actually to use a message queue instead of sending patch requests. now that i know terraform...lookout. ultimately, there is no way around the two generals problem. so setting some ttl is a safety net.

paulswartz

general question: how does Nebulex handle things like Redis going down? That isn't something we've needed to address with the in-memory cache.
general question: what's the performance like? we're adding both a network round trip, as well as a serialize/deserialize step.

paulswartz · 2023-12-14T15:08:43Z

apps/cms/lib/repo.ex

@@ -29,6 +33,8 @@ defmodule CMS.Repo do

  @cms_api Application.get_env(:cms, :cms_api)

+  @ttl :timer.hours(1)


question: If you're increasing the TTL to an hour, do need any of the Nebulex changes? It seems like that alone would make a big difference in the number of cache hits.

paulswartz · 2023-12-14T15:10:28Z

apps/cms/lib/telemetry/reporter.ex

+    groups = Enum.group_by(metrics, & &1.event_name)
+
+    for {event, metrics} <- groups do
+      :telemetry.attach({__MODULE__, event, self()}, event, &__MODULE__.handle_event/4, metrics)


question: do you need a GenServer for this? I don't see it maintaining any state, so the metric source could do the logging directly?

afaik, this is how you're supposed to write custom telemetry reporters. it's how the other reporter we're using telemetry_metrics_statsd works: https://github.com/beam-telemetry/telemetry_metrics_statsd/blob/main/lib/telemetry_metrics_statsd.ex. and the console reporter that telemetry metrics ships with: https://github.com/beam-telemetry/telemetry_metrics/blob/main/lib/telemetry_metrics/console_reporter.ex. the idea behind telemetry is to separate out the metric emitters from reporters.

paulswartz · 2023-12-14T15:11:58Z

apps/site/lib/site_web/controllers/cms_controller.ex

@@ -43,6 +43,18 @@ defmodule SiteWeb.CMSController do
    |> handle_page_response(conn)
  end

+  def reset_cache_key(conn, %{"object" => object, "id" => id}) do
+    CMS.Repo.delete("/cms/#{object}/#{id}")


question: is there a CMS PR for this, or is this built-in Drupal functionality that we're using?

https://github.com/mbta/cms/pull/409

apps/site/lib/site_web/router.ex

anthonyshull · 2023-12-16T03:55:52Z

general question: how does Nebulex handle things like Redis going down? That isn't something we've needed to address with the in-memory cache. general question: what's the performance like? we're adding both a network round trip, as well as a serialize/deserialize step.

redix handles reconnections. https://github.com/whatyouhide/redix?tab=readme-ov-file#features if there is some kind of serious error, the site wouldn't function. but, the same could be said about ec2 going down or postgres going down, or route 53, etc.

we won't have performance numbers until we're actually pushing data through the system. but i highly suspect that it's going to be much faster given the fact that it current cache hit rate is 50%.

anthonyshull · 2023-12-16T04:13:22Z

apps/cms/config/config.exs

+config :cms, CMS.Repo,
+  conn_opts: [
+    host: "127.0.0.1",
+    port: 6379
+  ],
+  stats: true,
+  telemetry: true


this will be different for prod. i just don't know how until we can get something running on aws.

paulswartz · 2023-12-18T14:44:15Z

if there is some kind of serious error, the site wouldn't function. but, the same could be said about ec2 going down or postgres going down, or route 53, etc.

That's a great point, and why it's good to avoid introducing new external dependencies to the critical path. Something to think about!

anthonyshull added 6 commits December 11, 2023 11:44

cache swapped and tests passing

688e5d5

telemetry output to console

301a785

statsd telemetry

35236dd

add endpoint for flushing cache

049da51

metrics using telemetry

491c3a6

add basic auth

1de14ca

anthonyshull commented Dec 13, 2023

View reviewed changes

remove previews test as key is wrong

bee2df2

anthonyshull commented Dec 13, 2023

View reviewed changes

anthonyshull added 2 commits December 13, 2023 07:35

replace optioned news entry

917e83a

log out stats like repo cache does...until we support statsd

308d00f

anthonyshull commented Dec 13, 2023

View reviewed changes

paulswartz reviewed Dec 14, 2023

View reviewed changes

anthonyshull commented Dec 16, 2023

View reviewed changes

anthonyshull closed this Dec 18, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Centralized CMS Caching #1823

Centralized CMS Caching #1823

anthonyshull commented Dec 13, 2023

anthonyshull Dec 13, 2023

anthonyshull Dec 13, 2023

anthonyshull Dec 13, 2023

anthonyshull Dec 13, 2023

anthonyshull Dec 13, 2023

anthonyshull Dec 13, 2023

paulswartz Dec 14, 2023

anthonyshull Dec 16, 2023

anthonyshull Dec 16, 2023

paulswartz left a comment

paulswartz Dec 14, 2023

paulswartz Dec 14, 2023

anthonyshull Dec 16, 2023

paulswartz Dec 14, 2023

anthonyshull Dec 16, 2023

anthonyshull commented Dec 16, 2023

anthonyshull Dec 16, 2023

paulswartz commented Dec 18, 2023

		@@ -29,6 +33,8 @@ defmodule CMS.Repo do

		@cms_api Application.get_env(:cms, :cms_api)

		@ttl :timer.hours(1)

Centralized CMS Caching #1823

Centralized CMS Caching #1823

Conversation

anthonyshull commented Dec 13, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

paulswartz left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

anthonyshull commented Dec 16, 2023

Choose a reason for hiding this comment

paulswartz commented Dec 18, 2023