erlang · rickard-green · Dec 21, 2024 · Maria-12648430 · Jan 8, 2025 · rickard-green
diff --git a/eeps/eep-0076-1.png b/eeps/eep-0076-1.png
diff --git a/eeps/eep-0076-2.png b/eeps/eep-0076-2.png
diff --git a/eeps/eep-0076.md b/eeps/eep-0076.md
@@ -0,0 +1,395 @@
+    Author: Rickard Green <rickard(at)erlang(dot)org>
+    Status: Draft
+    Type: Standards Track
+    Created: 07-Jan-2025
+    Post-History: https://erlangforums.com/t/eep-76-priority-messages
+    Erlang-Version: OTP-28.0
+****
+EEP 76: Priority Messages
+----
+
+Abstract
+========
+
+In some scenarios it is important to propagate certain information to a process
+quickly without the receiving process having to search the whole message queue
+which can become very inefficient if the message queue is long. This EEP
+introduces the concept of priority messages to the language which
+aim to solve this issue.
+
+Motivation
+==========
+
+Asynchronous signaling is *the Erlang way* of communicating between processes.
+The message signal is the most common type of signal. When a message signal is
+received, it is added to the end of the message queue of the receiving process.
+As a result of this, the messages in the message queue will be ordered in
+reception order. When the receiving process fetch a message from the message
+queue, using the `receive` expression, it begins searching at the start of the
+message queue. Searching for a matching message is an `O(N)` operation where
+`N` equals the amount of messages preceding the matching message.
+
+![Message Reception][]
+
+Figure 1.
+
+This works great in most cases, but in certain scenarios it does not work at
+all. At least not without paying a huge performance penalty.
+
+A Couple of Problematic Scenarios
+---------------------------------
+
+### Long Message Queue Notification
+
+As of Erlang/OTP 27.0 it is possible to set up a system monitor monitoring the
+message queue lengths of processes in the system. When a message queue length
+exceeds a certain limit, you might want to change strategy of handling incoming
+messages. In order to do that, you typically need to inform the process with a
+long message queue about this.
+
+Sending it a message informing about the long message queue will not work,
+since this message will end up at the end of the long message queue. If the
+receiver handles messages one at a time in message queue order, it will take a
+long time until the receiver fetch this message. The situation will at this
+point very likely have become even worse.
+
+If the receiver instead periodically tries to search for such messages using
+a selective receive, it will periodically have to do a lot of work. This
+especially when the message queue is long. Polling the message queue length
+using `process_info/2` will in this case be a better workaround. That is,
+communicating this information between processes using asynchronous signaling
+does not work in this scenario, or at least work very poorly.
+
+### Prioritized Termination
+
+Prioritized termination is another scenario that has similar issues. A worker
+process that handles large jobs is supervised in a supervision tree. It is easy
+to envision that such a worker could get a large amount of requests in its
+message queue. If the supervisor dies or wants the worker to terminate, the
+worker will receive an exit signal from its supervisor. If the worker traps
+exits, the corresponding `'EXIT'` message will end up at the end of the message
+queue.
+
+If one wants to be able to terminate the worker prior to having to handle all
+other requests in the message queue, one either has to stop trapping exits or
+periodically do selective receives searching for such `'EXIT'` messages. Not
+trapping exits might not be an option and doing periodical selective receives
+will be very expensive if the message queue is long. [Pull request 8371][]
+aimed to solve this scenario.
+
+A workaround in this scenario could be to poll the supervisor using the
+`is_process_alive/1` BIF in combination with polling of an ETS table where the
+supervisor can order it to terminate. That is, this is another scenario in
+which communicating information between processes using asynchronous signaling
+either does not work or performs very poorly.
+
+Polling Workarounds
+-------------------
+
+In order to be able to solve scenarios like these without the risk of having to
+do a lot of work in the receiving process, one have to resort to passing the
+information other ways and let the receiver poll for that information. For
+example, write something into an ETS table and let the receiver poll that ETS
+table for information. This will prevent potentially very large costs of
+having to repeatedly do selective receives, but the polling will not be for
+free either.
+
+In order to be able to handle scenarios like the ones above using asynchronous
+signaling, which is *the Erlang way* to communicate between processes, the
+following mechanism for sending and receiving priority messages between
+processes is proposed.
+
+Rationale
+=========
+
+By letting certain messages get priority status and upon reception of such
+messages insert them before ordinary messages in the message queue we can
+handle scenarios like the above with very little overhead. Besides getting a
+solution that most likely will have less overhead than any workaround for
+communicating information like this, we also get a solution where asynchronous
+signaling between processes can still be used.
+
+The proposed handling of priority messages in the message queue:
+
+![Priority Message Reception][]
+
+Figure 2.
+
+There will be no way for the Erlang code to distinguishing a priority message
+from an ordinary message when fetching a message from the message queue. Such
+knowledge needs to be part of the message protocol that the process should
+adhere to.
+
+The total message queue length in figure 2 equals `P+M`. The lengths `P` and
+`M` will not be visible. The only visible length is the total message queue
+length.
+
+A `receive` expression will select the first message, from the start, in the
+message queue that matches, just as before.
+
+How to Insert Priority Messages in the Message Queue?
+-----------------------------------------------------
+
+By letting priority messages overtake ordinary messages that already exist in
+the message queue we get priority messages ordered in reception order among
+priority messages followed by ordinary messages ordered in reception order
+among ordinary messages. Instead of just overtaking ordinary messages, one
+could choose to let a priority message overtake all messages in the message
+queue regardless of whether they are priority messages or not, but then
+multiple priority messages would accumulate in reverse order. Having these two
+sets of messages ordered internally by reception order at least to me feels the
+most useful. Just as in the case of ordinary messages we will probably want to
+handle priority messages in reception order.
+
+Note that the reception order of signals is not changed. If a process sends an
+ordinary message and then a priority message to a another process, the ordinary
+message will be received first and then the priority message will be received.
+The only difference is that when the priority message is received, it will be
+inserted earlier in the message queue than the ordinary message. That is,
+[the signal ordering guarantee][] of the language will still be respected. This
+just modifies how the message queue is managed.
+
+How to Determine What Should be a Priority Message?
+---------------------------------------------------
+
+By introducing priority messages, the messages in the queue will not
+necessarily be in the order the corresponding signals were received. There will
+be a lot of code that assumes that the order of messages in the message queue
+is in reception order, so it is reasonable that one should need to opt-in in
+order to be able to receive priority messages.
+
+This EEP propose that selected priority marked messages, selected exit
+messages, and selected monitor messages should be treated as priority messages.
+Perhaps one would want other types of messages to be treated as priority
+messages as well, but the set of allowed priority messages can easily be
+extended in the future if that should be the case. The following list
+describes how the different types of messages will be enabled as priority
+messages:
+
+* *Priority Marked Messages* - A message is marked as a priority message by the
+  sender by passing the option `priority` in the option list that is passed as
+  third argument to the `erlang:send/3` BIF. The receiver opts-in for reception
+  of priority marked messages from a specific sender by calling the
+  `process_flag/2` BIF like this:
+  `process_flag({priority_marked_message, SenderPid}, true)`.
+* *Exit Messages* - The receiver opts-in for reception of priority exit
+  messages from a specific process or port by calling the `process_flag/2` BIF
+  like this:
+  `process_flag({priority_exit_message, SenderPidOrPort}, true)`.
+* *Monitor Messages* - The receiver opts-in for reception of priority monitor
+  messages due to a specific monitor being triggered by calling the
+  `process_flag/2` BIF like this:
+  `process_flag({priority_monitor_message, MonitorRef}, true)`.
+  The receiver can also opt-in for reception of priority monitor messages by
+  passing the option `priority` in the option list that is passed as third
+  argument to the `monitor/3` BIF when creating the monitor.
+
+The receiver process can at any time disable reception of certain priority
+messages by passing `false` as second argument to any of the above listed
+`process_flag/2` BIF calls.
+
+The reason for not having options for accepting all priority marked messages,
+all exit messages, or all monitor messages as priority messages is the risk of
+introducing bugs when code in other modules are called from the process
+accepting priority messages. For example, if a process enables all monitor
+messages as priority messages and then makes a call into a module that makes
+a `gen_server` call, a `'DOWN'` message due to the call could be selected even
+though a reply message due to the call had been delivered before the `'DOWN'`
+message. In this case, the call would fail even though it actually succeeded.
+The reply message would then also be left as garbage in the message queue
+without any code picking it up.
+
+When a potential priority message is received, the receiver will check if it
+has enabled priority message reception for this message. If it has been
+enabled, the priority message will overtake all ordinary messages in the
+message queue and will be inserted after the last accepted priority message in
+the queue. If it has not been enabled, the message will be treated as any
+ordinary message and will be added to the end of the message queue. See figure
+2.
+
+The Selective Receive Optimization
+----------------------------------
+
+Current Erlang runtime system has a selective receive optimization that can
+prevent the need to search large parts of the message queue for a matching
+message. It is triggered when a reference is created and then matched against
+in all clauses of a `receive` expression. Messages present in the message queue
+when the reference is created do not have to be inspected, since they cannot
+contain the reference.
+
+When the optimization is triggered a marker is inserted into the message queue
+and only messages after the marker are searched. This optimization can make a
+huge impact on performance if the process has a long message queue. This
+optimization is frequently used in OTP code such as, for example, in a
+`gen_server` call.
+
+The insertion of a priority message in the message queue clashes with the
+receive optimization since a reference now can appear earlier in the message
+queue than where the receive marker was inserted. One solution to this problem
+could be to disable the selective receive optimization on processes that
+enables priority messages. The user of priority messages would in that case
+have to be very careful not to call into modules that might rely on the
+selective receive optimization. This would more or less make it impossible to
+safely call modules that you don't have full control over yourself, since it
+in the future might be modified in a way so that it relies on the selective
+receive optimization taking effect. Therefore I find it unacceptable to disable
+the selective receive optimization. The priority message implementation must
+preserve the selective receive optimization.
+
+Distributed Erlang
+------------------
+
+Handling of priority messages should be completely distribution transparent.
+You should be able to send and receive priority messages between nodes the
+same way as done locally.
+
+Alternative Solutions Considered
+--------------------------------
+
+A separate priority message queue per process exposed to the Erlang program
+could be an alternative solution. You would need a way similar to this
+proposal to choose which messages should be accepted as priority messages.
+There would also need to be some new syntax in order to multiplex matching of
+messages from the different message queues. This would be a larger change of
+the language without providing any extra benefits as I see it.
+
+There have been suggestions for multiple priority levels similar to the process
+priority levels. This could be viewed as an extension to this proposal. The
+implementation could relatively easily be extended with multiple priority
+levels even though it would complicate the implementation. A `low` priority
+level similar to the process priority level `low` which is mixed with the
+`normal` process priority level would be very strange to introduce, though.
+This since there would not be any easy way of understanding which message will
+be fetched from the message queue at a specific message queue state. I think
+multiple priority levels should be left for the future if a good enough use
+case is presented.
+
+Backwards Compatibility
+=======================
+
+Since the receiver process needs to opt-in in order to get any special handling
+of priority messages, this will be completely backwards compatible.
+
+Summary
+=======
+
+The proposed solution for priority messages enables users to solve problems
+using asynchronous signaling, which is *the Erlang way* of communicating,
+where they previously had to resort to workarounds using polling of some sort.
+It is likely to reduce the performance impact in most, if not all, scenarios
+where one otherwise needs to resort to polling of some sort. Since you need to
+opt-in to this new behavior it is completely backwards compatible. The changes
+to the language are very small, just "a light touch". On the conceptual level,
+it is very easy to understand how the priority messaging works assuming that
+you understand how asynchronous signaling in the language work.
+
+Reference Implementation
+========================
+
+The reference implementation can be found in [pull request 9269][] of the
+[Erlang/OTP repository][].
+
+Care has been taken to have as small impact as possible on processes not
+using priority messages. Processes not enabling reception of priority
+messages will not use any more memory at all due to the priority messages
+implementation.
+
+A Few Notes on the Implementation
+---------------------------------
+
+### The Message Queue
+
+The message queue may contain messages as well as receive markers used by the
+selective receive optimization. Receive markers are currently also used for
+adjustments that needs to be done to the message queue during certain
+operations. That is, the current code traversing the message queue needs to be
+prepared to encounter receive markers of different types.
+
+When the user enables reception of priority messages, a block containing three
+receive markers and an area for auxiliary data is allocated. The receive
+markers are of new types distinguishable from the already existing receive
+markers. The auxiliary data, among other things, contains a red/black search
+tree containing information about what type of messages the process accepts as
+priority messages. All memory allocated for handling of priority messages is
+referred to from this memory block.
+
+The first and the second receive markers are inserted at the start of the
+message queue. The first marker marks the start of priority messages and the
+second marks the end of priority messages. The first marker also serves as an
+entrance for finding all information about the priority message handling. When
+a priority message is accepted it will be inserted just before the end marker.
+The third marker is inserted in the message queue when we need to remember a
+place in the message queue. This is needed when a priority message is accepted
+while we currently are traversing the message queue.
+
+#### Receive Optimization
+
+If we have active receive markers for the selective receive optimization in the
+message queue and a priority message is accepted, we scan the message for
+references. If a reference corresponding to a receive marker is found, we mark
+in the receive marker that the reference has been seen in the part of the
+message queue containing priority messages. When we enter a `receive`
+expression where a receive marker is used and it has been marked in the
+receive marker that the reference has been seen in a priority message, we
+search the priority messages prior to continuing with the messages after the
+receive marker.
+
+A further optimization that could be done to the receive optimization is to
+insert yet another receive marker before the first priority message containing
+the reference, but I see that as a premature optimization. A process is not
+expected to accumulate a large amount of priority messages. If so, the process
+has used priority messages in a way not intended.
+
+### Priority Messages in Transit
+
+There exists a number of different types of signals. For each type of signal an
+action is taken when the signal is received. Ordinary messages are special
+since they are very common and the only action taken upon reception of an
+ordinary message is to add it to the end of the message queue. Due to this,
+the signal queue for incoming signals is arranged as a skip list where each
+non-ordinary message signal points to the next non-ordinary message signal.
+This way we can move a whole batch of ordinary messages into the message queue
+at once.
+
+Priority marked message signals need to be sent as non-ordinary message
+signals, since they need to have another action taken than the default. There
+are other signals that are received as non-ordinary message signals, but then
+transformed into ordinary messages depending on the state of the receiving
+process. An example of such a signal is a message sent using an alias. Upon
+reception of such a message the receiver checks if the alias is still active.
+If it is, then adds it to the end of the message queue; otherwise, it drops the
+message. Since a message sent using an alias is very similar to a priority
+marked message, the implementation for alias messages has been generalized to
+handle *alternate action messages*. Both a priority marked message and a
+message sent using an alias are just messages with an alternate action to take
+upon reception than the default, so both of them will use the alternate action
+message implementation.
+
+[Message Reception]: eep-0076-1.png
+    "Message Reception"
+
+[Priority Message Reception]: eep-0076-2.png
+    "Priority Message Reception"
+
+[the signal ordering guarantee]: https://www.erlang.org/doc/system/ref_man_processes.html#delivery-of-signals
+
+[Pull request 8371]: https://github.com/erlang/otp/pull/8371
+
+[pull request 9269]: https://github.com/erlang/otp/pull/9269
+
+[Erlang/OTP repository]: https://github.com/erlang/otp
+
+Copyright
+=========
+
+This document is placed in the public domain or under the CC0-1.0-Universal
+license, whichever is more permissive.
+
+[EmacsVar]: <> "Local Variables:"
+[EmacsVar]: <> "mode: indented-text"
+[EmacsVar]: <> "indent-tabs-mode: nil"
+[EmacsVar]: <> "sentence-end-double-space: t"
+[EmacsVar]: <> "fill-column: 70"
+[EmacsVar]: <> "coding: utf-8"
+[EmacsVar]: <> "End:"
+[VimVar]: <> " vim: set fileencoding=utf-8 expandtab shiftwidth=4 softtabstop=4: "