krarup is a dialect of Erlang for quickly creating highly concurrent data processing flows with ease.
% src/krarup_example.krp
-module(krarup_example).
async add(A, B) ->
A + B.
main() ->
Result = await add(1, 2),
io:format("Count Result: ~p~n", [Result]),
Results2 = await [add(2, 3), add(4, 5)],
[io:format("Count Result: ~p~n", [R]) || R <- Results2],
> krarup_example:main().
Count Result: 3
Count Result: 5
Count Result: 7
% rebar.config
{plugins, [{krarup, {git, "https://github.com/mpope9/krarup/", {branch, "main"}}}]}.
The Erlang language is simple. Once fluency is achieved rather quickly then productivity is unrivaled.
We created krarup, a new Erlang dialect and a superset of Erlang, to provide convenience keywords take care of some common process-related boilerplate. These keywords can be used as quick building blocks to quickly create and prototype concurrent data processing flows that can be promoted to production.
The key additions are async
and await
to simplify spawning worker processes.
Not everything is required to be a gen_server
, and often simple processes
will do the trick. However, async task spawning in Erlang is a bit cumbersome.
A common pattern is to create a function that replies to a caller. This is
rather explicit boilerplate and it is often wrapped in a helper function,
but the pattern still rather common.
add(A + B) ->
Caller = self(),
spawn(fun() ->
Result = A + B,
Caller ! {self(), Result}
end),
Pid = add(1, 2),
receive
{Pid, Result} ->
io:format("", [Result])
end.
In Elixir the story is different with the Task
module
(example from the docs):
task = Task.async(fn -> do_some_work() end)
res = do_some_other_work()
Task.await(task)
This module provides great ergonomics around spawning and waiting for tasks.
However, Elixir.Task
s come with a high level of 'safety'. Links and monitors are
automatically created and it is expected that the consequences of these, such as
linked termination without error trapping and stray messages are dealt with by the
user as development mounts.
Further the Elixir.Task
and Elixir.Task.Supervisor
modules are not readily
available to Erlang users and are non-trivial to integrate. The difficultly
in using Elixir in Erlang is a well-known issue with no clear answer for
every environment.
For simple repeatable tasks new syntactic sugar can be added to Erlang which can increase developer productivity in designing, prototyping, and iterating on process based or data-flow based models.
The benefit of being an Erlang superset is that once sufficient complexity has been reached and one must reach for the more 'advanced' runtime features they are readily available, and further a complete removal of this syntactic sugar should be trivial. An other benefit is full interoperability throughout the BEAM ecosystem. It is easy to call Erlang from other languages.
However, since krarup programs are just Erlang programs a meta-layer can also be used. A top-level Erlang application can compose and control multiple lower-level krarup-written applications. krarup fits naturally into the Erlang structured concurrently and application building story. Further, this controller can be any other BEAM language, if an Elixir or Gleam or LFE controller is required then it can be used with ease.
This introduces a few new keywords to the language:
- Async is defined on a function's definitions.
- This signals that a function can be awaited, or it can be called directly and pid will be returned.
- Unlike other languages
async
does not 'bleed'. Async functions can be called from non-async functions.- Everything in Erlang can be
async
, unlike languages that addedasync
after the fact.
- Everything in Erlang can be
- A pid and not a
Future
equivalent is returned.
await
is used at the expression level.- Indicates that the current expression should block for a reply.
await
has the built in ability to wait on multiple async-spawned results in the order that they are defined.await
can be called from any function, even non-async
functions.
The semantics of async
/await
are notability different than other language
implementations. await
can be used anywhere, not just in functions defined
with async
. Erlang can be considered async
by default, so function
coloring and
propagation are not required.
The ergonomics of async
/await
are much more friendly than other languages.
linked
is equivalent to calling link/1
in the calling process.
Example:
async add(A, B) -> A + B.
main() ->
% Links to the child process and waits for the response.
await linked add(1, 2).
krarup by default encourages the user to define the control and data planes
of their programs separately. To achieve this, await
can only wait on async
functions defined in the current module. We refer to the code that deals with
concurrency and data-flow definition as the 'Control Plane'.
Code that performs data manipulation and validation is refered to as
the 'Data Plane'. The krarup programming data plane is slightly
similar to the networking concept, but less so than the Control Plane.
Here is a small example that demonstrates how the Control and Data Planes enforce
simplicity to the 'hard parts' of concurrent programming, that averages the ages
cross 3 JSON documents stored in a directory called data_dir/
.
% src/averager.krp
-module(averager).
async process_file(FileName) ->
json_decoder:handle_file(FileName).
async process_results(Ages) ->
lists:sum(Ages) / length(Ages).
main() ->
{ok, Files} = file:list_dirs("data_dir/"),
Ages = await [linked process_file(File) || File <- Files],
AverageAge = await process_results(Ages).
io:format("Average Age: ~p~n", [AverageAge]).
% src/processor.erl
-module(processor).
handle_file(FileName) ->
{ok, Binary} = file:read_file(FileName),
JSON = json_decoder:decode(Binary),
maps:get(age, JSON).
This structures krarup programs naturally into their separate concerns. The
code that deals with concurrency structure is separate from that which handles
data processing. The logical split encourages simple and readable code for
future readers. Of course, nothing prevents one from adding all of the JSON
decoding and manipulation into the process_file
directly. This does prevent
moving the concurrency story outside of the current module.
!
vsreceive
vsspawn
/link
(operator, keyword, BIF function, BIF function).
The use of a single keyword in front of an expression is an already supported
pattern in Erlang that krarup does not introduce. The example is catch
:
catch asdf.
> asdf
krarup is an extension of this syntax.
There are multiple existing langauge elements that perform runtime checks,
especially on pid()
inputs. For example, the !
operator checks the lhs
at runtime:
2> asdf ! asdf.
** exception error: bad argument
in operator !/2
called as asdf ! asdf
krarup leans on this pattern for it's support for pid-based keywords and extends this pattern to lists of pids.
A natural question that comes up is: why not just a new or existing library? Beyond the inconsistency story, these primitives should exist at the language level as they do in other languages, but probably not in Erlang itsself. The ergonomics of these tools being built into a language vs having an utility library is also a major difference, they should be a natural extension.
An EPP is an official proposal to add a new feature to Erlang.
We don't necessarily feel that most of krarup should necessarily belong in Erlang. While some of the concepts would be very welcome to be integrated they can be done so by people more familiar with creating languages. Parts of krarup are quite distinct and do not feel completely natural compared to vanilla Erlang. So we went with an implementation instead of a proposal.
-
linked
support.- Basic
linked
support on a singlepid()
. -
linked
support on list ofpid()
s. - Use
spawn_link
inasync
function instead oflink/1
function.
- Basic
-
supervised
support. -
monitored
support. - More user-friendly messages during parsing.
- Sane error for in-module
await
violations.
- Sane error for in-module
- Better recompile detection?
- Stronger safety checks for
await
.- Currently only checks if the expression is a
pid()
or a[pid()]
. - Could consider using a
{awaitable, pid()}
for more safety.
- Currently only checks if the expression is a
- Better code generation.
- The code that is currently generated is quite heavy and a bit ugly.
- Add default timeout to match
gen_server
behavior. -
timeout
support. - Consider a
dispatcher
behaviour and keyword to implement some kind of pooler.