An escript that generates protobuffers and Erlang CRUDL (Create, Read, Update, Delete, List/Lookup) based on your relational data model. The tool will generate code that supports either records or maps. The tool supports the Erlang gpb library, though not required. Currently, it only supports Postgres.
Aside from simple CRUDL, this tool allows you to generate functions based on any standard SQL query, lookups, and transformations. The custom mappings and transformations are very powerful.
Please look at the example
directory to an example schema and configuration file to generate not just simple
CRUD, but search/lookup operations based on indexed fields, as well as custom mapping queries including how to use
transformations to work with PostGis where you want to tranform lat/lon to geography columns.
Make sure you have a minimum of Erlang/OTP 23 installed. Currently, this has only been tested on MacOS.
Using Homebrew:
brew install erlang
Using MacPorts:
sudo port install erlang
Most operating systems have pre-built Erlang distributions in their package management systems. For Ubuntu/Debian:
sudo apt-get update
sudo apt-get install erlang
For Fedora:
sudo yum install erlang
For FreeBSD
sudo pkg update
sudo pkg install erlang
Please follow the getting started instruction
If you are setting up docker for the first time on Ubuntu using snap, the follow these instructions. Due to the confinement issues on snappy, it requires some manual setup to make docker-snap works on your machine.
Using Homebrew:
brew install --cask docker
open /Applications/Docker.app
On Ubuntu classic, before installing the docker snap, please run the following command to add the login user into docker group.
sudo addgroup --system docker
sudo adduser $USER docker
newgrp docker
Once installed, you can then manage the services
sudo snap services
To start the docker service
sudo snap start docker
First, start the docker image in the examples subdirectory and then create the proto_crudl
Role and Database with
create database and superuser privileges and password proto_crudl
.
cd example
docker-compose up
Now log in as the user locally to postgres running in the container
psql -h localhost -p 5432 -U proto_crudl
or you can try...
docker run -it --link some-postgis:postgres --rm postgres sh -c 'exec psql -h "$POSTGRES_PORT_5432_TCP_ADDR" -p "$POSTGRES_PORT_5432_TCP_PORT" -U proto_crudl'
This helper script will allow you to rapidly drop and recreate your database. This expects the role and database proto_crudl
bin/reset_db.sh
From the top level directory
rebar3 escriptize
cd example
bin/generate.sh
or directly
_build/default/bin/proto_crudl <config>
Several of the eunit tests rely on the presence of the example database. Currently unit and functional tests are commingled.
rebar3 eunit
Note, when rerunning eunit tests after generating the example code, you must reset
the database otherwise the version
columns may still be present and cause the tests to faile.
The project includes an example schema and scripts to build located in the example
directory. You
will find the build scripts in example/bin
to create or reset the example database as well as to generate
the code in the example schema, which is located in example/database
. The schema was generated using DbSchema.
PLEASE NOTE: You will see errors and warnings in the output. This is intentional as the example schema includes a lot of corner cases, like a table without a primary key and unsupported postgres types.
The example code also contains some tests that tests the results of the generated code. To run these tests They
are located in example/apps/test/example_test.erl
and do full CRUDL test.
cd example
rebar3 eunit
Please refer to these test to better understand how to use the generated code.
Currently, I have not had time to write a github pull script for the escript
executable
artifact: _build/default/bin/proto_crudl
. At this time I manually copy the file to my project. So not build friendly
just yet.
This is important to read. One of the most significant disjointed aspects of using Erlang records that are mapped to
Protobuffers is the how timestamps
are supported. For protobuffers, there is only a timestamp
data type while in
Erlang, and relational databases, there is support for multiple datetime data types including date
and timestamp
.
proto_crudl
is a framework that generates Erlang code the maps between a relational database, Erlang
maps or records, and protobuffers, we make a conscious choice to err on the side of native support. If you are
using the default gpb
generated records, it presumes that all datetime fields are timestamps. This means that
proto_crudl
will treat datetimes as native Erlang data types (i.e. {{Year, Month, Day}, {Hour, Minute, Seconds}}
and
{Year, Month, Day}
). When serializing a protobuffer across the wire with gpb
you must call to_proto/1
and
from_proto/1
to encode and decode to a google.protobuf.Timestamp
.
There are built-in helper functions called ts_encode/1
, ts_decode/1
, date_encode/1
, and date_decode/1
when
operating on the record values natively in Erlang.
You will want to look at example/config/proto_crudl.config as a guide for your own config. It gives a complete example with inline documentation of the current functionality of the tool.
Almost all the configs follow the pattern of a list of tuples where the first element in the tuple is the table, and the second element is then a list.
The more complex feature of proto_crudl
is the ability to apply transformations or sql functions. If you are using PostGIS,
or need to apply other functions, you will need to use this feature.
{transforms, [
{"test_schema.user", [
{insert, [{"geog", "ST_POINT($lon, $lat)::geography"}]},
{update, [{"geog", "ST_POINT($lon, $lat)::geography"}]},
{select, [{"lat", "ST_Y(geog::geometry)"},
{"lon", "ST_X(geog::geometry)"}]}]},
{"public.foo", [
{select, [{"foobar", "1"}]}]}
]}.
Following the same pattern of a list of tables with a list of tuples. In the case of converting a lat
and lon
to a
geography
, you must define each of the operations insert/create, update, and select/read on how the column values will
be handled to and from the database. The result is that the columns lat
and lon
will be generated as virtual
columns in the mapping. Note that when referencing them in the function body (the second element of the tuple), you will
need to prepend them with the $
so that proto_crudl
knows they are the virtual columns being operated on.
For the insert
operation, a single tuple is defined which will
result in the extension function ST_POINT($lat, $lon)::geography
to be applied to the bind values of the INSERT
statement. Resulting in the following code:
-define(INSERT, "INSERT INTO test_schema.user (first_name, last_name, email, user_token, enabled, aka_id, my_array, user_type, number_value, created_on, updated_on, due_date, version, geog) VALUES ($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, 0, ST_POINT($14, $13)::geography) RETURNING user_id, first_name, last_name, email, user_token, enabled, aka_id, my_array, user_type, number_value, created_on, updated_on, due_date, version, ST_Y(geog::geometry) AS lat, ST_X(geog::geometry) AS lon").
and
create(M = #{lon := Lon, lat := Lat, due_date := DueDate, updated_on := UpdatedOn, created_on := CreatedOn, number_value := NumberValue, user_type := UserType, my_array := MyArray, aka_id := AkaId, enabled := Enabled, user_token := UserToken, email := Email, last_name := LastName, first_name := FirstName}) when is_map(M) ->
Params = [FirstName, LastName, Email, UserToken, Enabled, AkaId, MyArray, user_type_value(UserType), NumberValue, ts_decode_map(CreatedOn), ts_decode_map(UpdatedOn), DueDate, Lat, Lon],
case pgo:query(?INSERT, Params, #{decode_opts => [{return_rows_as_maps, true}, {column_name_as_atom, true}, {decode_fun, fun decode_row/2}]}) of
#{command := insert, num_rows := 0} ->
{error, failed_to_insert};
#{command := insert, num_rows := 1, rows := [Row]} ->
{ok, Row};
{error, {pgsql_error, #{code := <<"23505">>}}} ->
{error, exists};
{error, Reason} ->
{error, Reason}
end;
create(M = #{lon := Lon, lat := Lat, due_date := DueDate, updated_on := UpdatedOn, created_on := CreatedOn, number_value := NumberValue, user_type := UserType, my_array := MyArray, aka_id := AkaId, email := Email, last_name := LastName, first_name := FirstName}) when is_map(M) ->
Params = [FirstName, LastName, Email, AkaId, MyArray, user_type_value(UserType), NumberValue, ts_decode_map(CreatedOn), ts_decode_map(UpdatedOn), DueDate, Lat, Lon],
case pgo:query(?INSERT_DEFAULTS, Params, #{decode_opts => [{return_rows_as_maps, true}, {column_name_as_atom, true}, {decode_fun, fun decode_row/2}]}) of
#{command := insert, num_rows := 0} ->
{error, failed_to_insert};
#{command := insert, num_rows := 1, rows := [Row]} ->
{ok, Row};
{error, {pgsql_error, #{code := <<"23505">>}}} ->
{error, exists};
{error, Reason} ->
{error, Reason}
end;
create(_M) ->
{error, invalid_map}.
I would recommend that you build the example project and then review the generated code for test_schema_user_db.erl
to get a better
understanding.
proto-crudl
generates all the Create, Read, Update, Delete, and List/Lookup functions based on the table schema while
supporting several key features. It is important to note that guards are needed to keep from accidentally updating foreign
key relationships, to this end, the update
function does not include any foreign key fields, rather, it generates
specific update functions for the foreign key.
The proto_crudl framework implements all the necessary code to support handling stale changes to a record with a version column.
{options, [{version_column, "version"}, indexed_lookups, check_constraints_as_enums]},
In this example, the column is called version
. When a table has this column, proto_crudl will generate code that will
automatically handle updating the column value on a good update, as well as guard against updating the record from a
stale update. If the current value of version
is 100
and you attempt to update using a map or record that has the version value
of 90, the update will return with notfound
, otherwise update returns {ok, Map}
or {ok, Record}
(depending on how you generated your code).
If the column is not present, the tool will automatically inject the column into the table by performing an ALERT TABLE ...
In our example database, the user
table has a column named version
.
-define(UPDATE, "UPDATE test_schema.user SET first_name = $2, last_name = $3, email = $4, user_token = $5, enabled = $6, aka_id = $7, my_array = $8, user_type = $9, number_value = $10, created_on = $11, updated_on = $12, due_date = $13, version = version + 1, geog = ST_POINT($16, $15)::geography WHERE user_id = $1 AND version = $14 RETURNING user_id, first_name, last_name, email, user_token, enabled, aka_id, my_array, user_type, number_value, created_on, updated_on, due_date, version, ST_Y(geog::geometry) AS lat, ST_X(geog::geometry) AS lon").
This results in the following code generation and logic for UPDATES. Please note that the INSERT sql and code is also different.
update(M = #{lon := Lon, lat := Lat, version := Version, due_date := DueDate, updated_on := UpdatedOn, created_on := CreatedOn, number_value := NumberValue, user_type := UserType, my_array := MyArray, aka_id := AkaId, enabled := Enabled, user_token := UserToken, email := Email, last_name := LastName, first_name := FirstName, user_id := UserId}) when is_map(M) ->
Params = [UserId, FirstName, LastName, Email, UserToken, Enabled, AkaId, MyArray, user_type_value(UserType), NumberValue, ts_decode_map(CreatedOn), ts_decode_map(UpdatedOn), DueDate, Version, Lat, Lon],
case pgo:query(?UPDATE, Params, #{decode_opts => [{return_rows_as_maps, true}, {column_name_as_atom, true}, {decode_fun, fun decode_row/2}]}) of
#{command := update, num_rows := 0} ->
notfound;
#{command := update, num_rows := 1, rows := [Row]} ->
{ok, Row};
{error, Reason} ->
{error, Reason}
end;
update(_M) ->
{error, invalid_map}.
Finally, proto_crudl
uses erleans/pgo for its Postgres connectivity (which is currently)
the only supported database. If you want the pgo client to return UUID as binary strings, set an application environment
variable or in your sys.config:
{pg_types, [{uuid_format, string}]},
{pgo, [{pools, [{default, #{pool_size => 10,
host => "127.0.0.1",
database => "proto_crudl",
user => "proto_crudl",
password => "proto_crudl"}}]}]}
Size the pool according to your requirements.
proto_crudl will generate a .proto
that maps to each table in the schema (unless explicitly excluded).
The package for each proto will be the schema that the table is located in. The .proto
files will be generated in
the output
directory specified in the config file with those proto to table mappings being written to a subdirectory
that corresponds to the schema the table is in.
Relational databases are inherently namespaced by schema. This means that when supporting multiple schemas, and the fact that Erlang has no concept of namespaces, the module name will be the name of the table prepended by the schema name (if set in the config). Note that protobuffers are correctly generated where the schema maps to a package.
It is recommended that you download and install the protocol buffer compiler, which is necessary to compile to other languages such as Java, Objective-C, or even Go. If you are new to protocol buffers, start by reading the developer docs.
It is important to note that the default configuration is to use gpb
to compile and support protobuffers in Erlang.
PLEASE NOTE: There is a bug that has been filed in gpb
where protos of the same name in different packages will get
overwritten since there is no namespacing.
You will need to copy the proto_crudl
escript to your project.
Next, create your proto_crudl.config
. You can simply copy and modify the one in the repo. Currently, I have a manual
generate script that I locate in the top level bin
directory:
#!/usr/bin/env bash
DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" >/dev/null 2>&1 && pwd )"
"$DIR"/proto_crudl "$DIR"/../config/proto_crudl.config
proto_crutl
supports code generation using both maps and records. It is important to note that there are several issues
with using maps with gpb
and pgo
that do align very well when using together.
First pgo
favors maps and uses the atom (as expected) null
for NULL values. Unfortunately, here gpb
seems to
favor records when serializing and deserializing, this is likely a bug. For now, proto_crudl
generates code based on
each library implementations support for maps and records. When using the maps
config option with gpb
, fields that are
explicitly undefined
are not handled properly and cause an exception as the library attempts to serialize them. The
gpb
generated code for records does not have this issue. As expected, undefined
is treated as missing.
Next, pgo
does not handle query parameters that have undefined
present. While this is strictly correct, it breaks
when trying to map between the code generated by gpo
and query parameters expected by pgo
. This project includes a
forked version of pgo
that is more lenient and allows undefined
to also mean null
. This has not been committed
back to the original project yet.
Because of the subtle misalignment between libraries, using maps requires the use of to_proto/1
and from_proto/1
to correctly convert the serialized protobuffers as maps as input values to queries. It also means that the developer
has to handle null
values from the pgo
and ultimately proto_crutl
and remember to convert to undefined
when
serializing. Maps are very flexible, but also very problematic.