You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Namespaces require semantics that will prepare us to work with distributed systems and allow us to do data migrations. So far, we have generated information systems with one unified namespace. The semantics of the INCLUDE statement until Ampersand vs. 5.0 is the set union. To support data migration, we need to support three systems, one of which has an INCLUDE relation with the two others.
Requirements
Proposed solution
In issue #850 we decided to borrow Haskell's module mechanism, with one file for each module. Each file starts with a MODULE statement, so let's replace the CONTEXT statement from Ampersand with the MODULE statement. Without any INCLUDE statements, Ampersand compiles the entire file into one information system containing a dataset, a schema, and a set of interfaces. So it compiles a module called ${\tt bar}$ to a triple $\langle D_{\tt bar}, S_{\tt bar}, F_{\tt bar}\rangle$. With an INCLUDE statement, we need to define that every identifier in the included module is known in the including module by the prefix " ${\tt bar.}$ ". To define renaming, need an operator $\downarrow$, just for defining the semantics in the compiler: ${\tt x\downarrow y\ =\ x<>}$ "." ${\tt<>y}$
I will overload this operator to work for information systems, datasets, schemas, interface sets, and their constituent elements as well, meaning that $x\downarrow y$ prefixes the name $x$ together with a dot to every identifier in the namespace of $y$. For example, if $y$ contains the name client, then $x\downarrow y$ contains the name x.client on every qualifying occurrence of client in $y$.
Let ${\tt foo}$ and ${\tt bar}$ be information systems. Each has a dataset, a schema, and some (0...) interfaces.
Let $D_{\tt foo}$ and $D_{\tt bar}$ be datasets. Let $S_{\tt foo}$ and $S_{\tt bar}$ be schemas. Let $F_{\tt foo}$ and $F_{\tt bar}$ be sets of interfaces. Now we can define the system ${\tt foo\ INCLUDES\ bar}$ as:
For the datasets, this means that all relation names and concept names in ${\tt bar}$ are prefixed with ${\tt bar}$. Atoms are left alone. In the schema of ${\tt bar}$, all rule names, relation names, concept names, pattern names, and view names are prefixed with ${\tt bar}$. All rule names, relation names, concept names, and interface names from $F_{\tt bar}$are prefixed with ${\tt bar}$.
Surely, name clashes can occur. If, for example, system ${\tt foo}$ contains a name bar.account and ${\tt bar}$ contains a name account, the system $D_{\tt foo\ INCLUDES\ bar}$ has a name clash. We will forbid that to ensure a disjoint union semantics.
Alias
In the current implementation, two relation declarations with the same name, source, and target are treated as the same. I don't mind this to remain, but it does not work across the INCLUDE mechanism (because we forbid name clashes). I propose to do this explicitly with an ALIAS statement, for example:
ALIAS client, bar.client
This statement presumes that aliases have the same type, or else we get type errors. Needless to say, the ALIAS statement can also work inside one namespace. It is not linked to the INCLUDE mechanism. Aliasing works for concepts and relations, but not for other named entities.
Consequences
This mechanism excludes cyclic INCLUDE-dependencies. I expect the proposed mechanism to meet the requirements of the migration mechanism, but I will leave that to @sjcjoosten to verify. I hope that this include-relation between information systems is transitive. If not, I would like to fix that, so we can draw an include-graph of the system.
If module ${\tt foo}$ includes module ${\tt bar}$, we currently implement both ${\tt foo}$ and ${\tt bar}$ on the same database. For distributed systems, we will have to allow them to be implemented on different databases. I suggest we do that in another issue.
The text was updated successfully, but these errors were encountered:
Problem
Namespaces require semantics that will prepare us to work with distributed systems and allow us to do data migrations. So far, we have generated information systems with one unified namespace. The semantics of the
INCLUDE
statement until Ampersand vs. 5.0 is the set union. To support data migration, we need to support three systems, one of which has an INCLUDE relation with the two others.Requirements
Proposed solution
In issue #850 we decided to borrow Haskell's module mechanism, with one file for each module. Each file starts with a MODULE statement, so let's replace the CONTEXT statement from Ampersand with the MODULE statement. Without any INCLUDE statements, Ampersand compiles the entire file into one information system containing a dataset, a schema, and a set of interfaces. So it compiles a module called${\tt bar}$ to a triple $\langle D_{\tt bar}, S_{\tt bar}, F_{\tt bar}\rangle$ . With an INCLUDE statement, we need to define that every identifier in the included module is known in the including module by the prefix " ${\tt bar.}$ ". To define renaming, need an operator $\downarrow$ , just for defining the semantics in the compiler:
${\tt x\downarrow y\ =\ x<>}$ "." ${\tt<>y}$ $x\downarrow y$ prefixes the name $x$ together with a dot to every identifier in the namespace of $y$ . For example, if $y$ contains the name $x\downarrow y$ contains the name $y$ .
I will overload this operator to work for information systems, datasets, schemas, interface sets, and their constituent elements as well, meaning that
client
, thenx.client
on every qualifying occurrence ofclient
inLet${\tt foo}$ and ${\tt bar}$ be information systems. Each has a dataset, a schema, and some (0...) interfaces.$D_{\tt foo}$ and $D_{\tt bar}$ be datasets. Let $S_{\tt foo}$ and $S_{\tt bar}$ be schemas. Let $F_{\tt foo}$ and $F_{\tt bar}$ be sets of interfaces. Now we can define the system ${\tt foo\ INCLUDES\ bar}$ as:
Let
For the datasets, this means that all relation names and concept names in${\tt bar}$ are prefixed with ${\tt bar}$ . Atoms are left alone. In the schema of ${\tt bar}$ , all rule names, relation names, concept names, pattern names, and view names are prefixed with ${\tt bar}$ . All rule names, relation names, concept names, and interface names from $F_{\tt bar}$are prefixed with ${\tt bar}$ .
Surely, name clashes can occur. If, for example, system${\tt foo}$ contains a name ${\tt bar}$ contains a name $D_{\tt foo\ INCLUDES\ bar}$ has a name clash. We will forbid that to ensure a disjoint union semantics.
bar.account
andaccount
, the systemAlias
In the current implementation, two relation declarations with the same name, source, and target are treated as the same. I don't mind this to remain, but it does not work across the
INCLUDE
mechanism (because we forbid name clashes). I propose to do this explicitly with anALIAS
statement, for example:This statement presumes that aliases have the same type, or else we get type errors. Needless to say, the
ALIAS
statement can also work inside one namespace. It is not linked to theINCLUDE
mechanism. Aliasing works for concepts and relations, but not for other named entities.Consequences
This mechanism excludes cyclic
INCLUDE
-dependencies. I expect the proposed mechanism to meet the requirements of the migration mechanism, but I will leave that to @sjcjoosten to verify. I hope that this include-relation between information systems is transitive. If not, I would like to fix that, so we can draw an include-graph of the system.If module${\tt foo}$ includes module ${\tt bar}$ , we currently implement both ${\tt foo}$ and ${\tt bar}$ on the same database. For distributed systems, we will have to allow them to be implemented on different databases. I suggest we do that in another issue.
The text was updated successfully, but these errors were encountered: