-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
186 changed files
with
17,957 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,108 @@ | ||
--- | ||
sidebar_position: 1 | ||
sidebar_label: Introduction | ||
--- | ||
|
||
# The Zed Project | ||
|
||
Zed offers a new approach to data that makes it easier to manipulate and manage | ||
your data. | ||
|
||
With Zed's new [super-structured data model](formats/README.md#2-zed-a-super-structured-pattern), | ||
messy JSON data can easily be given the fully-typed precision of relational tables | ||
without giving up JSON's uncanny ability to represent eclectic data. | ||
|
||
## Getting Started | ||
|
||
Trying out Zed is easy: just [install](install.md) the command-line tool | ||
[`zq`](commands/zq.md) and run through the [zq tutorial](tutorials/zq.md). | ||
|
||
`zq` is a lot like [`jq`](https://stedolan.github.io/jq/) | ||
but is built from the ground up as a search and analytics engine based | ||
on the [Zed data model](formats/zed.md). Since Zed data is a | ||
proper superset of JSON, `zq` also works natively with JSON. | ||
|
||
While `zq` and the Zed data formats are production quality, the Zed project's | ||
[Zed data lake](commands/zed.md) is a bit [earlier in development](commands/zed.md#status). | ||
|
||
For a non-technical user, Zed is as easy to use as web search | ||
while for a technical user, Zed exposes its technical underpinnings | ||
in a gradual slope, providing as much detail as desired, | ||
packaged up in the easy-to-understand | ||
[ZSON data format](formats/zson.md) and | ||
[Zed language](language/README.md). | ||
|
||
## Terminology | ||
|
||
"Zed" is an umbrella term that describes | ||
a number of different elements of the system: | ||
* The [Zed data model](formats/zed.md) is the abstract definition of the data types and semantics | ||
that underlie the Zed formats. | ||
* The [Zed formats](formats/README.md) are a family of | ||
[sequential (ZNG)](formats/zng.md), [columnar (VNG)](formats/vng.md), | ||
and [human-readable (ZSON)](formats/zson.md) formats that all adhere to the | ||
same abstract Zed data model. | ||
* A [Zed lake](commands/zed.md) is a collection of Zed data stored | ||
across one or more [data pools](commands/zed.md#data-pools) with ACID commit semantics and | ||
accessed via a [Git](https://git-scm.com/)-like API. | ||
* The [Zed language](language/README.md) is the system's dataflow language for performing | ||
queries, searches, analytics, transformations, or any of the above combined together. | ||
* A [Zed query](language/overview.md) is a Zed script that performs | ||
search and/or analytics. | ||
* A [Zed shaper](language/shaping.md) is a Zed script that performs | ||
data transformation to _shape_ | ||
the input data into the desired set of organizing Zed data types called "shapes", | ||
which are traditionally called _schemas_ in relational systems but are | ||
much more flexible in the Zed system. | ||
|
||
## Digging Deeper | ||
|
||
The [Zed language documentation](language/README.md) | ||
is the best way to learn about `zq` in depth. | ||
All of its examples use `zq` commands run on the command line. | ||
Run `zq -h` for a list of command options and online help. | ||
|
||
The [Zed lake documentation](commands/zed.md) | ||
is the best way to learn about `zed`. | ||
All of its examples use `zed` commands run on the command line. | ||
Run `zed -h` or `-h` with any subcommand for a list of command options | ||
and online help. The same language query that works for `zq` operating | ||
on local files or streams also works for `zed query` operating on a lake. | ||
|
||
## Design Philosophy | ||
|
||
The design philosophy for Zed is based on composable building blocks | ||
built from self-describing data structures. Everything in a Zed lake | ||
is built from Zed data and each system component can be run and tested in isolation. | ||
|
||
Since Zed data is self-describing, this approach makes stream composition | ||
very easy. Data from a Zed query can trivially be piped to a local | ||
instance of `zq` by feeding the resulting Zed stream to stdin of `zq`, for example, | ||
``` | ||
zed query "from pool | ...remote query..." | zq "...local query..." - | ||
``` | ||
There is no need to configure the Zed entities with schema information | ||
like [protobuf configs](https://developers.google.com/protocol-buffers/docs/proto3) | ||
or connections to | ||
[schema registries](https://docs.confluent.io/platform/current/schema-registry/index.html). | ||
|
||
A Zed lake is completely self-contained, requiring no auxiliary databases | ||
(like the [Hive metastore](https://cwiki.apache.org/confluence/display/hive/design)) | ||
or other third-party services to interpret the lake data. | ||
Once copied, a new service can be instantiated by pointing a `zed serve` | ||
at the copy of the lake. | ||
|
||
Functionality like [data compaction](commands/zed.md#manage) and retention are all API-driven. | ||
|
||
Bite-sized components are unified by the Zed data, usually in the ZNG format: | ||
* All lake meta-data is available via meta-queries. | ||
* All like operations available through the service API are also available | ||
directly via the `zed` command. | ||
* Lake management is agent-driven through the API. For example, instead of complex policies | ||
like data compaction being implemented in the core with some fixed set of | ||
algorithms and policies, an agent can simply hit the API to obtain the meta-data | ||
of the objects in the lake, analyze the objects (e.g., looking for too much | ||
key space overlap) and issue API commands to merge overlapping objects | ||
and delete the old fragmented objects, all with the transactional consistency | ||
of the commit log. | ||
* Components are easily tested and debugged in isolation. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,18 @@ | ||
# Command Tooling | ||
|
||
The Zed system is managed and queried with the [`zed` command](zed.md), | ||
which is organized into numerous subcommands like the familiar command patterns | ||
of `docker` or `kubectrl`. | ||
Built-in help for the `zed` command and all of its subcommands is always | ||
accessible with the `-h` flag. | ||
|
||
The [`zq` command](zq.md) offers a convenient slice of `zed` for running | ||
stand-alone, command-line queries on inputs from files, HTTP URLs, or [S3](../integrations/amazon-s3.md). | ||
`zq` is like [`jq`](https://stedolan.github.io/jq/) but is easier and faster, utilizes the richer | ||
Zed data model, and interoperates with a number of other formats beyond JSON. | ||
If you don't need a Zed lake, you can install just the | ||
slimmer `zq` command which omits lake support and dev tools. | ||
|
||
`zq` is always installed alongside `zed`. You might find yourself mixing and | ||
matching `zed` lake queries with `zq` local queries and stitching them | ||
all together with Unix pipelines. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,2 @@ | ||
position: 3 | ||
label: Commands |
Oops, something went wrong.