Commandline tool for running SQL queries against JSON, CSV, Excel, Parquet, and more.

This is a CLI companion to DataStation (a GUI) for running SQL queries against data files. So if you want the GUI version of this, check out DataStation.

Install

Binaries for amd64 (x86_64) are provided for each release.

macOS, Linux

On macOS or Linux, you can run the following:

$ VERSION=0.1.0
$ curl -LO "https://github.com/multiprocessio/dsq/releases/download/$VERSION/dsq-$(uname -s | awk '{ print tolower($0) }')-x64-$VERSION.zip"
$ unzip dsq*.zip
$ sudo mv dsq /usr/local/bin/dsq

Or install manually from the releases page, unzip and add dsq to your $PATH.

Windows

Download the latest Windows release, unzip it, and add dsq to your $PATH.

Manual

If you are on another platform or architecture or want to grab the latest release, you can do so with Go 1.17+:

$ go install github.com/multiprocessio/dsq@latest

Usage

You can either pipe data to dsq or you can pass a file name to it.

If you are passing a file, it must have the usual extension for its content type.

For example:

$ dsq testdata.json "SELECT * FROM {} WHERE x > 10"

Or:

$ dsq testdata.ndjson "SELECT name, AVG(time) FROM {} GROUP BY name ORDER BY AVG(time) DESC"

Piping data to dsq

When piping data to dsq you need to set the -s flag and specify the file extension or MIME type.

For example:

$ cat testdata.csv | dsq -s csv "SELECT * FROM {} LIMIT 1"

Or:

$ cat testdata.parquet | dsq -s parquet "SELECT COUNT(1) FROM {}"

Multiple files and joins

You can pass multiple files to DSQ. As long as they are supported data files in a valid format, you can run SQL against all files as tables. Each table can be accessed by the string {N} where N is the 0-based index of the file in the list of files passed on the commandline.

For example this joins two datasets of differing origin types (CSV and JSON).

$ dsq testdata/join/users.csv testdata/join/ages.json \
      "select {0}.name, {1}.age from {0} join {1} on {0}.id = {1}.id"

Transforming data to JSON without querying

As a shorthand for dsq testdata.csv "SELECT * FROM {}" to convert supported file types to JSON you can skip the query and the converted JSON will be dumped to stdout.

For example:

$ dsq testdata.csv
[{...some csv data...},{...some csv data...},...]

Supported Data Types

Name	File Extension(s)	Notes
CSV	`csv`
TSV	`tsv`, `tab`
JSON	`json`	Must be an array of objects. Nested object fields are ignored.
Newline-delimited JSON	`ndjson`, `jsonl`
Parquet	`parquet`
Excel	`xlsx`, `xls`	Currently only works if there is only one sheet.
ODS	`ods`	Currently only works if there is only one sheet.
Apache Error Logs	`text/apache2error`	Currently only works if being piped in.
Apache Access Logs	`text/apache2access`	Currently only works if being piped in.
Nginx Access Logs	`text/nginxaccess`	Currently only works if being piped in.

Engine

Under the hood dsq uses DataStation as a library and under that hood DataStation uses SQLite to power these kinds of SQL queries on arbitrary (structured) data.

Comparisons

The speed column is based on rough benchmarks based on q's benchmarks. Eventually I'll do a more thorough and public benchmark.

Name	Link	Speed	Supported File Types	Engine
q	http://harelba.github.io/q/	Fast	CSV, TSV	Uses SQLite
textql	https://github.com/dinedal/textql	Ok	CSV, TSV	Uses SQLite
octoql	https://github.com/cube2222/octosql	Slow	JSON, CSV, Excel, Parquet	Custom engine missing many features from SQLite
dsq	Here	Ok	CSV, TSV, JSON, Newline-delimited JSON, Parquet, Excel, ODS (OpenOffice Calc), Logs	Uses SQLite

Community

Join us on Discord.

How can I help?

Download the app and use it! Report bugs on Discord.

Before starting on any new feature though, check in on Discord!

Subscribe

If you want to hear about new features and how this works under the hood, sign up here.

License

This software is licensed under an Apache 2.0 license.

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
.github/workflows		.github/workflows
scripts		scripts
testdata		testdata
.gitignore		.gitignore
LICENSE.md		LICENSE.md
README.md		README.md
go.mod		go.mod
go.sum		go.sum
main.go		main.go

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Commandline tool for running SQL queries against JSON, CSV, Excel, Parquet, and more.

Install

macOS, Linux

Windows

Manual

Usage

Piping data to dsq

Multiple files and joins

Transforming data to JSON without querying

Supported Data Types

Engine

Comparisons

Community

How can I help?

Subscribe

License

About

Releases

Packages

Languages

License

hackstrap/dsq

Folders and files

Latest commit

History

Repository files navigation

Commandline tool for running SQL queries against JSON, CSV, Excel, Parquet, and more.

Install

macOS, Linux

Windows

Manual

Usage

Piping data to dsq

Multiple files and joins

Transforming data to JSON without querying

Supported Data Types

Engine

Comparisons

Community

How can I help?

Subscribe

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages