Releases: stripe-archive/sequins
v1.0.0
This release represents a major rewrite of sequins. We've been running this new code internally for more than six months, and we think it's in pretty good shape!
The main difference is that sequins can now run as a cluster, partitioning and replicating data across multiple machines. There are lots of other changes, as well, including:
- A better on-disk format, including on-disk compression
- File-based configuration, with lots of configuration options
- A web interface with status information, and lots of monitoring hooks
- A comprehensive manual
To say the least, this release is highly backwards incompatible with sequins 0.4.x. Moreover, input data that was organized for 0.4.x needs to be organized slightly different in S3 or HDFS. You can read more about the new source root layout here.
Additionally, since v1.0.0rc1, a few issues have been fixed:
v1.0.0rc1
This is the first draft release of v1.0, a major rewrite of sequins. We've been running this new code internally for more than six months, and we think it's in pretty good shape!
The main difference is that sequins can now run as a cluster, partitioning and replicating data across multiple machines. There are lots of other changes, as well, including:
- A better on-disk format, including on-disk compression
- File-based configuration, with lots of configuration options
- A web interface with status information, and lots of monitoring hooks
- A comprehensive manual
To say the least, this release is highly backwards incompatible with sequins 0.4.x. Moreover, input data that was organized for 0.4.x needs to be organized slightly different in S3 or HDFS. You can read more about the new source root layout here.
v0.4.7
This minor release fixes a bug in the previous release - downloading data to temporary directories before renaming.
It also includes a new utility, sequins-dump
, to dump out the data in sequencefiles. You use it like this:
$ sequins-dump test_data/0/
Alice Practice
Bob Hope
Charlie Horse
It also has options to dump any combination of offsets, keys, and values:
$ sequins-dump --help
usage: sequins-dump [<flags>] <PATH>
Flags:
--help Show help (also see --help-long and --help-man).
-o, --offsets Display offsets.
-k, --keys Display keys.
-v, --values Display values.
--version Show application version.
v0.4.6
v0.4.5
v0.4.4
v0.4.3
This minor release brings two small changes:
- A bug has been fixed where sequins would keep indexed files open long after it had switched to a new version.
- Sequins now writes out a
.manifest
file once it has fully indexed a dataset, and it can load from that on startup instead of indexing everything all over again.
v0.4.2
v0.4.1
v0.4.0
This release comes with some performance optimizations, as well as some code cleanup. The two major changes are:
- Reading SequenceFiles now requires significantly less heap bloat; this should reduce memory usage for large datasets considerably
- Indexing now happens in serial, instead of in parallel - this means it's slightly slower, but thrashes the disk a lot less.