Releases: TIBCOSoftware/snappydata
Project SnappyData 1.3.1 Release
The SnappyData team is pleased to announce the availability of version 1.3.1 of the platform. The release artifacts are listed at the end.
The major change in this release is the move from log4j 1.x to log4j 2.x (2.17.2) for all the components. The Spark connector component still continues to allow usage of log4j 1.x for compatibility with upstream Spark releases, but SnappyData and its Spark distribution neither includes nor depends on any log4j 1.x components.
Prior to release 1.3.0 there were two editions namely, the Community Edition which was a fully functional core OSS distribution that was under the Apache Source License v2.0, and the Enterprise Edition which was sold by TIBCO Software under the name TIBCO ComputeDB™ that included everything offered in the OSS version along with additional capabilities that are closed source and only available as part of a licensed subscription. Following the release of OSS 1.3.0, all the platform's private modules have been made open-source (under the Apache Source License v2.0) apart from the streaming GemFire connector (which depends on non-OSS Pivotal GemFire product jars).
The release notes are available here. Full set of documentation is here.
Update: Hotfix-1 release
A hotfix release with version 1.3.1-HF-1
was released subsequently to address a late breaking issue:
SDSNAP-842: ComputeDB failed to start due to some errors
This issue is resolved after patches for the following issues were ported from Apache Geode:
- GEODE-8029: IllegalArgumentException: Too large (805306401 expected elements with load factor 0.75)
- Other related Oplog fixes: GEODE-1969, GEODE-5302, GEODE-8667, GEODE-9881, GEODE-9854
Only product binary has changes while rest of the artifacts are unchanged.
If you have not installed 1.3.1
release tarball, then it is recommended to install the hotfix snappydata-1.3.1-HF-1-bin.tar.gz
instead.
Description of download artifacts
Artifact Name | Description |
---|---|
snappydata-1.3.1-HF-1-bin.tar.gz | Full product binary for 1.3.1 hotfix-1 (includes Hadoop 3.2.0) - recommended. |
snappydata-1.3.1-bin.tar.gz | Full product binary for 1.3.1 release (includes Hadoop 3.2.0). |
snappydata-jdbc_2.11-1.3.1.jar | JDBC client driver and push down JDBC data source for Spark. Compatible with Java 8, Java 11 and higher. |
snappydata-odbc_1.3.0_win64.zip | 32-bit and 64-bit ODBC client drivers from 1.3.0 release for Windows 64-bit platform. |
snappydata-spark-connector_2.11-1.3.1.jar | The single jar needed in Smart Connector mode; an alternative to --packages option. Compatible with Spark versions 2.1.1, 2.1.2 and 2.1.3 and the included SnappyData's Spark distribution. |
snappydata-zeppelin_2.11-0.8.2.1.jar | The Zeppelin interpreter jar for SnappyData compatible with Apache Zeppelin 0.8.2. The standard jdbc interpreter is now recommended instead of this. See How to Use Apache Zeppelin with SnappyData. |
snappydata-1.3.1.sha256 | The SHA256 checksums of the product artifacts. On Linux verify using sha256sum --check snappydata-1.3.1.sha256 . |
snappydata-1.3.1.sha256.asc | PGP signature for snappydata-1.3.1.sha256 in ASCII format. Get the public key using gpg --keyserver hkps://keyserver.ubuntu.com --recv-keys A7994CE77A24E5511A68727D8CED09EB8184C4D6 . Then verify using gpg --verify snappydata-1.3.1.sha256.asc which should show a good signature using that key having [email protected] as the email. |
Project SnappyData 1.3.0 Release
The SnappyData team is pleased to announce the availability of version 1.3.0 of the platform. The release artifacts are listed at the end.
In previous releases there were two editions namely, the Community Edition which was a fully functional core OSS distribution that was under the Apache Source License v2.0, and the Enterprise Edition which was sold by TIBCO Software under the name TIBCO ComputeDB™ that included everything offered in the OSS version along with additional capabilities that are closed source and only available as part of a licensed subscription.
The SnappyData team is pleased to announce that starting from this release, all the platform's private modules have been made open-source (under the Apache Source License v2.0) apart from the streaming GemFire connector (which depends on non-OSS Pivotal GemFire product jars). These include Approximate Query Processing (AQP) and the JDBC connector repositories which also include the off-heap storage support for column tables and the security modules. In addition, the ODBC driver has also been made open-source. With this, the entire code base of the platform (apart from the GemFire connector) has been made open source and there is no longer an Enterprise edition distributed by TIBCO.
The release notes are available here. Full set of documentation is here.
Description of download artifacts
Artifact Name | Description |
---|---|
snappydata-1.3.0-bin.tar.gz | Full product binary (includes Hadoop 3.2.0). |
snappydata-jdbc_2.11-1.3.0.jar | JDBC client driver and push down JDBC data source for Spark. Compatible with Java 8, Java 11 and higher. |
snappydata-core_2.11-1.3.0.jar | The single jar needed in Smart Connector mode; an alternative to --packages option. Compatible with Spark versions 2.1.1, 2.1.2 and 2.1.3. |
snappydata-odbc_1.3.0_win64.zip | 32-bit and 64-bit ODBC client drivers for Windows 64-bit platform. |
snappydata-zeppelin_2.11-0.8.2.1.jar | The Zeppelin interpreter jar for SnappyData compatible with Apache Zeppelin 0.8.2. The standard jdbc interpreter is now recommended instead of this. See How to Use Apache Zeppelin with SnappyData. |
snappydata-1.3.0.sha256 | The SHA256 checksums of the product artifacts. On Linux verify using sha256sum --check snappydata-1.3.0.sha256 . |
snappydata-1.3.0.sha256.asc | PGP signature for snappydata-1.3.0.sha256 in ASCII format. Get the public key using gpg --keyserver hkps://keyserver.ubuntu.com --recv-keys A7994CE77A24E5511A68727D8CED09EB8184C4D6 . Then verify using gpg --verify snappydata-1.3.0.sha256.asc which should show a good signature using that key having [email protected] as the email. |
Project SnappyData Community Edition 1.2.0 Release
The SnappyData team is pleased to announce the availability of version 1.2.0 of the platform. The release artifacts of its Community Edition are listed below.
You can also download the TIBCO ComputeDB Enterprise Edition from here.
The release notes are available here.
Description of download artifacts
Artifact Name | Description |
---|---|
snappydata-1.2.0-bin.tar.gz | Full product binary (includes Hadoop 2.7) |
zeppelin-0.8.2-snappydata-1.2.0.zip | Apache Zeppelin distribution with sample SnappyData notebooks |
snappydata-jdbc_2.11-1.2.0.jar | Client (JDBC) JAR |
snappydata-core_2.11-1.2.0.jar | The single jar needed in Smart Connector mode; an alternative to --packages option |
snappydata_1.2.0.md5 | The MD5 checksum of the product artifacts |
EC2 script to launch SnappyData cluster on AWS EC2 instances is available here.
Project SnappyData Community Edition 1.1.1 Release
The SnappyData team is pleased to announce the availability of version 1.1.1 of the platform. The release artifacts of its Community Edition are listed below.
You can also download the TIBCO ComputeDB Enterprise Edition from here.
The release notes are available here.
Description of download artifacts
Artifact Name | Description |
---|---|
snappydata-1.1.1-bin.tar.gz | Full product binary (includes Hadoop 2.7) |
snappydata-jdbc_2.11-1.1.1.jar | Client (JDBC) JAR |
snappydata_1.1.1.md5 | The MD5 checksum of the product artifacts |
EC2 script to launch SnappyData cluster on AWS EC2 instances is available here.
Project SnappyData Community Edition 1.1.0 Release
The SnappyData team is pleased to announce the availability of version 1.1.0 of the platform. The release artifacts of its Community Edition are listed below.
You can also download the TIBCO ComputeDB Enterprise Edition from here.
The release notes are available here.
Description of download artifacts
Artifact Name | Description |
---|---|
snappydata-1.1.0-bin.tar.gz | Full product binary (includes Hadoop 2.7) |
snappydata-jdbc_2.11-1.1.0.jar | Client (JDBC) JAR |
snappydata_1.1.0.md5 | The MD5 checksum of the product artifacts |
EC2 script to launch SnappyData cluster on AWS EC2 instances is available here.
SnappyData OSS 1.0.2.1 Release
The SnappyData team is pleased to announce the availability of version 1.0.2.1 of the platform. You can find the release artifacts of its Community Edition towards the end of this page.
Note: A new version (1.0.2.2) of JDBC Client jar is released with a fix for an issue seen when used with Apache Spark 2.3.1 or later. Users are advised to use the latest JDBC Client jar instead of the older one.
You can also download the Enterprise Edition here. The following table summarizes the features available in Enterprise and OSS editions.
Feature | Community | Enterprise |
---|---|---|
Mutable Row & Column Store | X | X |
Compatibility with Spark | X | X |
Shared Nothing Persistence and HA | X | X |
REST API for Spark Job Submission | X | X |
Fault Tolerance for Driver | X | X |
Access to the system using JDBC Driver | X | X |
CLI for backup, restore, and export data | X | X |
Spark console extensions | X | X |
System Perf/Behavior statistics | X | X |
Support for transactions in Row tables | X | X |
Support for indexing in Row Tables | X | X |
SQL extensions for stream processing | X | X |
Runtime deployment of packages and jars | X | X |
Synopsis Data Engine for Approximate Querying | X | |
ODBC Driver with High Concurrency | X | |
Off-heap data storage for column tables | X | |
CDC Stream receiver for SQL Server into SnappyData | X | |
GemFire/Apache Geode connector | X | |
Row Level Security | X | |
Use encrypted password instead of clear text password | X | |
Restrict Table, View, Function creation even in user’s own schema | X | |
LDAP security interface | X |
New Features
- Support Spark's HiveServer2 in SnappyData cluster. Enable starting an embedded Spark HiveServer2 on leads, in embedded mode.
- Provided a default Structured Streaming Sink implementation for Snappy column and row tables. Conflation of events with same key columns can be enabled by a sink property.
- Added a
-agent
jvm argument in the launch commands to kill the jvm as soon as OOM occurs. This is important because the VM sometimes used to crash in very unexpected ways later as a side effect of this corrupting internal metadata which later gave restart troubles. - Allow NONE as a valid policy for
server-auth-provider
. Essentially, the cluster can now be configured only for user authentication and mutual peer to peer authentication of cluster members can be disabled by specifying this property as NONE. - Add support for query hints to force a join type. This may be useful for cases where result is known to be small, for example, but plan rules cannot determine so.
- Allow
deleteFrom
api to work as far as the dataframe contains key columns.
Performance Enhancements
- Avoid shuffle when join key columns are a superset of child partitioning.
- Added a pooled version of SnappyData JDBC driver for Spark to connect to SnappyData cluster as jdbc datasource.
- Added caching for hive catalog lookups. Meta-data queries with large number of tables take quite long because of nested loop joins between SYSTABLES and HIVETABLES for most meta-data queries. Even if the table numbers were in hundreds it used to take a lot of time. (SNAP-2657)
Select Fixes and Performance Related Fixes
- Reset the pool at the end of collect to avoid spillover of lowlatency pool setting to latter operations that may not use the CachedDataFrame execution paths. (SNAP-2659)
- Fixed: Column added using 'ALTER TABLE ... ADD COLUMN ...' through snappy shell does not reflect in spark-shell. (SNAP-2491)
- Fixing occasional failures in serialization using CachedDataFrame if node is just starting/stopping. Also, fix a hang in shutdown for cases where hive client close is trying to boot up the node again, waiting on the locks taken during shutdown.
- Lead and Lag window functions were failing due to incorrect analysis error. (SNAP-2566)
- Fixed the validate-disk-store tool. It was not getting initialized with registered types. This was required to desrialize byte arrays being read from persisted files.
- Fix schema in ResultSet metadata. It used to show default schema 'APP' always.
- Sometimes a false unique constraint violation happened due to removed or destroyed AbstractRegionEntry. Now attempt is made to remove it from index and another try is made to put the new value against the index key. (SNAP-2627)
- Fix for memory leak in oldEntrieMap leading to LowMemoryException and OutOfMemoryException. (SNAP-2654)
- Note the change in the name of SnappyData JDBC Client jar.
Description of download artifacts
Artifact Name | Description |
---|---|
snappydata-1.0.2.1-bin.tar.gz | Full product binary (includes Hadoop 2.7) |
snappydata-1.0.2.1-without-hadoop-bin.tar.gz | Product without the Hadoop dependency JARs |
snappydata-jdbc_2.11-1.0.2.2.jar | Client (JDBC) JAR |
snappydata-jdbc_2.11-1.0.2.1.jar | Client (JDBC) JAR |
snappydata-zeppelin_2.11-0.7.3.4.jar | The Zeppelin interpreter jar for SnappyData, compatible with Apache Zeppelin 0.7.3 |
snappydata-ec2-0.8.2.1.tar.gz | Script to Launch SnappyData cluster on AWS EC2 instances |
SnappyData OSS 1.0.2 Release
The SnappyData team is pleased to announce the availability of version 1.0.2 of the platform. You can find the release artifacts of its Community Edition towards the end of this page.
You can also download the Enterprise Edition here. The following table summarizes the features available in Enterprise and OSS editions.
Feature | Community | Enterprise |
---|---|---|
Mutable Row & Column Store | X | X |
Compatibility with Spark | X | X |
Shared Nothing Persistence and HA | X | X |
REST API for Spark Job Submission | X | X |
Fault Tolerance for Driver | X | X |
Access to the system using JDBC Driver | X | X |
CLI for backup, restore, and export data | X | X |
Spark console extensions | X | X |
System Perf/Behavior statistics | X | X |
Support for transactions in Row tables | X | X |
Support for indexing in Row Tables | X | X |
SQL extensions for stream processing | X | X |
Runtime deployment of packages and jars | X | X |
Synopsis Data Engine for Approximate Querying | X | |
ODBC Driver with High Concurrency | X | |
Off-heap data storage for column tables | X | |
CDC Stream receiver for SQL Server into SnappyData | X | |
GemFire/Apache Geode connector | X | |
Row Level Security | X | |
Use encrypted password instead of clear text password | X | |
Restrict Table, View, Function creation even in user’s own schema | X | |
LDAP security interface | X |
New Features
- Introduced an API in snappy session catalog to get Primary Key of Row tables or Key Columns of Column Tables, as DataFrame. (SNAP-2459)
- Introduced an API in snappy session catalog to get table type as String (SNAP-2477).
- Added support for arbitrary size view definition. It use to fail when view text size went beyond 32k.
Support for displaying VIEWTEXT for views in SYS.HIVETABLES.
For example: Select viewtext from sys.hivetables where tablename = ‘view_name” will give the text with which the view was created. - Added Row level Security feature. Admins can define multiple security policies on tables for different users or ldap groups.
Refer Row Level Security - Auto refresh of UI page. Now the SnappyData UI page gets updated automatically and frequently. User does not have to refresh or reload. Refer SnappyData Pulse
- More richer User Interface. Added graphs for memory, CPU consumption etc. for last 15 minutes. The user has the ability to see how the cluster health has been for the last 15 minutes instead of just current state.
- Total CPU core count capacity of the cluster is now displayed on the UI.
Refer SnappyData Pulse - Bucket count of tables are also displayed now on the user interface.
- Support deployment of packages and jars as DDL command.
- Added support for reading maven dependencies using --packages option in our job server scripts.
- Changes to procedure sys.repair_catalog to execute it on the server (earlier this was run on lead by sending a message to it). This will be useful to repair catalog even when lead is down.
Refer Catalog Repair - Added support for** PreparedStatement.getMetadata() JDBC API **. This is on an experimental basis.
- Added support for execution of some ddl commands viz CREATE/DROP DISKSTORE, GRANT, REVOKE. CALL procedures from snappy session as well.
- Quote table names in all store DDL/DML/query strings to allow for special characters and keywords in table names
Spark application with same name cannot be submitted to SnappyData. This has been done so that individual apps can be killed by its name when required. - Users are not allowed to create tables in their own schema based on system property -
snappydata.RESTRICT_TABLE_CREATION
. In some cases it may be required to control use of cluster resources in which case the table creation is done only by authorized owners of schema. - Schema can be owned by an LDAP group also and not necessarily by a single user.
- Support for deploying SnappyData on Kubernetes using Helm charts. This feature is currently experimental.
Refer Kubernetes - Disk Store Validate tool enhancement. Validation of disk store can find out all the inconsistencies at once.
- BINARY data type is same as Blob data type.
Performance Enhancements
- Fixed concurrent query performance issue by resolving the incorrect output partition choice. Due to numBucket check, all the partition pruned queries were converted to hash partition with one partition. This was causing an exchange node to be introduced. (SNAP-2421)
- Fixed SnappyData UI becoming unresponsive on LowMemoryException.(SNAP-2071)
- Cleaning up tokenization handling and fixes. Main change is addition of the following two separate classes for tokenization:
- ParamLiteral
- TokenLiteral
Both classes extend a common trait TokenizedLiteral. Tokenization will always happen independently of plan caching, unless it is explicitly turned off. (SNAP-1932)
- Procedure for smart connector iteration and fixes. Includes fixes for perf issues as noted for all iterators (disk iterator, smart connector and remote iterator). (SNAP-2243)
Select Fixes and Performance Related Fixes
- Fixed incorrect server status shown on the UI. Sometimes due to a race condition for the same member two entries were shown up on the UI. (SNAP-2433)
- Fixed missing SQL tab on SnappyData UI in local mode. (SNAP-2470)
- Fixed few issues related to wrong results for Row tables due to plan caching. (SNAP-2463 - Incorrect pushing down of OR and AND clause filter combination in push down query, SNAP-2351 - re-evaluation of filter was not happening due to plan caching, SNAP-2451, SNAP-2457)
- Skip batch, if the stats row is missing while scanning column values from disk. This was already handled for in-memory batches and the same has been added for on-disk batches. (SNAP-2364)
- Fixes in UI to not let unauthorized users to see any tab. (ENT-21)
- Fixes in SnappyData parser to create inlined table. (SNAP-2302), ‘()’ as optional in some function like ‘current_date()’, ‘current_timestamp()’ etc. (SNAP-2303)
- Consider the current schema name also as part of Caching Key for plan caching. So same query on same table but from different schema should not clash with each other. (SNAP-2438)
- Fix for COLUMN table mysteriously shown as ROW table on dashboard after LME in data server. (SNAP-2382)
- Fixed off-heap size for Partitioned Regions, showed on UI. (SNAP-2186)
- Fixed failure when query on view does not fallback to Spark plan in case Code Generation fails. (SNAP-2363)
- Fix invalid decompress call on stats row.(SNAP-2348). Use to fail in run time while scanning column tables.(SNAP-2348)
- Fixed negative bucket size with eviction. (GITHUB-982)
- Fixed the issue of incorrect LowMemoryException, even if a lot of memory was left. (SNAP-2356)
- Handled int overflow case in memory accounting. Due to this ExecutionMemoryPool released more memory than it has throws AssertionError (SNAP-2312)
- Fixed the pooled connection not being returned to the pool after authorization check failure which led to unusable cluster. (SNAP-2255)
- Fixed different results of nearly identical queries, due to join order. Its due to EXCHANGE hash ordering being different from table partitioning. It will happen for the specific case when query join order is different from partitioning of one of the tables while the other table being joined is partitioned differently. (SNAP-2225)
- Corrected row count updated/inserted in a column table via putInto. (SNAP-2220)
- Fixed the OOM issue due to hive queries. This was a memory leak. Due to this the system became very slow after sometime even if idle. (SNAP-2248)
- Fixed the issue of incomplete plan and query string info in UI due to plan caching changes.
- Corrected the logic of existence join.
- Sensitive information, like user password, LDAP password etc, which are passed as properties to the cluster are masked on the UI now.
- Schema with boolean columns sometimes returned incorrect null values. Fixed. (SNAP-2436)
- Fixed the scenario where break in colocation chain of buckets due to crash led to disk store metadata going bad causing restart failure.
- Wrong entry count on restart, if region got closed on a server due to DiskAccessException leading to a feeling of loss of data. Do not let the region close in case of LME. This has been done by not letting non IOException get wrapped in DiskAccessException. (SNAP-2375)
- Fix to avoid hang or delay in stop when stop is issued and the component has gone into reconnect cycle. (SNAP-2380)
- Handle joining of new servers better. Avoid ConflictingPersistentDataException when a new server starts before any of the old server start. SNAP-2236
- ODBC driver bug fix. Added EmbedDatabaseMetaData.getTableSchemas.
- Change the order in which backup is taken. Internal DD diskstore of backup is taken first followed by rest of the disk stores. This helps in stream apps which want to store offset of replayable source in snappydata. They can create the offset table backed up by the internal DD store instead of default or custom disk store.
Description of download artifacts
Artifact Name | Description |
---|---|
snappydata-1.0.2-bin.tar.gz | Full product binary (includes Hadoop 2.7) |
snappydata-1.0.2-without-hadoop-bin.tar.gz | Product without the Hadoop dependen... |
SnappyData OSS 1.0.1 Release
The SnappyData team is pleased to announce the availability of version 1.0.1 of the platform. You can find the release artifacts of its Community Edition towards the end of this page.
You can also download the Enterprise Edition here. The table below summarizes the features available in Enterprise and OSS editions.
Feature | Community | Enterprise |
---|---|---|
Mutable Row & Column Store | X | X |
Compatibility with Spark | X | X |
Shared Nothing Persistence & HA | X | X |
REST API for Spark Job Submission | X | X |
Fault Tolerance for Driver | X | X |
JDBC Driver | X | X |
CLI for backup, restore & export | X | X |
Spark console extensions | X | X |
System Perf/Behavior statistics | X | X |
Support for transactions in Row tables | X | X |
Support for indexing in Row Tables | X | X |
SQL extensions for stream processing | X | X |
Synopsis Data Engine for Approximate Querying | X | |
ODBC Driver with High Concurrency | X | |
Off-heap data storage for column tables | X | |
CDC Stream receiver for SQL Server into SnappyData | X | |
GemFire/Apache Geode connector | X | |
LDAP security interface | X |
More details about the release:
New Features:
- putInto and deleteFrom bulk operations support for column tables (SNAP-2092, SNAP-2093, SNAP-2094):
- ability to specify "key columns" in the table DDL to use for putInto and deleteFrom APIs
- "PUT INTO" SQL or putInto API extension to overwrite existing rows and insert non-existing ones
- "DELETE FROM" SQL or deleteFrom API extension to delete a set of matching rows
- UPDATE SQL now supports using expressions with column references of another table in RHS of SET
- Improvements in cluster restart with off-line, failed nodes or with corrupt meta-data (SNAP-2096)
- new admin command "unblock" to allow the initialization of a table even if it is waiting for offline members
- retain data unlike revoke and initialize with the latest online working copy (SNAP-2143)
- parallel recovery of data regions to break any cyclic dependencies between the nodes, and allow reporting on all off-line nodes that may have more recent copy of data
- many bug-fixes related to startup issues due to meta-data inconsistencies:
incorrect data conflicts (SNAP-2097, SNAP-2098), metadata corruption (SNAP-2140)
- Compression of column batches in disk storage and over the network (SNAP-1743)
- support for LZ4, SNAPPY compression codecs in disk storage and transport of column table data
- new SOURCEPATH and COMPRESSION columns in SYS.HIVETABLES virtual table
- Support for temporary, global temporary and persistent VIEWs (SNAP-2072):
- CREATE VIEW, CREATE TEMPORARY VIEW and CREATE GLOBAL TEMPORARY VIEW DDLs
- No jar dependencies in snappydata cluster for external datasources of smart connector (SNAP-2072)
- External tables display in dashboard and snappy command-line (SNAP-2086)
- Auto-configuration of SPARK_PUBLIC_DNS, hostname-for-clients etc in AWS environment (SNAP-2116)
- GRANT/REVOKE SQL support in SnappySession.sql() earlier only allowed from JDBC/ODBC (SNAP-2042)
- LATERAL VIEW support in SnappySession.sql() (SNAP-1283)
- FETCH FIRST syntax as an alternative to LIMIT to support some SQL tools that use former
- Addition of IndexStats in for local row table index lookup and range scans
- SYS.DISKSTOREIDS virtual table to disk-store IDs being used in the cluster by all members (SNAP-2113)
Performance Enhancements:
- Major performance improvements in smart connector mode (SNAP-2101, SNAP-2084)
- minimized buffer copying, key lookups in column table rather than full scan for filters, reduce round-trips
- allow using SnappyUnifiedMemoryManager with smart connector (SNAP-2084)
- New memory and disk iterator to minimize faultins and serialize disk reads (SNAP-2102):
- reduce faultins and cross-iterator serial disk reads per diskstore to minimize random reads from disk
- new remote iterator that substantially reduces the memory overhead and caches only current batch
- Startup performance improvements to cut down on locator/server/lead start and restart times (SNAP-338)
- Improve performance of reads of variable length data for some queries (SNAP-2118)
- Use colocated joins with VIEWs when possible (SNAP-2204)
- Separate disk store for delta buffer regions to substantially improve column table compaction (SNAP-2121)
- Projection push-down to scan layer for non-deterministic expressions like spark_partition_id() (SNAP-2036)
- code-generation cache is larger by default and configurable (SNAP-2120)
Select bug fixes and performance related fixes:
A sample of bug fixes done as part of this release are noted below. For a more comprehensive list, see ReleaseNotes.txt.
- Now only overflow-to-disk is allowed as eviction action for tables (SNAP-1501):
- only overflow-to-disk is allowed as a valid eviction action and cannot be explicitly specified
- OVERFLOW=false property can be used to disable eviction which is true by default
- Memory accounting fixes:
- incorrect initial memory accounting causing insert failure even with memory available (SNAP-2084)
- zero usage shown in UI on restart (SNAP-2180)
- Disable embedded Zeppelin interpreter in a secure cluster which can bypass security (SNAP-2191)
- Fix import of JSON data (SNAP-2087)
- selects missing results or failing during node failures (SNAP-889, SNAP-1547)
- fixes and improvements to server and lead status in both the launcher status and SYS.MEMBERS table
(SNAP-1960, SNAP-2060, SNAP-1645) - fix updates on complex types (SNAP-2141)
- column table scan fixes related to null value reads (SNAP-2088)
- disable tokenization for external tables, flags to disable it and plan caching (SNAP-2114, SNAP-2124)
- deadlock in transactional operations with GII (SNAP-1950)
- couple of fixes in UPDATE SQL: unexpected rollover (SNAP-2192), show as update count (SNAP-2156)
- fixes ported from Apache Geode (GEODE-2109, GEODE-2240)
- fixes to all failures in snappy-spark test suite which includes both product and test changes
- more comprehensive python API testing (SNAP-2044)
Description of download artifacts:
Artifact Name | Description |
---|---|
snappydata-1.0.1-bin.tar.gz | Full product binary (includes Hadoop 2.7) |
snappydata-1.0.1-without-hadoop-bin.tar.gz | Product without the Hadoop dependency JARs |
snappydata-client-1.6.1.jar | Client (JDBC) JAR |
snappydata-zeppelin-0.7.3.jar | The Zeppelin interpreter jar for SnappyData, compatible with Apache Zeppelin 0.7.3 |
snappydata-ec2-0.8.1.tar.gz | Script to Launch SnappyData cluster on AWS EC2 instances |
SnappyData Community Edition (OSS) 1.0.0 GA Release
The SnappyData team is pleased to announce the availability of version 1.0.0 GA of the platform.
Download the Enterprise Edition here
New Features:
- Fully compatible with Apache Spark 2.1.1
- Mutability support for column store (SNAP-1389):
- UPDATE and DELETE operations are now supported on column tables.
- ALTER TABLE support for row table (SNAP-1326).
- Security Support (available in enterprise edition): This release introduces cluster security with authentication and authorisation based on LDAP mechanism. Will be extended to other mechanisms in future (SNAP-1656, SNAP-1813).
- DEB and RPM installers (distProduct target in source build).
- Support for setting scheduler pools using the set command.
- Multi-node cluster now boots up quickly as background start of server processes is enabled by default.
- Pulse Console: SnappyData Pulse has been enhanced to be more useful to both developers and operations personnel (SNAP-1890, SNAP-1792). Improvements include
- Ability to sort members list based on members type.
- Added new UI view named SnappyData Member Details Page which includes, among other things, latest logs.
- Added members Heap and Off-Heap memory usage details along with their storage and execution splits.
- Users can specify streaming batch interval when submitting a stream job via conf/snappy-job.sh (SNAP-1948).
- Row tables now support LONG, SHORT, TINYINT and BYTE datatypes (SNAP-1722).
- The history file for snappy shell has been renamed from .gfxd.history to .snappy.history. You may copy your existing ~/.gfxd.history to ~/.snappy.history to be able to access your historical snappy shell commands.
Performance Enhancements:
- Performance enhancements with dictionary decoder when dictionary is large. (SNAP-1877)
- Using a consistent sort for pushed down predicates so that different sessions do not end up creating different generated code.
- Reduced the size of generated code.
- Indexed cursors in decoders to improve heavily filtered queries (SNAP-1936)
- Performance improvements in Smart Connector mode, specially with queries on tables with wide schema (SNAP-1363, SNAP-1699)
- Several other performance improvements.
Select bug fixes and performance related fixes:
There have been numerous bug fixes done as part of this release. Some of these are included below. For a more comprehensive list, see ReleaseNotes.txt.
- Fixed data inconsistency issues when a new node is joining the cluster and at the same time write operations are going on. (SNAP-1756).
- The product internally does retries on redundant copy of partitions on the event of a node failure (SNAP-1377, SNAP-902).
- Fixed the wrong status of locators on restarts. After cluster restart, snappy-status-all.sh used to show locators in waiting state even when the actual status changed to running (SNAP-1893).
- Fixed the SnappyData Pulse freezing when loading data sets (SNAP-1426).
- More accurate accounting of execution and storage memory (SNAP-1688, SNAP-1798).
- Corrected case-sensitivity handling for query API calls (SNAP-1714).
Description of download artifacts:
Artifact Name | Description |
---|---|
snappydata-1.0.0-bin.tar.gz | Full product binary (includes Hadoop 2.7) |
snappydata-1.0.0-without-hadoop-bin.tar.gz | Product without the Hadoop dependency JARs |
snappydata-client-1.6.0.jar | Client (JDBC) JAR |
snappydata-zeppelin-0.7.2.jar | The Zeppelin interpreter jar for SnappyData, compatible with Apache Zeppelin 0.7.2 |
SnappyData OSS 1.0.0-RC1 Release
The SnappyData team is pleased to announce the availability of version 1.0.0-RC1 of the platform.
New Features:
- Fully compatible with Apache Spark 2.1.1
- Mutability support for column store (SNAP-1389):
-- UPDATE and DELETE operations are now supported on column tables. - ALTER TABLE support for row table (SNAP-1326).
- Security Support (available in enterprise edition): This release introduces cluster security with authentication and authorisation based on LDAP mechanism. Will be extended to other mechanisms in future (SNAP-1656, SNAP-1813).
- Support for setting scheduler pools using the set command.
- Multi-node cluster now boots up quickly as background start of server processes is enabled by default.
- Pulse Console: SnappyData Pulse has been enhanced to be more useful to both developers and operations personnel (SNAP-1890, SNAP-1792). Improvements include
-- Ability to sort members list based on members type.
-- Added new UI view named SnappyData Member Details Page which includes, among other things, latest logs.
-- Added members Heap and Off-Heap memory usage details along with their storage and execution splits. - Users can specify streaming batch interval when submitting a stream job via conf/snappy-job.sh (SNAP-1948).
- Row tables now support LONG, SHORT, TINYINT and BYTE datatypes (SNAP-1722).
- The history file for snappy shell has been renamed from .gfxd.history to .snappy.history. You may copy your existing ~/.gfxd.history to ~/.snappy.history to be able to access your historical snappy shell commands.
Performance Enhancements:
- Performance enhancements with dictionary decoder when dictionary is large. (SNAP-1877)
-- Different sessions end up creating different code due to indeterminate statsPredicate
ordering. Now using a consistent sort order so that generated code is identical across
sessions for the same query.
-- Reduced the size of generated code. - Indexed cursors in decoders to improve heavily filtered queries (SNAP-1936)
- Performance improvements in Smart Connector mode, specially with queries on tables with wide schema (SNAP-1363, SNAP-1699)
- Several other performance improvements.
Select bug fixes and performance related fixes:
Some of these are included below. For the complete list, see ReleaseNotes.txt.
- Fixed data inconsistency issues when a new node is joining the cluster and at the same time write operations are going on. (SNAP-1756).
- The product internally does retries on redundant copy of partitions on the event of a node failure (SNAP-1377, SNAP-902).
- Fixed the wrong status of locators on restarts. After cluster restart, snappy-status-all.sh used to show locators in waiting state even when the actual status changed to running (SNAP-1893).
- Fixed the SnappyData Pulse freezing when loading data sets (SNAP-1426).
- More accurate accounting of execution and storage memory (SNAP-1688, SNAP-1798).
- Corrected case-sensitivity handling for query API calls (SNAP-1714).
Description of download artifacts:
Artifact Name | Description |
---|---|
snappydata-1.0.0-rc1-bin.tar.gz | Full product binary (includes Hadoop 2.7) |
snappydata-1.0.0-rc1-bin.zip | Full product binary (includes Hadoop 2.7) |
snappydata-1.0.0-rc1-without-hadoop-bin.tar.gz | Product without the Hadoop dependency JARs |
snappydata-1.0.0-rc1-without-hadoop-bin.zip | Product without the Hadoop dependency JARs |
snappydata-client-1.5.6-rc1.jar | Client (JDBC) JAR |
snappydata-core_2.11-1.0.0-rc1.jar | The only dependency needed to connect to SnappyStore from Apache Spark 2.1.1 cluster (Smart Connector mode) |
snappydata-zeppelin-0.7.1.jar | The Zeppelin interpreter jar for SnappyData, compatible with Apache Zeppelin 0.7 |
(Details will be added here soon)