diff --git a/CHANGELOG.md b/CHANGELOG.md index 6cedf32df628..8fa4e4242e59 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -17,13 +17,119 @@ under the License. --> +# Apache Arrow 0.6.0 (14 August 2017) + +## Bug + +* ARROW-1192 - [JAVA] Improve splitAndTransfer performance for List and Union vectors +* ARROW-1195 - [C++] CpuInfo doesn't get cache size on Windows +* ARROW-1204 - [C++] lz4 ExternalProject fails in Visual Studio 2015 +* ARROW-1225 - [Python] pyarrow.array does not attempt to convert bytes to UTF8 when passed a StringType +* ARROW-1237 - [JAVA] Expose the ability to set lastSet +* ARROW-1239 - issue with current version of git-commit-id-plugin +* ARROW-1240 - security: upgrade logback to address CVE-2017-5929 +* ARROW-1242 - [Java] security - upgrade Jackson to mitigate 3 CVE vulnerabilities +* ARROW-1245 - [Integration] Java Integration Tests Disabled +* ARROW-1248 - [Python] C linkage warnings in Clang with public Cython API +* ARROW-1249 - [JAVA] Expose the fillEmpties function from NullableVector.mutator +* ARROW-1263 - [C++] CpuInfo should be able to get CPU features on Windows +* ARROW-1265 - [Plasma] Plasma store memory leak warnings in Python test suite +* ARROW-1267 - [Java] Handle zero length case in BitVector.splitAndTransfer +* ARROW-1269 - [Packaging] Add Windows wheel build scripts from ARROW-1068 to arrow-dist +* ARROW-1275 - [C++] Default static library prefix for Snappy should be "\_static" +* ARROW-1276 - Cannot serializer empty DataFrame to parquet +* ARROW-1283 - [Java] VectorSchemaRoot should be able to be closed() more than once +* ARROW-1285 - PYTHON: NotImplemented exception creates empty parquet file +* ARROW-1287 - [Python] Emulate "whence" argument of seek in NativeFile +* ARROW-1290 - [C++] Use array capacity doubling in arrow::BufferBuilder +* ARROW-1291 - [Python] `pa.RecordBatch.from_pandas` doesn't accept DataFrame with numeric column names +* ARROW-1294 - [C++] New Appveyor build failures +* ARROW-1296 - [Java] templates/FixValueVectors reset() method doesn't set allocationSizeInBytes correctly +* ARROW-1300 - [JAVA] Fix ListVector Tests +* ARROW-1306 - [Python] Encoding? issue with error reporting for `parquet.read_table` +* ARROW-1308 - [C++] ld tries to link `arrow_static` even when -DARROW_BUILD_STATIC=off +* ARROW-1309 - [Python] Error inferring List type in `Array.from_pandas` when inner values are all None +* ARROW-1310 - [JAVA] Revert ARROW-886 +* ARROW-1312 - [C++] Set default value to `ARROW_JEMALLOC` to OFF until ARROW-1282 is resolved +* ARROW-1326 - [Python] Fix Sphinx build in Travis CI +* ARROW-1327 - [Python] Failing to release GIL in `MemoryMappedFile._open` causes deadlock +* ARROW-1328 - [Python] `pyarrow.Table.from_pandas` option `timestamps_to_ms` changes column values +* ARROW-1330 - [Plasma] Turn on plasma tests on manylinux1 +* ARROW-1335 - [C++] `PrimitiveArray::raw_values` has inconsistent semantics re: offsets compared with subclasses +* ARROW-1338 - [Python] Investigate non-deterministic core dump on Python 2.7, Travis CI builds +* ARROW-1340 - [Java] NullableMapVector field doesn't maintain metadata +* ARROW-1342 - [Python] Support strided array of lists +* ARROW-1343 - [Format/Java/C++] Ensuring encapsulated stream / IPC message sizes are always a multiple of 8 +* ARROW-1350 - [C++] Include Plasma source tree in source distribution +* ARROW-187 - [C++] Decide on how pedantic we want to be about exceptions +* ARROW-276 - [JAVA] Nullable Value Vectors should extend BaseValueVector instead of BaseDataValueVector +* ARROW-573 - [Python/C++] Support ordered dictionaries data, pandas Categorical +* ARROW-884 - [C++] Exclude internal classes from documentation +* ARROW-932 - [Python] Fix compiler warnings on MSVC +* ARROW-968 - [Python] RecordBatch [i:j] syntax is incomplete + +## Improvement + +* ARROW-1093 - [Python] Fail Python builds if flake8 yields warnings +* ARROW-1121 - [C++] Improve error message when opening OS file fails +* ARROW-1140 - [C++] Allow optional build of plasma +* ARROW-1149 - [Plasma] Create Cython client library for Plasma +* ARROW-1173 - [Plasma] Blog post for Plasma +* ARROW-1211 - [C++] Consider making `default_memory_pool()` the default for builder classes +* ARROW-1213 - [Python] Enable s3fs to be used with ParquetDataset and reader/writer functions +* ARROW-1219 - [C++] Use more vanilla Google C++ formatting +* ARROW-1224 - [Format] Clarify language around buffer padding and alignment in IPC +* ARROW-1230 - [Plasma] Install libraries and headers +* ARROW-1243 - [Java] security: upgrade all libraries to latest stable versions +* ARROW-1251 - [Python/C++] Revise build documentation to account for latest build toolchain +* ARROW-1253 - [C++] Use pre-built toolchain libraries where prudent to speed up CI builds +* ARROW-1255 - [Plasma] Check plasma flatbuffer messages with the flatbuffer verifier +* ARROW-1257 - [Plasma] Plasma documentation +* ARROW-1258 - [C++] Suppress dlmalloc warnings on Clang +* ARROW-1259 - [Plasma] Speed up Plasma tests +* ARROW-1260 - [Plasma] Use factory method to create Python PlasmaClient +* ARROW-1264 - [Plasma] Don't exit the Python interpreter if the plasma client can't connect to the store +* ARROW-1274 - [C++] `add_compiler_export_flags()` throws warning with CMake >= 3.3 +* ARROW-1288 - Clean up many ASF license headers +* ARROW-1289 - [Python] Add `PYARROW_BUILD_PLASMA` option like Parquet +* ARROW-1301 - [C++/Python] Add remaining supported libhdfs UNIX-like filesystem APIs +* ARROW-1303 - [C++] Support downloading Boost +* ARROW-1315 - [GLib] Status check of arrow::ArrayBuilder::Finish() is missing +* ARROW-1323 - [GLib] Add `garrow_boolean_array_get_values()` +* ARROW-1333 - [Plasma] Sorting example for DataFrames in plasma +* ARROW-1334 - [C++] Instantiate arrow::Table from vector of Array objects (instead of Columns) + +## New Feature + +* ARROW-1076 - [Python] Handle nanosecond timestamps more gracefully when writing to Parquet format +* ARROW-1104 - Integrate in-memory object store from Ray +* ARROW-1246 - [Format] Add Map logical type to metadata +* ARROW-1268 - [Website] Blog post on Arrow integration with Spark +* ARROW-1281 - [C++/Python] Add Docker setup for running HDFS tests and other tests we may not run in Travis CI +* ARROW-1305 - [GLib] Add GArrowIntArrayBuilder +* ARROW-1336 - [C++] Add arrow::schema factory function +* ARROW-439 - [Python] Add option in `to_pandas` conversions to yield Categorical from String/Binary arrays +* ARROW-622 - [Python] Investigate alternatives to `timestamps_to_ms` argument in pandas conversion + +## Task + +* ARROW-1270 - [Packaging] Add Python wheel build scripts for macOS to arrow-dist +* ARROW-1272 - [Python] Add script to arrow-dist to generate and upload manylinux1 Python wheels +* ARROW-1273 - [Python] Add convenience functions for reading only Parquet metadata or effective Arrow schema from a particular Parquet file +* ARROW-1297 - 0.6.0 Release +* ARROW-1304 - [Java] Fix checkstyle checks warning + +## Test + +* ARROW-1241 - [C++] Visual Studio 2017 Appveyor build job + # Apache Arrow 0.5.0 (23 July 2017) ## Bug -* ARROW-1074 - from_pandas doesnt convert ndarray to list +* ARROW-1074 - `from_pandas` doesnt convert ndarray to list * ARROW-1079 - [Python] Empty "private" directories should be ignored by Parquet interface -* ARROW-1081 - C++: arrow::test::TestBase::MakePrimitive doesn't fill null_bitmap +* ARROW-1081 - C++: arrow::test::TestBase::MakePrimitive doesn't fill `null_bitmap` * ARROW-1096 - [C++] Memory mapping file over 4GB fails on Windows * ARROW-1097 - Reading tensor needs file to be opened in writeable mode * ARROW-1098 - Document Error? diff --git a/site/_posts/2017-08-16-0.6.0-release.md b/site/_posts/2017-08-16-0.6.0-release.md new file mode 100644 index 000000000000..2796c4b821c6 --- /dev/null +++ b/site/_posts/2017-08-16-0.6.0-release.md @@ -0,0 +1,112 @@ +--- +layout: post +title: "Apache Arrow 0.6.0 Release" +date: "2017-08-16 00:00:00 -0400" +author: wesm +categories: [release] +--- + + +The Apache Arrow team is pleased to announce the 0.6.0 release. It includes +[**90 resolved JIRAs**][1] with the new Plasma shared memory object store, and +improvements and bug fixes to the various language implementations. The Arrow +memory format remains stable since the 0.3.x release. + +See the [Install Page][2] to learn how to get the libraries for your +platform. The [complete changelog][5] is also available. + +## Plasma Shared Memory Object Store + +This release includes the [Plasma Store][7], which you can read more about in +the linked blog post. This system was originally developed as part of the [Ray +Project][8] at the [UC Berkeley RISELab][9]. We recognized that Plasma would be +highly valuable to the Arrow community as a tool for shared memory management +and zero-copy deserialization. Additionally, we believe we will be able to +develop a stronger software stack through sharing of IO and buffer management +code. + +The Plasma store is a server application which runs as a separate process. A +reference C++ client, with Python bindings, is made available in this +release. Clients can be developed in Java or other languages in the future to +enable simple sharing of complex datasets through shared memory. + +## Arrow Format Addition: Map type + +We added a Map logical type to represent ordered and unordered maps +in-memory. This corresponds to the `MAP` logical type annotation in the Parquet +format (where maps are represented as repeated structs). + +Map is represented as a list of structs. It is the first example of a logical +type whose physical representation is a nested type. We have not yet created +implementations of Map containers in any of the implementations, but this can +be done in a future release. + +As an example, the Python data: + +``` +data = [{'a': 1, 'bb': 2, 'cc': 3}, {'dddd': 4}] +``` + +Could be represented in an Arrow `Map` as: + +``` +Map = List> + is_valid: [true, true] + offsets: [0, 3, 4] + values: Struct + children: + - keys: String + is_valid: [true, true, true, true] + offsets: [0, 1, 3, 5, 9] + data: abbccdddd + - values: Int32 + is_valid: [true, true, true, true] + data: [1, 2, 3, 4] +``` +## Python Changes + +Some highlights of Python development outside of bug fixes and general API +improvements include: + +* New `strings_to_categorical=True` option when calling `Table.to_pandas` will + yield pandas `Categorical` types from Arrow binary and string columns +* Expanded Hadoop Filesystem (HDFS) functionality to improve compatibility with + Dask and other HDFS-aware Python libraries. +* s3fs and other Dask-oriented filesystems can now be used with + `pyarrow.parquet.ParquetDataset` +* More graceful handling of pandas's nanosecond timestamps when writing to + Parquet format. You can now pass `coerce_timestamps='ms'` to cast to + milliseconds, or `'us'` for microseconds. + +## Toward Arrow 1.0.0 and Beyond + +We are still discussing the roadmap to 1.0.0 release on the [developer mailing +list][6]. The focus of the 1.0.0 release will likely be memory format stability +and hardening integration tests across the remaining data types implemented in +Java and C++. Please join the discussion there. + +[1]: https://issues.apache.org/jira/issues/?jql=project%20%3D%20ARROW%20AND%20status%20in%20(Resolved%2C%20Closed)%20AND%20fixVersion%20%3D%200.6.0 +[2]: http://arrow.apache.org/install +[3]: http://github.com/apache/parquet-cpp +[5]: http://arrow.apache.org/release/0.6.0.html +[6]: http://mail-archives.apache.org/mod_mbox/arrow-dev/ +[7]: http://arrow.apache.org/blog/2017/08/08/plasma-in-memory-object-store/ +[8]: https://ray-project.github.io/ray/ +[9]: https://rise.cs.berkeley.edu/ \ No newline at end of file diff --git a/site/_release/0.6.0.md b/site/_release/0.6.0.md new file mode 100644 index 000000000000..061393c383cd --- /dev/null +++ b/site/_release/0.6.0.md @@ -0,0 +1,156 @@ +--- +layout: default +title: Apache Arrow 0.6.0 Release +permalink: /release/0.6.0.html +--- + + +# Apache Arrow 0.6.0 (14 August 2017) + +This is a major release. Read more in the [release blog post][3]. + +## Download + +* [**Source Artifacts**][2] +* [Git tag][1] + +## Contributors + +```shell +$ git shortlog -sn apache-arrow-0.5.0..apache-arrow-0.6.0 + 48 Wes McKinney + 7 siddharth + 5 Matt Darwin + 5 Max Risuhin + 5 Philipp Moritz + 4 Kouhei Sutou + 3 Bryan Cutler + 2 Emilio Lahr-Vivaz + 2 Li Jin + 2 Robert Nishihara + 1 Antony Mayi + 1 Marco Neumann + 1 Stepan Kadlec + 1 Steven Phillips + 1 Yeolar + 1 fjetter + 1 rendel +``` + +# Changelog + +## New Features and Improvements + +* [ARROW-1076](https://issues.apache.org/jira/browse/ARROW-1076) - [Python] Handle nanosecond timestamps more gracefully when writing to Parquet format +* [ARROW-1093](https://issues.apache.org/jira/browse/ARROW-1093) - [Python] Fail Python builds if flake8 yields warnings +* [ARROW-1104](https://issues.apache.org/jira/browse/ARROW-1104) - Integrate in-memory object store from Ray +* [ARROW-1121](https://issues.apache.org/jira/browse/ARROW-1121) - [C++] Improve error message when opening OS file fails +* [ARROW-1140](https://issues.apache.org/jira/browse/ARROW-1140) - [C++] Allow optional build of plasma +* [ARROW-1149](https://issues.apache.org/jira/browse/ARROW-1149) - [Plasma] Create Cython client library for Plasma +* [ARROW-1173](https://issues.apache.org/jira/browse/ARROW-1173) - [Plasma] Blog post for Plasma +* [ARROW-1211](https://issues.apache.org/jira/browse/ARROW-1211) - [C++] Consider making default_memory_pool() the default for builder classes +* [ARROW-1213](https://issues.apache.org/jira/browse/ARROW-1213) - [Python] Enable s3fs to be used with ParquetDataset and reader/writer functions +* [ARROW-1219](https://issues.apache.org/jira/browse/ARROW-1219) - [C++] Use more vanilla Google C++ formatting +* [ARROW-1224](https://issues.apache.org/jira/browse/ARROW-1224) - [Format] Clarify language around buffer padding and alignment in IPC +* [ARROW-1230](https://issues.apache.org/jira/browse/ARROW-1230) - [Plasma] Install libraries and headers +* [ARROW-1241](https://issues.apache.org/jira/browse/ARROW-1241) - [C++] Visual Studio 2017 Appveyor build job +* [ARROW-1243](https://issues.apache.org/jira/browse/ARROW-1243) - [Java] security: upgrade all libraries to latest stable versions +* [ARROW-1246](https://issues.apache.org/jira/browse/ARROW-1246) - [Format] Add Map logical type to metadata +* [ARROW-1251](https://issues.apache.org/jira/browse/ARROW-1251) - [Python/C++] Revise build documentation to account for latest build toolchain +* [ARROW-1253](https://issues.apache.org/jira/browse/ARROW-1253) - [C++] Use pre-built toolchain libraries where prudent to speed up CI builds +* [ARROW-1255](https://issues.apache.org/jira/browse/ARROW-1255) - [Plasma] Check plasma flatbuffer messages with the flatbuffer verifier +* [ARROW-1257](https://issues.apache.org/jira/browse/ARROW-1257) - [Plasma] Plasma documentation +* [ARROW-1258](https://issues.apache.org/jira/browse/ARROW-1258) - [C++] Suppress dlmalloc warnings on Clang +* [ARROW-1259](https://issues.apache.org/jira/browse/ARROW-1259) - [Plasma] Speed up Plasma tests +* [ARROW-1260](https://issues.apache.org/jira/browse/ARROW-1260) - [Plasma] Use factory method to create Python PlasmaClient +* [ARROW-1264](https://issues.apache.org/jira/browse/ARROW-1264) - [Plasma] Don't exit the Python interpreter if the plasma client can't connect to the store +* [ARROW-1268](https://issues.apache.org/jira/browse/ARROW-1268) - [Website] Blog post on Arrow integration with Spark +* [ARROW-1270](https://issues.apache.org/jira/browse/ARROW-1270) - [Packaging] Add Python wheel build scripts for macOS to arrow-dist +* [ARROW-1272](https://issues.apache.org/jira/browse/ARROW-1272) - [Python] Add script to arrow-dist to generate and upload manylinux1 Python wheels +* [ARROW-1273](https://issues.apache.org/jira/browse/ARROW-1273) - [Python] Add convenience functions for reading only Parquet metadata or effective Arrow schema from a particular Parquet file +* [ARROW-1274](https://issues.apache.org/jira/browse/ARROW-1274) - [C++] add_compiler_export_flags() throws warning with CMake >= 3.3 +* [ARROW-1281](https://issues.apache.org/jira/browse/ARROW-1281) - [C++/Python] Add Docker setup for running HDFS tests and other tests we may not run in Travis CI +* [ARROW-1288](https://issues.apache.org/jira/browse/ARROW-1288) - Clean up many ASF license headers +* [ARROW-1289](https://issues.apache.org/jira/browse/ARROW-1289) - [Python] Add PYARROW_BUILD_PLASMA option like Parquet +* [ARROW-1297](https://issues.apache.org/jira/browse/ARROW-1297) - 0.6.0 Release +* [ARROW-1301](https://issues.apache.org/jira/browse/ARROW-1301) - [C++/Python] Add remaining supported libhdfs UNIX-like filesystem APIs +* [ARROW-1303](https://issues.apache.org/jira/browse/ARROW-1303) - [C++] Support downloading Boost +* [ARROW-1304](https://issues.apache.org/jira/browse/ARROW-1304) - [Java] Fix checkstyle checks warning +* [ARROW-1305](https://issues.apache.org/jira/browse/ARROW-1305) - [GLib] Add GArrowIntArrayBuilder +* [ARROW-1315](https://issues.apache.org/jira/browse/ARROW-1315) - [GLib] Status check of arrow::ArrayBuilder::Finish() is missing +* [ARROW-1323](https://issues.apache.org/jira/browse/ARROW-1323) - [GLib] Add garrow_boolean_array_get_values() +* [ARROW-1333](https://issues.apache.org/jira/browse/ARROW-1333) - [Plasma] Sorting example for DataFrames in plasma +* [ARROW-1334](https://issues.apache.org/jira/browse/ARROW-1334) - [C++] Instantiate arrow::Table from vector of Array objects (instead of Columns) +* [ARROW-1336](https://issues.apache.org/jira/browse/ARROW-1336) - [C++] Add arrow::schema factory function +* [ARROW-439](https://issues.apache.org/jira/browse/ARROW-439) - [Python] Add option in "to_pandas" conversions to yield Categorical from String/Binary arrays +* [ARROW-622](https://issues.apache.org/jira/browse/ARROW-622) - [Python] Investigate alternatives to timestamps_to_ms argument in pandas conversion + +## Bug Fixes + +* [ARROW-1192](https://issues.apache.org/jira/browse/ARROW-1192) - [JAVA] Improve splitAndTransfer performance for List and Union vectors +* [ARROW-1195](https://issues.apache.org/jira/browse/ARROW-1195) - [C++] CpuInfo doesn't get cache size on Windows +* [ARROW-1204](https://issues.apache.org/jira/browse/ARROW-1204) - [C++] lz4 ExternalProject fails in Visual Studio 2015 +* [ARROW-1225](https://issues.apache.org/jira/browse/ARROW-1225) - [Python] pyarrow.array does not attempt to convert bytes to UTF8 when passed a StringType +* [ARROW-1237](https://issues.apache.org/jira/browse/ARROW-1237) - [JAVA] Expose the ability to set lastSet +* [ARROW-1239](https://issues.apache.org/jira/browse/ARROW-1239) - issue with current version of git-commit-id-plugin +* [ARROW-1240](https://issues.apache.org/jira/browse/ARROW-1240) - security: upgrade logback to address CVE-2017-5929 +* [ARROW-1242](https://issues.apache.org/jira/browse/ARROW-1242) - [Java] security - upgrade Jackson to mitigate 3 CVE vulnerabilities +* [ARROW-1245](https://issues.apache.org/jira/browse/ARROW-1245) - [Integration] Java Integration Tests Disabled +* [ARROW-1248](https://issues.apache.org/jira/browse/ARROW-1248) - [Python] C linkage warnings in Clang with public Cython API +* [ARROW-1249](https://issues.apache.org/jira/browse/ARROW-1249) - [JAVA] Expose the fillEmpties function from NullableVector.mutator +* [ARROW-1263](https://issues.apache.org/jira/browse/ARROW-1263) - [C++] CpuInfo should be able to get CPU features on Windows +* [ARROW-1265](https://issues.apache.org/jira/browse/ARROW-1265) - [Plasma] Plasma store memory leak warnings in Python test suite +* [ARROW-1267](https://issues.apache.org/jira/browse/ARROW-1267) - [Java] Handle zero length case in BitVector.splitAndTransfer +* [ARROW-1269](https://issues.apache.org/jira/browse/ARROW-1269) - [Packaging] Add Windows wheel build scripts from ARROW-1068 to arrow-dist +* [ARROW-1275](https://issues.apache.org/jira/browse/ARROW-1275) - [C++] Default static library prefix for Snappy should be "_static" +* [ARROW-1276](https://issues.apache.org/jira/browse/ARROW-1276) - Cannot serializer empty DataFrame to parquet +* [ARROW-1283](https://issues.apache.org/jira/browse/ARROW-1283) - [Java] VectorSchemaRoot should be able to be closed() more than once +* [ARROW-1285](https://issues.apache.org/jira/browse/ARROW-1285) - PYTHON: NotImplemented exception creates empty parquet file +* [ARROW-1287](https://issues.apache.org/jira/browse/ARROW-1287) - [Python] Emulate "whence" argument of seek in NativeFile +* [ARROW-1290](https://issues.apache.org/jira/browse/ARROW-1290) - [C++] Use array capacity doubling in arrow::BufferBuilder +* [ARROW-1291](https://issues.apache.org/jira/browse/ARROW-1291) - [Python] pa.RecordBatch.from_pandas doesn't accept DataFrame with numeric column names +* [ARROW-1294](https://issues.apache.org/jira/browse/ARROW-1294) - [C++] New Appveyor build failures +* [ARROW-1296](https://issues.apache.org/jira/browse/ARROW-1296) - [Java] templates/FixValueVectors reset() method doesn't set allocationSizeInBytes correctly +* [ARROW-1300](https://issues.apache.org/jira/browse/ARROW-1300) - [JAVA] Fix ListVector Tests +* [ARROW-1306](https://issues.apache.org/jira/browse/ARROW-1306) - [Python] Encoding? issue with error reporting for parquet.read_table +* [ARROW-1308](https://issues.apache.org/jira/browse/ARROW-1308) - [C++] ld tries to link 'arrow_static' even when -DARROW_BUILD_STATIC=off +* [ARROW-1309](https://issues.apache.org/jira/browse/ARROW-1309) - [Python] Error inferring List type in Array.from_pandas when inner values are all None +* [ARROW-1310](https://issues.apache.org/jira/browse/ARROW-1310) - [JAVA] Revert ARROW-886 +* [ARROW-1312](https://issues.apache.org/jira/browse/ARROW-1312) - [C++] Set default value to ARROW_JEMALLOC to OFF until ARROW-1282 is resolved +* [ARROW-1326](https://issues.apache.org/jira/browse/ARROW-1326) - [Python] Fix Sphinx build in Travis CI +* [ARROW-1327](https://issues.apache.org/jira/browse/ARROW-1327) - [Python] Failing to release GIL in MemoryMappedFile._open causes deadlock +* [ARROW-1328](https://issues.apache.org/jira/browse/ARROW-1328) - [Python] pyarrow.Table.from_pandas option timestamps_to_ms changes column values +* [ARROW-1330](https://issues.apache.org/jira/browse/ARROW-1330) - [Plasma] Turn on plasma tests on manylinux1 +* [ARROW-1335](https://issues.apache.org/jira/browse/ARROW-1335) - [C++] PrimitiveArray::raw_values has inconsistent semantics re: offsets compared with subclasses +* [ARROW-1338](https://issues.apache.org/jira/browse/ARROW-1338) - [Python] Investigate non-deterministic core dump on Python 2.7, Travis CI builds +* [ARROW-1340](https://issues.apache.org/jira/browse/ARROW-1340) - [Java] NullableMapVector field doesn't maintain metadata +* [ARROW-1342](https://issues.apache.org/jira/browse/ARROW-1342) - [Python] Support strided array of lists +* [ARROW-1343](https://issues.apache.org/jira/browse/ARROW-1343) - [Format/Java/C++] Ensuring encapsulated stream / IPC message sizes are always a multiple of 8 +* [ARROW-1350](https://issues.apache.org/jira/browse/ARROW-1350) - [C++] Include Plasma source tree in source distribution +* [ARROW-187](https://issues.apache.org/jira/browse/ARROW-187) - [C++] Decide on how pedantic we want to be about exceptions +* [ARROW-276](https://issues.apache.org/jira/browse/ARROW-276) - [JAVA] Nullable Value Vectors should extend BaseValueVector instead of BaseDataValueVector +* [ARROW-573](https://issues.apache.org/jira/browse/ARROW-573) - [Python/C++] Support ordered dictionaries data, pandas Categorical +* [ARROW-884](https://issues.apache.org/jira/browse/ARROW-884) - [C++] Exclude internal classes from documentation +* [ARROW-932](https://issues.apache.org/jira/browse/ARROW-932) - [Python] Fix compiler warnings on MSVC +* [ARROW-968](https://issues.apache.org/jira/browse/ARROW-968) - [Python] RecordBatch [i:j] syntax is incomplete + +[1]: https://github.com/apache/arrow/releases/tag/apache-arrow-0.6.0 +[2]: https://www.apache.org/dyn/closer.cgi/arrow/arrow-0.6.0/ +[3]: http://arrow.apache.org/blog/2017/08/16/0.6.0-release/ \ No newline at end of file diff --git a/site/_release/index.md b/site/_release/index.md index f18cff3b649e..b373d8bfe199 100644 --- a/site/_release/index.md +++ b/site/_release/index.md @@ -26,6 +26,7 @@ limitations under the License. Navigate to the release page for downloads and the changelog. +* [0.6.0 (14 August 2017)][7] * [0.5.0 (23 July 2017)][6] * [0.4.1 (9 June 2017)][5] * [0.4.0 (22 May 2017)][4] @@ -39,3 +40,4 @@ Navigate to the release page for downloads and the changelog. [4]: {{ site.baseurl }}/release/0.4.0.html [5]: {{ site.baseurl }}/release/0.4.1.html [6]: {{ site.baseurl }}/release/0.5.0.html +[7]: {{ site.baseurl }}/release/0.6.0.html diff --git a/site/index.html b/site/index.html index 8a06c6acec58..224e5da35833 100644 --- a/site/index.html +++ b/site/index.html @@ -7,10 +7,10 @@

Apache Arrow

Powering Columnar In-Memory Analytics

Join Mailing List - Install (0.5.0 Release - July 23, 2017) + Install (0.6.0 Release - August 14, 2017)

-

Latest News: Apache Arrow 0.5.0 release

+

Latest News: Apache Arrow 0.6.0 release

Fast

diff --git a/site/install.md b/site/install.md index bd45642fe201..bfea0b179d00 100644 --- a/site/install.md +++ b/site/install.md @@ -20,17 +20,17 @@ limitations under the License. {% endcomment %} --> -## Current Version: 0.5.0 +## Current Version: 0.6.0 -### Released: 23 July 2017 +### Released: 14 August 2017 -See the [release notes][10] and [blog post][11] for more about what's new. +See the [release notes][10] for more about what's new. ### Source release -* **Source Release**: [apache-arrow-0.5.0.tar.gz][6] +* **Source Release**: [apache-arrow-0.6.0.tar.gz][6] * **Verification**: [md5][3], [asc][7] -* [Git tag e9f76e1][2] +* [Git tag b173334][2] ### Java Packages @@ -38,7 +38,7 @@ See the [release notes][10] and [blog post][11] for more about what's new. ## Binary Installers for C, C++, Python -It may take a little time for the binary packages to get updated +Binary packages may not be updated immediately after the source release is posted. ### C++ and Python Conda Packages (Unofficial) @@ -52,8 +52,8 @@ Install them with: ```shell -conda install arrow-cpp=0.5.0 -c conda-forge -conda install pyarrow=0.5.0 -c conda-forge +conda install arrow-cpp=0.6.* -c conda-forge +conda install pyarrow==0.6.* -c conda-forge ``` ### Python Wheels on PyPI (Unofficial) @@ -61,9 +61,12 @@ conda install pyarrow=0.5.0 -c conda-forge We have provided binary wheels on PyPI for Linux, macOS, and Windows: ```shell -pip install pyarrow==0.5.0 +pip install pyarrow==0.6.* ``` +We recommend pinning `0.6.*` in `requirements.txt` to install the latest patch +release. + These include the Apache Arrow and Apache Parquet C++ binary libraries bundled with the wheel. @@ -133,14 +136,13 @@ These repositories are managed at [red-data-tools/arrow-packages][9]. If you have any feedback, please send it to the project instead of Apache Arrow project. -[1]: https://www.apache.org/dyn/closer.cgi/arrow/arrow-0.5.0/ -[2]: https://github.com/apache/arrow/releases/tag/apache-arrow-0.5.0 -[3]: https://www.apache.org/dyn/closer.cgi/arrow/arrow-0.5.0/apache-arrow-0.5.0.tar.gz.md5 -[4]: http://search.maven.org/#search%7Cga%7C1%7Cg%3A%22org.apache.arrow%22%20AND%20v%3A%220.5.0%22 +[1]: https://www.apache.org/dyn/closer.cgi/arrow/arrow-0.6.0/ +[2]: https://github.com/apache/arrow/releases/tag/apache-arrow-0.6.0 +[3]: https://www.apache.org/dyn/closer.cgi/arrow/arrow-0.6.0/apache-arrow-0.6.0.tar.gz.md5 +[4]: http://search.maven.org/#search%7Cga%7C1%7Cg%3A%22org.apache.arrow%22%20AND%20v%3A%220.6.0%22 [5]: http://conda-forge.github.io -[6]: https://www.apache.org/dyn/closer.cgi/arrow/arrow-0.5.0/apache-arrow-0.5.0.tar.gz -[7]: https://www.apache.org/dyn/closer.cgi/arrow/arrow-0.5.0/apache-arrow-0.5.0.tar.gz.asc +[6]: https://www.apache.org/dyn/closer.cgi/arrow/arrow-0.6.0/apache-arrow-0.6.0.tar.gz +[7]: https://www.apache.org/dyn/closer.cgi/arrow/arrow-0.6.0/apache-arrow-0.6.0.tar.gz.asc [8]: https://github.com/red-data-tools/parquet-glib [9]: https://github.com/red-data-tools/arrow-packages -[10]: http://arrow.apache.org/release/0.5.0.html -[11]: http://arrow.apache.org/blog/2017/07/25/0.5.0-release/ +[10]: http://arrow.apache.org/release/0.6.0.html