Skip to content

Commit

Permalink
Fix typos (Datafusion -> DataFusion) (apache#1993)
Browse files Browse the repository at this point in the history
* Fix typos (Datafusion -> DataFusion)

* revert change to proto files
  • Loading branch information
andygrove authored Mar 12, 2022
1 parent 8e09b49 commit b702e08
Show file tree
Hide file tree
Showing 18 changed files with 48 additions and 48 deletions.
2 changes: 1 addition & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@
Changelogs are maintained separately for each subproject. Please check out the
changelog file within each subproject folder for more details:

* [Datafusion CHANGELOG](./datafusion/CHANGELOG.md)
* [DataFusion CHANGELOG](./datafusion/CHANGELOG.md)
* [Ballista CHANGELOG](./ballista/CHANGELOG.md)

For older versions, see [apache/arrow/CHANGELOG.md](https://github.com/apache/arrow/blob/master/CHANGELOG.md).
6 changes: 3 additions & 3 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -132,7 +132,7 @@ python -m pytest -v integration-tests/test_psql_parity.py

### Criterion Benchmarks

[Criterion](https://docs.rs/criterion/latest/criterion/index.html) is a statistics-driven micro-benchmarking framework used by Datafusion for evaluating the performance of specific code-paths. In particular, the criterion benchmarks help to both guide optimisation efforts, and prevent performance regressions within Datafusion.
[Criterion](https://docs.rs/criterion/latest/criterion/index.html) is a statistics-driven micro-benchmarking framework used by DataFusion for evaluating the performance of specific code-paths. In particular, the criterion benchmarks help to both guide optimisation efforts, and prevent performance regressions within DataFusion.

Criterion integrates with Cargo's built-in [benchmark support](https://doc.rust-lang.org/cargo/commands/cargo-bench.html) and a given benchmark can be run with

Expand Down Expand Up @@ -160,7 +160,7 @@ The benchmark will automatically remove any generated parquet file on exit, howe

### Upstream Benchmark Suites

Instructions and tooling for running upstream benchmark suites against Datafusion and/or Ballista can be found in [benchmarks](./benchmarks).
Instructions and tooling for running upstream benchmark suites against DataFusion and/or Ballista can be found in [benchmarks](./benchmarks).

These are valuable for comparative evaluation against alternative Arrow implementations and query engines.

Expand Down Expand Up @@ -227,7 +227,7 @@ dot -Tpdf < /tmp/plan.dot > /tmp/plan.pdf

## Specification

We formalize Datafusion semantics and behaviors through specification
We formalize DataFusion semantics and behaviors through specification
documents. These specifications are useful to be used as references to help
resolve ambiguities during development or code reviews.

Expand Down
2 changes: 1 addition & 1 deletion ballista/rust/core/src/execution_plans/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@
// specific language governing permissions and limitations
// under the License.

//! This module contains execution plans that are needed to distribute Datafusion's execution plans into
//! This module contains execution plans that are needed to distribute DataFusion's execution plans into
//! several Ballista executors.
mod distributed_query;
Expand Down
2 changes: 1 addition & 1 deletion conbench/benchmarks.py
Original file line number Diff line number Diff line change
Expand Up @@ -38,4 +38,4 @@ def _f(self):
@conbench.runner.register_benchmark
class CargoBenchmarks(_criterion.CriterionBenchmark):
name = "datafusion"
description = "Run Arrow Datafusion micro benchmarks."
description = "Run Arrow DataFusion micro benchmarks."
2 changes: 1 addition & 1 deletion datafusion-physical-expr/src/expressions/cast.rs
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ use datafusion_common::ScalarValue;
use datafusion_common::{DataFusionError, Result};
use datafusion_expr::ColumnarValue;

/// provide Datafusion default cast options
/// provide DataFusion default cast options
pub const DEFAULT_DATAFUSION_CAST_OPTIONS: CastOptions = CastOptions { safe: false };

/// CAST expression casts an expression to a specific data type and returns a runtime error on invalid cast
Expand Down
18 changes: 9 additions & 9 deletions datafusion/CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@
- Remove non idiomatic `DataFusionError::into_arrow_external_error` in favor of From conversion [\#1645](https://github.com/apache/arrow-datafusion/pull/1645) ([alamb](https://github.com/alamb))
- Remove `Accumulator::update` and `Accumulator::merge` [\#1582](https://github.com/apache/arrow-datafusion/pull/1582) ([Jimexist](https://github.com/Jimexist))
- implement `Hash` for various types and replace `PartialOrd` [\#1580](https://github.com/apache/arrow-datafusion/pull/1580) ([Jimexist](https://github.com/Jimexist))
- Replace `DatafusionError` with `GenericError` in `ObjectStore` interface [\#1541](https://github.com/apache/arrow-datafusion/pull/1541) ([matthewmturner](https://github.com/matthewmturner))
- Replace `DataFusionError` with `GenericError` in `ObjectStore` interface [\#1541](https://github.com/apache/arrow-datafusion/pull/1541) ([matthewmturner](https://github.com/matthewmturner))
- Make `FLOAT` SQL type map to `Float32` rather than `Float64` [\#1423](https://github.com/apache/arrow-datafusion/pull/1423) [[sql](https://github.com/apache/arrow-datafusion/labels/sql)] ([liukun4515](https://github.com/liukun4515))
- Map `REAL` SQL type to `Float32` rather than `Float64` to be consistent with pg [\#1390](https://github.com/apache/arrow-datafusion/pull/1390) [[sql](https://github.com/apache/arrow-datafusion/labels/sql)] ([hntd187](https://github.com/hntd187))

Expand Down Expand Up @@ -79,7 +79,7 @@
- Add support for `ORDER BY` on unprojected columns [\#1415](https://github.com/apache/arrow-datafusion/pull/1415) ([viirya](https://github.com/viirya))
- Support decimal for `min` and `max` aggregate [\#1407](https://github.com/apache/arrow-datafusion/pull/1407) ([liukun4515](https://github.com/liukun4515))
- Consolidate `ConstantFolding` and `SimplifyExpression` [\#1375](https://github.com/apache/arrow-datafusion/pull/1375) ([alamb](https://github.com/alamb))
- Datafusion cli quiet mode command to contain option bool [\#1345](https://github.com/apache/arrow-datafusion/pull/1345) ([Jimexist](https://github.com/Jimexist))
- DataFusion cli quiet mode command to contain option bool [\#1345](https://github.com/apache/arrow-datafusion/pull/1345) ([Jimexist](https://github.com/Jimexist))
- Implement `array_agg` aggregate function [\#1300](https://github.com/apache/arrow-datafusion/pull/1300) ([viirya](https://github.com/viirya))
- Add a command to switch output format in cli [\#1284](https://github.com/apache/arrow-datafusion/pull/1284) ([capkurmagati](https://github.com/capkurmagati))
- Support `=`, `<`, `<=`, `>`, `>=`, `!=`, `is distinct from`, `is not distinct from` for `BooleanArray` [\#1163](https://github.com/apache/arrow-datafusion/pull/1163) ([alamb](https://github.com/alamb))
Expand All @@ -94,7 +94,7 @@
- CTE/WITH .. UNION ALL confuses name resolution in WHERE [\#1509](https://github.com/apache/arrow-datafusion/issues/1509)
- ORDER BY min\(x\) results in error `Plan("No field named 'foo.x'. Valid fields are 'MIN(foo.x)'.")` [\#1479](https://github.com/apache/arrow-datafusion/issues/1479)
- Sort discards field metadata on the output schema [\#1476](https://github.com/apache/arrow-datafusion/issues/1476)
- Datafusion should not strip out timezone information from existing types [\#1454](https://github.com/apache/arrow-datafusion/issues/1454)
- DataFusion should not strip out timezone information from existing types [\#1454](https://github.com/apache/arrow-datafusion/issues/1454)
- Error on some queries: "column types must match schema types, expected XXX but found YYY" [\#1447](https://github.com/apache/arrow-datafusion/issues/1447)
- Query failing to return any results when filter is an equality check on strings \(bad statistics in parquet\) [\#1433](https://github.com/apache/arrow-datafusion/issues/1433)
- Field names containing period such as `f.c1` cannot be named in SQL query [\#1432](https://github.com/apache/arrow-datafusion/issues/1432)
Expand All @@ -111,7 +111,7 @@
- Fix single\_distinct\_to\_groupby for arbitrary expressions [\#1519](https://github.com/apache/arrow-datafusion/pull/1519) ([james727](https://github.com/james727))
- Fix SortExec discards field metadata on the output schema [\#1477](https://github.com/apache/arrow-datafusion/pull/1477) ([alamb](https://github.com/alamb))
- fix calculate in many\_to\_many\_hash\_partition test. [\#1463](https://github.com/apache/arrow-datafusion/pull/1463) ([Ted-Jiang](https://github.com/Ted-Jiang))
- Add Timezone to Scalar::Time\* types, and better timezone awareness to Datafusion's time types [\#1455](https://github.com/apache/arrow-datafusion/pull/1455) ([maxburke](https://github.com/maxburke))
- Add Timezone to Scalar::Time\* types, and better timezone awareness to DataFusion's time types [\#1455](https://github.com/apache/arrow-datafusion/pull/1455) ([maxburke](https://github.com/maxburke))
- Support identifiers with `.` in them [\#1449](https://github.com/apache/arrow-datafusion/pull/1449) [[sql](https://github.com/apache/arrow-datafusion/labels/sql)] ([alamb](https://github.com/alamb))
- Fixes for working with functions in dataframes, additional documentation [\#1430](https://github.com/apache/arrow-datafusion/pull/1430) ([tobyhede](https://github.com/tobyhede))
- \[Minor\] Fix `send_time` metric for hash-repartition [\#1421](https://github.com/apache/arrow-datafusion/pull/1421) ([Dandandan](https://github.com/Dandandan))
Expand All @@ -130,7 +130,7 @@

- Clarify docs about `Accumulator::update` and `Accumulator::update_batch` [\#1542](https://github.com/apache/arrow-datafusion/pull/1542) ([alamb](https://github.com/alamb))
- Fix duplicated `cargo run --example parquet_sql` [\#1482](https://github.com/apache/arrow-datafusion/pull/1482) ([sergey-melnychuk](https://github.com/sergey-melnychuk))
- add documentation to Datafusion cli's new commands [\#1348](https://github.com/apache/arrow-datafusion/pull/1348) ([liukun4515](https://github.com/liukun4515))
- add documentation to DataFusion cli's new commands [\#1348](https://github.com/apache/arrow-datafusion/pull/1348) ([liukun4515](https://github.com/liukun4515))
- fix some clippy warnings from nightly channel [\#1277](https://github.com/apache/arrow-datafusion/pull/1277) [[sql](https://github.com/apache/arrow-datafusion/labels/sql)] ([Jimexist](https://github.com/Jimexist))

**Performance improvements:**
Expand Down Expand Up @@ -470,7 +470,7 @@
- delete redundant code [\#973](https://github.com/apache/arrow-datafusion/issues/973)
- How to build DataFusion python wheel [\#853](https://github.com/apache/arrow-datafusion/issues/853)
- Add support for partition pruning [\#204](https://github.com/apache/arrow-datafusion/issues/204)
- \[Datafusion\] Support joins on TimestampMillisecond columns [\#187](https://github.com/apache/arrow-datafusion/issues/187)
- \[DataFusion\] Support joins on TimestampMillisecond columns [\#187](https://github.com/apache/arrow-datafusion/issues/187)
- TPC-H Query 21 [\#173](https://github.com/apache/arrow-datafusion/issues/173)
- TPC-H Query 13 [\#164](https://github.com/apache/arrow-datafusion/issues/164)
- TPC-H Query 8 [\#162](https://github.com/apache/arrow-datafusion/issues/162)
Expand Down Expand Up @@ -509,7 +509,7 @@ For older versions, see [apache/arrow/CHANGELOG.md](https://github.com/apache/ar
- Box ScalarValue:Lists, reduce size by half size [\#788](https://github.com/apache/arrow-datafusion/pull/788) ([alamb](https://github.com/alamb))
- JOIN conditions are order dependent [\#778](https://github.com/apache/arrow-datafusion/pull/778) ([seddonm1](https://github.com/seddonm1))
- Show the result of all optimizer passes in EXPLAIN VERBOSE [\#759](https://github.com/apache/arrow-datafusion/pull/759) ([alamb](https://github.com/alamb))
- \#723 Datafusion add option in ExecutionConfig to enable/disable parquet pruning [\#749](https://github.com/apache/arrow-datafusion/pull/749) ([lvheyang](https://github.com/lvheyang))
- \#723 DataFusion add option in ExecutionConfig to enable/disable parquet pruning [\#749](https://github.com/apache/arrow-datafusion/pull/749) ([lvheyang](https://github.com/lvheyang))
- Update API for extension planning to include logical plan [\#643](https://github.com/apache/arrow-datafusion/pull/643) ([alamb](https://github.com/alamb))
- Rename MergeExec to CoalescePartitionsExec [\#635](https://github.com/apache/arrow-datafusion/pull/635) ([andygrove](https://github.com/andygrove))
- fix 593, reduce cloning by taking ownership in logical planner's `from` fn [\#610](https://github.com/apache/arrow-datafusion/pull/610) ([Jimexist](https://github.com/Jimexist))
Expand All @@ -520,7 +520,7 @@ For older versions, see [apache/arrow/CHANGELOG.md](https://github.com/apache/ar
- Use 4.x arrow-rs from crates.io rather than git sha [\#395](https://github.com/apache/arrow-datafusion/pull/395) ([alamb](https://github.com/alamb))
- Return Vec\<bool\> from PredicateBuilder rather than an `Fn` [\#370](https://github.com/apache/arrow-datafusion/pull/370) ([alamb](https://github.com/alamb))
- Refactor: move RowGroupPredicateBuilder into its own module, rename to PruningPredicateBuilder [\#365](https://github.com/apache/arrow-datafusion/pull/365) ([alamb](https://github.com/alamb))
- \[Datafusion\] NOW\(\) function support [\#288](https://github.com/apache/arrow-datafusion/pull/288) ([msathis](https://github.com/msathis))
- \[DataFusion\] NOW\(\) function support [\#288](https://github.com/apache/arrow-datafusion/pull/288) ([msathis](https://github.com/msathis))
- Implement select distinct [\#262](https://github.com/apache/arrow-datafusion/pull/262) ([Dandandan](https://github.com/Dandandan))
- Refactor datafusion/src/physical\_plan/common.rs build\_file\_list to take less param and reuse code [\#253](https://github.com/apache/arrow-datafusion/pull/253) ([Jimexist](https://github.com/Jimexist))
- Support qualified columns in queries [\#55](https://github.com/apache/arrow-datafusion/pull/55) ([houqp](https://github.com/houqp))
Expand Down Expand Up @@ -718,7 +718,7 @@ For older versions, see [apache/arrow/CHANGELOG.md](https://github.com/apache/ar
- RFC Roadmap for 2021 \(DataFusion\) [\#140](https://github.com/apache/arrow-datafusion/issues/140)
- Implement hash partitioning [\#131](https://github.com/apache/arrow-datafusion/issues/131)
- Grouping by column position [\#110](https://github.com/apache/arrow-datafusion/issues/110)
- \[Datafusion\] GROUP BY with a high cardinality doesn't seem to finish [\#107](https://github.com/apache/arrow-datafusion/issues/107)
- \[DataFusion\] GROUP BY with a high cardinality doesn't seem to finish [\#107](https://github.com/apache/arrow-datafusion/issues/107)
- \[Rust\] Add support for JSON data sources [\#103](https://github.com/apache/arrow-datafusion/issues/103)
- \[Rust\] Implement metrics framework [\#95](https://github.com/apache/arrow-datafusion/issues/95)
- Publically export Arrow crate from datafusion [\#36](https://github.com/apache/arrow-datafusion/issues/36)
Expand Down
2 changes: 1 addition & 1 deletion datafusion/src/execution/context.rs
Original file line number Diff line number Diff line change
Expand Up @@ -833,7 +833,7 @@ pub struct ExecutionConfig {
/// Should DataFusion repartition data using the partition keys to execute window functions in
/// parallel using the provided `target_partitions` level
pub repartition_windows: bool,
/// Should Datafusion parquet reader using the predicate to prune data
/// Should DataFusion parquet reader using the predicate to prune data
parquet_pruning: bool,
/// Runtime configurations such as memory threshold and local disk for spill
pub runtime: RuntimeConfig,
Expand Down
2 changes: 1 addition & 1 deletion datafusion/src/physical_plan/planner.rs
Original file line number Diff line number Diff line change
Expand Up @@ -583,7 +583,7 @@ impl DefaultPhysicalPlanner {
// columns with names like `SUM(t1.c1)`, `t1.c1 + t1.c2`, etc.
//
// If we run these logical columns through physical_name function, we will
// get physical names with column qualifiers, which violates Datafusion's
// get physical names with column qualifiers, which violates DataFusion's
// field name semantics. To account for this, we need to derive the
// physical name from physical input instead.
//
Expand Down
20 changes: 10 additions & 10 deletions dev/release/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,18 +21,18 @@

## Sub-projects

The Datafusion repo contains 2 different releasable sub-projects: Datafusion, Ballista
The DataFusion repo contains 2 different releasable sub-projects: DataFusion, Ballista

We use Datafusion release to drive the release for the other sub-projects. As a
result, Datafusion version bump is required for every release while version
We use DataFusion release to drive the release for the other sub-projects. As a
result, DataFusion version bump is required for every release while version
bumps for the Python binding and Ballista are optional. In other words, we can
release a new version of Datafusion without releasing a new version of the
release a new version of DataFusion without releasing a new version of the
Python binding or Ballista. On the other hand, releasing a new version of the
Python binding or Ballista always requires a new Datafusion version release.
Python binding or Ballista always requires a new DataFusion version release.

## Branching

Datafusion currently only releases from the `master` branch. Given the project
DataFusion currently only releases from the `master` branch. Given the project
is still in early development state, we are not maintaining an active stable
release backport branch.

Expand Down Expand Up @@ -177,11 +177,11 @@ Send the email output from the script to [email protected]. The email should

```
To: [email protected]
Subject: [VOTE][Datafusion] Release Apache Arrow Datafusion 5.1.0 RC0
Subject: [VOTE][DataFusion] Release Apache Arrow DataFusion 5.1.0 RC0
Hi,
I would like to propose a release of Apache Arrow Datafusion Implementation,
I would like to propose a release of Apache Arrow DataFusion Implementation,
version 5.1.0.
This release candidate is based on commit: a5dd428f57e62db20a945e8b1895de91405958c4 [1]
Expand All @@ -193,9 +193,9 @@ and vote on the release.
The vote will be open for at least 72 hours.
[ ] +1 Release this as Apache Arrow Datafusion 5.1.0
[ ] +1 Release this as Apache Arrow DataFusion 5.1.0
[ ] +0
[ ] -1 Do not release this as Apache Arrow Datafusion 5.1.0 because...
[ ] -1 Do not release this as Apache Arrow DataFusion 5.1.0 because...
[1]: https://github.com/apache/arrow-datafusion/tree/a5dd428f57e62db20a945e8b1895de91405958c4
[2]: https://dist.apache.org/repos/dist/dev/arrow/apache-arrow-datafusion-5.1.0
Expand Down
10 changes: 5 additions & 5 deletions dev/release/create-tarball.sh
Original file line number Diff line number Diff line change
Expand Up @@ -80,10 +80,10 @@ echo ""
echo "---------------------------------------------------------"
cat <<MAIL
To: [email protected]
Subject: [VOTE][RUST][Datafusion] Release Apache Arrow Datafusion ${version} RC${rc}
Subject: [VOTE][RUST][DataFusion] Release Apache Arrow DataFusion ${version} RC${rc}
Hi,
I would like to propose a release of Apache Arrow Datafusion Implementation,
I would like to propose a release of Apache Arrow DataFusion Implementation,
version ${version}.
This release candidate is based on commit: ${release_hash} [1]
Expand All @@ -98,9 +98,9 @@ encouraged to test the release and vote with "(non-binding)".
The standard verification procedure is documented at https://github.com/apache/arrow-datafusion/blob/master/dev/release/README.md#verifying-release-candidates.
[ ] +1 Release this as Apache Arrow Datafusion ${version}
[ ] +1 Release this as Apache Arrow DataFusion ${version}
[ ] +0
[ ] -1 Do not release this as Apache Arrow Datafusion ${version} because...
[ ] -1 Do not release this as Apache Arrow DataFusion ${version} because...
[1]: https://github.com/apache/arrow-datafusion/tree/${release_hash}
[2]: ${url}
Expand Down Expand Up @@ -129,4 +129,4 @@ gpg --armor --output ${tarball}.asc --detach-sig ${tarball}
echo "Uploading to apache dist/dev to ${url}"
svn co --depth=empty https://dist.apache.org/repos/dist/dev/arrow ${SOURCE_TOP_DIR}/dev/dist
svn add ${distdir}
svn ci -m "Apache Arrow Datafusion ${version} ${rc}" ${distdir}
svn ci -m "Apache Arrow DataFusion ${version} ${rc}" ${distdir}
2 changes: 1 addition & 1 deletion dev/release/release-tarball.sh
Original file line number Diff line number Diff line change
Expand Up @@ -65,7 +65,7 @@ cp -r ${tmp_dir}/dev/* ${tmp_dir}/release/${release_version}/
svn add ${tmp_dir}/release/${release_version}

echo "Commit release"
svn ci -m "Apache Arrow Datafusion ${version}" ${tmp_dir}/release
svn ci -m "Apache Arrow DataFusion ${version}" ${tmp_dir}/release

echo "Clean up"
rm -rf ${tmp_dir}
Expand Down
2 changes: 1 addition & 1 deletion docs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ inside a Python virtualenv.

- Python
- `pip install -r requirements.txt`
- Datafusion python package. You can install the latest version by running `maturin develop` inside `../python` directory.
- DataFusion python package. You can install the latest version by running `maturin develop` inside `../python` directory.

## Build

Expand Down
4 changes: 2 additions & 2 deletions docs/source/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -35,9 +35,9 @@

# -- Project information -----------------------------------------------------

project = 'Arrow Datafusion'
project = 'Arrow DataFusion'
copyright = '2022, Apache Software Foundation'
author = 'Arrow Datafusion Authors'
author = 'Arrow DataFusion Authors'


# -- General configuration ---------------------------------------------------
Expand Down
Loading

0 comments on commit b702e08

Please sign in to comment.