forked from aserebryakov/trie-rs
-
Notifications
You must be signed in to change notification settings - Fork 2
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
doc: Update readme, add contributing.md, add git cliff to generate ch…
…angelog, add pre-commit to automatically fmt and update changelog
- Loading branch information
Showing
7 changed files
with
300 additions
and
86 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,13 +1,3 @@ | ||
# Generated by Cargo | ||
# will have compiled files and executables | ||
/target/ | ||
|
||
# Remove Cargo.lock from gitignore if creating an executable, leave it for libraries | ||
# More information here http://doc.crates.io/guide.html#cargotoml-vs-cargolock | ||
Cargo.lock | ||
|
||
# These are backup files generated by rustfmt | ||
**/*.rs.bk | ||
/target/ | ||
**/*.rs.bk | ||
Cargo.lock | ||
tarpaulin-report.html |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,34 @@ | ||
# See https://pre-commit.com for more information | ||
# See https://pre-commit.com/hooks.html for more hooks | ||
repos: | ||
- repo: https://github.com/pre-commit/pre-commit-hooks | ||
rev: v4.3.0 | ||
hooks: | ||
- id: check-added-large-files | ||
name: 🐘 Check for added large files | ||
- id: check-toml | ||
name: ✔️ Check TOML | ||
- id: check-yaml | ||
name: ✔️ Check YAML | ||
args: | ||
- --unsafe | ||
- id: end-of-file-fixer | ||
name: 🪚 Fix end of files | ||
- id: trailing-whitespace | ||
name: ✂️ Trim trailing whitespaces | ||
- repo: local | ||
hooks: | ||
- id: rustfmt | ||
name: 🦀 Format Rust files | ||
description: Check if all files follow the rustfmt style | ||
entry: cargo fmt | ||
language: system | ||
pass_filenames: false | ||
- id: git-cliff | ||
name: 🏔️ Update changelog | ||
entry: git cliff -o CHANGELOG.md | ||
language: system | ||
pass_filenames: false | ||
ci: | ||
autofix_commit_msg: 🎨 [pre-commit.ci] Auto format from pre-commit.com hooks | ||
autoupdate_commit_msg: ⬆ [pre-commit.ci] pre-commit autoupdate |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,48 @@ | ||
# 📜 Changelog | ||
|
||
All notable changes to this project will be documented in this file. | ||
|
||
## [unreleased] | ||
|
||
### ⛰️ Features | ||
|
||
- Add functions to find_prefixes, find_postfixes, and find_longest_prefix. Rename files - ([ccbae67](https://github.com/vemonet/ptrie/commit/ccbae673304c0d052e8625f2040b2a2005afc408)) | ||
|
||
### 🧪 Testing | ||
|
||
- Improve tests, add GitHub actions workflows for testing and releasing, remove travis CI, update benchmark script - ([8391056](https://github.com/vemonet/ptrie/commit/839105644ff00e1ac9a8fee08bf0c5f6eb2fddf8)) | ||
|
||
## [0.4.0](https://github.com/vemonet/ptrie/compare/0.3.0..0.4.0) - 2018-07-09 | ||
|
||
### ⚡ Performance | ||
|
||
- Extracts values to a vector from nodes - ([8032921](https://github.com/vemonet/ptrie/commit/8032921117659093525956f35b0bee8c2b508b5b)) | ||
- Improves the perfomance - ([1542fd9](https://github.com/vemonet/ptrie/commit/1542fd90728d6e4c5123af031b635d9c7e282e81)) | ||
- Fixes the case of existing value overriding in the trie - ([89c08ad](https://github.com/vemonet/ptrie/commit/89c08ad74d7994efc97f307f46c78b537e80a3c2)) | ||
|
||
### 🎨 Styling | ||
|
||
- Fixes formatting with rust-fmt - ([57b8acf](https://github.com/vemonet/ptrie/commit/57b8acf6ddaed88c391a7548982fcef8fa7eb491)) | ||
|
||
## [0.3.0](https://github.com/vemonet/ptrie/compare/0.2.1..0.3.0) - 2017-12-19 | ||
|
||
### ⚙️ Miscellaneous Tasks | ||
|
||
- Improves the performance by keys localization in memory | ||
|
||
Previous version of the TrieNode structure caused cache miss on each | ||
comparison iteration. | ||
|
||
Placing the child key in the node itself makes these comparisons much | ||
faster because they keys are localized in CPU cache | ||
- ([2cc8e88](https://github.com/vemonet/ptrie/commit/2cc8e882f32e99044b8e6a89a236de4accb9f5b0)) | ||
|
||
## [0.2.1](https://github.com/vemonet/ptrie/compare/0.2.0..0.2.1) - 2017-12-19 | ||
|
||
## [0.2.0](https://github.com/vemonet/ptrie/compare/0.1.2..0.2.0) - 2017-12-17 | ||
|
||
## [0.1.2](https://github.com/vemonet/ptrie/compare/0.1.1..0.1.2) - 2017-12-12 | ||
|
||
## [0.1.1](https://github.com/vemonet/ptrie/tree/0.1.1) - 2017-12-12 | ||
|
||
<!-- generated by git-cliff --> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,77 @@ | ||
# 🧑💻 Contributing | ||
|
||
The usual process to make a contribution is to: | ||
|
||
1. Check for existing related [issues on GitHub](https://github.com/vemonet/ptrie/issues) | ||
2. [Fork](https://github.com/vemonet/ptrie/fork) the repository and create a new branch | ||
3. Make your changes | ||
4. Make sure formatting, linting and tests passes. | ||
5. Add tests if possible to cover the lines you added. | ||
6. Commit, and send a Pull Request. | ||
|
||
## 🛠️ Development | ||
|
||
Install dependencies: | ||
|
||
```bash | ||
rustup update | ||
rustup toolchain install nightly | ||
rustup component add rustfmt clippy | ||
cargo install cargo-tarpaulin git-cliff cargo-outdated | ||
pipx install pre-commit | ||
pre-commit install | ||
``` | ||
|
||
### 🧪 Tests | ||
|
||
Run tests: | ||
|
||
```bash | ||
cargo test | ||
``` | ||
|
||
Tests with coverage: | ||
|
||
```bash | ||
cargo tarpaulin -p ptrie --doc --tests --out html | ||
``` | ||
|
||
> Start web server for the cov report: `python -m http.server` | ||
### 📚 Docs | ||
|
||
Generate docs locally: | ||
|
||
```bash | ||
cargo doc --all --all-features | ||
``` | ||
|
||
> Start web server for the generated docs: `python -m http.server --directory target/doc` | ||
### ⏱️ Benchmark | ||
|
||
Running benchmarks requires to enable rust nightly: `rustup default nightly` | ||
|
||
```bash | ||
cargo bench | ||
``` | ||
|
||
## 🏷️ New release | ||
|
||
Publishing artifacts will be done by the `build.yml` workflow, make sure you have set the following tokens as secrets for this repository: `CRATES_IO_TOKEN`, `CODECOV_TOKEN` | ||
|
||
1. Make sure dependencies have been updated: | ||
|
||
```bash | ||
cargo update | ||
cargo outdated | ||
``` | ||
|
||
2. Bump the version in the `Cargo.toml` file, create a new tag with `git`, and update changelog using [`git-cliff`](https://git-cliff.org): | ||
|
||
```bash | ||
git tag -a 0.5.0 -m "v0.5.0" | ||
git cliff -o CHANGELOG.md | ||
``` | ||
|
||
3. Commit, and push. The `release.yml` workflow will automatically create the release on GitHub, and publish to crates.io. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,102 +1,94 @@ | ||
# GTrie | ||
<h1 align="center"> | ||
🎄 Prefix Trie | ||
</h1> | ||
|
||
[![Build Status](https://travis-ci.org/aserebryakov/trie-rs.svg?branch=master)](https://travis-ci.org/aserebryakov/trie-rs) | ||
<p align="center"> | ||
<a href="https://crates.io/crates/ptrie"> | ||
<img alt="Crates.io" src="https://img.shields.io/crates/v/ptrie" /> | ||
</a> | ||
<a href="https://github.com/vemonet/ptrie/actions/workflows/test.yml"> | ||
<img alt="Test" src="https://github.com/vemonet/ptrie/actions/workflows/test.yml/badge.svg" /> | ||
</a> | ||
<a href="https://github.com/vemonet/ptrie/actions/workflows/release.yml"> | ||
<img alt="Release" src="https://github.com/vemonet/ptrie/actions/workflows/release.yml/badge.svg" /> | ||
</a> | ||
<a href="https://docs.rs/ptrie"> | ||
<img alt="Documentation" src="https://docs.rs/ptrie/badge.svg" /> | ||
</a> | ||
<a href="https://codecov.io/gh/vemonet/ptrie/branch/main"> | ||
<img src="https://codecov.io/gh/vemonet/ptrie/branch/main/graph/badge.svg" alt="Codecov status" /> | ||
</a> | ||
<a href="https://github.com/vemonet/ptrie/blob/main/LICENSE"> | ||
<img alt="MIT license" src="https://img.shields.io/badge/License-MIT-brightgreen.svg" /> | ||
</a> | ||
</p> | ||
|
||
Trie is the library that implements the [trie](https://en.wikipedia.org/wiki/Trie). | ||
`PTrie` is a versatile implementation of the [trie data structure](https://en.wikipedia.org/wiki/Trie), tailored for efficient prefix searching within a collection of objects, such as strings, with no dependencies. | ||
|
||
Trie is a generic data structure, written `Trie<T, U>` where `T` is node key type and `U` is a | ||
value type. | ||
The structure is defined as `Trie<K, V>`, where `K` represents the type of keys in each node, and `V` is the type of the associated values. | ||
|
||
## 💭 Motivation | ||
|
||
# Motivation | ||
The trie is particularly effective for operations involving common prefix identification and retrieval, making it a good choice for applications that require fast and efficient prefix-based search functionalities. | ||
|
||
Trie may be faster than other data structures in some cases. | ||
## 🚀 Usage | ||
|
||
For example, `Trie` may be used as a replacement for `std::HashMap` in case of a dictionary where | ||
the number of words in dictionary is significantly less than number of different words in the | ||
input and matching probability is low. | ||
### ✨ Find prefixes | ||
|
||
|
||
# Usage | ||
PTrie can return all prefixes in the trie corresponding to a given string, sorted in ascending order of their length. | ||
|
||
```rust | ||
use gtrie::Trie; | ||
|
||
let mut t = Trie::new(); | ||
|
||
t.insert("this".chars(), 1); | ||
t.insert("trie".chars(), 2); | ||
t.insert("contains".chars(), 3); | ||
t.insert("a".chars(), 4); | ||
t.insert("number".chars(), 5); | ||
t.insert("of".chars(), 6); | ||
t.insert("words".chars(), 7); | ||
|
||
assert_eq!(t.contains_key("number".chars()), true); | ||
assert_eq!(t.contains_key("not_existing_key".chars()), false); | ||
assert_eq!(t.get_value("words".chars()), Some(7)); | ||
assert_eq!(t.get_value("none".chars()), None); | ||
``` | ||
use ptrie::Trie; | ||
|
||
# Benchmarks | ||
let mut trie = Trie::new(); | ||
|
||
Benchmark `std::HashMap<String, String>` vs `gtrie::Trie` shows that `Trie` is | ||
significantly faster in the case of key mismatch but significantly slower in the case of | ||
matching key. | ||
trie.insert("a".bytes(), "A"); | ||
trie.insert("ab".bytes(), "AB"); | ||
trie.insert("abc".bytes(), "ABC"); | ||
trie.insert("abcde".bytes(), "ABCDE"); | ||
|
||
let prefixes = trie.find_prefixes("abcd".bytes()); | ||
assert_eq!(prefixes, vec!["A", "AB", "ABC"]); | ||
``` | ||
$ cargo bench | ||
test hash_map_massive_match ... bench: 150,127 ns/iter (+/- 12,986) | ||
test hash_map_massive_mismatch_on_0 ... bench: 93,246 ns/iter (+/- 5,108) | ||
test hash_map_massive_mismatch_on_0_one_symbol_key ... bench: 93,706 ns/iter (+/- 5,908) | ||
test hash_map_match ... bench: 24 ns/iter (+/- 3) | ||
test hash_map_mismatch ... bench: 20 ns/iter (+/- 0) | ||
test trie_massive_match ... bench: 231,343 ns/iter (+/- 4,940) | ||
test trie_massive_mismatch_on_0 ... bench: 28,743 ns/iter (+/- 8,401) | ||
test trie_massive_mismatch_on_1 ... bench: 28,734 ns/iter (+/- 1,839) | ||
test trie_massive_mismatch_on_2 ... bench: 28,760 ns/iter (+/- 2,582) | ||
test trie_massive_mismatch_on_3 ... bench: 28,829 ns/iter (+/- 2,504) | ||
test trie_match ... bench: 10 ns/iter (+/- 1) | ||
test trie_mismatch ... bench: 5 ns/iter (+/- 0) | ||
``` | ||
|
||
## Important | ||
|
||
Search performance is highly dependent on the data stored in `Trie` and may be | ||
as significantly faster than `std::HashMap` as significantly slower. | ||
|
||
|
||
# Contribution | ||
|
||
Source code and issues are hosted on GitHub: | ||
|
||
https://github.com/aserebryakov/trie-rs | ||
### 🔍 Find postfixes | ||
|
||
PTrie can also find all strings in the trie that begin with a specified prefix. | ||
|
||
# License | ||
|
||
[MIT License](https://opensource.org/licenses/MIT) | ||
```rust | ||
use ptrie::Trie; | ||
|
||
let mut trie = Trie::new(); | ||
|
||
# Changelog | ||
trie.insert("app".bytes(), "App"); | ||
trie.insert("apple".bytes(), "Apple"); | ||
trie.insert("applet".bytes(), "Applet"); | ||
trie.insert("apricot".bytes(), "Apricot"); | ||
|
||
#### 0.4.0 | ||
let strings = trie.find_postfixes("app".bytes()); | ||
assert_eq!(strings, vec!["App", "Apple", "Applet"]); | ||
``` | ||
|
||
* Significant performance improvement due to switch to data oriented model | ||
### 🔑 Key-based Retrieval Functions | ||
|
||
#### 0.3.0 | ||
PTrie provides functions to check for the existence of a key and to retrieve the associated value. | ||
|
||
* Significantly improved performance of the key mismatch case | ||
* API is updated to be closer to `std::HashMap` | ||
```rust | ||
use ptrie::Trie; | ||
|
||
#### 0.2.1 | ||
let mut trie = Trie::new(); | ||
trie.insert("app".bytes(), "App"); | ||
|
||
* Benchmarks are improved | ||
assert!(trie.contains_key("app".bytes())); | ||
assert!(!trie.contains_key("not_existing_key".bytes())); | ||
assert_eq!(trie.get_value("app".bytes()), Some("App")); | ||
assert_eq!(trie.get_value("none".bytes()), None); | ||
``` | ||
|
||
#### 0.2.0 | ||
## 🏷️ Features | ||
|
||
* API is updated to be closer to `std::HashMap` | ||
The `serde` feature adds Serde `Serialize` and `Deserialize` traits to the `Trie` and `TrieNode` struct. | ||
|
||
#### 0.1.1 | ||
## 📜 License | ||
|
||
* Basic trie implentation | ||
[MIT License](https://opensource.org/licenses/MIT) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.