-
Notifications
You must be signed in to change notification settings - Fork 161
Query Stat framework v3 #2304
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Query Stat framework v3 #2304
Conversation
| if stats_data: | ||
| key_type_data["storage_ops"][task_type_name] = stats_data | ||
|
|
||
| if has_data: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it will be unfriendly to have the schema of the return type change dynamically like this, but again something we can iterate on later.
|
|
||
| result = {} | ||
|
|
||
| for key_type_idx in range(len(raw_stats.op_stats_)): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is really hard to follow. I think it needs some docs showing an example of the input data and the output format that you're trying to coerce it to
| @@ -0,0 +1,70 @@ | |||
| import arcticdb.toolbox.query_stats as qs | |||
|
|
|||
| def verify_list_symbool_stats(count): | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo symbool
| @@ -0,0 +1,70 @@ | |||
| import arcticdb.toolbox.query_stats as qs | |||
|
|
|||
| def verify_list_symbool_stats(count): | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rename count to expected_symbol_list_keys_count or something
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You need docstrings for these APIs.
Please document that this API is unstable and not governed by our semantic versioning.
Once you instrument the storages completely, you should write a docs page with some tutorials and examples.
poodlewars
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just this to sort out please https://github.com/man-group/ArcticDB/pull/2304/files#r2070143373
391e32c to
6da1173
Compare
commit facc33bead487490322ba9cc973ed86dc9b5c4c6
Merge: bc68ed467 85d51e3b7
Author: Vasil Danielov Pashov <vasil.pashov1@gmail.com>
Date: Tue May 27 20:15:59 2025 +0300
Merge branch 'master' into vasil.pashov/coverity-test-existing-code-with-errors
commit bc68ed467842b510bbd7001175cc8eecefc29e1c
Merge: e68ec0146 91a076cc2
Author: Vasil Pashov <vasil.pashov1@gmail.com>
Date: Tue May 27 20:12:57 2025 +0300
Merge branch 'master' into vasil.pashov/coverity-test-existing-file
commit 85d51e3b748982dc9121026a4dfcbd9f5a1dc2fb
Author: Alex Owens <73388657+alexowens90@users.noreply.github.com>
Date: Tue May 27 10:54:08 2025 +0100
Bugfix 9209057536: Allow concatenation of uint64 columns with int* columns (#2365)
#### Reference Issues/PRs
Fixes
[9209057536](https://man312219.monday.com/boards/7852509418/pulses/9209057536)
#### What does this implement or fix?
Allows concatenating columns of type uint64 with columns of type int*
commit 91a076cc267caf549ff38cb532dd76c5e4e168ba
Author: Alex Owens <73388657+alexowens90@users.noreply.github.com>
Date: Fri May 23 17:46:47 2025 +0100
Enhancement 7992967434: filters and projections ternary operator (#2103)
#### Reference Issues/PRs
Implements
[7992967434](https://man312219.monday.com/boards/7852509418/pulses/7992967434)
#### What does this implement or fix?
Implements a ternary operator equivalent to `numpy.where`, primarily for
projecting new columns based on some condition, although it can also be
used for filtering. Semantically the same as `left if condition else
right`, although this Pythonic syntax cannot be made to work due to
limitations of the language.
#### Any other comments?
See `test_ternary.py` for a plethora of examples and the expected
behaviour in each case.
Example benchmark output with annotations below.
The first parameter to all benchmarks is the number of rows (100k for
all of them right now), so the single-threaded per-row time can be
calculated by dividing by 100,000.
e.g. projecting a new column of 100k rows by choosing from 2 dense
columns (likely a common use case) takes 424us, or just over 4ns per
row.
Other parameters are explained for each individual benchmark.
```
Run on (20 X 2918.4 MHz CPU s)
CPU Caches:
L1 Data 48 KiB (x10)
L1 Instruction 32 KiB (x10)
L2 Unified 1280 KiB (x10)
L3 Unified 24576 KiB (x1)
Load Average: 4.23, 6.56, 6.73
--------------------------------------------------------------------------------------------------
Benchmark Time CPU Iterations
--------------------------------------------------------------------------------------------------
BM_ternary_bitset_bitset/100000 13.1 us 13.1 us 58099
# Second arg is whether the boolean argument is true or false, third is whether the arguments are swapped
BM_ternary_bitset_bool/100000/1/1 2.00 us 2.00 us 363634
BM_ternary_bitset_bool/100000/1/0 7.43 us 7.43 us 101700
BM_ternary_bitset_bool/100000/0/1 7.28 us 7.28 us 88907
BM_ternary_bitset_bool/100000/0/0 2.45 us 2.45 us 307832
BM_ternary_numeric_dense_col_dense_col/100000 424 us 424 us 1276
BM_ternary_numeric_sparse_col_sparse_col/100000 3548 us 3548 us 185
# Second arg is whether the arguments are swapped
BM_ternary_numeric_dense_col_sparse_col/100000/1 2555 us 2555 us 258
BM_ternary_numeric_dense_col_sparse_col/100000/0 2800 us 2800 us 262
# Second arg is the number of unique strings in each string column, third is whether the columns have the same string pool or not
BM_ternary_string_dense_col_dense_col/100000/100000/1 438 us 438 us 1534
BM_ternary_string_dense_col_dense_col/100000/100000/0 16257 us 16258 us 43
BM_ternary_string_dense_col_dense_col/100000/2/1 441 us 441 us 1603
BM_ternary_string_dense_col_dense_col/100000/2/0 4219 us 4219 us 186
BM_ternary_string_sparse_col_sparse_col/100000/100000/1 3854 us 3854 us 191
BM_ternary_string_sparse_col_sparse_col/100000/100000/0 10753 us 10754 us 67
BM_ternary_string_sparse_col_sparse_col/100000/2/1 3655 us 3655 us 183
BM_ternary_string_sparse_col_sparse_col/100000/2/0 4592 us 4592 us 123
BM_ternary_string_dense_col_sparse_col/100000/100000/1 2957 us 2957 us 236
BM_ternary_string_dense_col_sparse_col/100000/100000/0 13980 us 13980 us 50
BM_ternary_string_dense_col_sparse_col/100000/2/1 2967 us 2966 us 237
BM_ternary_string_dense_col_sparse_col/100000/2/0 5179 us 5179 us 160
# Second arg is whether the arguments are swapped
BM_ternary_numeric_dense_col_val/100000/1 360 us 359 us 1871
BM_ternary_numeric_dense_col_val/100000/0 388 us 388 us 1692
BM_ternary_numeric_sparse_col_val/100000/1 2244 us 2244 us 292
BM_ternary_numeric_sparse_col_val/100000/0 2385 us 2385 us 283
# Second arg is whether the arguments are swapped, third is the number of unique strings in the column
BM_ternary_string_dense_col_val/100000/1/100000 8259 us 8258 us 82
BM_ternary_string_dense_col_val/100000/0/100000 7683 us 7683 us 93
BM_ternary_string_dense_col_val/100000/1/2 2578 us 2578 us 261
BM_ternary_string_dense_col_val/100000/0/2 2385 us 2385 us 297
BM_ternary_string_sparse_col_val/100000/1/100000 6302 us 6302 us 129
BM_ternary_string_sparse_col_val/100000/0/100000 5792 us 5792 us 115
BM_ternary_string_sparse_col_val/100000/1/2 2903 us 2903 us 249
BM_ternary_string_sparse_col_val/100000/0/2 3095 us 3095 us 232
# Second arg is whether the arguments are swapped
BM_ternary_numeric_dense_col_empty/100000/1 1269 us 1269 us 584
BM_ternary_numeric_dense_col_empty/100000/0 1354 us 1354 us 512
BM_ternary_numeric_sparse_col_empty/100000/1 1363 us 1363 us 572
BM_ternary_numeric_sparse_col_empty/100000/0 1374 us 1374 us 484
# Second arg is whether the arguments are swapped, third is the number of unique strings in the column
BM_ternary_string_dense_col_empty/100000/1/100000 1217 us 1217 us 587
BM_ternary_string_dense_col_empty/100000/0/100000 1343 us 1343 us 577
BM_ternary_string_dense_col_empty/100000/1/2 1287 us 1287 us 574
BM_ternary_string_dense_col_empty/100000/0/2 1363 us 1363 us 518
BM_ternary_string_sparse_col_empty/100000/1/100000 1413 us 1413 us 524
BM_ternary_string_sparse_col_empty/100000/0/100000 1343 us 1343 us 517
BM_ternary_string_sparse_col_empty/100000/1/2 1293 us 1293 us 540
BM_ternary_string_sparse_col_empty/100000/0/2 1235 us 1235 us 480
BM_ternary_numeric_val_val/100000 368 us 368 us 2039
BM_ternary_string_val_val/100000 376 us 376 us 1862
# Second arg is whether the arguments are swapped
BM_ternary_numeric_val_empty/100000/1 40.7 us 40.7 us 16491
BM_ternary_numeric_val_empty/100000/0 36.7 us 36.7 us 17836
BM_ternary_string_val_empty/100000/1 40.8 us 40.8 us 17892
BM_ternary_string_val_empty/100000/0 58.2 us 58.2 us 13825
# Second arg is whether the left argument is true or false, third is whether the right argument is true or false
BM_ternary_bool_bool/100000/1/1 1.43 us 1.43 us 518204
BM_ternary_bool_bool/100000/1/0 1.99 us 1.99 us 378598
BM_ternary_bool_bool/100000/0/1 4.52 us 4.52 us 157505
BM_ternary_bool_bool/100000/0/0 0.020 us 0.020 us 37060921
```
commit 3c059f4d4030dc73594f277d8754918c698a2969
Author: Phoebus Mak <61957902+phoebusm@users.noreply.github.com>
Date: Thu May 22 09:50:29 2025 +0100
Fix gcp lib unreachable after making it read only (#2349)
#### Reference Issues/PRs
<!--Example: Fixes #1234. See also #3456.-->
https://man312219.monday.com/boards/7852509418/pulses/8985074856
#### What does this implement or fix?
`create_store_from_lib_config` took protobuf setting only.
GCP setting is stored natively only, unlike other storages setting.
So when new store is created with the above function, gcp settings have
not been passed to the new store. Therefore the SDK will fallback to
default but incorrect setting and cause errors.
S3 and GCPXML native settings are given default value to avoid
uninitiailzied value being used in the test
#### Any other comments?
Test in the CI:
https://github.com/man-group/ArcticDB/actions/runs/15164054821/job/42638155043
```
test_symbol_list.py::test_symbol_list_read_only_compaction_needed[real_gcp_store_factory-True]
[gw0] [ 95%] PASSED tests/integration/arcticdb/version_store/test_symbol_list.py::test_symbol_list_read_only_compaction_needed[real_gcp_store_factory-True]
test_symbol_list.py::test_symbol_list_read_only_compaction_needed[real_gcp_store_factory-False]
[gw0] [ 95%] PASSED tests/integration/arcticdb/version_store/test_symbol_list.py::test_symbol_list_read_only_compaction_needed[real_gcp_store_factory-False]
```
(Other unrelated tests failed in the flaky real storage CI)
#### Checklist
<details>
<summary>
Checklist for code changes...
</summary>
- [ ] Have you updated the relevant docstrings, documentation and
copyright notice?
- [ ] Is this contribution tested against [all ArcticDB's
features](../docs/mkdocs/docs/technical/contributing.md)?
- [ ] Do all exceptions introduced raise appropriate [error
messages](https://docs.arcticdb.io/error_messages/)?
- [ ] Are API changes highlighted in the PR description?
- [ ] Is the PR labelled as enhancement or bug so it appears in
autogenerated release notes?
</details>
<!--
Thanks for contributing a Pull Request to ArcticDB! Please ensure you
have taken a look at:
- ArcticDB's Code of Conduct:
https://github.com/man-group/ArcticDB/blob/master/CODE_OF_CONDUCT.md
- ArcticDB's Contribution Licensing:
https://github.com/man-group/ArcticDB/blob/master/docs/mkdocs/docs/technical/contributing.md#contribution-licensing
-->
commit 9d98a4436e376fa1623af92f23153cde5b68a68b
Author: Alex Owens <73388657+alexowens90@users.noreply.github.com>
Date: Wed May 21 18:03:28 2025 +0100
Fix multiindex series (#2363)
#### What does this implement or fix?
Fixes roundtripping of multiindexed Series with timestamps as the first
level and strings as the second level.
Broken by #2142
---------
Co-authored-by: Alex Owens <alex.owens@man.com>
commit c3c7c2ac5d7d98d16305e6914713f03454d30a57
Author: Alex Owens <73388657+alexowens90@users.noreply.github.com>
Date: Wed May 21 16:42:34 2025 +0100
Docs 8975554293: Add concat demo notebook (#2361)
#### Reference Issues/PRs
Completes
[8975554293](https://man312219.monday.com/boards/7852509418/pulses/8975554293)
#### What does this implement or fix?
Adds a notebook demonstrating the new `concat` functionality added in
https://github.com/man-group/ArcticDB/pull/2142
---------
Co-authored-by: Alex Owens <alex.owens@man.com>
commit 17ea0e49deba0a3a1b8e6267e9516b14ea34b3ef
Author: grusev <george_rusev@yahoo.com>
Date: Wed May 21 18:31:23 2025 +0300
Update installation_tests.yml with 5.3 and 5.4 final versions (#2362)
#### Reference Issues/PRs
<!--Example: Fixes #1234. See also #3456.-->
#### What does this implement or fix?
#### Any other comments?
Moved 5.2.6 to different timeslot to eliminate the possibility about
failures being because timeslot. Although a manual execution shows this
problem with 5.2.6. is most probably persisting
https://github.com/man-group/ArcticDB/actions/runs/15139549472/job/42559651096)
Added:
5.3.4 https://github.com/man-group/ArcticDB/actions/runs/15133764164/
5.4.1 https://github.com/man-group/ArcticDB/actions/runs/15133923361
#### Checklist
<details>
<summary>
Checklist for code changes...
</summary>
- [ ] Have you updated the relevant docstrings, documentation and
copyright notice?
- [ ] Is this contribution tested against [all ArcticDB's
features](../docs/mkdocs/docs/technical/contributing.md)?
- [ ] Do all exceptions introduced raise appropriate [error
messages](https://docs.arcticdb.io/error_messages/)?
- [ ] Are API changes highlighted in the PR description?
- [ ] Is the PR labelled as enhancement or bug so it appears in
autogenerated release notes?
</details>
<!--
Thanks for contributing a Pull Request to ArcticDB! Please ensure you
have taken a look at:
- ArcticDB's Code of Conduct:
https://github.com/man-group/ArcticDB/blob/master/CODE_OF_CONDUCT.md
- ArcticDB's Contribution Licensing:
https://github.com/man-group/ArcticDB/blob/master/docs/mkdocs/docs/technical/contributing.md#contribution-licensing
-->
commit e68ec014683d00f095e4efbe5d72b81b7509299d
Author: Vasil Pashov <vasil.pashov1@gmail.com>
Date: Wed May 21 11:38:45 2025 +0300
Temporary disable tests
commit e3afff2115d4f0038d13a5327a8c7b7779552a99
Merge: bdbc17028 424cd56e2
Author: Vasil Pashov <vasil.pashov1@gmail.com>
Date: Wed May 21 11:17:22 2025 +0300
Merge branch 'master' into vasil.pashov/coverity-test-existing-file
commit 424cd56e295afafd64444420b92fcf89a82dd1ea
Author: grusev <george_rusev@yahoo.com>
Date: Tue May 20 11:09:42 2025 +0300
Schedule S3 tests and fix STS to run only against AWS S3 (#2356)
#### Reference Issues/PRs
<!--Example: Fixes #1234. See also #3456.-->
#### What does this implement or fix?
Shedule for now to run twice a week
Contains also couple of other fixes of the workflow:
- seeding tests were not executed previously due to change in workflow
parameter from boolean to choice for GCP tests. Now seeding tests are
executed.
- STS role creation was executed for GCP tests which was unnecessary.
Now it gets executed only with AWS S3
- persistent tests cleaning had a problem with the context and resulted
in crash not being able to load storage_tests.py. This test is fixed now
to allow proper loading of mark.py in defferent contexts
Results:
https://github.com/man-group/ArcticDB/actions/runs/15061574677/job/42337724260
(NOTE: the failures in the above run are because this PR:
https://github.com/man-group/ArcticDB/pull/2353 is not part of current
one. Once it gets merge S3 tests will run without problems)
#### Any other comments?
#### Checklist
<details>
<summary>
Checklist for code changes...
</summary>
- [ ] Have you updated the relevant docstrings, documentation and
copyright notice?
- [ ] Is this contribution tested against [all ArcticDB's
features](../docs/mkdocs/docs/technical/contributing.md)?
- [ ] Do all exceptions introduced raise appropriate [error
messages](https://docs.arcticdb.io/error_messages/)?
- [ ] Are API changes highlighted in the PR description?
- [ ] Is the PR labelled as enhancement or bug so it appears in
autogenerated release notes?
</details>
<!--
Thanks for contributing a Pull Request to ArcticDB! Please ensure you
have taken a look at:
- ArcticDB's Code of Conduct:
https://github.com/man-group/ArcticDB/blob/master/CODE_OF_CONDUCT.md
- ArcticDB's Contribution Licensing:
https://github.com/man-group/ArcticDB/blob/master/docs/mkdocs/docs/technical/contributing.md#contribution-licensing
-->
---------
Co-authored-by: Georgi Rusev <Georgi Rusev>
commit a158b0c2e684c9389691744c001192ce94ddc79d
Author: Alex Owens <73388657+alexowens90@users.noreply.github.com>
Date: Mon May 19 13:28:51 2025 +0100
Bugfix 9123099670: fix resampling of old updated data (#2351)
#### Reference Issues/PRs
Fixes
[9123099670](https://man312219.monday.com/boards/7852509418/views/168855452/pulses/9123099670)
#### What does this implement or fix?
Fixes three separate resampling bugs:
1. Old versions of `update` (changed sometime between `4.1.0` and
`4.4.0`, I haven't pinned down exactly where) had a behaviour in which
the `end_index` value in the data key of the segment overlapping with
the start of the date range provided to the `update` call was set to the
first value of the date range in the `update` call. For all other
modification methods, this is set to 1 nanosecond larger than the last
index value in the contained segment. Resampling assumed this to be the
case, and had an assertion verifying it. Relaxing this assertion is
sufficient to fix the issue.
2. Providing a `date_range` argument with a resample where the provided
date range did not overlap with the timerange covered by the index of
the symbol led to trying to reserve a vector with a negative size. This
now correctly returns an empty result.
3. Previously, checks that a symbol being resampled had a timestamp
index occurred after some operations which also require this to be true,
which could lead to the same vector reserve issue above. It is now
checked in advance, and a suitable exception raised.
commit 9edc74a89102b4ab66fbd7911a31322425dfcacc
Author: grusev <george_rusev@yahoo.com>
Date: Mon May 19 12:54:07 2025 +0300
nfs backed tests for v1 API (#2350)
#### Reference Issues/PRs
<!--Example: Fixes #1234. See also #3456.-->
#### What does this implement or fix?
arctic_* fixtures or v2 API is already covered with nfs backed s3 tests.
What is needed now is to add also tests for v1 API fixtures.
New Fixtures:
nfs_backed_s3_store_factory
nfs_backed_s3_version_store_v1
nfs_backed_s3_version_store_v2
nfs_backed_s3_version_store_dynamic_schema_v1
nfs_backed_s3_version_store_dynamic_schema_v2
nfs_backed_s3_version_store
Added to:
object_store_factory
s3_store_factory -> nfs_backed_s3_store_factory
object_and_mem_and_lmdb_version_store
s3_version_store_v1 -> nfs_backed_s3_version_store_v1
s3_version_store_v2 -> nfs_backed_s3_version_store_v2
object_and_mem_and_lmdb_version_store_dynamic_schema
s3_version_store_dynamic_schema_v1 ->
nfs_backed_s3_version_store_dynamic_schema_v1
s3_version_store_dynamic_schema_v2 ->
nfs_backed_s3_version_store_dynamic_schema_v2
#### Any other comments?
#### Checklist
<details>
<summary>
Checklist for code changes...
</summary>
- [ ] Have you updated the relevant docstrings, documentation and
copyright notice?
- [ ] Is this contribution tested against [all ArcticDB's
features](../docs/mkdocs/docs/technical/contributing.md)?
- [ ] Do all exceptions introduced raise appropriate [error
messages](https://docs.arcticdb.io/error_messages/)?
- [ ] Are API changes highlighted in the PR description?
- [ ] Is the PR labelled as enhancement or bug so it appears in
autogenerated release notes?
</details>
<!--
Thanks for contributing a Pull Request to ArcticDB! Please ensure you
have taken a look at:
- ArcticDB's Code of Conduct:
https://github.com/man-group/ArcticDB/blob/master/CODE_OF_CONDUCT.md
- ArcticDB's Contribution Licensing:
https://github.com/man-group/ArcticDB/blob/master/docs/mkdocs/docs/technical/contributing.md#contribution-licensing
-->
---------
Co-authored-by: Georgi Rusev <Georgi Rusev>
commit 67d2bbe530f96a0aa5412f479e123da480ba2d99
Author: Alex Owens <73388657+alexowens90@users.noreply.github.com>
Date: Fri May 16 15:20:37 2025 +0100
Enhancement 8277989680: symbol concatenation poc (#2142)
#### Reference Issues/PRs
8277989680
#### What does this implement or fix?
Implements symbol concatenation. Inner and outer joins over columns both
supported. Expected usage:
```
# Read requests can contain usual as_of, date_range, columns, etc arguments
lazy_dfs = lib.read_batch([read_request_1, read_request_2, ...])
# Potentially apply some processing to all or individual constituent lazy dataframes here, that will be applied before the join
lazy_dfs = lazy_dfs[lazy_dfs["col"].notnull()]
# Join here
lazy_df = adb.concat(lazy_dfs)
# Perform more processing if desired
lazy_df = lazy_df.resample("15min").agg({"col": "mean"})
# Collect result
res = lazy_df.collect()
# res contains a list of VersionedItems from the consituent symbols that went into the join with data=None, and a data member with the joined Series/DataFrame
```
See `test_symbol_concatenation.py` for thorough examples of how the API
works.
For outer joins, if a column is not present in one of the input symbols,
then the same type-specific behaviour as used for dynamic schema is used
to backfill the missing values.
Not all symbols can be concatenated together. The following will throw
exceptions if attempted to be concatenated:
- a Series with a DataFrame
- Different index types, including multiindexes with different numbers
of levels
- Incompatible column types. e.g. if `col` has type `INT64` in one
symbol, and is a string column in another symbol. this only applies if
the column would be in the result, which is always the case for all
columns with an outer join, but may not always be for inner joins.
Where possible, the implementation is permissive with what can be joined
with an output as sensible as possible:
- Joining two or more Series with different names that are otherwise
compatible will produce a Series with no name
- Joining two or more timeseries where the indexes have different names
will produce a timeseries with an unnamed index
- Joining two or more timeseries where the indexes have different
timezones will produce a timeseries with a UTC index
- Joining two or more multiindexed Series/DataFrames where the levels
have compatible types but different names will produce a multiindexed
Series/DataFrame with unnamed levels where they differed between some of
the inputs.
- Joining two or more Series/DataFrames that all have `RangeIndex`. If
the index `step` does not match between all of the inputs, then the
output will have a `RangeIndex` with `start=0` and `step=1`. **This is
different behaviour to Pandas, which converts to an Int64 index in this
case. For this reason, a warning is logged when this happens.**
The only known major limitation is that all of the symbols being joined
together (after any pre-join processing) must fit into memory. Relaxing
this constraint would require much more sophisticated query planning
than we currently support, in which all of the clauses both for
individual symbols pre-join, the join, and any post-join clauses, are
all taken into account when scheduling both IO and individual processing
tasks.
commit c1c7a8cff3193dcf4aefee268cd3feea01c68bd9
Author: grusev <george_rusev@yahoo.com>
Date: Fri May 16 13:55:12 2025 +0300
Patch for Real S3 library names (#2353)
#### Reference Issues/PRs
<!--Example: Fixes #1234. See also #3456.-->
#### What does this implement or fix?
Currently we create library names which are too long for real S3, this
is a patch for the tests until the real bug is addressed
Manually triggered run:
https://github.com/man-group/ArcticDB/actions/runs/15013824867
#### Any other comments?
#### Checklist
<details>
<summary>
Checklist for code changes...
</summary>
- [ ] Have you updated the relevant docstrings, documentation and
copyright notice?
- [ ] Is this contribution tested against [all ArcticDB's
features](../docs/mkdocs/docs/technical/contributing.md)?
- [ ] Do all exceptions introduced raise appropriate [error
messages](https://docs.arcticdb.io/error_messages/)?
- [ ] Are API changes highlighted in the PR description?
- [ ] Is the PR labelled as enhancement or bug so it appears in
autogenerated release notes?
</details>
<!--
Thanks for contributing a Pull Request to ArcticDB! Please ensure you
have taken a look at:
- ArcticDB's Code of Conduct:
https://github.com/man-group/ArcticDB/blob/master/CODE_OF_CONDUCT.md
- ArcticDB's Contribution Licensing:
https://github.com/man-group/ArcticDB/blob/master/docs/mkdocs/docs/technical/contributing.md#contribution-licensing
-->
---------
Co-authored-by: Georgi Rusev <Georgi Rusev>
commit bb65a85ab82dd7fec5297b258956545f8b4adea7
Author: Alex Owens <73388657+alexowens90@users.noreply.github.com>
Date: Fri May 16 11:41:18 2025 +0100
Add resolve_defaults back in as a static method of NativeVersionStore (#2358)
#### Reference Issues/PRs
Was removed in #2345 , but is needed at least by some internal tests,
and technically constitutes an API break (although we don't expect
anybody to be using it)
commit e78758a7fe5fbb02085dcfae01218903d6dad6d9
Author: grusev <george_rusev@yahoo.com>
Date: Fri May 16 13:25:24 2025 +0300
Installation Tests Workflow Fixes (#2354)
#### Reference Issues/PRs
<!--Example: Fixes #1234. See also #3456.-->
#### What does this implement or fix?
A failure when job is triggered on schedule is fixed - the string
containe extra single quotes. Also the order of 2 steps is changed for
schedulling specific use case.
Changes in workflow dispatch are implemented to simplify execution and
leave some parts for enhancements - ie the selection of exact
os-python-repo combination which needs actually single flow of step and
not matrix.
S3 tests also enabled to run along with LMDB test by default
#### Any other comments?
#### Checklist
<details>
<summary>
Checklist for code changes...
</summary>
- [ ] Have you updated the relevant docstrings, documentation and
copyright notice?
- [ ] Is this contribution tested against [all ArcticDB's
features](../docs/mkdocs/docs/technical/contributing.md)?
- [ ] Do all exceptions introduced raise appropriate [error
messages](https://docs.arcticdb.io/error_messages/)?
- [ ] Are API changes highlighted in the PR description?
- [ ] Is the PR labelled as enhancement or bug so it appears in
autogenerated release notes?
</details>
<!--
Thanks for contributing a Pull Request to ArcticDB! Please ensure you
have taken a look at:
- ArcticDB's Code of Conduct:
https://github.com/man-group/ArcticDB/blob/master/CODE_OF_CONDUCT.md
- ArcticDB's Contribution Licensing:
https://github.com/man-group/ArcticDB/blob/master/docs/mkdocs/docs/technical/contributing.md#contribution-licensing
-->
---------
Co-authored-by: Georgi Rusev <Georgi Rusev>
commit 9e544da9d823c3a4e76b256b741925af52a20742
Author: grusev <george_rusev@yahoo.com>
Date: Tue May 13 13:45:53 2025 +0300
Installation tests v4 (#2339)
#### Reference Issues/PRs
<!--Example: Fixes #1234. See also #3456.-->
#### What does this implement or fix?
Successful execution 5.2.6:
https://github.com/man-group/ArcticDB/actions/runs/14641126753/job/41083591802
5.1.2: https://github.com/man-group/ArcticDB/actions/runs/14637571996
4.5.1:
https://github.com/man-group/ArcticDB/actions/runs/14639124835/job/41077126258
1.6.2:
https://github.com/man-group/ArcticDB/actions/runs/14701046721/job/41250511273
The PR contains workflow definition to execute tests on installed
arcticdb it is combination of approaches:
https://github.com/man-group/ArcticDB/pull/2330
https://github.com/man-group/ArcticDB/pull/2316
Installation tests are now in separate folder
(python/installation_tests) not part of tests. They have their own
fixtures, making them independent from rest of code base
The tests are direct copy from originals with one modified to user ver 2
API. Otherwise now if there are changes in API each test in installation
set can be addapted. As tests run very fast no need to use simulators,
instead directly using S3 real storage
The tests are executed by a workflow.
Currently each test is executed against LMDB and real S3. The moto
simulated version is not available in this moment due to tight coupling
with protobufs which differ for ach version as well as tight coupling
with whole existing test code.
The workflow have 2 triggers:
- manual trigger - allowing tests to be executed manually on demand
- on schedule - the schedule execution is overnight. Each arcticdb
version tests are executed within 1hr difference from the other. Thats
is due to fact that executing all at once is likely to generate errors
with real storages
#### Any other comments?
#### Checklist
<details>
<summary>
Checklist for code changes...
</summary>
- [ ] Have you updated the relevant docstrings, documentation and
copyright notice?
- [ ] Is this contribution tested against [all ArcticDB's
features](../docs/mkdocs/docs/technical/contributing.md)?
- [ ] Do all exceptions introduced raise appropriate [error
messages](https://docs.arcticdb.io/error_messages/)?
- [ ] Are API changes highlighted in the PR description?
- [ ] Is the PR labelled as enhancement or bug so it appears in
autogenerated release notes?
</details>
<!--
Thanks for contributing a Pull Request to ArcticDB! Please ensure you
have taken a look at:
- ArcticDB's Code of Conduct:
https://github.com/man-group/ArcticDB/blob/master/CODE_OF_CONDUCT.md
- ArcticDB's Contribution Licensing:
https://github.com/man-group/ArcticDB/blob/master/docs/mkdocs/docs/technical/contributing.md#contribution-licensing
-->
---------
Co-authored-by: Georgi Rusev <Georgi Rusev>
Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
commit 2612fb45f15350dc483ddde1c8d43c2d6a02731b
Author: grusev <george_rusev@yahoo.com>
Date: Mon May 12 15:39:20 2025 +0300
Asv v2 s3 tests (Refactored) (#2249)
#### Reference Issues/PRs
<!--Example: Fixes #1234. See also #3456.-->
Contains refactored framework for setting up shared storages + tests for
AWS S3 storage
Merged 3 Prs into one:
- https://github.com/man-group/ArcticDB/pull/2185
- https://github.com/man-group/ArcticDB/pull/2227
- https://github.com/man-group/ArcticDB/pull/2204
Important: the benchmark tests down in this PR cannot run successfully.
Therefore do not take them as criteria. All tests need to be run
manually. Here are runs from 27-march:
LMDB set:
https://github.com/man-group/ArcticDB/actions/runs/14100376040/job/39495398374
Real set:
https://github.com/man-group/ArcticDB/actions/runs/14100497273/job/39495728734
#### What does this implement or fix?
#### Any other comments?
#### Checklist
<details>
<summary>
Checklist for code changes...
</summary>
- [ ] Have you updated the relevant docstrings, documentation and
copyright notice?
- [ ] Is this contribution tested against [all ArcticDB's
features](../docs/mkdocs/docs/technical/contributing.md)?
- [ ] Do all exceptions introduced raise appropriate [error
messages](https://docs.arcticdb.io/error_messages/)?
- [ ] Are API changes highlighted in the PR description?
- [ ] Is the PR labelled as enhancement or bug so it appears in
autogenerated release notes?
</details>
<!--
Thanks for contributing a Pull Request to ArcticDB! Please ensure you
have taken a look at:
- ArcticDB's Code of Conduct:
https://github.com/man-group/ArcticDB/blob/master/CODE_OF_CONDUCT.md
- ArcticDB's Contribution Licensing:
https://github.com/man-group/ArcticDB/blob/master/docs/mkdocs/docs/technical/contributing.md#contribution-licensing
-->
Co-authored-by: Georgi Rusev <Georgi Rusev>
commit 3c2fe145cad45797356a4ec5fbd42e4dac57681a
Author: William Dealtry <william.dealtry@gmail.com>
Date: Mon May 12 09:57:15 2025 +0100
size_t size in MacOS
commit bb54de8879ab57c37093a62c5282e405fc9a834b
Author: William Dealtry <william.dealtry@gmail.com>
Date: Mon May 12 09:03:04 2025 +0100
resolve defaults is a free function
commit e973f8dbd898aedc747bc232e022c9a1137d882c
Author: willdealtry <william.dealtry@gmail.com>
Date: Wed Apr 16 14:49:46 2025 +0100
Fix up file operations
commit af1a171eab284902db4333946b732de7d9ec2b18
Author: Phoebus Mak <61957902+phoebusm@users.noreply.github.com>
Date: Mon May 12 10:00:32 2025 +0100
Disable s3 checksumming (#2337)
#### Reference Issues/PRs
<!--Example: Fixes #1234. See also #3456.-->
https://github.com/man-group/ArcticDB/issues/2251
#### What does this implement or fix?
Disable s3 checksumming by setting environment variable in the wheel.
#### Any other comments?
This will also unblock the upgrade of `aws-sdk-cpp` on vcpkg.
The upgrade will not be made in this PR
One of the newly added test is needed to be skipped as `conda` CI has
`aws-sdk-cpp` pinned at non-s3-checksumming version due the `libarrow`
pin.
`environment-dev.yml` doesn't align with the counterpart in the
feedstock. Therefore the new version of `aws-sdk-cpp` is only used in
the feedstock thus release wheel but not in local and CI build here.
This will be addressed in separate ticket.
[Commit](https://github.com/man-group/ArcticDB/pull/2337/commits/245a02cd455e39fb8f976301ccd5409e6ae88b13)
to remove `libarrow` pin so more updated `aws-sdk-cpp`, which support s3
checksumming is in used in conda
It's for verifying the change with the newly added the test. The
[test](https://github.com/man-group/ArcticDB/actions/runs/14732394443/job/41349695905)
is successful.
#### Checklist
<details>
<summary>
Checklist for code changes...
</summary>
- [ ] Have you updated the relevant docstrings, documentation and
copyright notice?
- [ ] Is this contribution tested against [all ArcticDB's
features](../docs/mkdocs/docs/technical/contributing.md)?
- [ ] Do all exceptions introduced raise appropriate [error
messages](https://docs.arcticdb.io/error_messages/)?
- [ ] Are API changes highlighted in the PR description?
- [ ] Is the PR labelled as enhancement or bug so it appears in
autogenerated release notes?
</details>
<!--
Thanks for contributing a Pull Request to ArcticDB! Please ensure you
have taken a look at:
- ArcticDB's Code of Conduct:
https://github.com/man-group/ArcticDB/blob/master/CODE_OF_CONDUCT.md
- ArcticDB's Contribution Licensing:
https://github.com/man-group/ArcticDB/blob/master/docs/mkdocs/docs/technical/contributing.md#contribution-licensing
-->
commit b808afac25bed84595b874f28b6b3ce2407fbd0c
Author: grusev <george_rusev@yahoo.com>
Date: Fri May 9 15:46:17 2025 +0300
Delete STS roles regularly (#2344)
#### Reference Issues/PRs
<!--Example: Fixes #1234. See also #3456.-->
#### What does this implement or fix?
Due to limitation of STS roles number we should constantly do cleaning
of failed to delete roles. The PR contains a scheduled job that would do
that every Sa. The python script can also be executed at any time and
will delete only roles created prior of today, leaving all currently
running jobs unaffected
As roles cannot be guaranteed to be cleaned after tests execution due to
many factors, we should take them out on regular bases, and perhaps this
is the quickest and most reliable approach
#### Any other comments?
#### Checklist
<details>
<summary>
Checklist for code changes...
</summary>
- [ ] Have you updated the relevant docstrings, documentation and
copyright notice?
- [ ] Is this contribution tested against [all ArcticDB's
features](../docs/mkdocs/docs/technical/contributing.md)?
- [ ] Do all exceptions introduced raise appropriate [error
messages](https://docs.arcticdb.io/error_messages/)?
- [ ] Are API changes highlighted in the PR description?
- [ ] Is the PR labelled as enhancement or bug so it appears in
autogenerated release notes?
</details>
<!--
Thanks for contributing a Pull Request to ArcticDB! Please ensure you
have taken a look at:
- ArcticDB's Code of Conduct:
https://github.com/man-group/ArcticDB/blob/master/CODE_OF_CONDUCT.md
- ArcticDB's Contribution Licensing:
https://github.com/man-group/ArcticDB/blob/master/docs/mkdocs/docs/technical/contributing.md#contribution-licensing
-->
---------
Co-authored-by: Georgi Rusev <Georgi Rusev>
commit 0136f4ca52559e0640dc1b7518d6a8b0773ed3a8
Author: Ognyan Stoimenov <ostoimenov@icloud.com>
Date: Fri May 9 14:36:54 2025 +0300
Fix permissions for the automatic docs building (#2347)
#### Reference Issues/PRs
<!--Example: Fixes #1234. See also #3456.-->
#### What does this implement or fix?
Fixes failures when building the docs automatically on release like:
https://github.com/man-group/ArcticDB/actions/runs/14832306883
#### Any other comments?
#### Checklist
<details>
<summary>
Checklist for code changes...
</summary>
- [ ] Have you updated the relevant docstrings, documentation and
copyright notice?
- [ ] Is this contribution tested against [all ArcticDB's
features](../docs/mkdocs/docs/technical/contributing.md)?
- [ ] Do all exceptions introduced raise appropriate [error
messages](https://docs.arcticdb.io/error_messages/)?
- [ ] Are API changes highlighted in the PR description?
- [ ] Is the PR labelled as enhancement or bug so it appears in
autogenerated release notes?
</details>
<!--
Thanks for contributing a Pull Request to ArcticDB! Please ensure you
have taken a look at:
- ArcticDB's Code of Conduct:
https://github.com/man-group/ArcticDB/blob/master/CODE_OF_CONDUCT.md
- ArcticDB's Contribution Licensing:
https://github.com/man-group/ArcticDB/blob/master/docs/mkdocs/docs/technical/contributing.md#contribution-licensing
-->
commit 652d968561d473599e90508078005c4fd00a1ba4
Author: Phoebus Mak <61957902+phoebusm@users.noreply.github.com>
Date: Sat May 3 02:03:44 2025 +0100
Query Stat framework v3 (#2304)
#### Reference Issues/PRs
<!--Example: Fixes #1234. See also #3456.-->
#### What does this implement or fix?
New query stat implemenation which its schema is static
The feature of linking arcticdb API calls to storage operations has been
dropped. Now only storage operation stats will be logged. Therefore the
schema of the stats is hardcoded and allow the summation of stats is
logged, one statical object with numerous atomic ints is enough to do
the job.
No fancy map nor modification of folly executor.
#### Any other comments?
Sample output:
```
{ // Stats
"SYMBOL_LIST": // std::array<std::array<OpStats, NUMBER_OF_TASK_TYPES>, NUMBER_OF_KEYS>
{
"storage_ops": {
"S3_ListObjectsV2":
{ // OpStats
"result_count": 1,
"total_time_ms": 34
}
}
}
}
```
#### Checklist
<details>
<summary>
Checklist for code changes...
</summary>
- [ ] Have you updated the relevant docstrings, documentation and
copyright notice?
- [ ] Is this contribution tested against [all ArcticDB's
features](../docs/mkdocs/docs/technical/contributing.md)?
- [ ] Do all exceptions introduced raise appropriate [error
messages](https://docs.arcticdb.io/error_messages/)?
- [ ] Are API changes highlighted in the PR description?
- [ ] Is the PR labelled as enhancement or bug so it appears in
autogenerated release notes?
</details>
<!--
Thanks for contributing a Pull Request to ArcticDB! Please ensure you
have taken a look at:
- ArcticDB's Code of Conduct:
https://github.com/man-group/ArcticDB/blob/master/CODE_OF_CONDUCT.md
- ArcticDB's Contribution Licensing:
https://github.com/man-group/ArcticDB/blob/master/docs/mkdocs/docs/technical/contributing.md#contribution-licensing
-->
commit 9b93303adf8d5c436ae267be4d950fc5e55139de
Author: Vasil Danielov Pashov <vasil.pashov1@gmail.com>
Date: Fri May 2 17:29:18 2025 +0300
Hold the GIL when incrementing None's refcount to prevent race conditions when there are multiple Python threads (#2334)
#### Reference Issues/PRs
<!--Example: Fixes #1234. See also #3456.-->
None is a global static object in Python which is also refcounted. When
ArcticDB creates `None` objects it must increase their refcount. It must
acquire the GIL when the refcount is increased. Currently we don't
acquire the GIL when we do this, we only hold a SpinLock protecting
other ArcticDB threads from racing on the GIL refcount. With this change
we add an atomic variable in the PythonHandler data which will
accumulate the refcount. Then at the end of the operation when we
reacquire the GIL we will increase the refcount. The same is done for
the NaN refcount, note that we don't really need the GIL to increase
NaN's refcount as we create it internally and don't handle it to Python
until the read operation is done. Currently only read operations need to
work with the `None` object.
`apply_global_refcounts` must be called at the very end before passing
the dataframe to python to prevent something raising an exception in
after the refcount is applied but before python receives the data.
Increasing None's refcount but never decreasing it doesn't seem to be
fatal but we're trying to be good citizens. The best place for that is
`adapt_read_df` or `adapt_read_dfs` as they are called at the end of all
read functions. The code is changed so that the type handler data is
created always in the python bindings file as it's easier to track.
#### What does this implement or fix?
#### Any other comments?
#### Checklist
<details>
<summary>
Checklist for code changes...
</summary>
- [ ] Have you updated the relevant docstrings, documentation and
copyright notice?
- [ ] Is this contribution tested against [all ArcticDB's
features](../docs/mkdocs/docs/technical/contributing.md)?
- [ ] Do all exceptions introduced raise appropriate [error
messages](https://docs.arcticdb.io/error_messages/)?
- [ ] Are API changes highlighted in the PR description?
- [ ] Is the PR labelled as enhancement or bug so it appears in
autogenerated release notes?
</details>
<!--
Thanks for contributing a Pull Request to ArcticDB! Please ensure you
have taken a look at:
- ArcticDB's Code of Conduct:
https://github.com/man-group/ArcticDB/blob/master/CODE_OF_CONDUCT.md
- ArcticDB's Contribution Licensing:
https://github.com/man-group/ArcticDB/blob/master/docs/mkdocs/docs/technical/contributing.md#contribution-licensing
-->
---------
Co-authored-by: Vasil Pashov <vasil.pashov@man.com>
commit d4b40e287863960d608d52131471a88a435bf844
Author: Phoebus Mak <61957902+phoebusm@users.noreply.github.com>
Date: Fri May 2 11:13:30 2025 +0100
Update docs for sts ca issue (#2265)
#### Reference Issues/PRs
<!--Example: Fixes #1234. See also #3456.-->
#### What does this implement or fix?
Clarify when does the workaround need for STS CA issue
#### Any other comments?
#### Checklist
<details>
<summary>
Checklist for code changes...
</summary>
- [ ] Have you updated the relevant docstrings, documentation and
copyright notice?
- [ ] Is this contribution tested against [all ArcticDB's
features](../docs/mkdocs/docs/technical/contributing.md)?
- [ ] Do all exceptions introduced raise appropriate [error
messages](https://docs.arcticdb.io/error_messages/)?
- [ ] Are API changes highlighted in the PR description?
- [ ] Is the PR labelled as enhancement or bug so it appears in
autogenerated release notes?
</details>
<!--
Thanks for contributing a Pull Request to ArcticDB! Please ensure you
have taken a look at:
- ArcticDB's Code of Conduct:
https://github.com/man-group/ArcticDB/blob/master/CODE_OF_CONDUCT.md
- ArcticDB's Contribution Licensing:
https://github.com/man-group/ArcticDB/blob/master/docs/mkdocs/docs/technical/contributing.md#contribution-licensing
-->
commit a9d0e41e47c40a34e2e146a4297b5c638375fe85
Author: Phoebus Mak <61957902+phoebusm@users.noreply.github.com>
Date: Tue Apr 29 17:44:08 2025 +0100
Skip azurite api check (#2288)
#### Reference Issues/PRs
<!--Example: Fixes #1234. See also #3456.-->
#### What does this implement or fix?
The api check in Azurite has brought pain to local tests as the azurite
version needs to keep up with the SDK version. We are only using very
simple API so safe to skip the check.
#### Any other comments?
#### Checklist
<details>
<summary>
Checklist for code changes...
</summary>
- [ ] Have you updated the relevant docstrings, documentation and
copyright notice?
- [ ] Is this contribution tested against [all ArcticDB's
features](../docs/mkdocs/docs/technical/contributing.md)?
- [ ] Do all exceptions introduced raise appropriate [error
messages](https://docs.arcticdb.io/error_messages/)?
- [ ] Are API changes highlighted in the PR description?
- [ ] Is the PR labelled as enhancement or bug so it appears in
autogenerated release notes?
</details>
<!--
Thanks for contributing a Pull Request to ArcticDB! Please ensure you
have taken a look at:
- ArcticDB's Code of Conduct:
https://github.com/man-group/ArcticDB/blob/master/CODE_OF_CONDUCT.md
- ArcticDB's Contribution Licensing:
https://github.com/man-group/ArcticDB/blob/master/docs/mkdocs/docs/technical/contributing.md#contribution-licensing
-->
commit 550d3e7c29a5f9d67a0e993bbabc1cbf88295ef1
Author: grusev <george_rusev@yahoo.com>
Date: Thu Apr 24 17:45:21 2025 +0300
initial version fix for GCP (#2326)
#### Reference Issues/PRs
<!--Example: Fixes #1234. See also #3456.-->
#### What does this implement or fix?
#### Any other comments?
#### Checklist
<details>
<summary>
Checklist for code changes...
</summary>
- [ ] Have you updated the relevant docstrings, documentation and
copyright notice?
- [ ] Is this contribution tested against [all ArcticDB's
features](../docs/mkdocs/docs/technical/contributing.md)?
- [ ] Do all exceptions introduced raise appropriate [error
messages](https://docs.arcticdb.io/error_messages/)?
- [ ] Are API changes highlighted in the PR description?
- [ ] Is the PR labelled as enhancement or bug so it appears in
autogenerated release notes?
</details>
<!--
Thanks for contributing a Pull Request to ArcticDB! Please ensure you
have taken a look at:
- ArcticDB's Code of Conduct:
https://github.com/man-group/ArcticDB/blob/master/CODE_OF_CONDUCT.md
- ArcticDB's Contribution Licensing:
https://github.com/man-group/ArcticDB/blob/master/docs/mkdocs/docs/technical/contributing.md#contribution-licensing
-->
---------
Co-authored-by: Georgi Rusev <Georgi Rusev>
commit 41a2086963e018ffe0ac90e6fea72d3577d463f3
Author: Alex Owens <73388657+alexowens90@users.noreply.github.com>
Date: Wed Apr 23 12:31:26 2025 +0100
Timeseries defrag function (#2319)
#### What does this implement or fix?
Adds a (private) function to defragment timeseries data. See big list of
caveats in code comments for limitations
commit 61b00e99ce7861a0fd767572be0d58600c065b53
Author: Vasil Danielov Pashov <vasil.pashov1@gmail.com>
Date: Thu Apr 17 16:04:41 2025 +0300
Fix race conditions on the None object refcount during a multithreaded read (#2320)
#### Reference Issues/PRs
<!--Example: Fixes #1234. See also #3456.-->
#### What does this implement or fix?
**Bugfix**
Columns are handled in multiple threads during read calls. String
columns can contain `None` values. `None` is a global static ref counted
object and the refcount is not atomic. When ArcticDB places `None`
objects in columns it must increment the refcount. Currently None
objects are allocated only via type handlers. ArcticDB has a global
spin-lock that is shared by all type-handlers. The bug is caused by
[this
line](https://github.com/man-group/ArcticDB/blob/300e121e1be47ecfbabba78f077851a9c3b0772c/cpp/arcticdb/python/python_utils.hpp#L117)
the spin-lock is wrapped in a `std::lock_guard` but there is a call to
`unlock`. When `unlock` is called another thread will take the lock and
start calling `Py_INCREF(Py_None)` but when the function exists the
`std::scope_guard` will call unlock again allowing another thread to
start calling `Py_INCREF(Py_None)` in parallel.
**Refactoring**
- Remove GIL safe py none. It was created because pybind11 wraps
`Py_None` in an object and calls `Py_INCREF(Py_None)` and we must hold
the GIL when incrementing the refcount. The wrapper we have was used
only to get the pointer to the `Py_None` object. We don't need pybind11
to do that. Using the C API we can directly get `Py_None` which is
global object
- Add function to check if a python object is `None`
- Remove uses of py::none{} in places where we don't hold the GIL (most
of those were just to get the `Py_None` object that's inside `py:none`
#### Any other comments?
#### Checklist
<details>
<summary>
Checklist for code changes...
</summary>
- [ ] Have you updated the relevant docstrings, documentation and
copyright notice?
- [ ] Is this contribution tested against [all ArcticDB's
features](../docs/mkdocs/docs/technical/contributing.md)?
- [ ] Do all exceptions introduced raise appropriate [error
messages](https://docs.arcticdb.io/error_messages/)?
- [ ] Are API changes highlighted in the PR description?
- [ ] Is the PR labelled as enhancement or bug so it appears in
autogenerated release notes?
</details>
<!--
Thanks for contributing a Pull Request to ArcticDB! Please ensure you
have taken a look at:
- ArcticDB's Code of Conduct:
https://github.com/man-group/ArcticDB/blob/master/CODE_OF_CONDUCT.md
- ArcticDB's Contribution Licensing:
https://github.com/man-group/ArcticDB/blob/master/docs/mkdocs/docs/technical/contributing.md#contribution-licensing
-->
---------
Co-authored-by: Vasil Pashov <vasil.pashov@man.com>
commit 396757028afbd460fd6325fd2403636ed8482d56
Author: Julien Jerphanion <git@jjerphan.xyz>
Date: Thu Apr 17 11:39:55 2025 +0200
Support MSVC 19.29 (#2332)
Signed-off-by: Julien Jerphanion <git@jjerphan.xyz>
commit b89fc53dbd7cd1eee783fed1fba7b401d69b6ffd
Author: Georgi Petrov <32372905+G-D-Petrov@users.noreply.github.com>
Date: Wed Apr 16 15:35:56 2025 +0300
Increase tolerance to arithmetic mismatches with Pandas with floats (#2333)
#### Reference Issues/PRs
https://github.com/man-group/ArcticDB/actions/runs/14487537861/job/40636907727?pr=2331
#### What does this implement or fix?
To resolve this type of flakiness:
``` python
FAILED tests/hypothesis/arcticdb/test_resample.py::test_resample - AssertionError: Series are different
Series values are different (100.0 %)
[index]: [1969-12-31T23:59:01.000000000]
[left]: [-1706666.6666666667]
[right]: [-1706325.3333333333]
At positional index 0, first diff: -1706666.6666666667 != -1706325.3333333333
Falsifying example: test_resample(
df=
col_float col_int col_uint
1970-01-01 00:00:00.000000000 0.0 9223372036849590785 0
1970-01-01 00:00:00.000000001 0.0 512 0
1970-01-01 00:00:00.000000002 0.0 -9223372036854710785 0
,
rule='1min',
origin='start',
offset='1s',
)
You can reproduce this example by temporarily adding @reproduce_failure('6.72.4', b'AXicY2RgYGQAYxCCUEwMyAAkzVD/Hwg2PGIEq2ACqgASjBDR/0yMMFUwAAB9FAui') as a decorator on your test case
```
#### Any other comments?
A similar fix was done here:
https://github.com/man-group/ArcticDB/commit/fe9de294580526e921102fbdedda736f20596fc7
#### Checklist
<details>
<summary>
Checklist for code changes...
</summary>
- [ ] Have you updated the relevant docstrings, documentation and
copyright notice?
- [ ] Is this contribution tested against [all ArcticDB's
features](../docs/mkdocs/docs/technical/contributing.md)?
- [ ] Do all exceptions introduced raise appropriate [error
messages](https://docs.arcticdb.io/error_messages/)?
- [ ] Are API changes highlighted in the PR description?
- [ ] Is the PR labelled as enhancement or bug so it appears in
autogenerated release notes?
</details>
<!--
Thanks for contributing a Pull Request to ArcticDB! Please ensure you
have taken a look at:
- ArcticDB's Code of Conduct:
https://github.com/man-group/ArcticDB/blob/master/CODE_OF_CONDUCT.md
- ArcticDB's Contribution Licensing:
https://github.com/man-group/ArcticDB/blob/master/docs/mkdocs/docs/technical/contributing.md#contribution-licensing
-->
commit 30f4c48db0d742898f629d129b5d1caa83091662
Author: Alex Seaton <alexbseaton@gmail.com>
Date: Wed Apr 16 13:08:30 2025 +0100
Symbol sizes API (#2266)
Add Python APIs to get sizes of symbols, in a new `AdminTools` class.
Add documentation for this feature to our website.
You can access the new tools with:
```
lib: Library
lib.admin_tools(): AdminTools
```
Refactor the existing symbol scanning APIs to a visitor pattern so they
can all share as much of the implementation as possible.
Monday: 8560764974
commit 6b3c593924808d33a39e275f921f613f77139d06
Author: Georgi Petrov <32372905+G-D-Petrov@users.noreply.github.com>
Date: Wed Apr 16 14:32:57 2025 +0300
Prevent exceptions in ReliableStorageLockGuard destructor (#2331)
#### Reference Issues/PRs
<!--Example: Fixes #1234. See also #3456.-->
#### What does this implement or fix?
Sometimes when trying to release the lock, there could be exceptions
that occur (either storage related or others).
This PR is trying to catch all exceptions, mainly to prevent unnecessary
seg faults in enterprise.
#### Any other comments?
#### Checklist
<details>
<summary>
Checklist for code changes...
</summary>
- [ ] Have you updated the relevant docstrings, documentation and
copyright notice?
- [ ] Is this contribution tested against [all ArcticDB's
features](../docs/mkdocs/docs/technical/contributing.md)?
- [ ] Do all exceptions introduced raise appropriate [error
messages](https://docs.arcticdb.io/error_messages/)?
- [ ] Are API changes highlighted in the PR description?
- [ ] Is the PR labelled as enhancement or bug so it appears in
autogenerated release notes?
</details>
<!--
Thanks for contributing a Pull Request to ArcticDB! Please ensure you
have taken a look at:
- ArcticDB's Code of Conduct:
https://github.com/man-group/ArcticDB/blob/master/CODE_OF_CONDUCT.md
- ArcticDB's Contribution Licensing:
https://github.com/man-group/ArcticDB/blob/master/docs/mkdocs/docs/technical/contributing.md#contribution-licensing
-->
commit aa585fc0a5ae60f61f1752d78614e0951047d21e
Author: Julien Jerphanion <git@jjerphan.xyz>
Date: Wed Apr 16 10:10:11 2025 +0200
conda-build: Extend development environment for Windows (#2328)
#### Reference Issues/PRs
Extracted from https://github.com/man-group/ArcticDB/pull/2252.
#### What does this implement or fix?
#### Any other comments?
#### Checklist
<details>
<summary>
Checklist for code changes...
</summary>
- [ ] Have you updated the relevant docstrings, documentation and
copyright notice?
- [ ] Is this contribution tested against [all ArcticDB's
features](../docs/mkdocs/docs/technical/contributing.md)?
- [ ] Do all exceptions introduced raise appropriate [error
messages](https://docs.arcticdb.io/error_messages/)?
- [ ] Are API changes highlighted in the PR description?
- [ ] Is the PR labelled as enhancement or bug so it appears in
autogenerated release notes?
</details>
<!--
Thanks for contributing a Pull Request to ArcticDB! Please ensure you
have taken a look at:
- ArcticDB's Code of Conduct:
https://github.com/man-group/ArcticDB/blob/master/CODE_OF_CONDUCT.md
- ArcticDB's Contribution Licensing:
https://github.com/man-group/ArcticDB/blob/master/docs/mkdocs/docs/technical/contributing.md#contribution-licensing
-->
Signed-off-by: Julien Jerphanion <git@jjerphan.xyz>
commit 42091dbe1ea4b7b827cad4f53b2ef099eb43b4fb
Author: Ognyan Stoimenov <ostoimenov@icloud.com>
Date: Tue Apr 15 18:13:47 2025 +0300
Fix pr getting action (#2323)
#### Reference Issues/PRs
<!--Example: Fixes #1234. See also #3456.-->
#### What does this implement or fix?
https://github.com/VanOns/get-merged-pull-requests-action was updated to
fix some issues but changes its API
* Accommodate new API
* Remove previous workaround (now fixed)
* Pin action to 1.3.0 so no such breaks happen in the future
* Changelog generator was not skipping release candidates when comparing
version. Fixed now
* Fix docs building permission
#### Any other comments?
#### Checklist
<details>
<summary>
Checklist for code changes...
</summary>
- [ ] Have you updated the relevant docstrings, documentation and
copyright notice?
- [ ] Is this contribution tested against [all ArcticDB's
features](../docs/mkdocs/docs/technical/contributing.md)?
- [ ] Do all exceptions introduced raise appropriate [error
messages](https://docs.arcticdb.io/error_messages/)?
- [ ] Are API changes highlighted in the PR description?
- [ ] Is the PR labelled as enhancement or bug so it appears in
autogenerated release notes?
</details>
<!--
Thanks for contributing a Pull Request to ArcticDB! Please ensure you
have taken a look at:
- ArcticDB's Code of Conduct:
https://github.com/man-group/ArcticDB/blob/master/CODE_OF_CONDUCT.md
- ArcticDB's Contribution Licensing:
https://github.com/man-group/ArcticDB/blob/master/docs/mkdocs/docs/technical/contributing.md#contribution-licensing
-->
commit 311c1bf8099a491bf1dd85c09e83d640f9d6ce74
Author: Julien Jerphanion <git@jjerphan.xyz>
Date: Tue Apr 15 17:13:05 2025 +0200
ci: Benchmark workflow adaptations (#2327)
#### Reference Issues/PRs
#### What does this implement or fix?
Fixes the import error, working around
https://github.com/airspeed-velocity/asv/issues/1465.
#### Any other comments?
#### Checklist
<details>
<summary>
Checklist for code changes...
</summary>
- [ ] Have you updated the relevant docstrings, documentation and
copyright notice?
- [ ] Is this contribution tested against [all ArcticDB's
features](../docs/mkdocs/docs/technical/contributing.md)?
- [ ] Do all exceptions introduced raise appropriate [error
messages](https://docs.arcticdb.io/error_messages/)?
- [ ] Are API changes highlighted in the PR description?
- [ ] Is the PR labelled as enhancement or bug so it appears in
autogenerated release notes?
</details>
<!--
Thanks for contributing a Pull Request to ArcticDB! Please ensure you
have taken a look at:
- ArcticDB's Code of Conduct:
https://github.com/man-group/ArcticDB/blob/master/CODE_OF_CONDUCT.md
- ArcticDB's Contribution Licensing:
https://github.com/man-group/ArcticDB/blob/master/docs/mkdocs/docs/technical/contributing.md#contribution-licensing
-->
Signed-off-by: Julien Jerphanion <git@jjerphan.xyz>
commit 7b37536b67b8410d2d890b8ee8bf38b05181aa61
Author: Vasil Danielov Pashov <vasil.pashov1@gmail.com>
Date: Tue Apr 15 11:25:03 2025 +0300
Refactor to_atom and to_ref to properly use forwarding references (#2321)
#### Reference Issues/PRs
<!--Example: Fixes #1234. See also #3456.-->
#### What does this implement or fix?
This solves two problems
- Code duplication. to_atom had 3 overloads for value/ref/rval ref for
the same thing. Forwarding references were invented to solve this
problem.
- There were unnecessary copies. `to_atom` had an overload taking
`VeriantKey` by value at some point some APIs have changed and started
returning `AtomKey` instead of `VariantKey` due to the excessive use of
`auto` nobody noticed the difference. Thus we ended up with calling
`to_atom` on an atom key, that worked because `VariantKey` can be
constructed from an `AtomKey` implicitly thus we ended up constructing
`VariantKey` from an `AtomKey` only to extract the `AtomKey` from that.
Forwarding references do not allow implicit conversions thus the
compiler pointed out all places in the code where the above happens.
#### Any other comments?
#### Checklist
<details>
<summary>
Checklist for code changes...
</summary>
- [ ] Have you updated the relevant docstrings, documentation and
copyright notice?
- [ ] Is this contribution tested against [all ArcticDB's
features](../docs/mkdocs/docs/technical/contributing.md)?
- [ ] Do all exceptions introduced raise appropriate [error
messages](https://docs.arcticdb.io/error_messages/)?
- [ ] Are API changes highlighted in the PR description?
- [ ] Is the PR labelled as enhancement or bug so it appears in
autogenerated release notes?
</details>
<!--
Thanks for contributing a Pull Request to ArcticDB! Please ensure you
have taken a look at:
- ArcticDB's Code of Conduct:
https://github.com/man-group/ArcticDB/blob/master/CODE_OF_CONDUCT.md
- ArcticDB's Contribution Licensing:
https://github.com/man-group/ArcticDB/blob/master/docs/mkdocs/docs/technical/contributing.md#contribution-licensing
-->
commit 300e121e1be47ecfbabba78f077851a9c3b0772c
Author: grusev <george_rusev@yahoo.com>
Date: Fri Apr 11 14:07:36 2025 +0300
Update s3.py moto*.create_fixture - add retry attempts (#2311)
#### Reference Issues/PRs
<!--Example: Fixes #1234. See also #3456.-->
#### What does this implement or fix?
Addresses couple of flaky tests opened due to NFS or S3…
#### Reference Issues/PRs <!--Example: Fixes #1234. See also #3456.--> Continuing #2304 for S3 query stats https://man312219.monday.com/boards/7852509418/pulses/8417112339 #### What does this implement or fix? Add S3 query stats support #### Any other comments? For S3 asynchronous calls, the timing measurement begins when the SDK API is initially called, although the actual processing may be delayed until the SDK executor processes the request. This differs from synchronous calls, where timing also starts at the API call, but since these calls are blocking, the requests are processed immediately within the same execution stack. #### Checklist <details> <summary> Checklist for code changes... </summary> - [ ] Have you updated the relevant docstrings, documentation and copyright notice? - [ ] Is this contribution tested against [all ArcticDB's features](../docs/mkdocs/docs/technical/contributing.md)? - [ ] Do all exceptions introduced raise appropriate [error messages](https://docs.arcticdb.io/error_messages/)? - [ ] Are API changes highlighted in the PR description? - [ ] Is the PR labelled as enhancement or bug so it appears in autogenerated release notes? </details> <!-- Thanks for contributing a Pull Request to ArcticDB! Please ensure you have taken a look at: - ArcticDB's Code of Conduct: https://github.com/man-group/ArcticDB/blob/master/CODE_OF_CONDUCT.md - ArcticDB's Contribution Licensing: https://github.com/man-group/ArcticDB/blob/master/docs/mkdocs/docs/technical/contributing.md#contribution-licensing -->
Reference Issues/PRs
What does this implement or fix?
New query stat implemenation which its schema is static
The feature of linking arcticdb API calls to storage operations has been dropped. Now only storage operation stats will be logged. Therefore the schema of the stats is hardcoded and allow the summation of stats is logged, one statical object with numerous atomic ints is enough to do the job.
No fancy map nor modification of folly executor.
Any other comments?
Sample output:
Checklist
Checklist for code changes...