Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Partial context updates #93

Open
wants to merge 417 commits into
base: dev
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 184 commits
Commits
Show all changes
417 commits
Select commit Hold shift + click to select a range
fca7c42
json storage fixed
pseusys Aug 4, 2023
33dabc8
Merge branch 'feat/partial_context_updates' of https://github.com/dee…
pseusys Aug 4, 2023
c5ad6d5
test pickle save and load with logging
pseusys Aug 4, 2023
8deaabd
timestamp conversion test for windows
pseusys Aug 7, 2023
cbe7c70
time in nanoseconds for windows
pseusys Aug 7, 2023
b14239e
ok ok windows take this
pseusys Aug 7, 2023
ed888d4
some other idea to trick windows
pseusys Aug 7, 2023
998fb2c
excessive logging removed
pseusys Aug 8, 2023
12f938e
config dicts fixed + module docstrings added
pseusys Aug 9, 2023
dbc8928
linting and formatting fixed
pseusys Aug 9, 2023
ab43a98
s's removed from docstrings
pseusys Aug 10, 2023
cc18acc
Merge branch 'dev' into feat/partial_context_updates
pseusys Aug 10, 2023
7f850ee
type defined
pseusys Aug 11, 2023
9fe28c9
property docstring added
pseusys Aug 14, 2023
6e4eb75
Merge branch 'dev' into feat/partial_context_updates
pseusys Aug 30, 2023
c25c48d
dff installation cell added to tutorial 8
pseusys Aug 30, 2023
6856ee5
shelve improved
pseusys Sep 5, 2023
5314e31
partial review reaction
pseusys Sep 19, 2023
7f1835e
more documentation added
pseusys Sep 20, 2023
cd76105
finished review
pseusys Sep 24, 2023
4238f9b
Merge branch 'dev' into feat/partial_context_updates
RLKRo Mar 22, 2024
50cda47
put benchmark tutorial after partial updates one
RLKRo Mar 22, 2024
4cc055a
Merge branch 'dev' into feat/partial_context_updates
pseusys Jul 4, 2024
7f77c8f
context storages updated
pseusys Jul 4, 2024
2617255
old naming reset
pseusys Jul 4, 2024
4fb8f67
context merge fixed
pseusys Jul 4, 2024
1230d16
context ids removed
pseusys Jul 4, 2024
c3d82da
context equality tested
pseusys Jul 4, 2024
0bd6347
framework data comparison removed
pseusys Jul 4, 2024
4a15bf0
context id removed from everywhere
pseusys Jul 4, 2024
9b3dd80
lint applied
pseusys Jul 4, 2024
4f0562a
documentation building fixed
pseusys Jul 4, 2024
ef0a9ee
RST syntax fixed
pseusys Jul 4, 2024
3d364bc
context dict added
pseusys Jul 29, 2024
e7ad269
async + pydantic
pseusys Jul 30, 2024
be34714
fixes
pseusys Jul 31, 2024
b8701a0
hashes manipulation only on `write_full_diff`
pseusys Jul 31, 2024
a58eace
ctx_dict + ctx updated
pseusys Aug 5, 2024
33f2823
setting removed
pseusys Aug 5, 2024
c4f9fce
sets added
pseusys Aug 6, 2024
e892a52
serialization added, sample context storage class created
pseusys Aug 6, 2024
1b8aa0d
iterative async access made synchronous
pseusys Aug 6, 2024
173b1fe
sql prototype
pseusys Aug 6, 2024
9665038
context API updated proposal
pseusys Aug 7, 2024
3468af5
context schema and serializer removed
pseusys Aug 7, 2024
71bd9f3
context API updated once again
pseusys Aug 7, 2024
2e6b334
review notes fixed
pseusys Aug 8, 2024
830ea40
ContextDictView made mutable
pseusys Aug 8, 2024
5d3dd95
context dict file split
pseusys Aug 8, 2024
f00ba02
turn introduction reverted
pseusys Aug 8, 2024
1af24db
turns separated (again)
pseusys Aug 13, 2024
3616ac0
key deletion now nullifies value
pseusys Aug 13, 2024
81ce7ba
memory storage
pseusys Aug 16, 2024
1f9e653
ctx_dict tests done
pseusys Aug 17, 2024
c981cc5
general context storages tests created
pseusys Aug 27, 2024
5002dda
ctx_dict updated not to use serializer
pseusys Sep 18, 2024
3e6a8f4
merge dev
RLKRo Sep 18, 2024
6991fb6
Merge branch 'refs/heads/dev' into feat/partial_context_updates
RLKRo Sep 18, 2024
5b80818
fix imports in newly added files
RLKRo Sep 19, 2024
96af9bc
hide circular imports behind type checking
RLKRo Sep 19, 2024
000fb0d
fix imports in test files
RLKRo Sep 19, 2024
2c2ab9d
merge context.init into context.connected
RLKRo Sep 19, 2024
2eb5a2c
remove get_last_index imports
RLKRo Sep 19, 2024
06d54b9
update pipeline.context_storage type
RLKRo Sep 19, 2024
f80e6a3
fix bug with setting sequence type values under a single key
RLKRo Sep 19, 2024
c5311f6
revert primary_id renaming
RLKRo Sep 19, 2024
d43752a
memory test (almost!) finished
pseusys Sep 23, 2024
1ae3e4f
ctx_dict tests fixed
pseusys Sep 23, 2024
85315a6
add overload for getitem
RLKRo Sep 23, 2024
351a43e
split typevar definitions
RLKRo Sep 23, 2024
e9eb2fb
remove asyncio mark
RLKRo Sep 23, 2024
6d93399
allow using negative indexes for context dict
RLKRo Sep 23, 2024
e2053dc
add validation on setitem for context dict
RLKRo Sep 24, 2024
acdcd3c
fixes
RLKRo Sep 24, 2024
16a3d77
allow non-str context ids
RLKRo Sep 24, 2024
9a76ae3
add current_turn_id
RLKRo Sep 24, 2024
5e37651
fix tests
RLKRo Sep 24, 2024
d376e49
update doc
RLKRo Sep 24, 2024
256e296
integer keysreversed
pseusys Sep 24, 2024
e2ffa0a
sql storage update function fix
pseusys Sep 24, 2024
9043dca
move context factory and pipeline fixtures to global conftest
RLKRo Sep 24, 2024
d58ce7c
unbound V from BaseModel
RLKRo Sep 24, 2024
6905bcd
remove default marker; return None by default
RLKRo Sep 24, 2024
0ac3c1e
fix key slicing
RLKRo Sep 24, 2024
3956348
use current_turn_id in check_happy_path
RLKRo Sep 24, 2024
d37c4e2
use context_factory to initialize context in non-core tests
RLKRo Sep 24, 2024
2bf82f9
fix: await misc get
RLKRo Sep 24, 2024
8a4d8be
update pipeline tutorials
RLKRo Sep 24, 2024
6404eb4
allow initializing MemoryContextStoraeg via context_storage_factory
RLKRo Sep 25, 2024
240cded
move all db tests into a single parametrized test class
RLKRo Sep 25, 2024
535d524
SQL testing fixed
pseusys Sep 27, 2024
6e0a103
Merge branch 'feat/partial_context_updates' of https://github.com/dee…
pseusys Sep 27, 2024
862e7d3
test_dbs fixed
pseusys Sep 27, 2024
e82d086
file context storages implemented
pseusys Sep 27, 2024
59f91c1
file and sql fixed
pseusys Sep 28, 2024
1c97303
async file dependency removed
pseusys Sep 30, 2024
f5ceb2f
rename delete_main_info to delete_context
RLKRo Sep 30, 2024
cf27afa
fix load_field_items typing
RLKRo Oct 1, 2024
c1a24ee
rewrite db tests
RLKRo Oct 1, 2024
f2ec013
Merge branch 'refs/heads/dev' into feat/partial_context_updates
RLKRo Oct 1, 2024
cb22d12
small None checking update
pseusys Oct 3, 2024
8ba5aed
Merge branch 'feat/partial_context_updates' of https://github.com/dee…
pseusys Oct 3, 2024
d9b95f6
tests updated
pseusys Oct 3, 2024
7277bf9
mongo done
pseusys Oct 3, 2024
e1cb50d
redis done
pseusys Oct 4, 2024
782bf66
ydb finished
pseusys Oct 4, 2024
0fb487b
raise error in abstract method
RLKRo Oct 4, 2024
ff70324
update service tests
RLKRo Oct 7, 2024
b59cf95
Merge remote-tracking branch 'origin/feat/partial_context_updates' in…
RLKRo Oct 7, 2024
d3af3b2
update lock file
RLKRo Oct 7, 2024
e38e2d4
fieldconfig removed
pseusys Oct 10, 2024
de739f2
update benchmark utils
RLKRo Oct 11, 2024
eaa8a87
aiofile reverted
pseusys Oct 13, 2024
53bf877
misc tables removed
pseusys Oct 13, 2024
7629fbc
Merge branch 'feat/partial_context_updates' of https://github.com/dee…
pseusys Oct 13, 2024
757fe48
denchmark awaiting removed
pseusys Oct 17, 2024
a001c27
Merge branch 'refs/heads/dev' into feat/partial_context_updates
RLKRo Oct 18, 2024
96d05dc
update lock file
RLKRo Oct 18, 2024
1430544
fix context size calculation
RLKRo Oct 18, 2024
403e2e1
change model_dump mode
RLKRo Oct 18, 2024
5340256
key filter implementation
pseusys Oct 21, 2024
9aad1bb
Merge branch 'feat/partial_context_updates' of https://github.com/dee…
pseusys Oct 21, 2024
b32b367
ctx_dict hashes update added
pseusys Oct 24, 2024
edc85bd
added and removed sets cleared upon storage
pseusys Oct 24, 2024
e61b1b7
Revert "key filter implementation"
RLKRo Oct 24, 2024
d114d42
sql and file logging added
pseusys Oct 28, 2024
3619125
Merge branch 'feat/partial_context_updates' of https://github.com/dee…
pseusys Oct 28, 2024
5618484
debug logging added
pseusys Oct 28, 2024
5e6e223
use standard logging practices
RLKRo Oct 30, 2024
4323871
make logging more uniform across the methods and collapse long lists
RLKRo Oct 31, 2024
93144df
fix potential error in prefix parsing
RLKRo Oct 31, 2024
83c7b33
Merge branch 'refs/heads/dev' into feat/partial_context_updates
RLKRo Oct 31, 2024
b763f21
create tmp file only for file dbs
RLKRo Nov 1, 2024
69d1520
add test for load_field_items
RLKRo Nov 2, 2024
291396f
test fix: misc no longer context dict
RLKRo Nov 2, 2024
c3d8c73
test fix: load_field_items no longer returns dict
RLKRo Nov 2, 2024
4bb6ca7
test fix: field config was removed
RLKRo Nov 2, 2024
dbbbb28
remove debug artefact
RLKRo Nov 2, 2024
710554c
all user input escapedin ydb
pseusys Nov 6, 2024
20b6b5f
ctx_dict moved
pseusys Nov 8, 2024
2b6eebf
async lock introduced
pseusys Nov 8, 2024
6c458c6
codestyle fixed
pseusys Nov 14, 2024
46e0112
Merge branch 'dev' into feat/partial_context_updates
pseusys Nov 14, 2024
e263fa1
SOME of the errors FIXED!!!
pseusys Nov 20, 2024
1f96f6d
rebuild script updated
pseusys Nov 22, 2024
ce6c8b6
turns added, empty ctx_dict method also added
pseusys Nov 22, 2024
9e7cf47
context creation field set removed
pseusys Nov 22, 2024
c34f8e7
contex storage class splitted
pseusys Nov 22, 2024
1d3859c
rebuild was cleaned (once again)
pseusys Nov 22, 2024
5514c7b
turns added and tested
pseusys Nov 25, 2024
2b9b947
splitted database methods + locks and validations
pseusys Nov 27, 2024
86d745c
insert limit removed
pseusys Nov 27, 2024
214fb92
_locks removed from subclasses
pseusys Nov 27, 2024
5a8d0d5
lazy connection
pseusys Nov 27, 2024
abbd920
uuid length and name changed
pseusys Nov 28, 2024
b9a0680
logs location changed
pseusys Nov 28, 2024
0115b83
none and empty subscript forbidden
pseusys Nov 28, 2024
0587881
names extracted to a special class
pseusys Nov 28, 2024
e756f75
set strings removed
pseusys Nov 28, 2024
61619e3
configuration name changed
pseusys Nov 28, 2024
aad2c49
literal keys instead of strings
pseusys Nov 28, 2024
539005d
loggers from SQL removed
pseusys Nov 29, 2024
2feb094
connect before load in file
pseusys Nov 29, 2024
2ac91a2
logging moved to commect
pseusys Nov 29, 2024
f4e5f33
context dict made abstract
pseusys Nov 29, 2024
68a1c5f
connect moved to pipeline.run
pseusys Nov 29, 2024
8671233
ctx_dict overloads fixed
pseusys Nov 29, 2024
48b6444
configuration renamed
pseusys Nov 29, 2024
e40786c
context_info dataclass added
pseusys Nov 29, 2024
a54df18
test-time comparison fixed
pseusys Nov 29, 2024
49d3bff
lock staticmethod extracted
pseusys Nov 29, 2024
6fd0e1a
initial locking system fixed
pseusys Nov 29, 2024
47edbda
codestyle
pseusys Nov 29, 2024
567fdb1
id_length moved to init arguments
Jan 10, 2025
5d5879d
tutorial updated
Jan 13, 2025
c19947f
Merge branch 'refs/heads/dev' into feat/partial_context_updates
RLKRo Jan 16, 2025
95b7cbe
fix: use context factory in slot partial extraction tests
RLKRo Jan 16, 2025
fcf3739
revert misc await
RLKRo Jan 16, 2025
cdca730
another documentation update
Jan 17, 2025
3d73555
Merge remote-tracking branch 'origin/feat/partial_context_updates' in…
RLKRo Jan 17, 2025
683f23d
lint fixed
Jan 20, 2025
78c7d7c
tests fixed
Jan 20, 2025
59b0aa0
mini-fix for context copy :)
Jan 23, 2025
28e2d25
synchronization attempt!
Jan 23, 2025
b5eb543
explicit connect made mandatory
Jan 23, 2025
9317cb1
context storage initialization rules relaxed
Jan 23, 2025
1441814
doc fixed and one silly error also fixed
Jan 23, 2025
52fd530
generic removed at all :(
Jan 24, 2025
7ba5b39
last generic remains removed
pseusys Jan 28, 2025
1f04c35
overloaded methods added
pseusys Jan 28, 2025
ba602b3
serializer and validator replaced
pseusys Jan 28, 2025
da1f4bc
super validate and serialize model methods made static
pseusys Jan 28, 2025
7d9bdb6
serializer and validator added to context dict generic class
pseusys Jan 28, 2025
a540fbc
naming conflict resolved
pseusys Jan 28, 2025
b1f9e68
validators names changed
pseusys Jan 28, 2025
e3e24f2
child class validator and serializer removed
pseusys Jan 28, 2025
bb6a41d
fix type adapter typing
RLKRo Jan 28, 2025
bd37d13
potential warnings fixed
pseusys Jan 28, 2025
60231e5
subscript set test added
pseusys Jan 28, 2025
c85ef7e
clear all test added and also small YDB set fix
pseusys Jan 28, 2025
0cfebbe
lint fixed
pseusys Jan 28, 2025
03ddeab
Merge remote-tracking branch 'origin/feat/partial_context_updates' in…
RLKRo Jan 29, 2025
b2dca52
remove unnecessary union
RLKRo Jan 29, 2025
e01e739
coverage increased further
pseusys Jan 31, 2025
e23de94
fix indent in context dict serialization test
RLKRo Feb 3, 2025
eaccf80
overload setitem signature
RLKRo Feb 3, 2025
e0a0996
lint
RLKRo Feb 3, 2025
0ef6d97
context guide tutorial update prototype
Feb 3, 2025
10ae3b1
formatting fixed
Feb 3, 2025
ad67ade
fix incorrect rewrite_existing flag usage & add tests
RLKRo Feb 6, 2025
6ae396a
context storing API updated
Feb 7, 2025
c67dd72
legacy API removed from tests
Feb 10, 2025
817cb67
context updating fixture updated
Feb 10, 2025
1e529b7
old API remains removed
Feb 10, 2025
502bb15
some more tests updated
Feb 10, 2025
e110c06
field validation returned
Feb 10, 2025
7baf4a9
field name fixed
Feb 10, 2025
68b2d1e
sql length updated
Feb 10, 2025
aa02ef4
mongo session introduced
Feb 10, 2025
25675be
motor asyncio transaction
Feb 10, 2025
10619a4
mongo and postgres updated AGAIN
Feb 11, 2025
50b0ce5
mongo transacrion made optional
Feb 11, 2025
0820450
import fixed
Feb 11, 2025
87ec2a0
few more tests fixed
Feb 11, 2025
3646271
one more small redis update
Feb 11, 2025
191303b
SQL column binary defaults added
Feb 11, 2025
25a0b7d
empty SQL queries not sent
Feb 11, 2025
1a6efa7
different field updates collapsed
Feb 11, 2025
f4eb249
assignment changed
Feb 11, 2025
9677882
lint applied
Feb 11, 2025
ff6661c
sql updates separated
Feb 11, 2025
9148970
docs updated and type updated
Feb 11, 2025
758c65e
add documentation for context dict fields and module
RLKRo Feb 12, 2025
55e5fff
minor changes to tutorial text
RLKRo Feb 12, 2025
86ac873
Merge remote-tracking branch 'origin/feat/partial_context_updates' in…
RLKRo Feb 12, 2025
57a147c
update rewrite existing tests
RLKRo Feb 12, 2025
882f030
update context info signature changed
Feb 13, 2025
ee2b967
context data moved away from context file
Feb 13, 2025
37bd14d
one method name corrected
Feb 13, 2025
fdb3db9
context info field naming updated
Feb 13, 2025
2213fe8
method transformed to class property
Feb 13, 2025
83bd9ad
async transaction
Feb 13, 2025
0135c57
context fields moved to utils
Feb 13, 2025
0caa820
pool size added
Feb 13, 2025
af64ae2
updated something :)
Feb 13, 2025
5ad4676
no query sessions anymore :/
Feb 13, 2025
95d92f4
newly-published fields excluded
Feb 13, 2025
b117591
lint fixed
Feb 13, 2025
9b7781e
field name fixed
Feb 13, 2025
6ca4523
docs updated
Feb 13, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@
dist/
venv/
build/
dbs/
docs/source/apiref
docs/source/release_notes.rst
docs/source/tutorials
Expand Down
2 changes: 2 additions & 0 deletions dff/context_storages/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,3 +10,5 @@
from .mongo import MongoContextStorage, mongo_available
from .shelve import ShelveContextStorage
from .protocol import PROTOCOLS, get_protocol_install_suggestion
from .context_schema import ContextSchema, ALL_ITEMS
from .serializer import DefaultSerializer
286 changes: 286 additions & 0 deletions dff/context_storages/context_schema.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,286 @@
"""
Context Schema
--------------
The `ContextSchema` module provides class for managing context storage rules.
The :py:class:`~.Context` will be stored in two instances, `CONTEXT` and `LOGS`,
that can be either files, databases or namespaces. The context itself alongsode with
pseusys marked this conversation as resolved.
Show resolved Hide resolved
several latest requests, responses and labels are stored in `CONTEXT` table,
while the older ones are kept in `LOGS` table and not accessed too often.
"""

import time
pseusys marked this conversation as resolved.
Show resolved Hide resolved
from asyncio import gather
from uuid import uuid4
from enum import Enum
from pydantic import BaseModel, Field
from typing import Any, Coroutine, List, Dict, Optional, Callable, Tuple, Union, Awaitable
from typing_extensions import Literal

from dff.script import Context

ALL_ITEMS = "__all__"
"""
The default value for all `DictSchemaField`:
it means that all keys of the dictionary or list will be read or written.
Can be used as a value of `subscript` parameter for `DictSchemaField` and `ListSchemaField`.
pseusys marked this conversation as resolved.
Show resolved Hide resolved
"""

_ReadPackedContextFunction = Callable[[str], Awaitable[Tuple[Dict, Optional[str]]]]
RLKRo marked this conversation as resolved.
Show resolved Hide resolved
RLKRo marked this conversation as resolved.
Show resolved Hide resolved
"""
Type alias of asynchronous function that should be called in order to retrieve context
data from `CONTEXT` table. Matches type of :py:func:`DBContextStorage._read_pac_ctx` method.
pseusys marked this conversation as resolved.
Show resolved Hide resolved
"""

_ReadLogContextFunction = Callable[[Optional[int], str, str], Awaitable[Dict]]
"""
Type alias of asynchronous function that should be called in order to retrieve context
data from `LOGS` table. Matches type of :py:func:`DBContextStorage._read_log_ctx` method.
"""

_WritePackedContextFunction = Callable[[Dict, int, int, str, str], Awaitable]
"""
Type alias of asynchronous function that should be called in order to write context
data to `CONTEXT` table. Matches type of :py:func:`DBContextStorage._write_pac_ctx` method.
"""

_WriteLogContextFunction = Callable[[List[Tuple[str, int, Any]], int, str], Coroutine]
pseusys marked this conversation as resolved.
Show resolved Hide resolved
"""
Type alias of asynchronous function that should be called in order to write context
data to `LOGS` table. Matches type of :py:func:`DBContextStorage._write_log_ctx` method.
"""


class SchemaField(BaseModel, validate_assignment=True):
"""
Schema for :py:class:`~.Context` fields that are dictionaries with numeric keys fields.
Used for controlling read and write policy of the particular field.
"""

name: str = Field(default_factory=str, frozen=True)
"""
`name` is the name of backing :py:class:`~.Context` field.
It can not (and should not) be changed in runtime.
"""

subscript: Union[Literal["__all__"], int] = 3
"""
`subscript` is used for limiting keys for reading and writing.
It can be a string `__all__` meaning all existing keys or number,
positive for first **N** keys and negative for last **N** keys.
pseusys marked this conversation as resolved.
Show resolved Hide resolved
Keys should be sorted as numbers.
Default: 3.
"""


class ExtraFields(str, Enum):
RLKRo marked this conversation as resolved.
Show resolved Hide resolved
"""
Enum, conaining special :py:class:`~.Context` field names.
These fields only can be used for data manipulation within context storage.
"""

active_ctx = "active_ctx"
pseusys marked this conversation as resolved.
Show resolved Hide resolved
primary_id = "_primary_id"
storage_key = "_storage_key"
created_at = "_created_at"
updated_at = "_updated_at"


class ContextSchema(BaseModel, validate_assignment=True, arbitrary_types_allowed=True):
"""
Schema, describing how :py:class:`~.Context` fields should be stored and retrieved from storage.
The default behaviour is the following: All the context data except for the fields that are
dictionaries with numeric keys is serialized and stored in `CONTEXT` **table** (that is a table
for SQL context storages only, it can also be a file or a namespace for different backends).
pseusys marked this conversation as resolved.
Show resolved Hide resolved
For the dictionaries with numeric keys, their entries are sorted according by key and the last
pseusys marked this conversation as resolved.
Show resolved Hide resolved
few are included into `CONTEXT` table, while the rest are stored in `LOGS` table.

That behaviour allows context storage to minimize the operation number for context reading and
writing.
"""

requests: SchemaField = Field(default_factory=lambda: SchemaField(name="requests"), frozen=True)
"""
Field for storing Context field `requests`.
pseusys marked this conversation as resolved.
Show resolved Hide resolved
"""

responses: SchemaField = Field(default_factory=lambda: SchemaField(name="responses"), frozen=True)
"""
Field for storing Context field `responses`.
"""

labels: SchemaField = Field(default_factory=lambda: SchemaField(name="labels"), frozen=True)
"""
Field for storing Context field `labels`.
"""

append_single_log: bool = True
"""
If set will *not* write only one value to LOGS table each turn.

Example:
If `labels` field contains 7 entries and its subscript equals 3, (that means that 4 labels
were added during current turn), if `duplicate_context_in_logs` is set to False:
RLKRo marked this conversation as resolved.
Show resolved Hide resolved

- If `append_single_log` is True:
only the first label will be written to `LOGS`.
- If `append_single_log` is False:
all 4 first labels will be written to `LOGS`.

"""

duplicate_context_in_logs: bool = False
"""
If set will *always* backup all items in `CONTEXT` table in `LOGS` table

Example:
If `labels` field contains 7 entries and its subscript equals 3 and `append_single_log`
is set to False:

- If `duplicate_context_in_logs` is False:
the last 3 entries will be stored in `CONTEXT` table and 4 first will be stored in `LOGS`.
- If `duplicate_context_in_logs` is True:
the last 3 entries will be stored in `CONTEXT` table and all 7 will be stored in `LOGS`.

"""

supports_async: bool = False
"""
If set will try to perform *some* operations asynchronously.

WARNING! Be careful with this flag. Some databases support asynchronous reads and writes,
and some do not. For all `DFF` context storages it will be set automatically.
pseusys marked this conversation as resolved.
Show resolved Hide resolved
Change it only if you implement a custom context storage.
"""

def __init__(self, **kwargs):
super().__init__(**kwargs)

async def read_context(
self, pac_reader: _ReadPackedContextFunction, log_reader: _ReadLogContextFunction, storage_key: str
) -> Context:
"""
Read context from storage.
Calculate what fields to read, call reader function and cast result to context.
Also set `primary_id` and `storage_key` attributes of the read context.

:param pac_reader: the function used for reading context from
`CONTEXT` table (see :py:const:`~._ReadPackedContextFunction`).
:param log_reader: the function used for reading context from
`LOGS` table (see :py:const:`~._ReadLogContextFunction`).
:param storage_key: the key the context is stored with.

:return: the read :py:class:`~.Context` object.
"""
ctx_dict, primary_id = await pac_reader(storage_key)
if primary_id is None:
raise KeyError(f"No entry for key {primary_id}.")

tasks = dict()
for field_props in [value for value in dict(self).values() if isinstance(value, SchemaField)]:
field_name = field_props.name
nest_dict: Dict[int, Any] = ctx_dict[field_name]
if isinstance(field_props.subscript, int):
sorted_dict = sorted(list(nest_dict.keys()))
last_read_key = sorted_dict[-1] if len(sorted_dict) > 0 else 0
if len(nest_dict) > field_props.subscript:
limit = -field_props.subscript
last_keys = sorted(nest_dict.keys())[limit:]
ctx_dict[field_name] = {k: v for k, v in nest_dict.items() if k in last_keys}
elif len(nest_dict) < field_props.subscript and last_read_key > field_props.subscript:
pseusys marked this conversation as resolved.
Show resolved Hide resolved
limit = field_props.subscript - len(nest_dict)
tasks[field_name] = log_reader(limit, field_name, primary_id)
else:
tasks[field_name] = log_reader(None, field_name, primary_id)

if self.supports_async:
tasks = dict(zip(tasks.keys(), await gather(*tasks.values())))
else:
tasks = {key: await task for key, task in tasks.items()}

for field_name in tasks.keys():
log_dict = {k: v for k, v in tasks[field_name].items()}
ctx_dict[field_name].update(log_dict)
pseusys marked this conversation as resolved.
Show resolved Hide resolved

ctx = Context.cast(ctx_dict)
setattr(ctx, ExtraFields.primary_id.value, primary_id)
setattr(ctx, ExtraFields.storage_key.value, storage_key)
return ctx

async def write_context(
self,
ctx: Context,
pac_writer: _WritePackedContextFunction,
log_writer: _WriteLogContextFunction,
storage_key: str,
chunk_size: Union[Literal[False], int] = False,
pseusys marked this conversation as resolved.
Show resolved Hide resolved
):
"""
Write context to storage.
Calculate what fields to write, split large data into chunks if needed and call writer function.
Also update `updated_at` attribute of the given context with current time, set `primary_id` and `storage_key`.

:param ctx: the context to store.
:param pac_writer: the function used for writing context to
`CONTEXT` table (see :py:const:`~._WritePackedContextFunction`).
:param log_writer: the function used for writing context to
`LOGS` table (see :py:const:`~._WriteLogContextFunction`).
:param storage_key: the key to store the context with.
:param chunk_size: maximum number of items that can be inserted simultaneously, False if no such limit exists.

:return: the read :py:class:`~.Context` object.
"""
updated_at = time.time_ns()
setattr(ctx, ExtraFields.updated_at.value, updated_at)
created_at = getattr(ctx, ExtraFields.created_at.value, updated_at)

ctx_dict = ctx.model_dump()
logs_dict = dict()
primary_id = getattr(ctx, ExtraFields.primary_id.value, str(uuid4()))
RLKRo marked this conversation as resolved.
Show resolved Hide resolved

for field_props in [value for value in dict(self).values() if isinstance(value, SchemaField)]:
nest_dict = ctx_dict[field_props.name]
last_keys = sorted(nest_dict.keys())

if (
self.append_single_log
and isinstance(field_props.subscript, int)
and len(nest_dict) > field_props.subscript
):
unfit = -field_props.subscript - 1
pseusys marked this conversation as resolved.
Show resolved Hide resolved
logs_dict[field_props.name] = {last_keys[unfit]: nest_dict[last_keys[unfit]]}
else:
if self.duplicate_context_in_logs or not isinstance(field_props.subscript, int):
logs_dict[field_props.name] = nest_dict
else:
limit = -field_props.subscript
logs_dict[field_props.name] = {key: nest_dict[key] for key in last_keys[:limit]}

if isinstance(field_props.subscript, int):
limit = -field_props.subscript
last_keys = last_keys[limit:]

ctx_dict[field_props.name] = {k: v for k, v in nest_dict.items() if k in last_keys}

await pac_writer(ctx_dict, created_at, updated_at, storage_key, primary_id)

flattened_dict: List[Tuple[str, int, Dict]] = list()
for field, payload in logs_dict.items():
for key, value in payload.items():
flattened_dict += [(field, key, value)]
if len(flattened_dict) > 0:
if not bool(chunk_size):
await log_writer(flattened_dict, updated_at, primary_id)
RLKRo marked this conversation as resolved.
Show resolved Hide resolved
else:
tasks = list()
for ch in range(0, len(flattened_dict), chunk_size):
next_ch = ch + chunk_size
chunk = flattened_dict[ch:next_ch]
tasks += [log_writer(chunk, updated_at, primary_id)]
if self.supports_async:
await gather(*tasks)
pseusys marked this conversation as resolved.
Show resolved Hide resolved
else:
for task in tasks:
await task

setattr(ctx, ExtraFields.primary_id.value, primary_id)
setattr(ctx, ExtraFields.storage_key.value, storage_key)
Loading