Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Partial context updates #93

Open
wants to merge 392 commits into
base: dev
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 112 commits
Commits
Show all changes
392 commits
Select commit Hold shift + click to select a range
39d0da7
file-based
pseusys Jul 19, 2023
1ca66ed
with_stem removed
pseusys Jul 19, 2023
9bb3eb7
ydb ??? again??
pseusys Jul 19, 2023
a8c6497
len and prune
pseusys Jul 21, 2023
2bbf6e4
redis delete number of args changed
pseusys Jul 21, 2023
7aefa5b
Update community.rst, revert some changes
pseusys Jul 21, 2023
6fa0542
one line reverted
pseusys Jul 21, 2023
fa9359f
double serialization removed
pseusys Jul 21, 2023
9fdf5bd
no_dependencies_tests_fixed
pseusys Jul 21, 2023
c70157a
serializer changed
pseusys Jul 21, 2023
05f0d94
serializer unchanged (example)
pseusys Jul 21, 2023
95ba296
partial tutorials started
pseusys Jul 30, 2023
cd020c9
context storages made async
pseusys Jul 31, 2023
687ba7e
tutorials added
pseusys Jul 31, 2023
425a744
example context storage removed
pseusys Jul 31, 2023
2403aed
docs added
pseusys Aug 1, 2023
b4546b4
storages docs updated
pseusys Aug 1, 2023
bdda5ff
reviewed problems fixed
pseusys Aug 2, 2023
414e4a0
file-based dbs made sync
pseusys Aug 3, 2023
e5357fc
quickle removed
pseusys Aug 3, 2023
edeb376
Excessive description removed
pseusys Aug 4, 2023
895011c
Merge branch 'dev' into feat/partial_context_updates
pseusys Aug 4, 2023
78d2ccc
migrated to pydantic 2.0
pseusys Aug 4, 2023
d4bff86
Documentation building fixes (#186)
pseusys Aug 4, 2023
821713c
add patch for json context storage
ruthenian8 Aug 4, 2023
fca7c42
json storage fixed
pseusys Aug 4, 2023
33dabc8
Merge branch 'feat/partial_context_updates' of https://github.com/dee…
pseusys Aug 4, 2023
c5ad6d5
test pickle save and load with logging
pseusys Aug 4, 2023
8deaabd
timestamp conversion test for windows
pseusys Aug 7, 2023
cbe7c70
time in nanoseconds for windows
pseusys Aug 7, 2023
b14239e
ok ok windows take this
pseusys Aug 7, 2023
ed888d4
some other idea to trick windows
pseusys Aug 7, 2023
998fb2c
excessive logging removed
pseusys Aug 8, 2023
12f938e
config dicts fixed + module docstrings added
pseusys Aug 9, 2023
dbc8928
linting and formatting fixed
pseusys Aug 9, 2023
ab43a98
s's removed from docstrings
pseusys Aug 10, 2023
cc18acc
Merge branch 'dev' into feat/partial_context_updates
pseusys Aug 10, 2023
7f850ee
type defined
pseusys Aug 11, 2023
9fe28c9
property docstring added
pseusys Aug 14, 2023
6e4eb75
Merge branch 'dev' into feat/partial_context_updates
pseusys Aug 30, 2023
c25c48d
dff installation cell added to tutorial 8
pseusys Aug 30, 2023
6856ee5
shelve improved
pseusys Sep 5, 2023
5314e31
partial review reaction
pseusys Sep 19, 2023
7f1835e
more documentation added
pseusys Sep 20, 2023
cd76105
finished review
pseusys Sep 24, 2023
4238f9b
Merge branch 'dev' into feat/partial_context_updates
RLKRo Mar 22, 2024
50cda47
put benchmark tutorial after partial updates one
RLKRo Mar 22, 2024
4cc055a
Merge branch 'dev' into feat/partial_context_updates
pseusys Jul 4, 2024
7f77c8f
context storages updated
pseusys Jul 4, 2024
2617255
old naming reset
pseusys Jul 4, 2024
4fb8f67
context merge fixed
pseusys Jul 4, 2024
1230d16
context ids removed
pseusys Jul 4, 2024
c3d82da
context equality tested
pseusys Jul 4, 2024
0bd6347
framework data comparison removed
pseusys Jul 4, 2024
4a15bf0
context id removed from everywhere
pseusys Jul 4, 2024
9b3dd80
lint applied
pseusys Jul 4, 2024
4f0562a
documentation building fixed
pseusys Jul 4, 2024
ef0a9ee
RST syntax fixed
pseusys Jul 4, 2024
3d364bc
context dict added
pseusys Jul 29, 2024
e7ad269
async + pydantic
pseusys Jul 30, 2024
be34714
fixes
pseusys Jul 31, 2024
b8701a0
hashes manipulation only on `write_full_diff`
pseusys Jul 31, 2024
a58eace
ctx_dict + ctx updated
pseusys Aug 5, 2024
33f2823
setting removed
pseusys Aug 5, 2024
c4f9fce
sets added
pseusys Aug 6, 2024
e892a52
serialization added, sample context storage class created
pseusys Aug 6, 2024
1b8aa0d
iterative async access made synchronous
pseusys Aug 6, 2024
173b1fe
sql prototype
pseusys Aug 6, 2024
9665038
context API updated proposal
pseusys Aug 7, 2024
3468af5
context schema and serializer removed
pseusys Aug 7, 2024
71bd9f3
context API updated once again
pseusys Aug 7, 2024
2e6b334
review notes fixed
pseusys Aug 8, 2024
830ea40
ContextDictView made mutable
pseusys Aug 8, 2024
5d3dd95
context dict file split
pseusys Aug 8, 2024
f00ba02
turn introduction reverted
pseusys Aug 8, 2024
1af24db
turns separated (again)
pseusys Aug 13, 2024
3616ac0
key deletion now nullifies value
pseusys Aug 13, 2024
81ce7ba
memory storage
pseusys Aug 16, 2024
1f9e653
ctx_dict tests done
pseusys Aug 17, 2024
c981cc5
general context storages tests created
pseusys Aug 27, 2024
5002dda
ctx_dict updated not to use serializer
pseusys Sep 18, 2024
3e6a8f4
merge dev
RLKRo Sep 18, 2024
6991fb6
Merge branch 'refs/heads/dev' into feat/partial_context_updates
RLKRo Sep 18, 2024
5b80818
fix imports in newly added files
RLKRo Sep 19, 2024
96af9bc
hide circular imports behind type checking
RLKRo Sep 19, 2024
000fb0d
fix imports in test files
RLKRo Sep 19, 2024
2c2ab9d
merge context.init into context.connected
RLKRo Sep 19, 2024
2eb5a2c
remove get_last_index imports
RLKRo Sep 19, 2024
06d54b9
update pipeline.context_storage type
RLKRo Sep 19, 2024
f80e6a3
fix bug with setting sequence type values under a single key
RLKRo Sep 19, 2024
c5311f6
revert primary_id renaming
RLKRo Sep 19, 2024
d43752a
memory test (almost!) finished
pseusys Sep 23, 2024
1ae3e4f
ctx_dict tests fixed
pseusys Sep 23, 2024
85315a6
add overload for getitem
RLKRo Sep 23, 2024
351a43e
split typevar definitions
RLKRo Sep 23, 2024
e9eb2fb
remove asyncio mark
RLKRo Sep 23, 2024
6d93399
allow using negative indexes for context dict
RLKRo Sep 23, 2024
e2053dc
add validation on setitem for context dict
RLKRo Sep 24, 2024
acdcd3c
fixes
RLKRo Sep 24, 2024
16a3d77
allow non-str context ids
RLKRo Sep 24, 2024
9a76ae3
add current_turn_id
RLKRo Sep 24, 2024
5e37651
fix tests
RLKRo Sep 24, 2024
d376e49
update doc
RLKRo Sep 24, 2024
256e296
integer keysreversed
pseusys Sep 24, 2024
e2ffa0a
sql storage update function fix
pseusys Sep 24, 2024
9043dca
move context factory and pipeline fixtures to global conftest
RLKRo Sep 24, 2024
d58ce7c
unbound V from BaseModel
RLKRo Sep 24, 2024
6905bcd
remove default marker; return None by default
RLKRo Sep 24, 2024
0ac3c1e
fix key slicing
RLKRo Sep 24, 2024
3956348
use current_turn_id in check_happy_path
RLKRo Sep 24, 2024
d37c4e2
use context_factory to initialize context in non-core tests
RLKRo Sep 24, 2024
2bf82f9
fix: await misc get
RLKRo Sep 24, 2024
8a4d8be
update pipeline tutorials
RLKRo Sep 24, 2024
6404eb4
allow initializing MemoryContextStoraeg via context_storage_factory
RLKRo Sep 25, 2024
240cded
move all db tests into a single parametrized test class
RLKRo Sep 25, 2024
535d524
SQL testing fixed
pseusys Sep 27, 2024
6e0a103
Merge branch 'feat/partial_context_updates' of https://github.com/dee…
pseusys Sep 27, 2024
862e7d3
test_dbs fixed
pseusys Sep 27, 2024
e82d086
file context storages implemented
pseusys Sep 27, 2024
59f91c1
file and sql fixed
pseusys Sep 28, 2024
1c97303
async file dependency removed
pseusys Sep 30, 2024
f5ceb2f
rename delete_main_info to delete_context
RLKRo Sep 30, 2024
cf27afa
fix load_field_items typing
RLKRo Oct 1, 2024
c1a24ee
rewrite db tests
RLKRo Oct 1, 2024
f2ec013
Merge branch 'refs/heads/dev' into feat/partial_context_updates
RLKRo Oct 1, 2024
cb22d12
small None checking update
pseusys Oct 3, 2024
8ba5aed
Merge branch 'feat/partial_context_updates' of https://github.com/dee…
pseusys Oct 3, 2024
d9b95f6
tests updated
pseusys Oct 3, 2024
7277bf9
mongo done
pseusys Oct 3, 2024
e1cb50d
redis done
pseusys Oct 4, 2024
782bf66
ydb finished
pseusys Oct 4, 2024
0fb487b
raise error in abstract method
RLKRo Oct 4, 2024
ff70324
update service tests
RLKRo Oct 7, 2024
b59cf95
Merge remote-tracking branch 'origin/feat/partial_context_updates' in…
RLKRo Oct 7, 2024
d3af3b2
update lock file
RLKRo Oct 7, 2024
e38e2d4
fieldconfig removed
pseusys Oct 10, 2024
de739f2
update benchmark utils
RLKRo Oct 11, 2024
eaa8a87
aiofile reverted
pseusys Oct 13, 2024
53bf877
misc tables removed
pseusys Oct 13, 2024
7629fbc
Merge branch 'feat/partial_context_updates' of https://github.com/dee…
pseusys Oct 13, 2024
757fe48
denchmark awaiting removed
pseusys Oct 17, 2024
a001c27
Merge branch 'refs/heads/dev' into feat/partial_context_updates
RLKRo Oct 18, 2024
96d05dc
update lock file
RLKRo Oct 18, 2024
1430544
fix context size calculation
RLKRo Oct 18, 2024
403e2e1
change model_dump mode
RLKRo Oct 18, 2024
5340256
key filter implementation
pseusys Oct 21, 2024
9aad1bb
Merge branch 'feat/partial_context_updates' of https://github.com/dee…
pseusys Oct 21, 2024
b32b367
ctx_dict hashes update added
pseusys Oct 24, 2024
edc85bd
added and removed sets cleared upon storage
pseusys Oct 24, 2024
e61b1b7
Revert "key filter implementation"
RLKRo Oct 24, 2024
d114d42
sql and file logging added
pseusys Oct 28, 2024
3619125
Merge branch 'feat/partial_context_updates' of https://github.com/dee…
pseusys Oct 28, 2024
5618484
debug logging added
pseusys Oct 28, 2024
5e6e223
use standard logging practices
RLKRo Oct 30, 2024
4323871
make logging more uniform across the methods and collapse long lists
RLKRo Oct 31, 2024
93144df
fix potential error in prefix parsing
RLKRo Oct 31, 2024
83c7b33
Merge branch 'refs/heads/dev' into feat/partial_context_updates
RLKRo Oct 31, 2024
b763f21
create tmp file only for file dbs
RLKRo Nov 1, 2024
69d1520
add test for load_field_items
RLKRo Nov 2, 2024
291396f
test fix: misc no longer context dict
RLKRo Nov 2, 2024
c3d8c73
test fix: load_field_items no longer returns dict
RLKRo Nov 2, 2024
4bb6ca7
test fix: field config was removed
RLKRo Nov 2, 2024
dbbbb28
remove debug artefact
RLKRo Nov 2, 2024
710554c
all user input escapedin ydb
pseusys Nov 6, 2024
20b6b5f
ctx_dict moved
pseusys Nov 8, 2024
2b6eebf
async lock introduced
pseusys Nov 8, 2024
6c458c6
codestyle fixed
pseusys Nov 14, 2024
46e0112
Merge branch 'dev' into feat/partial_context_updates
pseusys Nov 14, 2024
e263fa1
SOME of the errors FIXED!!!
pseusys Nov 20, 2024
1f96f6d
rebuild script updated
pseusys Nov 22, 2024
ce6c8b6
turns added, empty ctx_dict method also added
pseusys Nov 22, 2024
9e7cf47
context creation field set removed
pseusys Nov 22, 2024
c34f8e7
contex storage class splitted
pseusys Nov 22, 2024
1d3859c
rebuild was cleaned (once again)
pseusys Nov 22, 2024
5514c7b
turns added and tested
pseusys Nov 25, 2024
2b9b947
splitted database methods + locks and validations
pseusys Nov 27, 2024
86d745c
insert limit removed
pseusys Nov 27, 2024
214fb92
_locks removed from subclasses
pseusys Nov 27, 2024
5a8d0d5
lazy connection
pseusys Nov 27, 2024
abbd920
uuid length and name changed
pseusys Nov 28, 2024
b9a0680
logs location changed
pseusys Nov 28, 2024
0115b83
none and empty subscript forbidden
pseusys Nov 28, 2024
0587881
names extracted to a special class
pseusys Nov 28, 2024
e756f75
set strings removed
pseusys Nov 28, 2024
61619e3
configuration name changed
pseusys Nov 28, 2024
aad2c49
literal keys instead of strings
pseusys Nov 28, 2024
539005d
loggers from SQL removed
pseusys Nov 29, 2024
2feb094
connect before load in file
pseusys Nov 29, 2024
2ac91a2
logging moved to commect
pseusys Nov 29, 2024
f4e5f33
context dict made abstract
pseusys Nov 29, 2024
68a1c5f
connect moved to pipeline.run
pseusys Nov 29, 2024
8671233
ctx_dict overloads fixed
pseusys Nov 29, 2024
48b6444
configuration renamed
pseusys Nov 29, 2024
e40786c
context_info dataclass added
pseusys Nov 29, 2024
a54df18
test-time comparison fixed
pseusys Nov 29, 2024
49d3bff
lock staticmethod extracted
pseusys Nov 29, 2024
6fd0e1a
initial locking system fixed
pseusys Nov 29, 2024
47edbda
codestyle
pseusys Nov 29, 2024
567fdb1
id_length moved to init arguments
Jan 10, 2025
5d5879d
tutorial updated
Jan 13, 2025
c19947f
Merge branch 'refs/heads/dev' into feat/partial_context_updates
RLKRo Jan 16, 2025
95b7cbe
fix: use context factory in slot partial extraction tests
RLKRo Jan 16, 2025
fcf3739
revert misc await
RLKRo Jan 16, 2025
cdca730
another documentation update
Jan 17, 2025
3d73555
Merge remote-tracking branch 'origin/feat/partial_context_updates' in…
RLKRo Jan 17, 2025
683f23d
lint fixed
Jan 20, 2025
78c7d7c
tests fixed
Jan 20, 2025
59b0aa0
mini-fix for context copy :)
Jan 23, 2025
28e2d25
synchronization attempt!
Jan 23, 2025
b5eb543
explicit connect made mandatory
Jan 23, 2025
9317cb1
context storage initialization rules relaxed
Jan 23, 2025
1441814
doc fixed and one silly error also fixed
Jan 23, 2025
52fd530
generic removed at all :(
Jan 24, 2025
7ba5b39
last generic remains removed
pseusys Jan 28, 2025
1f04c35
overloaded methods added
pseusys Jan 28, 2025
ba602b3
serializer and validator replaced
pseusys Jan 28, 2025
da1f4bc
super validate and serialize model methods made static
pseusys Jan 28, 2025
7d9bdb6
serializer and validator added to context dict generic class
pseusys Jan 28, 2025
a540fbc
naming conflict resolved
pseusys Jan 28, 2025
b1f9e68
validators names changed
pseusys Jan 28, 2025
e3e24f2
child class validator and serializer removed
pseusys Jan 28, 2025
bd37d13
potential warnings fixed
pseusys Jan 28, 2025
60231e5
subscript set test added
pseusys Jan 28, 2025
c85ef7e
clear all test added and also small YDB set fix
pseusys Jan 28, 2025
0cfebbe
lint fixed
pseusys Jan 28, 2025
e01e739
coverage increased further
pseusys Jan 31, 2025
0ef6d97
context guide tutorial update prototype
Feb 3, 2025
10ae3b1
formatting fixed
Feb 3, 2025
6ae396a
context storing API updated
Feb 7, 2025
c67dd72
legacy API removed from tests
Feb 10, 2025
817cb67
context updating fixture updated
Feb 10, 2025
1e529b7
old API remains removed
Feb 10, 2025
502bb15
some more tests updated
Feb 10, 2025
e110c06
field validation returned
Feb 10, 2025
7baf4a9
field name fixed
Feb 10, 2025
68b2d1e
sql length updated
Feb 10, 2025
aa02ef4
mongo session introduced
Feb 10, 2025
25675be
motor asyncio transaction
Feb 10, 2025
10619a4
mongo and postgres updated AGAIN
Feb 11, 2025
50b0ce5
mongo transacrion made optional
Feb 11, 2025
0820450
import fixed
Feb 11, 2025
87ec2a0
few more tests fixed
Feb 11, 2025
3646271
one more small redis update
Feb 11, 2025
191303b
SQL column binary defaults added
Feb 11, 2025
25a0b7d
empty SQL queries not sent
Feb 11, 2025
1a6efa7
different field updates collapsed
Feb 11, 2025
f4eb249
assignment changed
Feb 11, 2025
9677882
lint applied
Feb 11, 2025
ff6661c
sql updates separated
Feb 11, 2025
9148970
docs updated and type updated
Feb 11, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions dff/context_storages/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,3 +10,4 @@
from .mongo import MongoContextStorage, mongo_available
from .shelve import ShelveContextStorage
from .protocol import PROTOCOLS, get_protocol_install_suggestion
from .context_schema import ContextSchema, SchemaFieldReadPolicy, SchemaFieldWritePolicy, ALL_ITEMS
346 changes: 346 additions & 0 deletions dff/context_storages/context_schema.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,346 @@
import time
pseusys marked this conversation as resolved.
Show resolved Hide resolved
from hashlib import sha256
from enum import Enum
import uuid
from pydantic import BaseModel, Field
from typing import Dict, List, Optional, Tuple, Callable, Any, Union, Awaitable, Hashable
from typing_extensions import Literal

from dff.script import Context

ALL_ITEMS = "__all__"
"""
`__all__` - the default value for all `DictSchemaField`s:
it means that all keys of the dictionary or list will be read or written.
Can be used as a value of `subscript` parameter for `DictSchemaField`s and `ListSchemaField`s.
"""


class SchemaFieldReadPolicy(str, Enum):
"""
Read policy of context field.
The following policies are supported:

- READ: the context field is read from the context storage (default),
- IGNORE: the context field is completely ignored in runtime
(it can be still used with other tools for accessing database, like statistics).
"""

READ = "read"
IGNORE = "ignore"


class SchemaFieldWritePolicy(str, Enum):
"""
Write policy of context field.
The following policies are supported:

- IGNORE: the context field is completely ignored in runtime,
- UPDATE: the context field is unconditionally updated every time (default for `ValueSchemaField`s),
- HASH_UPDATE: the context field is updated only if it differs from the value in storage
(sha256 will be used to calculate difference, for dictionary the difference is calculated key-wise),
- APPEND: the context field will be updated only if it doesn't exist in storage
(for dictionary only the missing keys will be added).
"""

IGNORE = "ignore"
UPDATE = "update"
HASH_UPDATE = "hash_update"
APPEND = "append"


FieldDescriptor = Union[Dict[str, Tuple[Any, bool]], Tuple[Any, bool]]
"""
Field descriptor type.
It contains data and boolean (if writing of data should be enforced).
Field can be dictionary or single value.
In case if the field is a dictionary:
field descriptior is the dictionary; to each value the enforced boolean is added (each value is a tuple).
In case if the field is a value:
field descriptior is the tuple of the value and enforced boolean.
"""

_ReadContextFunction = Callable[[Dict[str, Union[bool, int, List[Hashable]]], str], Awaitable[Dict]]
"""
Context reader function type.
The function accepts subscript, that is a dict, where keys context field names to read.
The dict values are:
- booleans: that means that the whole field should be read (`True`) or ignored (`False`),
- ints: that means that if the field is a dict, only **N** first keys should be read
if **N** is positive, else last **N** keys. Keys should be sorted as numbers if they are numeric
or lexicographically if at least some of them are strings,
- list: that means that only keys that belong to the list should be read, others should be ignored.
The function is asynchronous, it returns dictionary representation of Context.
"""

_WriteContextFunction = Callable[[Optional[str], FieldDescriptor, bool, str], Awaitable]
"""
Context writer function type.
The function will be called multiple times: once for each dictionary field of Context.
It will be called once more for the whole context itself for writing its' value fields.
The function accepts:
- field name: string, the name of field to write, None if writing the whole context,
- field descriptor: dictionary, representing data to be written and if writing of the data should be enforced,
- nested flag: boolean, `True` if writing dictionary field of Context, `False` if writing the Context itself,
- primary id: string primary identificator of the context.
The function is asynchronous, it returns None.
"""


class BaseSchemaField(BaseModel):
"""
Base class for context field schema.
Used for controlling read / write policy of the particular field.
"""

name: str = Field("", allow_mutation=False)
"""
`name` is the name of backing Context field.
It can not (and should not) be changed in runtime.
"""
on_read: SchemaFieldReadPolicy = SchemaFieldReadPolicy.READ
"""
`on_read` is the default field read policy.
Default: :py:const:`~.SchemaFieldReadPolicy.READ`.
"""
on_write: SchemaFieldWritePolicy = SchemaFieldWritePolicy.IGNORE
"""
`on_write` is the default field write policy.
Default: :py:const:`~.SchemaFieldReadPolicy.IGNORE`.
"""

class Config:
validate_assignment = True


class ListSchemaField(BaseSchemaField):
"""
Schema for context fields that are dictionaries with numeric keys fields.
"""

on_write: SchemaFieldWritePolicy = SchemaFieldWritePolicy.APPEND
"""
Default: :py:const:`~.SchemaFieldReadPolicy.APPEND`.
"""
subscript: Union[Literal["__all__"], int] = -3
"""
`subscript` is used for limiting keys for reading and writing.
It can be a string `__all__` meaning all existing keys or number,
positive for first **N** keys and negative for last **N** keys.
pseusys marked this conversation as resolved.
Show resolved Hide resolved
Keys should be sorted as numbers.
Default: -3.
pseusys marked this conversation as resolved.
Show resolved Hide resolved
"""


class DictSchemaField(BaseSchemaField):
"""
Schema for context fields that are dictionaries with string keys fields.
"""

on_write: SchemaFieldWritePolicy = SchemaFieldWritePolicy.HASH_UPDATE
"""
Default: :py:const:`~.SchemaFieldReadPolicy.HASH_UPDATE`.
"""
subscript: Union[Literal["__all__"], List[Hashable]] = ALL_ITEMS
"""
`subscript` is used for limiting keys for reading and writing.
It can be a string `__all__` meaning all existing keys or number,
positive for first **N** keys and negative for last **N** keys.
Keys should be sorted as lexicographically.
Default: `__all__`.
"""


class ValueSchemaField(BaseSchemaField):
"""
Schema for context fields that aren't dictionaries.
"""

on_write: SchemaFieldWritePolicy = SchemaFieldWritePolicy.UPDATE
"""
Default: :py:const:`~.SchemaFieldReadPolicy.UPDATE`.
"""


class FrozenValueSchemaField(ValueSchemaField):
"""
Immutable schema for context fields that aren't dictionaries.
Schema should be used for keys that are used to keep database integrity
and whose policies shouldn't be changed by user.
"""

class Config:
allow_mutation = False


class ExtraFields(str, Enum):
"""
Enum, conaining special :py:class:`dff.script.Context` field names.
These fields only can be used for data manipulation within context storage.
"""

primary_id = "primary_id"
storage_key = "_storage_key"
active_ctx = "active_ctx"
created_at = "created_at"
updated_at = "updated_at"


class ContextSchema(BaseModel):
pseusys marked this conversation as resolved.
Show resolved Hide resolved
"""
Schema, describing how :py:class:`dff.script.Context` fields should be stored and retrieved from storage.
Allows fields ignoring, filtering, sorting and partial reading and writing of dictionary fields.
"""

active_ctx: ValueSchemaField = Field(FrozenValueSchemaField(name=ExtraFields.active_ctx), allow_mutation=False)
"""
Special field for marking currently active context.
Not active contexts are still stored in storage for statistical purposes.
Properties of this field can't be changed.
"""
storage_key: ValueSchemaField = Field(FrozenValueSchemaField(name=ExtraFields.storage_key), allow_mutation=False)
"""
Special field for key under that the context was stored (Context property `storage_key`).
Properties of this field can't be changed.
"""
requests: ListSchemaField = ListSchemaField(name="requests")
"""
Field for storing Context field `requests`.
"""
responses: ListSchemaField = ListSchemaField(name="responses")
"""
Field for storing Context field `responses`.
"""
labels: ListSchemaField = ListSchemaField(name="labels")
"""
Field for storing Context field `labels`.
"""
misc: DictSchemaField = DictSchemaField(name="misc")
"""
Field for storing Context field `misc`.
"""
framework_states: DictSchemaField = DictSchemaField(name="framework_states")
"""
Field for storing Context field `framework_states`.
"""
created_at: ValueSchemaField = ValueSchemaField(name=ExtraFields.created_at, on_write=SchemaFieldWritePolicy.APPEND)
"""
Special field for keeping track of time the context was first time stored.
"""
updated_at: ValueSchemaField = ValueSchemaField(name=ExtraFields.updated_at)
"""
Special field for keeping track of time the context was last time updated.
"""

class Config:
validate_assignment = True

def _calculate_hashes(self, value: Union[Dict[str, Any], Any]) -> Union[Dict[str, Any], Hashable]:
"""
Calculate hashes for a context field: single hashes for value fields
and dictionary of hashes for dictionary fields.
"""
if isinstance(value, dict):
return {k: sha256(str(v).encode("utf-8")) for k, v in value.items()}
else:
return sha256(str(value).encode("utf-8"))

async def read_context(
self, ctx_reader: _ReadContextFunction, storage_key: str, primary_id: str
) -> Tuple[Context, Dict]:
"""
Read context from storage.
Calculate what fields (and what keys of what fields) to read, call reader function and cast result to context.
`ctx_reader` - the function used for context reading from a storage (see :py:const:`~._ReadContextFunction`).
`storage_key` - the key the context is stored with (used in cases when the key is not preserved in storage).
`primary_id` - the context unique identifier.
returns tuple of context and context hashes
(hashes should be kept and passed to :py:func:`~.ContextSchema.write_context`).
"""
fields_subscript = dict()

field_props: BaseSchemaField
for field_props in dict(self).values():
field = field_props.name
if field_props.on_read == SchemaFieldReadPolicy.IGNORE:
fields_subscript[field] = False
elif isinstance(field_props, ListSchemaField) or isinstance(field_props, DictSchemaField):
fields_subscript[field] = field_props.subscript
else:
fields_subscript[field] = True

hashes = dict()
ctx_dict = await ctx_reader(fields_subscript, primary_id)
for key in ctx_dict.keys():
hashes[key] = self._calculate_hashes(ctx_dict[key])

ctx = Context.cast(ctx_dict)
ctx.__setattr__(ExtraFields.storage_key.value, storage_key)
return ctx, hashes

async def write_context(
self,
ctx: Context,
hashes: Optional[Dict],
val_writer: _WriteContextFunction,
storage_key: str,
primary_id: Optional[str],
chunk_size: Union[Literal[False], int] = False,
pseusys marked this conversation as resolved.
Show resolved Hide resolved
) -> str:
"""
Write context to storage.
Calculate what fields (and what keys of what fields) to write,
split large data into chunks if needed and call writer function.
`ctx` - the context to write.
`hashes` - hashes calculated for context during previous reading,
used only for :py:const:`~.SchemaFieldReadPolicy.UPDATE_HASHES`.
`val_writer` - the function used for context writing to a storage (see :py:const:`~._WriteContextFunction`).
`storage_key` - the key the context is stored with.
`primary_id` - the context unique identifier,
should be None if this is the first time writing this context,
otherwise the context will be overwritten.
`chunk_size` - chunk size for large dictionaries writing,
should be set to integer in case the storage has any writing query limitations,
otherwise should be boolean `False` or number `0`.
returns string, the context primary id.
"""
ctx.__setattr__(ExtraFields.storage_key.value, storage_key)
ctx_dict = ctx.dict()
primary_id = str(uuid.uuid4()) if primary_id is None else primary_id

ctx_dict[ExtraFields.storage_key.value] = storage_key
ctx_dict[self.active_ctx.name] = True
ctx_dict[self.created_at.name] = ctx_dict[self.updated_at.name] = time.time_ns()

flat_values = dict()
field_props: BaseSchemaField
for field_props in dict(self).values():
field = field_props.name
update_values = ctx_dict[field]
update_nested = not isinstance(field_props, ValueSchemaField)
if field_props.on_write == SchemaFieldWritePolicy.IGNORE:
continue
elif field_props.on_write == SchemaFieldWritePolicy.HASH_UPDATE:
update_enforce = True
if hashes is not None and hashes.get(field) is not None:
new_hashes = self._calculate_hashes(ctx_dict[field])
if isinstance(new_hashes, dict):
update_values = {k: v for k, v in ctx_dict[field].items() if hashes[field][k] != new_hashes[k]}
else:
update_values = ctx_dict[field] if hashes[field] != new_hashes else False
elif field_props.on_write == SchemaFieldWritePolicy.APPEND:
update_enforce = False
else:
update_enforce = True
if update_nested:
if not bool(chunk_size):
await val_writer(field, (update_values, update_enforce), True, primary_id)
else:
for ch in range(0, len(update_values), chunk_size):
next_ch = ch + chunk_size
chunk = {k: update_values[k] for k in list(update_values.keys())[ch:next_ch]}
await val_writer(field, (chunk, update_enforce), True, primary_id)
else:
flat_values.update({field: (update_values, update_enforce)})
await val_writer(None, flat_values, False, primary_id)
return primary_id
Loading