Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support bulk operations #448

Open
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

kirillpechurin
Copy link

@kirillpechurin kirillpechurin commented May 14, 2023

In order to support the actual state of documents in indexes, I propose a solution to support bulk operations via manager implementation

A manager is described for working with the following bulk operations:

  • bulk_create
  • bulk_update
  • update
  • delete

Manager can be connected to the model by calling as_manager() in the objects attribute.

The manager itself is implemented in a similar way to the signal processor. The manager is described using the Mixin approach, allowing you to connect additional functionality to existing managers. The mixin contains functions that call registry. The process of working with registry for massoperations boils down to the following: entities are received by IDs, these entities are normalized and sent to registry

To update related entities, the update occurs with an additional many flag.
The many flag is introduced to get target entities from related from a queryset or a list.

The following has been adjusted for registry:

  • Introduction of the many flag for update functions on related entities. With the many flag, the document function get_instances_from_many_related is called. The get_instances_from_many_related function receives the entity class and the selection of related entities itself as input
  • Getting the entity class according to the passed argument. Three types of support have been introduced:
    • Model
    • QuerySet
    • list

A check has been introduced into the existing signal processor for the correct processing of the following deletion signals:

  • handle_pre_delete
  • handle_delete

For these functions, a check of the key parameter origin has been introduced in order to reduce the operation of the signal only when interacting with a specific entity.
To process the signal by m2m changes, the transmission of origin as a key parameter has been introduced. The origin parameter is passed as a call to the class of the model from which the signal was received

Added in the get_value_from_instance (DEDField) function to work with

  1. QuerySet as ignored entities for a specific entity (Model-QuerySet)
  2. QuerySet as ignored entities for ignored entities (QuerySet-QuerySet)

Completed and updated testing

Updated readme. Add Support bulk operations.

@safwanrahman
Copy link
Collaborator

I have not understand the feature exactly. Can you elaborate more of the feature and add some documentation?

@kirillpechurin
Copy link
Author

kirillpechurin commented May 18, 2023

Thanks for the answer!

Of course, I'll tell you more about the feature.

In the process of working with the library, I came across the fact that bulk creation/updating of objects does not provide for calling the update of documents in the index.

Thus, the following constructs will update rows in the database, but the changes will not affect the elasticsearch indexes:

  1. Update on queryset filtering
Car.objects.filter(...).update(...)
  1. Deleting by queryset filtering
Car.objects.filter(...).delete()
  1. Bulk creation of objects
Car.objects.bulk_create([...])
  1. Bulk updating of objects
Car.objects.bulk_update([...], fields=[...])

All these operations are operations of bulk creating/updating

Since django does not serve signals for the operations I have listed, except for delete, the model Manager can act as a way to implement tracking the call of these operations

So, to support the current state of the elasticsearch index, I described a manager for working with bulk operations that can be connected to models that require it via the objects attribute in the model class.

In the manager implementation itself, I applied the registry call in the same way as calls from the signal processor. I described the general mechanism of the manager's work above

I also want to focus special attention on signal processing for mass deletion of objects.

In order to correctly process the built-in django signal in the signal processor implementation, a check of the key parameter origin was introduced, which allows you to understand how the signal was triggered (from the model or from queryset). That's, when deleting a specific entity, a signal will be triggered and processing will go through the signal processor. And when deleting a selection of entities, registry will be called through the implemented manager.

@safwanrahman
Copy link
Collaborator

Thanks for describing the scenario @kirillpechurin, I can understand it now. Let me think if it can be solved in any easier way. I will get back to it within one week. Feel free to ping me if I do not write any comment here! 😅

@kirillpechurin
Copy link
Author

@safwanrahman
Hello!

I have not received an answer from you. Tell me, please, did you manage to think of a easier way? 🙂

@safwanrahman
Copy link
Collaborator

Hi @kirillpechurin,
Thanks for pinging. I will check the PR again this week. Sorry for the delay.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants