Skip to content

Conversation

@Doma1612
Copy link
Contributor

This PR introduces the Document Tag Recommendation backend, which suggests tags for untagged documents by identifying similar documents with existing tags.

Main Functionalities
Three primary functionalities are added as API endpoints, extending through the service, CRUD, DTO, and ORM layers:

1️⃣ create_new_doc_tag_rec_task
Starts a Celery job that utilizes simsearch and Weaviate to compare document vectors.
Documents with high similarity are assumed to share tags.
The recommendations are stored in PostgreSQL for later retrieval.
2️⃣ get_recommendations_from_task_endpoint
Fetches recommended tags for untagged source documents using a task ID.
Each recommendation includes metadata and can be displayed to users in the frontend.
3️⃣ update_document_tag_recommendations
Accepts suggested tags by updating the document records in the database.
Takes a list of recommendation_ids, setting is_accepted = true for each.
Links the confirmed tags to their respective documents.

backend/src/api/document_tag_recommendations.py → Implements the three new endpoints.

Let me know if any refinements are needed!

@Doma1612 Doma1612 requested review from bigabig and fynnos February 18, 2025 09:52
@Doma1612 Doma1612 force-pushed the integrate-recommendation-backend branch from accc0d5 to e99a79b Compare February 18, 2025 10:32
@bigabig
Copy link
Member

bigabig commented Feb 19, 2025

DocumentTagRecomendation sollte DocumentTagRecommendationJob heißen?

@Doma1612 Doma1612 requested a review from bigabig February 19, 2025 15:38
Copy link
Collaborator

@fynnos fynnos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good already! added a few comments regarding DB operations

@Doma1612 Doma1612 requested review from bigabig and fynnos February 25, 2025 10:58
@fynnos fynnos merged commit 8723074 into main Feb 28, 2025
3 checks passed
@fynnos fynnos deleted the integrate-recommendation-backend branch February 28, 2025 09:30
noahscheld pushed a commit that referenced this pull request Mar 3, 2025
* implement tag-recommendation data model

* implement crud and dto

* implement dummy classification

* document tag recommendation

* update alembic version

* remove faulty  alembic version

* Address Tims findings

* update None value query

* revert changes

* remove dto transformation

---------

Co-authored-by: Dominik Martens <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants