-
-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: tag cleanup job #12654
feat: tag cleanup job #12654
Conversation
jrasm91
commented
Sep 13, 2024
•
edited
Loading
edited
- Add a job to remove empty tags
- Add a button in the UI to manually run a few different clean up jobs
Light | Dark |
---|---|
async deleteEmptyTags() { | ||
await this.dataSource.transaction(async (manager) => { | ||
const ids = new Set<string>(); | ||
const tags = await manager.find(TagEntity); | ||
for (const tag of tags) { | ||
const count = await manager | ||
.createQueryBuilder('assets', 'asset') | ||
.innerJoin( | ||
'asset.tags', | ||
'asset_tags', | ||
'asset_tags.id IN (SELECT id_descendant FROM tags_closure WHERE id_ancestor = :tagId)', | ||
{ tagId: tag.id }, | ||
) | ||
.getCount(); | ||
|
||
if (count === 0) { | ||
this.logger.debug(`Found empty tag: ${tag.id} - ${tag.value}`); | ||
ids.add(tag.id); | ||
} | ||
} | ||
|
||
if (ids.size > 0) { | ||
await manager.delete(TagEntity, { id: In([...ids]) }); | ||
this.logger.log(`Deleted ${ids.size} empty tags`); | ||
} | ||
}); | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mertalev if you know a better way to do this, feel free to replace it.
7be665b
to
197d8f2
Compare
197d8f2
to
ecc97d1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code looks slick 🚀
await this.dataSource.transaction(async (manager) => { | ||
const ids = new Set<string>(); | ||
const tags = await manager.find(TagEntity); | ||
for (const tag of tags) { | ||
const count = await manager | ||
.createQueryBuilder('assets', 'asset') | ||
.innerJoin( | ||
'asset.tags', | ||
'asset_tags', | ||
'asset_tags.id IN (SELECT id_descendant FROM tags_closure WHERE id_ancestor = :tagId)', | ||
{ tagId: tag.id }, | ||
) | ||
.getCount(); | ||
|
||
if (count === 0) { | ||
this.logger.debug(`Found empty tag: ${tag.id} - ${tag.value}`); | ||
ids.add(tag.id); | ||
} | ||
} | ||
|
||
if (ids.size > 0) { | ||
await manager.delete(TagEntity, { id: In([...ids]) }); | ||
this.logger.log(`Deleted ${ids.size} empty tags`); | ||
} | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
await this.dataSource.transaction(async (manager) => { | |
const ids = new Set<string>(); | |
const tags = await manager.find(TagEntity); | |
for (const tag of tags) { | |
const count = await manager | |
.createQueryBuilder('assets', 'asset') | |
.innerJoin( | |
'asset.tags', | |
'asset_tags', | |
'asset_tags.id IN (SELECT id_descendant FROM tags_closure WHERE id_ancestor = :tagId)', | |
{ tagId: tag.id }, | |
) | |
.getCount(); | |
if (count === 0) { | |
this.logger.debug(`Found empty tag: ${tag.id} - ${tag.value}`); | |
ids.add(tag.id); | |
} | |
} | |
if (ids.size > 0) { | |
await manager.delete(TagEntity, { id: In([...ids]) }); | |
this.logger.log(`Deleted ${ids.size} empty tags`); | |
} | |
}); | |
await this.dataSource.query(` | |
delete from tags | |
where not exists( | |
select 1 | |
from tag_asset | |
where tag_asset."tagsId" = tags.id | |
)`); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I have a tag Parent/Child I don't want to delete Parent or Child if Child is connected to an asset. Does this handle that case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, assuming that if the child has a row in tag_asset, the parent also has a row.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The parent doesn't necessarily have a row.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, why not? It isn't a parent/child relationship if the parent doesn't always include the child.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We link a single tag "Event/Year/Title" to a single asset. We don't want to have to link every tag in the hierarchy, eg Event and Event/Year and Event/Year/Title - that is implied by the fact that Event/Year/Title is the tag name. In terms of the database we track that relationship in a separate table. We have one link per tag hierarchy, instead of one record per asset/nested/tag combination, which would be a lot more records to read, write, and sync.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know how to exactly write the not exists query with the current query builder, but I can try to take a look at this in the future.
I find the button |
Maybe. I think there are plans to add more manual jobs and potentially trigger all jobs to be run through a separate interface like this. I'm open to different UX though. The current problem is that we only have two options - force=true or force=false - and not all of our jobs work with just that option. Some examples:
Maybe we can work towards separating the queue stats from the way we queue jobs and the types of jobs we can queue. |
@jrasm91 sounds good, we can think of a better way of representing these new jobs in a different PR |