Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cartographer and PixPlot Image preview #400

Open
wants to merge 84 commits into
base: master
Choose a base branch
from
Open

Conversation

dale-wahl
Copy link
Member

@dale-wahl dale-wahl commented Nov 30, 2023

This is merge-able!

  • Cartographer processor developed and functional; it allows, in principle, the creation of manifests (coordinate maps) that can be read and displayed by PixPlot. For the moment, I have it set to create a default grid mapping, but it should be easy to expand and add to if we would like to design different mappings.
  • This basic grid mapping is now done automatically (via preset) on downloading images. It then is used as a preview allowing users to easily explore their downloaded images!
  • I took and modified the PixPlot template and incorporated it into 4CAT. Primarily this means that it does not create a new static site (as PixPlot was designed) and instead loads them with our data (e.g., it uses the already downloaded images and templates instead of new hard-coded HTML files).

Notes:

  • The @app.route('/result/<path:query_file>') now can also load archived files. This may not be the best design and we might perhaps not archive the images since they have this new use. Recommend reviewing this part. Updated to open files and generate response on the fly.

To-do:

  • Cartographer options can be expanded and make it a bit easier to integrate new mappings.
  • I have done some basic editing to the PixPlot template, but it could benefit from more work. More editing done for 4CAT specific plots!
  • Particularly making certain features optional such as the "Hotspots" and some of the icons both of which are not used by default.
  • Add metadata to the Image descriptions
  • Move the image description panel to one of the sides; it works where it is with the image filename only, but is not great/intuitive with further descriptions
  • edit add presets for tiktok and telegram image downloaders

Copy link
Member

@stijn-uva stijn-uva left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some crashes I found when testing.

@@ -24,6 +27,21 @@

csv.field_size_limit(1024 * 1024 * 1024)

if app.logger.getEffectiveLevel() == 10:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this still need to be here? It crashes (since time is not imported)

@@ -401,10 +479,12 @@ def show_result(key):
:param key: Result key
:return: Rendered template
"""
ds_start = time.time()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Likewise

# TODO: Better way to identify this edge case?
elif child.type.startswith("image-downloader"):
# Images were downloaded from this zip; this is the case for presets
for grandkid in child.get_children(instaniate_datasets=True):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TypeError: get_children() got an unexpected keyword argument 'instaniate_datasets'

@stijn-uva
Copy link
Member

Also, it would be nice to make this compatible with all image-generating/downloading processors, including 'Extract video frames'.

@dale-wahl
Copy link
Member Author

Also, it would be nice to make this compatible with all image-generating/downloading processors, including 'Extract video frames'.

Right now, I need the cartographer to understand what to do with subfolders of image groupings. Or perhaps iterate_archive_contents doesn't since the files are not found (possibly a mismatch between the temp_path yielded and where the file was actually extracted). I could unpack and os.walk in the cartographer, but I really hate that message as it does not work well with amount caps since the whole (often quite large) archive is unpacked anyway.

Regardless, I think we could do better than to just splash the images semi in order. They could be automatically categorized by the subfolders they are in for example (mimicking the scene timeline).

.metadata.json with directories currently expects filename only (not subdirs), but this could be changed.
@dale-wahl
Copy link
Member Author

Also, it would be nice to make this compatible with all image-generating/downloading processors, including 'Extract video frames'.

Right now, I need the cartographer to understand what to do with subfolders of image groupings. Or perhaps iterate_archive_contents doesn't since the files are not found (possibly a mismatch between the temp_path yielded and where the file was actually extracted). I could unpack and os.walk in the cartographer, but I really hate that message as it does not work well with amount caps since the whole (often quite large) archive is unpacked anyway.

Regardless, I think we could do better than to just splash the images semi in order. They could be automatically categorized by the subfolders they are in for example (mimicking the scene timeline).

Sorted out subfolder handling in cartographer. Works with the video-frames processor. Currently there is no usable .metadata.json file (looks to be copied over from higher level processor and does not have image filename references for post_ids to be extracted) so images have no descriptions. I did not do anything to categorize the images by scene. That's more complex as the categories are currently decided from the metadata (which does not exist). This would be a special case.

@dale-wahl
Copy link
Member Author

I merged master into cartographer again. Last week there were a few bugs. This week the only thing I saw was actually an issue with the master in that, if we allow unlimited images (e.g., max images is set to 0 in configuration), all the downloaders would use a max of 0 and thus always download all images!

There is this one super weird but and so far only visible on tiktok datasets...
image
I am not sure what's going on there, but it is only the thumbnails so I need to figure out why that is the case.

@dale-wahl
Copy link
Member Author

dale-wahl commented May 28, 2024

I merged master into cartographer again. Last week there were a few bugs. This week the only thing I saw was actually an issue with the master in that, if we allow unlimited images (e.g., max images is set to 0 in configuration), all the downloaders would use a max of 0 and thus always download all images!

There is this one super weird but and so far only visible on tiktok datasets... image I am not sure what's going on there, but it is only the thumbnails so I need to figure out why that is the case.

ok, ok. it actually just occurs when there are very few images. PixPlot itself fails if you have less than 12 (and our cartagrapher ignores/bypasses that). it seems to have to do with the zoom not being far enough away that it doesn't trigger the thumbnails. no idea how to fix or address... but yeah.

@sal-uva
Copy link
Collaborator

sal-uva commented Sep 11, 2024

Tried to run this, but getting some importing errors in JS:
image

@sal-uva
Copy link
Collaborator

sal-uva commented Sep 11, 2024

I would also suggest to rename the processor to something more specific. "Create Image visualisation" is a bit general; something like "Display images on Web page" is already a bit more concrete.

Relatedly, it is a bit unclear to me what can I do with the outputted zip file. Can the description be updated so users are told what they can do with it? And maybe add a readme file in the zip results?

As I understand it, the page opened by the "View" button is what's most interesting here right? Can't the zip file contain this page as well?

Otherwise good work!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants