-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
List of possible enhancements #84
Comments
Another to-do might be to rewrite batch detection scripts to use PyTorch Dataloader instead of managing image I/O manually. This will also allow performing batch inference instead of looping over each image one by one. It should significantly improve inference performance. (Comment originally posted by patelvyom) |
Updating my response to this suggestion: rather than investing time in using the PyTorch data loader, I'd like to see someone experiment with YOLOv5's native inference tools (val.py and detect.py) as a total replacement for our inference scripts. These have all the benefits of "proper" PyTorch data loading, but also have a zillion bells and whistles, especially test-time augmentation that could improve accuracy. -- That's a great suggestion, I'll add an item to the list... more specifically, though, the item is to do a performance test (which can be arbitrarily inelegant) to see what the benefit would be, with and without a GPU, and make sure results are identical. If the benefit is more than around a 25% speedup, it's probably worth it. If it's less than that, it may be preferable to keep the current approach, which is easier to debug and maintain, and keeps a much longer shared code path across PyTorch and TF. Also I vaguely remember that images in a batch need to be the same size, which isn't guaranteed, so either the test would need to verify that this isn't the case, or the implementation would need to break batches when the image size changes. (Comment originally posted by agentmorris) |
Closed the following issues: |
Closed:
|
Closed the issue re: checkpoint support for multicore inference (thanks, Alex Morling!), and the issue with passing force_cpu deeper into the call stack (easier to just do that with CUDA_VISIBLE_DEVICES). |
v0 release of a script to break images up into tiles, run MD on tiles, and stitch the results back together (run_tiled_inference.py). Useful when a user has Very Large Images and Very Small Animals. |
Closed the issues re: merge_detections.py (a) not knowing how to merge at the level of individual detections, and (b) not having a command-line driver. Thanks, @atmorling! |
Closed the issue re: saving datetime, image size, and EXIF metadata in run_detector_batch. Thanks, @atmorling! |
Closed the issue re: running YOLOv5 val.py inference (run_inference_with_yolov5_val.py) on Windows, which was really just a "does this work?" issue. The answer for now is "yes, it works, but it requires admin privileges". |
Closed and removed the issue re: sorting by confidence in postprocess_batch_results.py. |
Closed issues related to making requirements.txt work, and therefore supporting newer versions of dependent packages, and therefore making the pip package work... better. |
Closed:
|
Closed everything related to rebooting the classifier training page; the classification page has been updated, with all new training pointed to MEWC. |
Closed the issue related to creating N! category combinations in postprocess_batch_results. This has been handled now as well as it can be without making opinionated decisions about which combinations to show. |
Closed:
|
Removed the item under "random models" about supporting vehicle classification with stock YOLOv5 and YOLOv8 models; this is supported now (and fun!). |
Removing items related to cleanup in process_video.py --
|
Closing the issue wrt pulling the sequence smoothing out of its notebook home and into proper module functions; this is done. --
|
Handled checkpointing in run_inference_with_yolov5, removed this issue.
|
Sometimes folks ping us to ask how they can contribute code to the MegaDetector project, and we don't really have a place to point them right now. Combined with the fact that a couple of important open issues have been languishing for a few weeks (months?), I got motivated to create this issue as a snapshot of our internal todo list, so I have somewhere to point folks who want to get involved. I'm making only a weak attempt at prioritization here, instead I'm just trying to sort them into logical buckets.
If you're interested in trying your hand at any of these, email us!
Table of contents
Feature additions for existing scripts/tools
Good first issues
postprocess_batch_results.py currently has no support for video. When run on a .json file that points to videos, extract frames in a sensible way to generate previews.
A number of modules are the type of functionality one might want to run from the command-line, but they lack command-line drivers (camtrap_dp_to_coco, generate_crops_from_cct, labelme_to_yolo, remap_coco_categories, wi_download_csv_to_coco, yolo_output_to_md_output, yolo_to_coco, remap_detection_categories). IMO knocking all of these out would be one self-contained task.
There are a few places in the code base where something happens serially that is an obvious candidate for parallelization, using the same approach that's used elsewhere in the repo, specifically: image validation in coco_to_labelme, image processing in coco_to_yolo, image resizing in resize_coco_dataset, image downlod in wi_download_csv_to_coco, create_lila_test_set. IMO knocking all of these out would be one self-contained task.
There is a relatively comprehensive set of tests for both programmatic and CLI invocation of most modules, but test coverage can always be improved, and currently test coverage is not formally assessed. This doesn't have to be a monolithic task, it's straightforward to just jump in and add more tests for modules/functions/parameters that aren't currently covered.
run_detector_batch.py supports checkpointing when called from the command line, but not when it's called programmatically. Add checkpointing support when calling load_and_run_detector_batch() directly. This is a relatively minor change, it just requires passing checkpointing arguments in slightly differently.
In visualization_utils.py, optionally render boxes so the most confident box is always on top; right now boxes are rendered in the order they're supplied, so lower confidence values can obscure higher confidence values.
The json manager app currently hard-codes the expected structure, so we have to keep it up to date with minor additions to the .json format. This should only hard-code parameters it actually needs to operate on, and pass everything else through unmodified. Json.NET supports all the right things, we're just not doing those things right now.
Somewhat more involved, but still self-contained enhancements
run_detector_batch.py currently does not support checkpointing when --use_image_queue is enabled, but the image queue is helpful when running off of a slow drive. Add checkpointing support when --use_image_queue is enabled. (This is not a significant issue when using manage_local_batch.py/.ipynb to create and run jobs; the user can just use lots of tasks in lieu of checkpointing, so the priority of this issue is pretty low.)
compare_batch_results.py supports comparing detections, but not species classifications. Add support for species classification results.
Allow run_detector_batch.py to use multiple GPUs. (This is not quite as critical as it sounds; large jobs are best run via manage_local_batch.py or manage_local_batch.ipynb anyway, and splitting jobs across multiple GPUs is handled there.)
The the repeat detection elimination pipeline currently uses two main parameters to decide which detections are likely repeating false positives: (1) size and (2) number of repeats. In practice, small detections are more likely to be false positives, so it would be helpful to express varying thresholds for number of repeats based on size.
Regarding the repeat detection elimination process... currently if you run the "find repeat detections" portion of the pipeline, and decide you just want a different threshold for the number of repeat detections you want to use to call a detection "suspicious", you have to run the whole process again. This is silly; you should be able to just change the threshold. In fact, even better, you should be able to specify around the number of suspicious detections you feel like dealing with (typically around 1000, which is around 5-10 minutes of manual review), and have that threshold determined automatically.
Allow postprocess_batch_results.py to operate on sequences, rather than just images. Sample based on sequences, do precision/recall analysis based on sequences, and render sequences in a sensible way on the output page.
In repeat detection elimination and sequence-based classification smoothing, write the smoothing parameters into the output file.
Refactoring or re-writing stuff
postprocess_batch_results.py is an absurd use of Pandas right now, and has an absurd level of duplication between the code paths (with/without ground truth, with/without classification results). This could use a re-write from scratch.
repeat_detections_core.py isn't nearly as bad, but it's not ideal, and it has some really bizarre properties right now, like the fact that when you run the main function a second time to apply a set of changes after the manual review step, it repeats all the folder-separation stuff it did the first time, which is brittle and silly. Not quite a total re-write, but a significant cleanup.
Infrastructure things
In certain Mac M1 environments, MD produces incorrect results. It is unlikely that this is specific to MD, this is likely a corner case for YOLOv5. This does not appear to happen in the recommended Python environment, but if the user upgrades YOLOv5 and/or certain Python dependencies, bizarre things might happen. See this issue and this question on the YOLOv5 repo for details and status.
A substantial number (most?) of our users prefer R, and we're forcing them to run a bunch of Python code. It would be great to either wrap the inference process in R, or port the inference code to R. IMO it's not urgent to do this for anything other than the inference code (run_detector_batch.py) and maybe separate_detections_into_folders.py.
Update: after adding this item to the list, I discovered the animl R package, which supports R-based inference for both MDv4 and MDv5. TBD whether any more porting to R is required.
Miscellaneous things that are more exploratory
Other projects that could use your help
If you found this text because you want to work on open-source code related to conservation, and everything I just listed is either too boring or too daunting, please don't give up! Depending on your specific skill set, maybe our close collaborators who maintain EcoAssist, CamTrap Detector, Timelapse, MEWC, or any of the platforms listed here could use contributions. Or head over to the "Open Source Solutions" forum at WILDLABS, and offer your skills there!
Random models someone should train
Now I'm letting this thread really veer off into a tangent, but FWIW, people frequently ask us "can MegaDetector do [x]?", where [x] is something MegaDetector definitely can't do. But there are some values of [x] that have come up a bunch of times and feel like the right balance of "tractable" and "useful", where there's sort of the right training data in the universe, and a focused student project could really get something going. So, to finish up this long post with lots of random ideas:
A model to classify camera trap images as obscured due to fog or snow, knocked over and staring at the sky, and/or completely obscured by vegetation
A model that runs as part of postprocess_batch_results.py to pick out "fun" images (currently we do this manually from the output of postprocess_batch_results, which is fast, but it means we're only ever searching over the ~7500 images we sample for postprocessing)
"MegaDetector for snakes"
"MegaDetector for fish"
Update: after I wrote this item, someone actually did release... wait for it... MegaFishDetector). Still, plenty of work to be done in this area. List of models and data in this space here.
Lots of camera trap data was recently ingested into Hugging Face, with the hope that someone might train a super-giant species classifier for camera trap data, and/or document a nice process for training regional classifiers. AFAIK no one has done either of the above yet.
Note to self: tags for this issue
Issue cloned from Microsoft/CameraTraps, original issue.
The text was updated successfully, but these errors were encountered: