forked from open-mmlab/mmdetection
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
c5d1fcc
commit 222487e
Showing
12 changed files
with
302 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,17 @@ | ||
## Active Learning | ||
|
||
Active learning is the problem of choosing what datapoints to label in order to most improve model performance. In most cases it boils down to trying to | ||
find cases where the model is likely to fail, without requiring human input. | ||
|
||
Most research is focused on offline approaches, taking a large sample of unlabelled images, and ranking them, but often the most valuable | ||
application of Active Learning is in edge deployments, with limited internet bandwidth to send data back to the cloud ready for re-training, | ||
and limited computational resource to run the algorithm on. | ||
|
||
### Online Approaches | ||
|
||
* Frame-Frame Jitter. | ||
* Disagreement between constituents of an ensemble. | ||
* Uncertainty based (range of scores close to the class threshold). | ||
* Pre-NMS internal disagreement (e.g. both a truck and person, before NMS resolved the predictions via. Score) | ||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,8 @@ | ||
# Dataset Analysis | ||
|
||
Simple analysis of a dataset in order to come up with sensible heuristics, | ||
e.g. min and max size (maybe ignore a few outliers). | ||
|
||
Similar for colour as well. | ||
|
||
Also probably possible to find a way to measure how similar shapes are (rotated and resized). |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,16 @@ | ||
# Heuristics | ||
|
||
## Colour Based | ||
|
||
* This at least needs to be a range | ||
|
||
## Shape Based | ||
|
||
* Height | ||
* Width | ||
* Exact shape from segmentation | ||
|
||
## Composite Based | ||
|
||
* Includes X and Y in whatever volumes, roughly X percent Orange, and Y percent brown. | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,14 @@ | ||
## Control Room | ||
|
||
A streamlit UI that enables you to edit annotation files (remap, delete, etc.) | ||
and add comments along with each change. | ||
|
||
### Merge | ||
|
||
### Replace | ||
|
||
### Delete | ||
|
||
### View class distribution | ||
|
||
Could potentially also be the UI for crafting and debugging heuristics. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
## Non-Max Suppression | ||
|
||
## Class Agnostic Non-Max Suppression | ||
|
||
## Weighted Box Fusion | ||
|
||
## Class Preferential Non-Max Suppression |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,44 @@ | ||
## Ensembles | ||
|
||
Ensembles for object detection are rarely deployed in production, however it can be very useful to use them when labelling. | ||
|
||
This will also be the approach used to combine heuristics with neural networks, which would definitely be sufficiently | ||
resource efficient to deploy to production. | ||
|
||
### | ||
```python | ||
model = Ensemble( | ||
models=[ | ||
Model(), | ||
Model() | ||
], | ||
aggregation=[], | ||
) | ||
``` | ||
|
||
### Class List Ensemble | ||
|
||
This setup enables you to share the preprocessing actions between each of the models in the ensemble. | ||
But also to cleanly combine the outputs of multiple models. | ||
|
||
```python | ||
model = Ensemble( | ||
preprocessor=Preprocessor(...), | ||
models=[ | ||
Model( | ||
model_path='model_one.onnx', | ||
postprocessor=postprocessor(...), | ||
class_list=['truck','car'], | ||
), | ||
Model( | ||
model_path='model_two.onnx', | ||
postprocessor=postprocessor(...), | ||
class_list=['person'], | ||
) | ||
], | ||
aggregation=NMS(...), | ||
postprocessor=Postprocessor([ | ||
|
||
]), | ||
) | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,80 @@ | ||
## Heuristics | ||
|
||
### How to build a model out of Heuristics | ||
|
||
```python | ||
|
||
@dataclass | ||
class AluminumCan(ClassHeuristic): | ||
brightness: float | ||
colour_ranges: | ||
``` | ||
|
||
``` | ||
@dataclass | ||
class AluminumCan(CompositeHeuristic): | ||
Brightness(): | ||
``` | ||
|
||
```python | ||
aluminum_can = Heuristic([ | ||
ShapeFilter( | ||
min_width=0, | ||
max_width=100, | ||
min_height=0, | ||
max_height=100, | ||
) | ||
Brightness(min='', max='') | Edginess(min='', max=''), | ||
Cornerness(min='', max=''), | ||
Circleness(min='', max=''), | ||
Squareness(min='', max=''), | ||
Reflectivity(min='', max=''), | ||
Transparency(min='', max=''), | ||
]) | ||
``` | ||
|
||
```python | ||
Redirection( | ||
trigger_class='car', | ||
heuristic=ShapeFilter(...), | ||
output_class='truck' | ||
) | ||
|
||
``` | ||
|
||
```python | ||
FalsePostive( | ||
trigger_class='car', | ||
heurstic=[], | ||
) | ||
``` | ||
|
||
Don't know when this becomes prohibitively slow to compute. | ||
|
||
2 design options: | ||
* Any list in a list, could be treated as an OR (I think this works?). | ||
* Or use the pipe for or, ideally don't want to have to use anything for &, because | ||
that should be the default. Would be quite nice to use something like an arrow based system though. | ||
* Or your own: | ||
|
||
```python | ||
Any(), | ||
Or(), # or is not the same as all any, because it's one OR the other, not both -> only defined for two options. | ||
All() | ||
Not() | ||
``` | ||
|
||
system | ||
|
||
* Also need to find a way to do NotX. | ||
|
||
- Measures (https://kornia.readthedocs.io/en/latest/feature.html): | ||
- kornia.feature.gftt_response | ||
- Somehow include Hu Moments (how you'd recognise a circle or square) | ||
- Hog features and SIFT features | ||
- Should be able to define regions of 'colour space' (as you do when creating a colour in Microsoft Paint or similar) https://theconversation.com/how-rainbow-colour-maps-can-distort-data-and-be-misleading-16715 (rainbow-color-map) | ||
- This is what HSV is useful for https://stackoverflow.com/questions/42882498/what-are-the-ranges-to-recognize-different-colors-in-rgb-space | ||
- Great post on measurement of water levels https://stackoverflow.com/questions/54950777/opencv-drawing-contours-with-various-methods-on-a-poor-image | ||
- https://stackoverflow.com/questions/51927948/how-can-i-extract-image-segment-with-specific-color-in-opencv | ||
- Really great tutorial which covers a lot of what you'd want to include https://learnopencv.com/blob-detection-using-opencv-python-c/ |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,53 @@ | ||
## Vision Chain | ||
|
||
Constraining artificial stupidity. | ||
|
||
It's often surprisingly difficult to train out a stubborn false positive. These false positives can appear very stupid to clients. It can be a simple thing like an object of a given colour is NEVER some class, or that an object of a given size is NEVER another. These problems are simple and these mistakes should not be made, and yet neural networks are not good at learning such absolute, deterministic rules. | ||
|
||
Good tooling to suggest sensible heuristics, to correct for these false positives does not exist, and this is one problem (among many others), that VisionChain aims to solve. | ||
|
||
This same style of heuristics are also extremely helpful for both labelling data and for online active-learning in resource constrained environments (other techniques include frame-frame jitter, and threshold based sampling). | ||
|
||
Similarly there may be a very simple condition for when you should not predict, such as excessive blur, or insufficient exposure. Some business requirements demand that your product must always predict, but many others require high specificity and demand that when your model predicts, it must predict well (potentially due to a high cost automatic intervention). | ||
|
||
A neural network based object detector can be used to improve the rules (via analysis of its predictions), while the rules can be used to improve the neural network (by increasing the size of the dataset). | ||
|
||
Most practitioners take inspiration from systems like Tesla's Data Engine, building ever larger dataset in order to teach Neural Networks simple rules, but most practical Machine Learning problems are not this open-ended. Most involve a set of fixed cameras in which object sizes are relatively consistent, and most deployments are aimed at solving a business application, which is not sufficiently addressed by just identifying objects, but instead is resolved by recognising an unexpected combination of items within a certain distance from each other, or the presence of one object in the absense of another etc. | ||
|
||
Most applications are composite problems, to build a complete product you must both recognise a region of interest and then detect all objects within it, or flag the presence of one object in the absence of another etc. | ||
|
||
The main downside of such approaches is the manual time to develop sensible rules, but with well-designed software, this need not be so. Rules can be suggested by analysis of a COCO dataset, and accepted or rejected by a developer. | ||
|
||
With the recent development of 'foundational models', there will be huge growth in 'training free' deployments, where a model (like grounding dino) is sufficiently accurate to deploy for a problem, but requires some guard rails specific to the dataset at hand. Models like GPT required LangChain, now models like GroundingDino need VisionChain. In an environment where GPU demand looks likely to continue to outstrip supply, such techniques will be needed to continue the democratization of Deep Learning applications. | ||
|
||
Below is a glimpse at the API: | ||
|
||
```python | ||
preprocessor = Preprocessor([ | ||
NoPredictFilter(Blur(max_value=0.1)), | ||
NoPredictFilter(Exposure(max_value=0.7, min_value=0.2)), | ||
]) | ||
``` | ||
|
||
```python | ||
postprocessor = Postprocessor([ | ||
Thresholding(thresholds={'person': 0.5, 'car': 0.5, 'truck': 0.5, 'road': 0.5}), | ||
ClassAgnosticNMS(nms_threhold=0.8), | ||
ShapeFilter(min_width=400, min_height=400, class='car'), | ||
ColourFilter(central_colour='XXX', range='XXX', class='car'), | ||
OnlyPredictInsideRegionFilter(region_defining_classes=['road']) | ||
]) | ||
``` | ||
|
||
```python | ||
model = Model( | ||
preprocessor=preprocessor, | ||
model_path='model.onnx', | ||
postprocessor=postprocessor, | ||
class_list=class_list, | ||
) | ||
``` | ||
|
||
The demo would be me using this tooling to apply foundation models to a non-coco dataset and business problem in real-time. | ||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,53 @@ | ||
site_name: "Vision Chain" | ||
repo_url: https://github.com/BinItAI/VisionChain | ||
|
||
theme: | ||
name: "material" | ||
palette: | ||
- scheme: default | ||
primary: indigo | ||
accent: indigo | ||
toggle: | ||
icon: material/brightness-7 | ||
name: Switch to dark mode | ||
- scheme: slate | ||
primary: indigo | ||
accent: indigo | ||
toggle: | ||
icon: material/brightness-4 | ||
name: Switch to light mode | ||
|
||
nav: | ||
- Dataset Splitting: 'index.md' | ||
- Dataset Analysis: | ||
- Heuristics: 'analysis/discovery.md' | ||
- Heuristics: 'analysis/heuristics.md' | ||
- Utils: 'index.md' | ||
- Evaluation: 'index.md' | ||
- Ensembles: | ||
- Ensembles: 'ensembles/ensembles.md' | ||
- Aggregations: 'ensembles/aggregations.md' | ||
- Active Learning: 'active-learning/active-learning.md' | ||
- Heuristics: 'heuristics/heuristics.md' | ||
- Thresholding: 'thresholding/thresholding.md' | ||
- Common Business Problems: 'problems/problems.md' | ||
- Control Room: 'control_room.md' | ||
|
||
|
||
plugins: | ||
- mkdocstrings: | ||
enabled: !ENV [ENABLE_MKDOCSTRINGS, true] | ||
default_handler: python | ||
handlers: | ||
python: | ||
options: | ||
show_source: true | ||
- search: | ||
lang: en | ||
|
||
markdown_extensions: | ||
- pymdownx.superfences: | ||
custom_fences: | ||
- name: mermaid | ||
class: mermaid | ||
format: !!python/name:pymdownx.superfences.fence_code_format |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
## Problems | ||
|
||
Many computer vision applications require a combination of models, or creative use of the outputs of one. | ||
|
||
1. License plate reading (object detection for number plate, OCR to convert to text). | ||
2. Count objects within a moving region. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
mkdocs serve -a 0.0.0.0:8000 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
## Thresholding | ||
|
||
Simple setup to optimize your model for a predefined metric against a validation set. |