Does RubixML have object detection? #141
-
I've been reading up on RubixML (and machine learning in general) and I understand that RubixML support image recognition (can tell you what it in an image) and I've set up and run the CIFAR-10 Image Recognizer project, which works well. I also love the code quality - the developers are doing an excellent job. However, from what I can see it doesn't appear able to tell you where in the image the object is? Ideally, I want to be able to train a dataset to recognise objects in photos, and then store the object data for each image. Is this currently possible, and if not - is it something that's being worked on? |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 3 replies
-
Hey @karlbuckland, great question! Currently, we're limited to deep feed-forward networks when it comes to computer vision tasks such as object recognition, detection, segmentation, etc. We do not yet have access to a fast n-dimensional convolve API which would allow us to implement Convolutional neural networks that are better suited for computer vision tasks as they work with spatial information contained within the image. So the CIFAR-10 example is about the best we can do for the time being. However, we are working on an interface to an extension called C Array that has a number of goodies one of which is support for n-d array operations and eventually the convolve API. More on this here #140 To your question about object detection, the way I understand it is we need to predict 3 things - an object of interest (class), and 2 sets of numbers, one for the coordinates of the upper left/right corner of the bounding box surrounding the object of interest, and one for the lower left/right for example. If that's indeed the case, then we cannot do object detection (yet) in Rubix because we only support a single target variable. That being said, it's somewhere on our roadmap after C Array integration and Convnets. |
Beta Was this translation helpful? Give feedback.
Hey @karlbuckland, great question!
Currently, we're limited to deep feed-forward networks when it comes to computer vision tasks such as object recognition, detection, segmentation, etc. We do not yet have access to a fast n-dimensional convolve API which would allow us to implement Convolutional neural networks that are better suited for computer vision tasks as they work with spatial information contained within the image. So the CIFAR-10 example is about the best we can do for the time being. However, we are working on an interface to an extension called C Array that has a number of goodies one of which is support for n-d array operations and eventually the convolve API. More on this h…