Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Number of images for retraining #1

Open
t-pfaff opened this issue Nov 11, 2016 · 6 comments
Open

Number of images for retraining #1

t-pfaff opened this issue Nov 11, 2016 · 6 comments

Comments

@t-pfaff
Copy link

t-pfaff commented Nov 11, 2016

@jnaulty Great project! One question: you only used the images in tf_files/data to retrain the net? Not more?

@jnaulty
Copy link
Owner

jnaulty commented Nov 11, 2016

@t-pfaff
Great question/point. I had a friend actually scrape google images manually for this. I'm certainly open for PRs or better ways to pull images of cars and metermaids.
Another great improvement would be to add the images confirmed to be metermaids from s3 to the data pool.
As well as false-positives (cars that trigger the text message, but aren't actually metermaids.).

This project actually needs a lot of cleaning up. I need to refactor the entire project so it's easier to

  1. Understand the flow (documentation might help, but I find just having clean, organized code is always a good start).
  2. package and distribute
  3. contribute

As always, it's Github, so feel free to fork, improve, and share. This was less than a weekend project I did at a hackathon. That being said, I will refactor it so it's a little bit more respectable and versatile. (Cars v. metermaids was just a simple demo example that was easy to implement and had an immediate, 'graspable' utility, especially for those who don't code).

@t-pfaff
Copy link
Author

t-pfaff commented Nov 11, 2016

@jnaulty
OK, cool. So there are more images on S3 than here in the repo? If so, how many more? I'm trying to get a feeling for the amount of images for such a project. We'd like to do a similar hackathon project as part of our Data Science meetup over here in Germany sometime in 2017.

@jnaulty
Copy link
Owner

jnaulty commented Nov 15, 2016

@t-pfaff
There are more images on S3, but I'm not rebuilding the model with them...yet.
I guess I could automate the rebuilding of the model. That wouldn't be too hard with the twilio api. Sending a text response back to confirm or deny the correct identification of the object, and then rebuild based on that info.

As for current images, I have only a few dozen. It's much easier to do this when you are within range of free wifi.

I still need to work on doing the classifying on the rpi side.

@craigm26
Copy link

craigm26 commented Feb 21, 2017

I love this repo. Gave me confidence that I could utilize TensorFlow for a project.

I recommend this set of images: http://image-net.org/api/text/imagenet.synset.geturls?wnid=n02960352

I have some others if interested. I've been trying to scrape all of those for a trained set. Still working on that part but have been following the below link to classify a set of images for different classes.

Been using this post as a reference point: Medium Post

Being able to validate in "real-time" would be amazing. (e.g. top 5 result is a specified bounding box in the frame), tap to classify as not the object and retrain.

@jnaulty
Copy link
Owner

jnaulty commented Jun 8, 2017

@craigm26 Thanks for the resources!
I plan on doing a development sprint in July on this. If anyone is interested, I'll be starting July 10th to July 19th.
I really like the idea of streaming classification. I will take a look into that.

@craigm26
Copy link

craigm26 commented Jun 8, 2017

@jnaulty - I'm free at the moment - let me know details of the July 10th-19th activities. I'm curious to see if I could participate.

I have a few ideas I'm touching on constantly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants