-
Notifications
You must be signed in to change notification settings - Fork 52
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Some Neural Network suggestions #14
Comments
Ok, something more, Does test_image = Image.open(aktfile) return values between 0 and 1. I am not sure. If not, scale your images data ! |
Great ideas, thanks a lot! I will keep them in mind and test / implement if I have time or need to change something anyway. Going to black and white would stop the compatiblities, but this should not be a real topic. At the moment I'm working on hardware improvements (external LEDs to avoid reflections). The Relu for the inner layer got lost somehow during the restructuring of my python files --> will do this with the next update. |
HI, you don't need to break compatability with your existing pipline. Just add this layer https://www.tensorflow.org/api_docs/python/tf/image/rgb_to_grayscale I speak from own experience. I had a look at your training data guideline.
The thing is, if you leave out these images, your network will never be robust against real-world perturbations like bad lighting, reflections and so on. You need ALL real-world examples to get a good model. I had a look these days on my esp32-cam-meter and noticed the classification is quite unreliable, even if the images for me look identical from one shot to the other.
As I said, training accuracy does not matter at all. By adding all these non-perfect images, the accuracy during training might be smaller than you are used to. What counts, is real-world accuracy AND robustness! Sorry for all the comments and suggestions without any valuable contribution. I hope to be able to actually do a lot of that stuff at he end of this year and open some pull requests, because I think this project is really cool. I'd really like to support this in the future. |
Thanks for your further input and I'll take them into consideration at the next major update. Currently my focus is in some other features. Regarding the idea of the conversion by the tensorflow function: this is not working, as it is not a layer, but a separate function. It is also not supported in tflite, which I need to use for the ESP32 |
@jomjol Here is a sneak peak: I changed your dataset:
I trained the tflite img reference classifier just out of curiosity. Model is way to big. But performance was also not so nice. I trained 2 simple classifiers. A RGB and a Gray version, both accept RGB images as input. Ich wuerde durchaus noch mehr Zeit da rein stecken, falls du Interesse hast dort zusammen zu arbeiten. |
Hallo - wir können gerne in Deutsch weiter schreiben - geht für mich schneller :-). Danke für deinen Mühen, da sind ein paar coole Ideen dabei. Deine Idee mit Graustufen statt RGB hatte ich schon ausprobiert. In meiner Netzarchitektur bringt das aber nur ein minimal kleiners Netz. Der Vorteil könnte höchstens im ESP32 liegen, denn die Bilder wären dann nur noch 1/3 so groß. Spannend finde ich die Idee, NaN durch die SoftMax-Wahrscheinlichkeiten zu ersetzen und damit quasi schlecht erkannte Ziffern raus zu sortieren. Ich bin allerdings gerade an einer anderen Weiterentwicklung für die Ziffern dran, nämlich Erkennung der Nachkommastelle anhand der Position des Ziffer auf dem ROI: Das würde das Problem mit "NaN" auch lösen und du hättest noch eine weitere Nachkommastelle für Genauigkeit oder Plausibilitätschecks vergleichbar zu den analogen Zählern. Das mache ich natürlich nicht mit 100 Klassen (0.0, 0.1, ... 9.9), sondern über 20 Klassen (je 10 für die Vorkomma bzw. Nachkommastelle). Trainingsdaten etc. sind natürlich aufwendiger, aber fast fertig. Vielleicht können wir das zusammen dran arbeiten? Leider habe ich aber auch mehr Ideen wie Zeit und gerade schreibe ich eine Artikelserie zu dem bisherigen Ansatz für die ct-Make - erster Teil kommt jetzt Anfang Dezember. Das bindet meine Kapa (ist eh "nur Hobby"). |
Hi, in Nachkommestellen sehe ich ehrlich gesagt keinen Benefit bei massig Aufwand. Die Meter geben ja praktisch die Genauigkeit vor. Mein Strommesser z.b. 3 Nachkommestellen, die ich eh nicht brauche, vielleicht die erste. Meine Einschaetzung ist, dass das Projekt mehr Robustheit benötigt. Es ist schlicht nervig fuer den Nutzer, wenn da staendig NAN steht oder dann wurde angeblich doch die Ziffer erkannt, aber die falsche und jetzt ist Zahl hoeher und Plausibilitaetschecks versagen, ... Es sind einfach massig Stoereinfluesse. Bildqualitaet, Licht, Rauschen, Ausrichtung, ... Ok, bin mir nicht 100% sicher, ich mache lieber gleich direkt nochmal ein Bild ... Grayscale Hast du irgendeinen Ansatz Daten zu sammeln oder hast du die immer mal nur auf Zuruf? |
Nachkommstellen: Die Nachkommastellen machen einen massiven Unterschied. Ich habe das bei mir schon am Laufen. Der Unterschied kommt in der höheren Auflösung: Greyscale: Meine ersten Versuche mit Greyscale haben eine deutlich schlechtere Performance des Netzes ergeben, so dass ich das erstmal nicht weiter verfolgt hatte. Bilder sammeln: Mache ich momentan auf Zuruf. Bin aber gut versorgt :-) |
Hi,
awesome project. Unfortunately the esp32-cam is such a unstable device.
I have some suggestions regarding your AI pipeline.
Rely on monochrome images only. There is really no need for 3 channels here. Rather spent the extra memory from having 1 channel in a more complex neural network.
Use non-linearities like relu after each CNN layer. You have 2 linear CNN layers which can only learn linear relations.
It is intriguing to use your whole dataset as training, but this is really not a good idea. You always have to remember, a good value during training tells you nothing. There are even papers, suggesting to have way larger test sets than training sets for e.g. error calculation incl. standard deviation. If I remember correctly, you have ~1500 images. Which is quite ok, because augmentation can do a lot here for these simple images.
I suggest to use 1000 for training an 500 for test. If you compare multiple NN models, you even need a third validation set.
Stop your training when the test error starts to increase again. Or validate intermediate model state afterwards against the complete test set and choose then.
last layer before dense is (4,2,32) which might be a little tight . Maybe use one less maxpool.
Try elu as non-linearity. More complex to calc than relu on the esp32 but might be possible if only monochrome data is handled. Elu does not suffer from vanishing gradient as much as relu does. Might not be important for the small network, but I usually had better performance with elu.
You might google for best MNIST models to get inspiration for your network, as it is very similar, just letters. They also only use monochrome images. This might also to streamline your code and increase SW quality.
Try ADAM optimizer. My preference, usually quite robust and fast.
You could think about using the softmax array as probabilites for your digits. SOmetimes the second best is the correct one. Maybe you can correct a number (because it maybe was lower than the previous) by replacing digits with high second guess values.
Again, awesome project. If I will find at the end of the year, I will try to train my own variant and give you feedback.
Thanks
The text was updated successfully, but these errors were encountered: