-
Notifications
You must be signed in to change notification settings - Fork 107
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Training numbers #2
Comments
You should be able to generate numbers like: python generate.py --text="1 2 3 4 5 " --noinfo --bias=4. although the quality will probably be quite bad (too little examples in dataset). You can add your own examples in Alternatively if you have data with consecutive points representing how to draw numbers (with labels) you could create your own dataset. So depending on format of your dataset it might be easier or harder. :) |
I'm really new to this so I'm not sure how to go about creating a dataset. Do you have any articles or direction you can point me to? |
Sorry for the delay. I get the feeling you have no data, which is problematic. Could you please elaborate a little bit more on what you are trying to achieve? :) |
It's no problem, thank you for taking the time to even discuss this with
me.
I found a dataset which of numerically written numbers however it isn't
setup as the current dataset used by IAM in xml files. What I'm trying to
accomplish is to use the handwriting but it also has to include numbers and
currently the numbers do not come out good.
…On Fri, Oct 20, 2017 at 6:06 AM, Grzegorz Opoka ***@***.***> wrote:
Sorry for the delay. I get the feeling you have no data, which is
problematic. Could you please elaborate a little bit more on what you are
trying to achieve? :)
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#2 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AEQOknAGNyvv2VlG7lkOJuE9BNydaJKOks5suJrygaJpZM4P-NV6>
.
|
Ok, is this dataset publicly available? I can look into it to see if there is a way to make it compatible with my code. :) |
Awesome! Here goes:
http://yann.lecun.com/exdb/mnist/
http://archive.ics.uci.edu/ml/machine-learning-databases/semeion/
I found these two
…Sent from my iPhone
On Oct 21, 2017, at 3:05 AM, Grzegorz Opoka ***@***.***> wrote:
Ok, is this dataset publicly available? I can look into it to see if there is a way to make it compatible with my code. :)
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub, or mute the thread.
|
Unfortunatelly, those datasets represent numbers as images. For handwriting generation you would need to have list of consecutive points showing how a digit is written. So those datasets cannot be used here. |
Would this one work? This has the stroke data:
https://github.com/edwin-de-jong/mnist-digits-stroke-sequence-data/wiki/MNIST-digits-stroke-sequence-data
…On Mon, Oct 23, 2017 at 2:36 PM, Grzegorz Opoka ***@***.***> wrote:
Unfortunatelly, those datasets represent numbers as images. For
handwriting generation you would need to have list of consecutive points
showing how a digit is written. So those datasets cannot be used here.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#2 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AEQOkpsMBSx4SjLVJftQ-gStOB7Yv2ZYks5svQb3gaJpZM4P-NV6>
.
|
This one might work. :) Can you give some examples of sequences you want to generate? I just want to figure out what kind of augmentation to dataset might be needed. |
about 5 digit random sequences. In example
11445
8013
1507 etc..
…On Mon, Oct 23, 2017 at 4:30 PM, Grzegorz Opoka ***@***.***> wrote:
This one might work. :) Can you give some examples of sequences you want
to generate? I just want to figure out what kind of augmentation to dataset
might be needed.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#2 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AEQOkiB0tXseZLgH7Nry79NSXJcXQchlks5svSGRgaJpZM4P-NV6>
.
|
Sorry for very late response. I tried this dataset and unfortunately it doesn't work well :/ The results are even worse than with original IAM dataset. If by any chance I find better dataset for this task I will post it here. |
THANK YOU!!!!
…On Wed, Nov 8, 2017 at 12:50 PM, Grzegorz Opoka ***@***.***> wrote:
Sorry for very late response. I tried this dataset and unfortunately it
doesn't work well :/ The results are even worse than with original IAM
dataset. If by any chance I find better dataset for this task I will post
it here.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#2 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AEQOkiSt828fSdSpFVqBdRCh93u3PkbCks5s0hQkgaJpZM4P-NV6>
.
|
Well it's been a while, but I was kind of interested in this problem and created MNIST handwriting dataset. If you still need to generate numbers you may find it useful. One simple solution is to just pick needed digits from this dataset and concatenate them together. :) |
@Grzego THANK YOU! |
This is probably outside the scope of the "issues" but figure I'd ask.
I notice it doesn't take numbers. Is there away to add numbers to the xml data sets so it can also do numbers?
The text was updated successfully, but these errors were encountered: