Adding various patterns to to training data #222
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Hi,
I've been working with the usaddress library, and have added some patterns that I have seen fail in my datasets. This commit includes the xml files for training (training/dealstat_addresses_v1.xml) and test sets (measure_performance/test_data/dealstat_tests_v1.xml). The csv files were excluded by the .gitignore file, I'm not sure if you require these?
Patterns
Both the nose tests and my tests are passing. Let me know how else I can be of assistance. I'm hoping to continue to add new patterns and make pull requests as I work through my datasets.