-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bootstrap version: Adding more flexibility to the DSL vocabulary #26
Comments
Great insight Paul. I'd say it's one of the most important research areas to advance this field. I'd love to see a paper or blog article exploring this topic in depth. Outlining the key constraints for the generator and key features for the tech, e.g. scalability, modularity, and capabilities. I haven't though this through enough to add value in the discussion. |
I've written a compiler that solves this (I'll paste the code below). Basically the compiler takes tokens from the GUI file and appends them to the web-dsl-mapping.json file with a friendly naming convention So that a gui of:
becomes:
The friendly DSL naming convention takes the character "." (which denotes classses) and turns it into "__", and also takes "+" which denotes additional classes and turns them into "_" This provides much more flexibility in the dsl mappings. I have not tested it in floydhub yet but I will in a couple of days and do a pull request:
|
Excellent, I'm looking forward to see how it performs! |
So I've been testing this on my larger vocab of 270 tokens using an updated compiler and It seems the network doesn't perform all too well, bummer. I suspect it has to do with the one-hot encoding which as the paper says does not scale very well to a large vocabulary and thus restricts the number of tokens in the DSL. I'll also adjust the T = 48 size of the sliding window and analyse the different outcomes. |
Interesting find Paul, what's the BLEU score (four n-gram, greedy search)? Here are some ideas on the top of my mind:
Keep us in the loop! |
So I'll be using MTurk to get a much larger dataset, my custom dataset performs reasonably well on my custom vocab off a couple hundred real world web screenshots. My BLEU score was quite low but the markup was coherent, clumsy but coherent needed more epochs I think. I haven't tweaked the network much just a few parameters here and there but I know more data is key. With MTurk I should have a couple thousand guis and be able to really test on real world scenarios. |
@PaulGwamanda Awesome, let me know how it goes. I'm curious why you didn't further develop a generator with an extended DSL and screenshots, or scrape existing websites and clean it? |
The generator solves the markup problem like grids, buttons, cols etc but it doesn't solve real world examples like more complex layouts, fonts, colors, OCR, animations etc. Using Bootstrap's documentation and web templates around the web we can steadily extend the DSL to include more complex tokens, what I found is that once the DSL is mapped out, labelling a GUI takes roughly 15-20 minutes, on average 120 tokens. This example here takes 15 minutes if you're just labeling the layout markup features(grids, cols, buttons). I want to solve the layout problem first then as you suggest move on to the more complex problems. Once i have a few thousand guis like this I'll utilize the power of GANS to further multiply it |
I'll be pushing my fork this week here: https://github.com/PaulGwamanda/Pix2code-Screenshot-to-code-dataset-builder, it will include a custom dataset (100 images), a dataset builder, training scripts, a flask API and a complete DSL library based on Bootstrap V4. My current dataset is 2500 images (over 286 000 training samples,) if anyone is interested in the dataset email me at [email protected] for a reasonable price :) |
@PaulGwamanda sending you an email now |
Sure, an the example dataset is here: You can modify the DSL to fit whatever data you NEED |
The DSL is simple and straight forward: A token per matching html snippet. However, in a real world scenario many classes would overlap and intermix with each other ie:
<div class="col-md-3">{}</div>
could be:
<div class="col-md-3 bg-primary">{}</div>
or
div class="col-md-3 border border-primary">{}</div>
It seems the structure of the DSL requires you to have a very large vocabulary with thousands of tokens but it would still not solve the flexibility problem. How would you approach solving this problem?
Is an emmet style vocabulary like the below possible?
{
"quadruple.border+border-primary": "<div class=\"col-lg-3 border border-primary \">\n{}\n</div>\n"
}
The text was updated successfully, but these errors were encountered: