Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade the HTML5 / CSS3 parser functionality #779

Open
jaytaph opened this issue Jan 26, 2025 · 0 comments
Open

Upgrade the HTML5 / CSS3 parser functionality #779

jaytaph opened this issue Jan 26, 2025 · 0 comments
Labels
html5 parser Any issues related to the HTML5 parser

Comments

@jaytaph
Copy link
Member

jaytaph commented Jan 26, 2025

At this moment, the HTML and CSS parsers are working, but are pretty isolated from eachother. To add some performance boosts, the html5 parser should be a bit smarter.

  • Parse HTML code until we find either CSS (inline or external), or a (java) script block that needs to be handled immediately (not defered or async).
  • On CSS, we can fire a CSS parser for that css block in parallel with the current HTML parser. They are not dependent on eachother.
  • If there are more CSS links, we can fire more CSS parsers.
  • If we find a javascript block or link we need to process manually, we need to stop the parser up until that specific point and execute the javascript. Note that the javascript CAN possibily modify the DOM upon that point.
  • A good optimization could be to let the parser continue, and dispose of the results when the DOM is updated by the javascript. If the javascript did not do any modifications, we can use the result of the parser that worked in front of the javascript.
  • If there is a javascript that we need to execute, we MUST wait until all the parallel CSS stylesheets have COMPLETELY FINISHED. This is because the javascript can modify the CSS up to that point.
flowchart TD
    Start[Start Parsing HTML] --> ParseHTML[Parse HTML Elements Sequentially]
    
    ParseHTML --> |CSS Block/External Link Found| StartCSS[Start CSS Parsing in Parallel]
    StartCSS --> CSSQueue[CSS Parsing Queue] 
    CSSQueue -->|All CSS Parsing Tasks Completed| WaitForCSS[Wait for All CSS to Complete]
    
    ParseHTML --> |Non-Defer/Non-Async JS Found| CheckCSS[Wait for Pending CSS]
    CheckCSS --> |CSS Completed| ExecuteJS[Execute JavaScript Block or File]
    CheckCSS --> |CSS Still Parsing| WaitForCSS

    ExecuteJS --> ResumeParsing[Resume Parsing]
    ResumeParsing --> |More HTML to Parse| ParseHTML
    
    ParseHTML --> |Defer/Async JS Found| DeferJS[Defer JS Execution]
    ParseHTML --> |No JS or CSS| ContinueParsing[Continue Parsing]
    ContinueParsing --> |More HTML to Parse| ParseHTML
    ContinueParsing --> |End of HTML| End[Finished Parsing DOM]
    
    CSSQueue --> ParseHTML
    WaitForCSS --> ExecuteJS
Loading
gantt
    dateFormat X
    axisFormat %s
    section HTML5
        index.html: 0, 20
        index.html: 30, 50
        index.html: 70, 100
    section Stylesheets
        main.css: 5, 10
        bootstrap.css: 7, 15
        extern.css: 40, 60
        extern2.css: 42, 48
        extern3.css: 44, 62
    section Javascript
        main.js: 20, 30    
        more.js: 62, 70

Loading

Index.html is parsed but is blocked twice for javascripts. Note that the first two stylesheets (main.css and bootstrap.css) are doing in parallel and are not blocking the html. Once the second javascript is found, the system must wait for all the external stylesheets to be complete before it can execute the javascript. Only when execution is complete, the parser can continue with the main index.html until completion.

@jaytaph jaytaph added the html5 parser Any issues related to the HTML5 parser label Jan 26, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
html5 parser Any issues related to the HTML5 parser
Projects
None yet
Development

No branches or pull requests

1 participant