Skip to content

HTML export #214

@stroiman

Description

@stroiman

When things don't work as intended, it could be useful to have a screen shot of the current page.

However, any visual rendering is completely outside the scope of Gost-DOM core functionality.

An alternative could be to save an HTML file, that you could open in a browser. Currently you can call Window.Document().DocumentElement().OuterHTML() to render the current DOM into HTML.

There are a few issues for this approach.

  • The current state of the DOM is what you want to render. Any scripts in the output could be executed when opening the page in a browser.
  • Some state isn't reflected in the HTML-serializable part. E.g., the value of an input field is an internal value - and not stored in the value data attribute. That attribute merely provides the initial state for the internal value.

This is not necessarily an exhaustive list if issues - just the things I could think of right now.

Workaround

One possible workaround is that client code modifies the DOM before exporting, e.g.:

  • Call Remove() on all element from Document.GetElementsByTagName("script")
  • Calls input.SetAttribute("value", input.Value()) in all HTMLInputElement objects.

As the DOM is no longer in the "correct" state, any code executed after this adjustment would operate on the wrong DOM. To mitigate this, call CloneNode() on the root element first.

Solution

Create some kind of "filtered export" function, allowing the document to be written to disk. Accepting an fs.FS rather than just an io.Writer allows the code to also write referenced files, e.g., CSS and image files.

func ExportFiltered(win html.Window, w io.Writer, dest fs.FS) error;

// What this should be, ... I have no idea
type DomTransformer interface {
  VisitNode(Node) Node
}

Warning

The interface only communicates an idea. It is not the right interface.

Note

It should be possible to implement this functionality outside of Gost-DOM itself, e.g. in client code, and then integrate into Gost-DOM (or even better, a "contrib" module), when the API seems to have stabilised. (examine if client code has proper access to the http.Client and can resolve hrefs)

Given a viable (but untested) workaround, this has a somewhat low priority.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    Status

    No status

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions