Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docling document - math equations modelling #666

Open
vitaly-d opened this issue Dec 31, 2024 · 0 comments
Open

Docling document - math equations modelling #666

vitaly-d opened this issue Dec 31, 2024 · 0 comments
Labels
question Further information is requested

Comments

@vitaly-d
Copy link

vitaly-d commented Dec 31, 2024

Equations is a coming soon feature, so it might be a bit too early to discuss, but I would like to understand how equation can be represented in Docling Document Model?

Basically there are two cases:

  • Display Equations
  • Inline Equations

Display Equations seem to be just TextItems with either the sanitized representation containing the TeX formula or using some extension adding the “tex” attribute, like this:

{
      "self_ref": "#/texts/47",
      "parent": {
        "cref": "#/body"
      },
      "children": [],
      "label": "formula",
      "prov": [
		…
      ],
      "orig": "Attention( Q,K,V ) = softmax( QK T \u221a d k ) V (1)",
      "text": "Attention( Q,K,V ) = softmax( QK T \u221a d k ) V (1)”,
      “tex”: ”$[\mathrm{Attention}(Q, K, V) = \mathrm{softmax}(\frac{QK^T}{\sqrt{d_k}})V]”
    }, 

Inline Equations modeling might require the hierarchy supported by Docling Document - a parent paragraph TextItem containing the list of children text items which are a mixture of the texts and inline equations? E.g., the TeX paragraph

Where the projections are parameter matrices $W^Q_i \in \mathbb{R}^{\dmodel \times d_k}$, $W^K_i \in \mathbb{R}^{\dmodel \times d_k}$, $W^V_i \in \mathbb{R}^{\dmodel \times d_v}$ and $W^O \in \mathbb{R}^{hd_v \times \dmodel}$.

after can be represented with a tree like this:

- parent: paragraph
    - child text: “Where the projections are parameter matrices”
    - child inline equation: “$W^Q_i \in \mathbb{R}^{\dmodel \times d_k}$, $W^K_i \in \mathbb{R}^{\dmodel \times d_k}$”
    - child text: “,”
   ...
    - child text: “.”

Questions:

  • How inline equations to be implemented in docling (and will it be implemented at all)? For example, a tree relationships within a paragraph make more sense if there is provenance information for inline equations and other child text nodes, otherwise markdown like the Nougat output seems to be more convenient in the paragraph.text field.
  • Is using tree structures for modeling inline equations within a paragraph consistent with the original design?
  • And, if yes, how to extend the model if needed, for instance, to define which text element requires a line break (regular paragraph, display equation) and which does not (inline equation, text between inline equations)

Thank you!

@vitaly-d vitaly-d added the question Further information is requested label Dec 31, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

1 participant