Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tables in pdf files are not converted properly #293

Open
kristofmulier opened this issue Jan 18, 2025 · 1 comment
Open

Tables in pdf files are not converted properly #293

kristofmulier opened this issue Jan 18, 2025 · 1 comment

Comments

@kristofmulier
Copy link

kristofmulier commented Jan 18, 2025

user_manual.pdf

I converted a pdf-file with lots of table to markdown. I had expected that markitdown would handle tables gracefully. For example, the following table:

Image

Should be converted into markdown like so:

| Register name | Description                     | Offset Address |
|---------------|---------------------------------|----------------|
| FMC_ACCTRL    | Flash access control register   | 0x00           |
| FMC_KEY       | Flash key register              | 0x04           |
| FMC_OPTKEY    | Flash option key register       | 0x08           |
| FMC_STS       | Flash state register            | 0x0C           |
| FMC_CTRL      | Flash control register          | 0x10           |
| FMC_OPTCTRL   | Flash option control register   | 0x14           |

However, what I get from markitdown is this:

  Register address mapping

Table 14 FMC Register Address Mapping

Register name

Description

Offset Address

FMC_ACCTRL

Flash access control register

FMC_KEY

Flash key register

FMC_OPTKEY

Flash option key register

FMC_STS

FMC_CTRL

Flash state register

Flash control register

FMC_OPTCTRL

Flash option control register

0x00

0x04

0x08

0x0C

0x10

0x14

The number 3.6 in the title is gone. But what's worse: the entire table is spread out.

@numairmansur
Copy link

+1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants