Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bad content #150

Open
enoch3712 opened this issue Dec 24, 2024 Discussed in #149 · 0 comments
Open

bad content #150

enoch3712 opened this issue Dec 24, 2024 Discussed in #149 · 0 comments
Assignees
Labels
bug Something isn't working

Comments

@enoch3712
Copy link
Owner

Discussed in #149

Originally posted by jmottishaw December 24, 2024
Azure Document Intelligence is falling on tables that have blank values in them.

the bad 'content' details are polluting my outputs. I had to implement something like this:

                            table_only_data = {
                                "content": "\n".join([", ".join(row) for row in single_page_data.get("tables", [])]),  # Tables as content
                                "filename": single_page_data.get("filename"),
                                "pageNo": single_page_data.get("pageNo")
                            }

I tried using description fields to tell the LLM to only use table data and not content but that failed miserably. Is there a better way to do this?

@enoch3712 enoch3712 self-assigned this Dec 24, 2024
@enoch3712 enoch3712 added the bug Something isn't working label Dec 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant