LLAMA_PARSE PERFORM BADLY IN THAI LANGUAGE!? or it's my mistake? #401

FRAMEEE17 · 2024-09-14T09:33:29Z

def process_pdf_files(index):
    pdf_directory = "./data/documents"
    llama_parse = LlamaParse(result_type="markdown")
    
    for filename in os.listdir(pdf_directory):
        if filename.endswith(".pdf"):
            file_path = os.path.join(pdf_directory, filename)
            
            try:
                # Parse the PDF file
                documents = llama_parse.load_data(file_path)
                
                # Upload the parsed content to LlamaCloud
                for doc in documents:
                    index.insert(doc)
                
                logger.info(f"Uploaded: {filename}")
            except Exception as e:
                logger.error(f"Error processing {filename}: {str(e)}")
            
            # Add a small delay between files to avoid rate limiting
            time.sleep(1)

try to parse Thai pdf files and it doesn't work! It's my first time to use LLAMA_PARSE though.

The text was updated successfully, but these errors were encountered:

hexapode · 2024-09-15T19:10:08Z

Could you share the doc you used with us so we can have a look? likely a font encoding issue given the screenshot you share

hexapode self-assigned this Sep 15, 2024

hexapode added the bug Something isn't working label Sep 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LLAMA_PARSE PERFORM BADLY IN THAI LANGUAGE!? or it's my mistake? #401

LLAMA_PARSE PERFORM BADLY IN THAI LANGUAGE!? or it's my mistake? #401

FRAMEEE17 commented Sep 14, 2024

hexapode commented Sep 15, 2024

LLAMA_PARSE PERFORM BADLY IN THAI LANGUAGE!? or it's my mistake? #401

LLAMA_PARSE PERFORM BADLY IN THAI LANGUAGE!? or it's my mistake? #401

Comments

FRAMEEE17 commented Sep 14, 2024

hexapode commented Sep 15, 2024