Fix for Issue #616: Handle Ingestion of Documents with Uppercase Extensions #621
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Hello,
I've addressed the issue #616 regarding the problem faced when ingesting documents with uppercase file extensions. Here's a breakdown of my changes:
Modified the load_documents function: now, files with uppercase extensions are renamed to have lowercase extensions before processing.
This ensures that documents, regardless of their original extension case, are ingested consistently and without issues.
Checked that all changes are in compliance with the pre-commit hooks and confirmed they pass all checks.
By making file extensions uniformly lowercase, we should now be able to ingest documents without facing issues related to varied extension casing.
Please review my pull request, and I'd appreciate any feedback or suggestions for further refinement.
Thank you for the opportunity to contribute!
Warm regards,
Afshin Karimi