Does this project support the retrieval of hundreds of millions of entries? #1051

win4r · 2024-07-12T09:08:15Z

Does this project support the retrieval of hundreds of millions of entries?

win4r · 2024-07-12T09:14:32Z

I have a collection of tens of thousands of files in various formats (PDF, CSV, etc.). I'd like to create a RAG system to enable efficient retrieval and information extraction from these documents.

ysolanky · 2024-07-15T19:48:14Z

Hey @win4r!

At a scale of hundreds of millions, the accuracy and efficiency will depend on the type of file, the chunking and the LLM that you are using.

I'd start with the data you have consisting of tens of thousands of files and then going from there. Please do share your results :)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Does this project support the retrieval of hundreds of millions of entries? #1051

Does this project support the retrieval of hundreds of millions of entries? #1051

win4r commented Jul 12, 2024

win4r commented Jul 12, 2024

ysolanky commented Jul 15, 2024

Does this project support the retrieval of hundreds of millions of entries? #1051

Does this project support the retrieval of hundreds of millions of entries? #1051

Comments

win4r commented Jul 12, 2024

win4r commented Jul 12, 2024

ysolanky commented Jul 15, 2024