This project will guide you through building a data retireval model by semantic search and filtering with metadata using the Timescale hosted potgreSQL database. This filtering and then perfomring sematic search techniques makes the model both fast and effeicient. To learn more about the project please refer this article.
In this project we have performed semantic search along with metadata filtering over the timescale hosted PostgreSQL using the concept of Pgvectors with Javascript. This makes the retrieval process fast, efficient and more refined due to two stage filtering of data. This model can be used to enhence the spped and user experience for different RAG Applications and chatbots.
- Efficient and Accurate
- Two stage filtering for refined results
- Scalable and high-performance retrieval system
- Fast Results
-
Clone the repository:
git clone https://github.com/vansh-khaneja/Semantic-Search-with-Filters-using-PostgreSQL-and-Pgvectors cd Semantic-Search-with-Filters-using-PostgreSQL-and-Pgvectors
-
Download Node.js and install these npm packages:
npm install pg npm install @xenova/transformers npm install ndarray npm install csv-parser
-
Set up Database:
Follow this link to create a service on Timescale to setup PostgreSQL database.
1.Once the database is setup copy the credentials user
, host
, password
, database
and port
.
2.Now replace these credentials in the following code snippet in fetch_data.js
and load_data.js
.
const client = new Client({
user: 'USER',
host: 'HOST',
password:'PASSWORD',
database: 'DATABASE',
port: PORT,
});
3.Download the Kaggle dataset from here which contains some technical queries and their resoltion along with metadata.
4.Execute the data_load.js
file by running this command in terminal.
node data_load.js
This will upload the data into the database along with the vector embeddings.
5.At last execute fetch_data.js
file by running this command.
node fetch_data.js
This will fetch the data based on the question in this code snippet present in fetch_data.js
.
var question = "My computer is overheating";
var filters = {
issue:"freezing",
os:"windows",
}
await fetch_query(question,filters.issue,filters.os)
For any questions or issues, feel free to open an issue on this repository or contact me at [email protected].
Happy coding!