Skip to content

nmdanny/FullTextSearchEngine

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Full Text Search engine

A full text search engine implemented as part of HUJI's Web Information Retrieval course.

The engine currently supports a specific dataset - Amazon product review data taken from here, using a line-oriented data format (see the .txt files under datasets for an example)

The main classes of this library are:

  • webdata.IndexWriter, for constructing the index given a dataset file

  • webdata.IndexReader for querying the index

  • webdata.ReviewSearch for performing various text search operations

Documentation

  • Click here for an explanation and visualization of the index structure, as well as theoretical runtime analysis of index operations.

  • Click here for various benchmarks of index construction and querying.

  • Click here for an explanation of a custom product ranking function I've implemented for product search.

Most of the classes and methods were also documented, see below on how to create javadocs.

Build instructions

Requires Java 11+ and Maven.

  • Type mvn package to compile, test and package this library, and generate docs.The resulting jars will be located at target.

    Documentation can be found at target/apidocs/index.html

    (Skip testing by adding -Dmaven.test.skip=true)

About

A full text search engine implemented in Java

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages