Search

Apache Tika detects and extracts metadata and text from over a thousand different file types, and as well as providing a Java library, has server and command-line editions.
- Users: Alfresco CMS, maybe Hippo CMS (see plans on https://www.bloomreach.com/en/blog/2010/04/Metadata+extraction+with+Apache+Tika.html
Elasticsearch provides a distributed, multitenant-capable full-text search engine with an HTTP web interface and schema-free JSON documents. It is designed for use as an integrated solution, referred to as the “Elastic Stack” (formerly the “ELK stack”) which contains
- Logstash a data-collection and log-parsing engine and
- Kibana an analytics and visualisation platform
Apache Solr

Other libraries worth looking into

A content repository is a hierarchical content store with support for structured and unstructured content, full text search, versioning, transactions, observation, and more.

Apache Jackrabbit™ a fully conforming implementation of the Content Repository for Java Technology API (JCR, specified in JSR 170 and JSR 283).
Jackrabbit Oak is a complementary implementation of the JCR specification. It is an effort to implement a scalable and performant hierarchical content repository for use as the foundation of modern world-class web sites and other demanding content applications. See the Jackrabbit Oak website for more information.

My sidebar

ASP.NET Core fundamentals

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Search

Search

Other libraries worth looking into

My sidebar

Clone this wiki locally