-
Notifications
You must be signed in to change notification settings - Fork 0
Search
surfmuggle edited this page Apr 15, 2018
·
2 revisions
-
Apache Tika detects and extracts metadata and text from over a thousand different file types, and as well as providing a Java library, has server and command-line editions.
- Users: Alfresco CMS, maybe Hippo CMS (see plans on https://www.bloomreach.com/en/blog/2010/04/Metadata+extraction+with+Apache+Tika.html
-
Elasticsearch provides a distributed, multitenant-capable full-text search engine with an HTTP web interface and schema-free JSON documents. It is designed for use as an integrated solution, referred to as the “Elastic Stack” (formerly the “ELK stack”) which contains
- Logstash a data-collection and log-parsing engine and
- Kibana an analytics and visualisation platform
- Apache Solr
A content repository is a hierarchical content store with support for structured and unstructured content, full text search, versioning, transactions, observation, and more.
- Apache Jackrabbit™ a fully conforming implementation of the Content Repository for Java Technology API (JCR, specified in JSR 170 and JSR 283).
- Jackrabbit Oak is a complementary implementation of the JCR specification. It is an effort to implement a scalable and performant hierarchical content repository for use as the foundation of modern world-class web sites and other demanding content applications. See the Jackrabbit Oak website for more information.