-
Notifications
You must be signed in to change notification settings - Fork 0
/
README
42 lines (27 loc) · 1.13 KB
/
README
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
This is JAX-RS Tika server for https://issues.apache.org/jira/browse/TIKA-593
Building
--------
Please build and install last Tika snapshot from SVN trunk.
Test data files is not available right now (I need some time to check and remove private data), so use
"mvn -Dmaven.test.skip=true install" to build.
Running
-------
java -jar target/tikaserver-1.5-SNAPSHOT.jar
Usage
-----
Usage examples from command line with curl utility:
1) Extract text:
curl -T price.xls http://localhost:9998/tika
2) Extract text with mime-type hint:
curl -v -H "Content-type: application/vnd.openxmlformats-officedocument.wordprocessingml.document" -T document.docx http://localhost:9998/tika
3) Get all document attachments as ZIP-file:
curl -v -T Doc1_ole.doc http://localhost:9998/unpacker > /var/tmp/x.zip
4) Extract metadata to CSV format:
curl -T price.xls http://localhost:9998/meta
HTTP Codes
----------
200 - Ok
204 - No content (for example when we are unpacking file without attachments)
415 - Unknown file type
422 - Unparsable document of known type (password protected documents and unsupported versions like Biff5 Excel)
500 - Internal error