Skip to content

Commit

Permalink
Update README.md
Browse files Browse the repository at this point in the history
  • Loading branch information
Florents-Tselai authored Nov 14, 2024
1 parent 51cdf8a commit d711c94
Showing 1 changed file with 32 additions and 44 deletions.
76 changes: 32 additions & 44 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -72,6 +72,9 @@ should work as usual:

```tsql
SELECT 'Below is the PDF we received ' || '/tmp/pgintro.pdf'::pdf;
```

```tsql
SELECT upper('/tmp/pgintro.pdf'::pdf::text);
```

Expand Down Expand Up @@ -110,28 +113,33 @@ SELECT '/tmp/pgintro.pdf'::pdf::text @@ to_tsquery('oracle');
### Document similarity with `pg_trgm`

You can use [pg_trgm](https://postgresql.org/docs/17/interactive/pgtrgm.html)
to get the similarity between to documents:
to get the similarity between two documents:

```tsql
CREATE EXTENSION pg_trgm;
SELECT similarity('/tmp/pgintro.pdf'::pdf::text, '/tmp/sample.pdf'::pdf::text);
```

### Content
### Metadata

```tsql
SELECT '/tmp/pgintro.pdf'::pdf;
```
The following functions are available:

```tsql
pdf
----------------------------------------------------------------------------------
PostgreSQL Introduction +
Digoal.Zhou +
7/20/2011Catalog +
 PostgreSQL Origin
```
- `pdf_title(pdf) → text`
- `pdf_author(pdf) → text`
- `pdf_num_pages(pdf) → integer`

Total number of pages in the document
- `pdf_page(pdf, integer) → text`

Get the i-th page as text
- `pdf_creator(pdf) → text`
- `pdf_keywords(pdf) → text`
- `pdf_metadata(pdf) → text`
- `pdf_version(pdf) → text`
- `pdf_subject(pdf) → text`
- `pdf_creation(pdf) → timestamp`
- `pdf_modification(pdf) → timestamp`

```tsql
SELECT pdf_title('/tmp/pgintro.pdf');
Expand All @@ -144,6 +152,17 @@ SELECT pdf_title('/tmp/pgintro.pdf');
(1 row)
```

```tsql
SELECT pdf_author('/tmp/pgintro.pdf');
```

```tsql
pdf_author
------------
周正中
(1 row)
```

Getting a subset of pages

```tsql
Expand Down Expand Up @@ -184,37 +203,6 @@ SELECT pdf_subject('/tmp/pgintro.pdf');
(1 row)
```

### Metadata

The following functions are available:

- `pdf_title(pdf) → text`
- `pdf_author(pdf) → text`
- `pdf_num_pages(pdf) → integer`

Total number of pages in the document
- `pdf_page(pdf, integer) → text`

Get the i-th page as text
- `pdf_creator(pdf) → text`
- `pdf_keywords(pdf) → text`
- `pdf_metadata(pdf) → text`
- `pdf_version(pdf) → text`
- `pdf_subject(pdf) → text`
- `pdf_creation(pdf) → timestamp`
- `pdf_modification(pdf) → timestamp`

```tsql
SELECT pdf_author('/tmp/pgintro.pdf');
```

```tsql
pdf_author
------------
周正中
(1 row)
```

```tsql
SELECT pdf_creation('/tmp/pgintro.pdf');
```
Expand Down

0 comments on commit d711c94

Please sign in to comment.