Skip to content

Commit

Permalink
Merge pull request pramsey#118 from Florents-Tselai/curlopt-userAgent
Browse files Browse the repository at this point in the history
Allow setting CURLOPT_USERAGENT
  • Loading branch information
pramsey authored Jan 2, 2021
2 parents 8384406 + fb022dd commit 6cd4f51
Show file tree
Hide file tree
Showing 2 changed files with 28 additions and 3 deletions.
24 changes: 24 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -155,6 +155,7 @@ FROM
--------+-----------------------------------------------------------
302 | http://www.google.ch/?gfe_rd=cr&ei=ACESWLy_KuvI8zeghL64Ag
```

## Concepts

Every HTTP call is a made up of an `http_request` and an `http_response`.
Expand Down Expand Up @@ -231,6 +232,8 @@ Select [CURL options](https://curl.haxx.se/libcurl/c/curl_easy_setopt.html) are
* [CURLOPT_TCP_KEEPALIVE](https://curl.haxx.se/libcurl/c/CURLOPT_TCP_KEEPALIVE.html)
* [CURLOPT_TCP_KEEPIDLE](https://curl.haxx.se/libcurl/c/CURLOPT_TCP_KEEPIDLE.html)
* [CURLOPT_CONNECTTIMEOUT](https://curl.haxx.se/libcurl/c/CURLOPT_CONNECTTIMEOUT.html)
* [CURLOPT_USERAGENT](https://curl.haxx.se/libcurl/c/CURLOPT_USERAGENT.html)



For example,
Expand All @@ -245,6 +248,27 @@ SELECT * FROM http_list_curlopt();

Will set the proxy port option for the lifetime of the database connection. You can reset all CURL options to their defaults using the `http_reset_curlopt()` function.

Using this extension as a background automated process without supervision (e.g as a trigger) may have unintended consequences for other servers.
It is considered a best practice to share contact information with your requests,
so that administrators can reach you in case your HTTP calls get out of control.

Certain API policies (e.g. [Wikimedia User-Agent policy](https://meta.wikimedia.org/wiki/User-Agent_policy)) may even require sharing specific contact information
with each request. Others may disallow (via `robots.txt`) certain agents they don't recognize.

For such cases you can set the `CURLOPT_USERAGENT` option

```sql
SELECT http_set_curlopt('CURLOPT_USERAGENT',
'Examplebot/2.1 (+http://www.example.com/bot.html) Contact [email protected]');

SELECT status, content::json ->> 'user-agent' FROM http_get('http://httpbin.org/user-agent');
```
```
status | user_agent
--------+-----------------------------------------------------------
200 | Examplebot/2.1 (+http://www.example.com/bot.html) Contact [email protected]
```

## Keep-Alive & Timeouts

*The `http_reset_curlopt()` approach described above is recommended. The global variables below will be deprecated and removed over time.*
Expand Down
7 changes: 4 additions & 3 deletions http.c
Original file line number Diff line number Diff line change
Expand Up @@ -129,6 +129,7 @@ static http_curlopt settable_curlopts[] = {
{ "CURLOPT_TIMEOUT", NULL, CURLOPT_TIMEOUT, CURLOPT_LONG, false },
{ "CURLOPT_TIMEOUT_MS", NULL, CURLOPT_TIMEOUT_MS, CURLOPT_LONG, false },
{ "CURLOPT_CONNECTTIMEOUT", NULL, CURLOPT_CONNECTTIMEOUT, CURLOPT_LONG, false },
{ "CURLOPT_USERAGENT", NULL, CURLOPT_USERAGENT, CURLOPT_STRING, false },
{ "CURLOPT_IPRESOLVE", NULL, CURLOPT_IPRESOLVE, CURLOPT_LONG, false },
#if LIBCURL_VERSION_NUM >= 0x070903 /* 7.9.3 */
{ "CURLOPT_SSLCERTTYPE", NULL, CURLOPT_SSLCERTTYPE, CURLOPT_STRING, false },
Expand Down Expand Up @@ -766,6 +767,9 @@ http_get_handle()
curl_easy_setopt(handle, CURLOPT_CONNECTTIMEOUT, 1);
curl_easy_setopt(handle, CURLOPT_TIMEOUT_MS, 5000);

/* Set the user agent. If not set, use PG_VERSION as default */
curl_easy_setopt(handle, CURLOPT_USERAGENT, PG_VERSION_STR);

if (!handle)
ereport(ERROR, (errmsg("Unable to initialize CURL")));

Expand Down Expand Up @@ -1013,9 +1017,6 @@ Datum http_request(PG_FUNCTION_ARGS)
/* Set the target URL */
CURL_SETOPT(g_http_handle, CURLOPT_URL, uri);

/* Set the user agent */
CURL_SETOPT(g_http_handle, CURLOPT_USERAGENT, PG_VERSION_STR);

/* Restrict to just http/https. Leaving unrestricted */
/* opens possibility of users requesting file:/// urls */
/* locally */
Expand Down

0 comments on commit 6cd4f51

Please sign in to comment.