Search()
andSearch_uri()
gain new parameterignore_unavailable
to determine what happens if an index name does not exist (#273)connect()
gains new parameterignore_version
. Internally,elastic
sometimes checks the Elasticsearch version that the user is connected to to determine what to do. may be useful when it's not possible to check the Elasticsearch version, e.g., when its not possible to ping the root route of the API (#275)- all docs bulk functions gain parameter
digits
that is passed down tojsonlite::toJSON() used internally
. thus,digits
will control the number of decimal digits used in the JSON the package creates to be bulk loaded into Elasticsearch (#279)
- fix README instructions on installing Elasticsearch from docker; there's no latest tag, so use a specific version (#277) thanks @ColinFay
- types were deprecated in Elasticsearch v7 and greater, and will be removed in Elasticsearch v8 and greater. this version makes type optional in all/most functions so that users with older Elasticsearch versions can still use them, but users with v7 or v8 installations don't have to use them (#251) (#270)
- gains new method
index_shrink()
for index shrinking (#192) - through fixing functionality in
docs_bulk()
to allow pipline attachments to work, alldocs_bulk
methods that do http requests (i.e, not prep fxns) gain the parameterquery
to pass through query parameters to the http request, including for examplepipeline
,_source
etc. (#253) Search()
andSearch_uri()
gain the parametertrack_total_hits
(default:TRUE
) (#262) thanks @orenov
- the
warn
parameter inconnect()
was not being used across the entire package; now all methods should capture any warnings returned in the Elasticsearch HTTP API headers (#261) - clarify in docs that
connect()
does not create a DBI like connection object (#265) - fix warning in
index_analyze()
function where as is methodI()
should only be applied if the input parameter is notNULL
- to avoid a warning (#269)
- fix to
docs_bulk_update()
: subsetting data.frame's was not working correctly when data.frame's had only 1 column; fixed (#260) - fix to internal method
es_ver()
in theElasticsearch
class to be more flexible in capturing Elasticsearch version (#268) - require newest
crul
version, helps fix a problem with passing along authentication details (#267)
(#87) The connect()
function is essentially the same, with some changes, but now you pass the connection object to each function all. This indeed will break code. That's why this is a major version bump.
There is one very big downside to this: breaks existing code. That's the big one. I do apologize for this, but I believe that is outweighed by the upsides: passing the connection object matches behavior in similar R packages (e.g., all the SQL database clients); you can now manage as many different connection objects as you like in the same R session; having the connection object as an R6 class allows us to have some simple methods on that object to ping the server, etc. In addition, all functions will error with an informative message if you don't pass the connection object as the first thing.
- gains new ingest functions
pipeline_create
,pipeline_delete
,pipeline_get
,pipeline_simulate
, andpipeline_attachment()
(#191) (#226) - gains new function
docs_delete_by_query()
anddocs_update_by_query()
to delete or update multiple documents at once, respectively; and new functionreindex()
to reindex all documents from one index to another (#237) (#195) - now using
crul
for HTTP requests. this only should matter with respect to passing in curl options (#168) - recent versions of Elasticsearch are starting to include warnings in response headers for deprecations and other things. These can now be turned on or off with
connect()
(#241) - gains new functions for the bulk API:
docs_bulk_create()
,docs_bulk_delete()
,docs_bulk_index()
. each of which are tailored to doing the operation in the function name: creating docs, deleting docs, or indexing docs (#183) - gains new function
type_remover()
as a utility function to help users remove types from their files to use for bulk loading; could be used on example files in this package or user supplied files (#180) - gains function
alias_rename()
to rename aliases
- fixed
scroll()
example that wasn't working (#228) - rework
alias_create()
(#230) - move initialize Elasticsearch connection section of README higher up to emphasize it in the right place (#231) thanks @mbannert
- whether you want "simple" or "complete" errors no longer sets env vars internally, but is passed through the internal error checker so that choices about type of errors for different connection objects do not affect one another (#242)
docs_get
gains new parameterssource_includes
andsource_excludes
to include or exclude certain fields in the returned document (#246) thanks @Jensxy- added more examples to
index_create()
(#211) - add examples to
Search()
andSearch_uri()
docs of how to use profiles (https://www.elastic.co/guide/en/elasticsearch/reference/current/search-profile.html) (#194) - additional example added to
docs_bulk_prep()
for doing a mix of actions (i.e., delete, create, etc.) - improved examples throughout package docs so that examples are more self-contained
- add
include_type_name
param in mappings fxns (#250)
docs_bulk_update()
was not handling boolean values correctly. now fixed (#239) (#240) thanks to @dpmccabe
- the
info()
method has been moved inside of the connection object. after callingx = connect()
you can callx$info()
- the
ping()
method has been marked as deprecated; instead, callping()
on the connection object created by a call toconnect()
- Gains new function
docs_bulk_update()
to do bulk updates to documents (#169)
- Vignettes weren't showing up on CRAN, fixed (#205)
- Added an example of using WKT in a query (#215)
- using markdown docs (#209)
id
is now optional indocs_create()
- if you don't pass a document identifier Elasticsearch generates one for you (#216) thanks @jbrantdocs_bulk()
gains new parameterquiet
to optionally turn off the progress bar (#202)
- Fix to
docs_bulk()
for encoding in different locales (#223) (#224) thanks @Lchiffon - Fix for
index_get()
: you can now only pass in one value to thefeatures
parameter (one of settings, mappings, or aliases) (#218) thanks @happyshows - Fix to
index_create()
to handle a list body, in addition to a JSON body (#214) thanks @emillykkejensen - Fix to
docs_bulk()
for document IDs as factors (#212) thanks @AMR-KELEG - Temporary files created when using
docs_bulk()
(and taking up disk space) are cleaned up now (deleted), though if you pass in your own file paths you have to clean them up (#208) thanks @emillykkejensen
- changed to S3 setup, with methods for
character
andlist
. - first parameter of
scroll()
andscroll_clear()
is nowx
, should only matter if you specified the parameter name for the first parameter scroll
parameter inscroll()
function is nowtime_scroll
- Added
asdf
(for "as data.frame") toscroll()
to give back a data.frame (#163) - streaming option added to
scroll()
, see parameterstream_opts
in the docs and examples (#160) - general docs improvements (#182)
- New functions
tasks
andtasks_cancel
for the tasks API (#145) - streaming option added to
Search()
, see parameterstream_opts
in the docs and examples.scroll
parameter inSearch()
is nowtime_scroll
(#160) - New function
field_caps
(for field capabilities) - in ES v5.4 and greater - New function
reindex
for the reindex ES API (#134) - New functions
index_template_get
,index_template_put
,index_template_exists
, andindex_template_delete
for the indices templates ES API (#133) - New function
index_forcemerge
for the ES index_forcemerge
route (#176)
- Added examples to docs for
Search
andSearch_uri
for how to show progress bar (#162) - Small docs fix to
docs_bulk
to clarify what's allowed as first parameter input (#173) docs_bulk
change to internal JSON preparation to usena = "null"
andauto_unbox = TRUE
in thejsonlite::toJSON
call. This means thatNA
's in R becomenull
in the JSON and atomic vectors are unboxed (#174) thanks @pieterprovoostmapping_create
gainsupdate_all_types
parameter; and new man file to explain how to enable fielddata if sorting needed (#164)suggest
is used through query DSL instead of a route, added example toSearch
(#102)- Now caching internal
ping()
calls - so that after the first one we used the cached version if called again within the same R session. Should help speed up some code with respect to http calls (#184) thanks @henfiber - Fixes to percolate functions and docs for differences in percolate functionality pre v5 and post v5 (#176)
- All http requests now contain
content-type
headers, for the most partapplication/json
(#197), though functions that work with the bulk API useapplication/x-ndjson
(#186) - docs fix to
mapping_create
egs (#199) - README now includes example of how to connect when your ES is using X-pack (#185) thanks @ugosan
- fixes for normalizing url paths (#181)
- fix to
type_exists
to work on ES versions less to and greater than v5 (#189) - fix to
field_stats
to indicate that its no longer avail. in ES v5.4 and above - and that thefields
parameter in ES >= v5 is gone (#190)
- New function
docs_update()
to do partial document updates (#152) - New function
docs_bulk_prep()
to prepare bulk format files that you can use to load into Elasticsearch with this package, on the command line, or in any other context (Python, Ruby, etc.) (#154)
- We're no longer running a check that your ES server is up before every request to the server. This makes request faster, but may lead to less informative errors when your server is down or in some other state than fully operational (#149)
- Tweaks here and there to make sure
elastic
works with Elasticsearch v5. Note that not all v5 features are included here yet. (#153)
docs_bulk()
was not working on single column data.frame's. now is working. (#151) thanks @gustavobiodocs_*
functions now support ids with whitespace in them. (#155)- fixes to
docs_mget()
to fix requesting certain fields back.
- Allow usage of
es_base
parameter inconnect()
- Now, instead ofstop()
ones_base
usage, we use its value fores_host
. Only pass in one or the other ofes_base
andes_host
, not both. (#146) thanks @MarcinKosinski
- package gains new set of functions for working with search templates:
Search_template()
,Search_template_register()
,Search_template_get()
,Search_template_delete()
, andSearch_template_render()
(#101)
- Improved documentation for
docs_delete
,docs_get
anddocs_create
to list correctly that numeric and character values are accepted for the id parameter - before stated that numeric values allowed only (#144) thanks @dominoFire - Added tests for illegal characters in index names.
- Fixed bug introduced into
Search
and related functions where wildcards in indeces didn't work. Turned out we url escaped twice unintentionally. Fixed now, and more tests added for wildcards. (#143) thanks @martijnvanbeers
- Changed
docs_bulk()
to always return a list, whether it's given a file, data.frame, or list. For a file, a named list is returned, while for a data.frame or list an unnamed list is returned as many chunks can be processed and we don't attempt to wrangle the list output. Inputs of data.frame and list used to returnNULL
as we didn't return anything from the internal for loop. You can wrapdocs_bulk
ininvisible()
if you don't want the list printed (#142)
- Fixed bug in
docs_bulk()
andmsearch()
in which base URL construction was not done correctly (#141) thanks @steeled !
- New function
scroll_clear()
to clear search contexts created when usingscroll()
(#140) - New function
ping()
to ping an Elasticsearch server to see if it is up (#138) connect()
gains new parameteres_path
to specify a context path, e.g., thebar
inhttp://foo.com/bar
(#137)
- Change all
httr::content()
calls to parse to plain text and UTF-8 encoding (#118) - Added note to docs that when using
scroll()
all scores are zero b/c scores are not calculated/tracked (#127) connect()
no longer pings the ES server when run, but can now be done separately withping()
(#139)- Let http request headers be sent with all requests - set with
connect()
(#129) - Added
transport_schema
param toconnect()
to specify http or https (#130) - By default use UUIDs with bulk API with
docs_bulk()
(#125)
- Fix to fail well on empty body sent by user (#119)
- Fix to
docs_bulk()
function so that user supplieddoc_ids
are not changed at all now (#123)
Compatibility for many Elasticsearch versions has improved. We've tested on ES versions
from the current (v2.1.1
) back to v1.0.0
, and elastic
works with all versions.
There are some functions that stop with a message with some ES versions simply
because older versions may not have had particular ES features. Please do let us
know if you have problems with older versions of ES, so we can improve compatibility.
- Added
index_settings_update()
function to allow updating index settings (#66) - All errors from the Elasticsearch server are now given back as
JSON
. Error parsing has thus changed inelastic
. We now have two levels of error behavior: 'simple' and 'complete'. These can be set inconnect()
with theerrors
parameter. Simple errors give back often just that there was an error, sometimes a message with explanation is supplied. Complete errors give more explanation and even the ES stack trace if supplied in the ES error response (#92) (#93) - New function
msearch()
to do multi-searches. This works by defining queries in a file, much like is done for a file to be used in bulk loading. (#103) - New function
validate()
to validate a search. (#105) - New suite of functions to work with the percolator service:
percolate_count()
,percolate_delete()
,percolate_list()
,percolate_match()
,percolate_register()
. The percolator works by first storing queries into an index and then you define documents in order to retrieve these queries. (#106) - New function
field_stats()
to find statistical properties of a field without executing a search (#107) - Added a Code of Conduct
- New function
cat_nodeattrs()
- New function
index_recreate()
as a convenience function that detects if an index exists, and if so, deletes it first, then creates it again.
docs_bulk()
now supports passing in document ids (to the_id
field) via the parameterdoc_ids
for each input data.frame or list & supports using ids already in data.frame's or lists (#83)cat_*()
functions cleaned up. previously, some functions had parameters that were essentially silently ignored. Those parameters dropped now from the functions. (#96)- Elasticsearch had for a while 'search exists' functionality (via
/_search/exists
), but have removed that in favor of using regular_search
withsize=0
andterminate_after=1
instead. (#104) - New parameter
lenient
inSearch()
andSearch_uri
to allow format based failures to be ignored, or not ignored. - Better error handling for
docs_get()
when gthe document isn't found
- Fixed problems in
docs_bulk()
in the use case where users use the function in a for loop, for example, and indexing started over, replacing documents with the same id (#83) - Fixed bug in
cat_()
functions in which they sometimes failed whenparse=TRUE
(#88) - Fixed bug in
docs_bulk()
in which user supplied document IDs weren't being passed correctly internally (#90) - Fixed bug in
Search()
andSearch_uri()
where multiple indices weren't supported, whereas they should have been - supported now (#115)
- The following functions are now defunct:
mlt()
,nodes_shutdown()
,index_status()
, andmapping_delete()
(#94) (#98) (#99) (#110)
- Added
index_settings_update()
function to allow updating index settings (#66)
- Replace
RCurl::curlEscape()
withcurl::curl_escape()
(#81) - Explicitly import non-base R functions (#80)
- Fixed problems introduced with
v1
ofhttr
- New function
Search_uri()
where the search is defined entirely in the URL itself. Especially useful for cases in whichPOST
requests are forbidden, e.g, on a server that preventsPOST
requests (which the functionSearch()
uses). (#58) - New function
nodes_shutdown()
(#23) docs_bulk()
gains ability to push data into Elasticsearch via the bulk http API from data.frame or list objects. Previously, this function only would accept a file formatted correctly. In addition, gains new parameters:index
- The index name to use.type
- The type name to use.chunk_size
- Size of each chunk. (#60) (#67) (#68)
cat_*()
functions gain new parameters:h
to specify what fields to return;help
to output available columns, and their meanings;bytes
to give numbers back machine friendly;parse
Parse to a data.frame or notcat_*()
functions can now optionally capture data returned in to a data.frame (#64)Search()
gains new parametersearch_path
to set the path that is used for searching. The default is_search
, but sometimes in your configuration you've setup so that you don't need that path, or it's a different path. (023d28762e7e1028fcb0ad17867f08b5e2c92f93)
- In
docs_mget()
added internal checker to make sure user passes in the right combination ofindex
,type
, andid
parameters, orindex
andtype_id
, or justindex_type_id
(#42) - Made
index
,type
, andid
parameters required in the functiondocs_get()
(#43) - Fixed bug in
scroll()
to allow longscroll_id
's by passing scroll ids in the body instead of as query parameter (#44) - In
Search()
function, in theerror_parser()
error parser function, check to see iferror
element returned in response body from Elasticsearch, and if so, parse error, if not, pass on body (likely empty) (#45) - In
Search()
function, added helper function to check size and from parameter values passed in to make sure they are numbers. (#46) - Across all functions where
index
andtype
parameters used, now usingRCurl::curlEscape()
to URL escape. Other parameters passed in are go throughhttr
CRUD methods, and do URL escaping for us. (#49) - Fixed links to development repo in DESCRIPTION file
First version to go to CRAN.
- Added a function
scroll()
and ascroll
parameter to theSearch()
function (#36) - Added the function
explain()
to easily get at explanation of search results. - Added a help file added to help explain timem and distance units. See
?units-time
and?units=distance
- New help file added to list and explain the various search functions. See
?searchapis
- New function
tokenizer_set()
to set tokenizers connect()
run on package load to set default base url oflocalhost
and port of9200
- you can override this by running that fxn yourself, or storinges_base
,es_port
, etc. in your.Rprofile
file.
- Made CouchDB river plugin functions not exported for now, may bring back later.
- Added vignettes for an intro and for search details and examples (#2)
es_search()
changed toSearch()
.- More datasets included in the package for bulk data load (#16)
- All examples wrapped in
\dontrun
instead of\donttest
so they don't fail on CRAN checks. es_search_body()
removed - body based queries using the query DSL moved to theSearch()
function, passed into thebody
parameter.
- Remoworked package API. Almost all functions have new names. Sorry for this major change
but it needed to be done. This brings
elastic
more in line with the official Elasticsearch Python client (http://elasticsearch-py.readthedocs.org/en/master/). - Similar functions are grouped together in the same manual file now to make finder related
functions easier. For example, all functions that work with indices are on the
index
manual page, and all functions prefixed withindex_()
. Thematic manual files are:index
,cat
,cluster
,alias
,cdbriver
,connect
,documents
,mapping
,nodes
, andsearch
. - Note that the function
es_cat()
was changed tocat_()
- we avoidedcat()
because as you know there is already a widely used function in base R, seebase::cat()
. - We changed
cat
functions to separate functions for each command, instead of passing the command in as an argument. For example,cat('aliases')
becomescat_aliases()
. - The
es_
prefix remains only fores_search()
, as we have to avoid conflict withbase::search()
. - Removed
assertthat
package import, usingstopifnot()
instead (#14)
- First version.