Joins for Elastisearch 2.x! Announcing SIREn Join 2.x

Joins for Elastisearch 2.x! Announcing SIREn Join 2.x

Siren Join is a plugin for Elasticsearch that extends Elasticsearch with new search actions and a filter query parser that enables to perform a “Filter Join” between two sets of documents (in the same index or in different indices).

Siren Join is at the heart of the Kibi Data intelligence platform – our friendly Kibana fork, but can also be used standalone to write applications with amazing powers of.. cross index joins ūüôā . Please refer to our previous blog post about this, with ample examples

Today we are delighted to announce the release of Siren Join 2.1.2 which is compatible with Elasticsearch 2.1.2 (and will be powering the upcoming Kibi 0.3, based on Kibana 4.4)

You can install it using the following command:

bin/plugin install solutions.siren/siren-join/2.1.2

The focus of this release is on the compatibility with Elasticsearch 2.x. One of the major changes in Elasticsearch 2.x was the merge between query and filter. As the Filter Join was originally implemented as a filter, we had to port it to the new query API. One drawback of this major change is the introduction of a new query cache policy which is at the moment too restrictive for certain Siren Join scenarios.

The issue is that – as things stand now in Elasticsearch – queries will never be cached for small segments which could lead to some performance impact in certain SIREn Join scenarios. We are currently discussing with Elasticsearch on how to improve this. For the moment, the fallback solution is to disable the new query cache policy and activate the old one with the settings index.queries.cache.everything: true.

Besides this, v2.1.2 includes enhancements such as support for alternative terms encoding and configurability of the node-level filter join cache. We have implemented an alternative terms encoding based on variable integers that improves significantly the performance if your data is compatible.

We are also hard at work on the upcoming release.  We will introduce additional terms encoding based on bloom filters, a circuit breaker to protect against out of memory error, and many more improvements are on the roadmap with a goal of unlimited scalability.

You might want to sign up to our mailig list (see box in our footer) and also to the  SIREn User Group to find out more about all of this, be informed about updates and to participate in discussions about structured document search.

Renaud Delbru

No Thanks / Already Signed Up