Elasticsearch‘s Java QueryBuilder
Julian Setiawan
Posted on June 25, 2019
In Solace PubSub+ Cloud, we began storing metrics early on in anticipation for accounting and billing. The problem was that we weren’t quite sure which metrics would be used nor what sort of queries would be needed to support our accounting and billing needs.
We chose Elasticsearch for storage as we trusted its powerful search capabilities and scalability. However, one aspect that we grossly undervalued was its fantastic Java API. Although it is generally a facade for Elasticsearch’s REST API, a particularly clever feature has been helping us build our metrics microservice with great velocity and flexibility without compromising robustness.
When we first started using Elasticsearch, we built queries in a pretty straightforward way:
BoolQueryBuilder()
.must(QueryBuilders.termQuery("metricName", "Host"))
.must(QueryBuilders.termQuery("metricType", "DiskSpace"))
.must(QueryBuilders.termQuery("organizationId", organizationId))
.must(QueryBuilders.rangeQuery("startTime").gte(startTime))
.must(QueryBuilders.rangeQuery("endTime").lte(endTime));
We eventually realized that we usually had to tack on an organization’s ID and some time range to the query so we abstracted that out and just required the metric-specific part of the query to be given.
This worked at first, but we didn’t want to have to edit code every time we needed to calculate a new metric or slightly change an existing one. This is when we discovered Elasticsearch’s Wrapper Query.
On the surface, this is simple functionality where you can feed the QueryBuilder object a JSON string. Something like this:
{
"bool" : {
"must" : [
{ "terms" : { "metricName" : ["Host"] } },
{ "terms" : { "metricType" : ["DiskSpace"] } }
]
}
}
Which you feed into the QueryBuilder like this:
QueryBuilders.wrapperQuery(json);
The next question is how to start augmenting the query to search across organization IDs and time periods. A gut reaction could be to add a token somewhere in the JSON string to be replaced, but this is where the Elasticsearch API shines.
You may have noticed that the Wrapper Query is just another QueryBuilder, which means you get back a builder on which you can simply add more parameters to. This let us re-use most of our abstractions of dealing with adding organization ID and time periods to our metric queries:
BoolQueryBuilder()
.must(QueryBuilders.wrapperQuery(json))
.must(QueryBuilders.termQuery("organizationId", organizationId))
.must(QueryBuilders.rangeQuery("startTime").gte(startTime))
.must(QueryBuilders.rangeQuery("endTime").lte(endTime));
And with this, we had our solution. We were able to churn out new Elasticsearch queries easily or update existing ones without any code changes while re-using our well-tested abstractions for specifying well-known search parameters. Another awesome benefit was being able to directly use our JSON files as queries to Elasticsearch’s REST API for easier testing and validation.
What do you think of this solution? Are there any other Elasticsearch API features we should have used instead? We are still learning and love hearing about new features and use cases.
Posted on June 25, 2019
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.