I’ve been wanting a good comparison between ES and Solr for a while, since the question often comes up, and I don’t have enough Solr experience to address it well. The start of this series looks really promising.

Some things from this post that make ES so great for our use cases:

  • Multiple types of documents with different structures (think about indexing posts and comments from a WordPress blog in the same index)
  • Pretty much all settings and document mappings can be changed on the fly without restarting the cluster (though it does take some forethought to ensure you can use them all)
  • Routing allows limiting a search to a single shard. Particularly useful for faceted search.

[Note: for those of you who don’t have the time or inclination to go through all the technical details, here’s a high-level, up-to-date (2015)Solr vs. Elasticsearch overview]


A good Solr vs. ElasticSearch coverage is long overdue.  At Sematext we make good use of our own Search Analytics and pay attention to what people search for.  Not surprisingly, lots of people are wondering when to choose Solr and when ElasticSearch, and this SolrCloud vs. ElasticSearch question is something we regularly address in our search consulting engagements.

As the Apache Lucene 4.0 release approaches and with it Solr 4.0 release as well, we thought it would be beneficial to take a deeper look and compare the two leading open source search engines built on top of Lucene – Apache Solr and ElasticSearch. Because the topic is very wide and can go deep, we are publishing our research as…

View original post 2,437 more words

Quickly Build Faceted Search with ElasticSearch and Backbone.js

I’ve been working with ElasticSearch off and on for the past year, and recently I’ve done a lot of work using Backbone.js to build interactive elements for web pages. Time to combine them together into a modular library: es-backbone.js.

Faceted search is one of the more powerful aspects of ElasticSearch. For es-backbone I was inspired by Karel Minarik’s very cool data visualization example. Initially I started from his implementation, then veered away from Protovis to use jQuery Flot for the charting when I realized my Javascript graphics abilities were not up to making use of Protovis. Flot makes it very fast to build and customize charts.

But managing all the data that comes back with your faceted search results, displaying it for the user, and allowing them to interact with the data and filter it further is also a bit of a headache. Using Backbone to keep a model of the current query the user is doing and another model of the search results helps keep the data well-organized and easy to update. By creating highly modular Backbone Views I can quickly customize a search page depending on what fields the data contains and how we want to display it.

I don’t have any public ElasticSearch data for a good demo, so a screenshot will have to do:

Each part of the page is a separate Backbone View, which allows you to customize the page very quickly based on what data you have. Think that pie chart of authors is too busy, replacing it with a list of the top sites (similar to the tags list) is only a few lines of code. Currently the library has Views for displaying facets as:

  • A pie chart of terms or ranges
  • A list of terms (with counts and percentages)
  • A timeline of dates (which auto switches scales between months, weeks, and days )

All of these views will re-filter your results when you click on the pie/chart/list so you can drill down into your search results. And I used Select2.js in tagging mode to make it easy to add and remove filters. I think the interaction is pretty cool, and wish I had some live public data to show it off on.

The library definitely allows new sites to be built very fast. Once the data is indexed, you can build a site in less than an hour (my last one took 35 min). I’ve released it on github in the hopes that others will find it useful (patches welcome). I’ve included a simple example which you can grep through for “TODO” to find the parts you need to edit to customize for your own application.

One of many great recent posts from Sematext on ElasticSearch performance.

We’ve been doing a ton of work with ElasticSearch. Not long ago, we had a few situations where ElasticSearch would “eat” all the JVM heap memory we give it.  It was so hungry, we could not feed it enough memory to keep it happy.  It was insatiable.  After some troubleshooting and looking at SPM for ElasticSearch (btw. we released a new version of the SPM agent earlier this week, so if you don’t have it, go grab agent v1.5.0) we figured out the cause – ElasticSearch default field cache setting was not quite right for our deployment. In this post we’ll share our experience on this topic, explain why this was happening and how to minimize the negative effect of large field caches.

ElasticSearch Cache Types

There are two types of caches in ElasticSearch whose behaviors you can control. The first cache is the filter cache. This cache is responsible for…

View original post 1,201 more words