We are going to look at some powerful Solr features and how it is different than MySQL.
What Is Apache Solr?
Apache Solr is an Open Source, enterprise search server. It stores information in such a way that searching is very fast. In a nutshell, it’s also a storage system like SQL and NoSQL.
Solr is written in Java and uses the Lucene search library for its core functionality. You don’t need to know Java to work with Solr.
How It Is Different than MySQL?
If you’re new to Solr, the best way to understand the internals of Solr is to compare it with MySQL.
- MySQL stores information in the form of tables and rows. Whereas Solr stores information in form of schema and XML documents. Schema defines the structure of the documents.
- You can have multiple tables in MySQL, similarly you can have multiple schemas in Solr.
- Columns in a table define the structure of the table similarly in Solr fields define the structure of the schema.
- In MySQL you store in the form of rows whereas in Solr you store in the form of documents.
- In MySQL when columns are indexed the rows get arranged in a tree like structure. Whereas in Solr when a field is indexed it is arranged into a inverted index data structure.
What Makes It Fast for Search?
Solr uses inverted index data structure to search for words in documents and intersects the final result. No other storage system uses this kind of data structure.
What Are Other Features of Solr?
Solr offers many other features like spell correction, faceting, highlighting, result grouping, auto completion etc. Implementing these features into your website will make it stand out from the crowd. These features provide better user experience and a new way to access content on your website.
Why You Should Integrate website/e-commerce site with Solr?
When the number of article/News on your site increases, MySQL starts to perform slow when users search on your site. This is because MySQL loops through every article and uses regular expressions to match search terms. This is a very CPU expensive task. Sometimes users get request timeout errors due to PHP script execution time limit. If there are 10,000 article/news/product then for every search query MySQL is going to hit the file system 10,000 times which is a very expansive task and will slow down your website.
Whereas Solr can search 10,000 documents in just a couple of seconds. If you have a medium size blog, then a single Solr instance is enough to power all posts.
Several inherent challenges complicate full-text search. First, there is currently no way to
Guarantee the searcher will find the “best” results because there is often no agreement on what the “best” result is for a particular search. That’s because evaluating results can be very subjective.
Also, users generally enter only a few terms into a search engine, and there is no way for the search system to understand the user’s intention for a search. In fact, if the user is doing an initial exploration of a topic area, the user may not even be aware of his or her intention.
A system that understands natural language (that is, the way people speak or write) perfectly is usually considered the ultimate goal in search engine technology, in that it would do as good a job as a person in finding answers. But even that is not perfect, as variations in human communication and comprehension mean that even a person is not guaranteed to find the “right” answer, especially in situations where there may not even be a single “right” answer for a particular question.
Some search engines, Full text searching provider try to solve, or at least mitigate, these challenges.
Solr search can be used as a replacement for core content search and boasts both extra features and better performance.
Features
- Faceted Search
- Faceted search is supported if you use the facet API module. Facets will be available for you ranging from content author to taxonomy to arbitrary fields.
- More like this
- Relevant content blocks (“More like this” blocks) can be added to any node page. The block will show you relevant nodes and/or nodes similar to the one your site-visitor is viewing. The analysis happens realtime in Solr
- Search Pages
- Multiple search pages with optionally customized search results, layout and others
- Search Environments
- Add multiple Solr Search cores and query them so you can optimally connect to the one of your choice. Ideal to have multiple facet configurations.
- Range Queries
- These query types, in combination with Facet Api Slider, delivers a very rich faceting experience delivers to the end user.
Public Websites using Solr and Drupal:
- http://www.whitehouse.gov/ – Uses Solr via Drupal for site search w/highlighting & faceting
- TheBigJobs.com A Job portal build using Drupal CMS. Using solr to index & search jobs posted on website.
- Scintilla: search and MoreLikeThis
- MAME Reviews: faceted search
- Peel Sessions: search
- Akademika.no is a Norwegian online book store with focus on students and school books. Built with Drupal, and Solr powers the search through the Drupal Solr plugin.