Solr Search - the full-text search application of our choice. We present it in our blog post.
What is Solr Search?
Solr Search is a pre-built search platform with various features based on Apache Lucene™. We have integrated this open source application in our stack and use it for various application purposes.
Solr is a standalone application that runs on our own Docker instance and is hosted internally on our own server. This means that no data from our databases is sent to external third-party providers; everything remains internal to us. This way we can also ensure the privacy of personal data.
Once the instance is set up, the databases from different projects can be linked to it. These databases form the basis for the search, as the data originates from them. Within the databases, it is possible to define a wide variety of objects (e.g. people, contracts, documents, etc.). The flexibility of the application in this area is remarkable, so that practically any data that you want to make searchable can be displayed as search results. In addition, the information that is displayed on a search result can be defined directly per object. For example, on a website, the object "Contract" will have different information highlighted than the object "Person". Thus, the layout can be defined for each object type.
What features does Solr offer?
Solr works like an advanced full-text search, which searches the linked data for the words you are looking for. The following features come with Solr out-of-the-box and can be used as desired.
Out-of-the-box
- Prioritization of search results based on the frequency and precision of the searched word
- Singular & Plural detection (for example, that with the word "words" also suggestions for "word" are displayed)
- Umlaut detection (so that, for example, can be searched for "Aenderung" and then still results with the word "change" are displayed)
- Incomplete or incorrect search entries
- Stopwords are also covered - so only the relevant words are filtered out. So you can search for "how can I find a topic" and the system itself notices that the words "how can I" are not relevant, excludes them from the search results.
- Fast indexing of data, so that one can see the contents in the search also fast
- Solr offers an optimized search function, which works also extremely fast - also with simultaneous search inquiries (traffic)
Own development
- Prioritization of the search results on the basis of the title (thus if the word occurs in the title of the object)
- Language subdivision (one can divide the search by language, so that also only results in the appropriate language are indicated)
- Indication of search results depending upon authorization role
This list is not exhaustive, as we can add more features for each custom development.
Integration of Tika PDF-Parser
We also use an additional service. Tika PDF parser can also be used to convert documents so that the texts within the documents become searchable.
This feature is exciting especially when you want to insert forms and documents and you want users to find the information through search.
Tika PDF parser can search documents with different formats - for example Word, Excel, PDF etc.
How do we use Solr Search?
We have several projects with different requirements where we use Solr Search.
Swiss Hiking Trails
On the Swiss Hiking Trails page, the search function is used for the entire website.
Two things are very handy here. The first is that search results which should not be displayed yet are not included in the search. For example, documents which are not published anywhere, because for example the page is not supposed to be public yet, are not displayed and not searched. Secondly, depending on the permission role (i.e. whether you are logged in as a member or not), it is possible to show search results and not show them. This way data leaks can be avoided.
Administration software
In the administration software of a network of service clubs with an international presence, we also use the search. Here, the authorization issue also comes into play, as members are divided according to different clubs and accordingly should only see the information belonging to their club.
Intranet projects
In our self-developed Luna intranet, the search is also integrated and delivered out-of-the-box.
What difficulties does Solr Search bring with it?
Permissions were an issue that was not so easy to solve because there are a lot of different cases to cover.
We store the permission roles in Solr and have built in dynamic logic. This means that when you search for something, the system directly checks whether the object is allowed to be displayed to the user or not. For example, a user who is logged in on some kind of intranet is not allowed to see all contracts. However, user-created contracts can - such cases are rather difficult to cover.
What are the advantages of Solr Search?
- The search and the behavior of the search can be configured to a high degree and customized independently, so that it is perfectly adjusted to the needs of the corresponding search.
- The system is extremely performant. Other databases offer additional features that Solr clearly excludes. Solr limits itself to an optimized search and is therefore much faster in indexing the data for the search, where other search functions would take much longer.
- The features described above in the chapter "What features does Solr offer?" would cost a lot of effort if you were to develop them yourself. These development heavy features are directly included out-of-the-box with Solr.
- There are no additional hosting or maintenance costs for the customer, as Solr is an open-source system and is hosted on our own servers.
Interested?
Do you have a project in which you would like to use an advanced search function, but do not know exactly how to proceed? Contact us at.