search
See the ElasticSearch documentation for configuration and Salt installation.
Check
The is_elasticsearch method checks the ELASTICSEARCH_HOST setting:
from search.models import is_elasticsearch
if is_elasticsearch():
# do something...
pass
Index
The search app has a SearchIndex class in search/search.py.
This class does the hard work of searching.
As a developer, you just need to create a new class for each of your indexes. Sample index classes are here:
ContactSearchin https://gitlab.com/kb/contact/blob/master/contact/search.pyTicketSearchin https://gitlab.com/kb/crm/blob/master/crm/search.py
ElasticSearch 8.x
Changes required for this version include e.g. ContactSearch
Remove the
document_typeproperty (look in theMixinclass).Rename
_settingstosettingsRename
configurationtomappingsand make it a method (not aproperty)mappingsto return just thepropertiesi.e. remove thesettings,mappingsanddocument_typekeys.
For sample code, see:
Delete
We have updated our strategy to include deleted rows in the index, with the option to include them in a search.
For more information, on our delete strategy, see the following links:
View
The search app also has SearchViewMixin which can be used to search the
indexes e.g:: dash.SearchView in
https://gitlab.com/kb/kbsoftware_couk/blob/master/dash/views.py#L66
which uses ContactSearch and TicketSearch (defined above).
Rebuild and Refresh
To rebuild your index, create a task e.g. rebuild_contact_index in
https://gitlab.com/kb/contact/blob/master/contact/tasks.py
Create a management command to call the rebuild_contact_index task:
e.g. rebuild_contact_index.py
To refresh your index, use the same idea, but call refresh instead e.g:
index = SearchIndex(ContactIndex())
count = index.refresh()
Search
Issues are much easier to diagnose if you can run a simple management command to perform a search query:
Create a management command called search_contact_index
e.g. search_contact_index.py
Update
Create an index update function e.g:: update_contact_index in
https://gitlab.com/kb/contact/blob/master/contact/tasks.py
In your create and update views, call the update task e.g:
from django.db import transaction
from contact.tasks import update_contact_index
transaction.on_commit(lambda: update_contact_index.delay(self.object.pk))
For a simple UpdateView, the minimum viable code is as follows:
def form_valid(self, form):
result = super().form_valid(form)
transaction.on_commit(lambda: update_contact_index.delay(self.object.pk))
return result
ElasticSearch
Prerequisites
Setup Celery (using Redis)…
Install
Follow the ElasticSearch - Getting Started instructions…
In your requirements/base.txt, add the following:
elasticsearch
Tip
Find the version number in Requirements
In settings/production.py (after CELERY_DEFAULT_QUEUE):
from celery.schedules import crontab
CELERYBEAT_SCHEDULE = {
'rebuild_contact_index': {
'task': 'contact.tasks.rebuild_contact_index',
'schedule': crontab(minute='10', hour='1'),
},
}
Note
Remember to use the correct pattern for transactions when queuing search index updates. For details, see Transactions
Diagnostics
Analyze
To understand how your field is being analyzed (this example is from the Contact app):
from search.search import SearchIndex
from contact.search import ContactIndex
index = SearchIndex(ContactIndex())
index.drop_create()
import json
print(json.dumps(index.analyze('autocomplete', 'EX2 2AB'), indent=4))
print(json.dumps(index.analyze('autocomplete_search', 'EX2 2AB'), indent=4))
Note
This example uses the ContactSearch index and the autocomplete
analyzers:
https://gitlab.com/kb/contact/blob/master/contact/search.py#L61
For other diagnostics, see Diagnostics…
Explain
To understand the score for search results:
Make sure
DEBUGis set toTruein yoursettings.Add
explainto your call to thesearchmethod.
e.g:
result = search_index.search(
criteria,
explain=True,
)
Tip
You could add this to the SearchViewMixin class
in search/views.py.
The _explain method in search/search.py will write a time-stamped file
containing the results e.g. elastic-explain-2019-01-07-13-20-46.json.
Maintenance
To manually update the index run the management command created earlier (see rebuild_contact_index.py).
The flush process of an index basically frees memory:
curl localhost:9200/_flush
Test
To check the install:
curl -X GET 'http://localhost:9200/?pretty'
Query
Note
Replace hatherleigh_info with your site name.
Install httpie:
pip install httpie
Create a json file containing your query e.g. query.json:
{
"query": {
"match": {
"part": "B020"
}
}
}
In this example, we are searching for B020.
Run the query:
http GET http://localhost:9200/hatherleigh_info/_search < query.json
Explain
To explain the query above:
http GET "http://localhost:9200/hatherleigh_info/part/_validate/query?explain" < query.json
The part is the document type (DOC_TYPE in the index mappings
below):
es.indices.create(
SEARCH_INDEX,
{
'mappings': {
self.DOC_TYPE: {
"properties": {
"part": {
"type": "string",
"analyzer": "autocomplete",
},
Analyze
See Diagnostics above… or:
Create a json file containing your query e.g. analyze.json:
{
"analyzer": "autocomplete",
"text": "quick brown"
}
Run the analysis:
http GET http://localhost:9200/hatherleigh_info/_analyze < analyze.json