search¶
See the doc:dev-elasticsearch documentation for configuration and Salt installation.
Index¶
The search
app has a SearchIndex
class in search/search.py
.
This class does the hard work of searching.
As a developer, you just need to create a new class for each of your indexes. Sample index classes are here:
ContactSearch
in https://gitlab.com/kb/contact/blob/master/contact/search.pyTicketSearch
in https://gitlab.com/kb/crm/blob/master/crm/search.py
Delete¶
We have updated our strategy to include deleted rows in the index, with the option to include them in a search.
For more information, on our delete strategy, see the following links:
View¶
The search
app also has SearchViewMixin
which can be used to search the
indexes e.g:: dash.SearchView
in
https://gitlab.com/kb/kbsoftware_couk/blob/master/dash/views.py#L66
which uses ContactSearch
and TicketSearch
(defined above).
Rebuild and Refresh¶
To rebuild your index, create a task e.g. rebuild_contact_index
in
https://gitlab.com/kb/contact/blob/master/contact/tasks.py
Create a management command to call the rebuild_contact_index
task:
e.g. rebuild_contact_index.py
To refresh your index, use the same idea, but call refresh
instead e.g:
index = SearchIndex(ContactIndex())
count = index.refresh()
Search¶
Issues are much easier to diagnose if you can run a simple management command to perform a search query:
Create a management command called search_contact_index
e.g. search_contact_index.py
Update¶
Create an index update function e.g:: update_contact_index
in
https://gitlab.com/kb/contact/blob/master/contact/tasks.py
In your create and update views, call the update task e.g:
from django.db import transaction
from contact.tasks import update_contact_index
transaction.on_commit(lambda: update_contact_index.delay(self.object.pk))
For a simple UpdateView
, the minimum viable code is as follows:
def form_valid(self, form):
result = super().form_valid(form)
transaction.on_commit(lambda: update_contact_index.delay(self.object.pk))
return result
ElasticSearch¶
Prerequisites¶
Setup Celery (using Redis)…
Install¶
Follow the ElasticSearch - Getting Started instructions…
In your requirements/base.txt
, add the following:
elasticsearch
Tip
Find the version number in Requirements
In settings/production.py
(after CELERY_DEFAULT_QUEUE
):
from celery.schedules import crontab
CELERYBEAT_SCHEDULE = {
'rebuild_contact_index': {
'task': 'contact.tasks.rebuild_contact_index',
'schedule': crontab(minute='10', hour='1'),
},
}
Note
Remember to use the correct pattern for transactions when queuing search index updates. For details, see Transactions
Diagnostics¶
Analyze¶
To understand how your field is being analyzed (this example is from the Contact app):
from search.search import SearchIndex
from contact.search import ContactIndex
index = SearchIndex(ContactIndex())
index.drop_create()
import json
print(json.dumps(index.analyze('autocomplete', 'EX2 2AB'), indent=4))
print(json.dumps(index.analyze('autocomplete_search', 'EX2 2AB'), indent=4))
Note
This example uses the ContactSearch
index and the autocomplete
analyzers:
https://gitlab.com/kb/contact/blob/master/contact/search.py#L61
For other diagnostics, see Diagnostics…
Explain¶
To understand the score for search results:
Make sure
DEBUG
is set toTrue
in yoursettings
.Add
explain
to your call to thesearch
method.
e.g:
result = search_index.search(
criteria,
explain=True,
)
Tip
You could add this to the SearchViewMixin
class
in search/views.py
.
The _explain
method in search/search.py
will write a time-stamped file
containing the results e.g. elastic-explain-2019-01-07-13-20-46.json
.
Maintenance¶
To manually update the index run the management command created earlier (see rebuild_contact_index.py).
The flush process of an index basically frees memory:
curl localhost:9200/_flush
Query¶
Note
Replace hatherleigh_info
with your site name.
Install httpie
:
pip install httpie
Create a json
file containing your query e.g. query.json
:
{
"query": {
"match": {
"part": "B020"
}
}
}
In this example, we are searching for B020
.
Run the query:
http GET http://localhost:9200/hatherleigh_info/_search < query.json
Explain¶
To explain the query above:
http GET "http://localhost:9200/hatherleigh_info/part/_validate/query?explain" < query.json
The part
is the document type (DOC_TYPE
in the index mappings
below):
es.indices.create(
SEARCH_INDEX,
{
'mappings': {
self.DOC_TYPE: {
"properties": {
"part": {
"type": "string",
"analyzer": "autocomplete",
},
Analyze¶
See Diagnostics above… or:
Create a json
file containing your query e.g. analyze.json
:
{
"analyzer": "autocomplete",
"text": "quick brown"
}
Run the analysis:
http GET http://localhost:9200/hatherleigh_info/_analyze < analyze.json