Elasticsearch Notes
how to secure
Tough, use an app proxy to be sure. For now: local access only. Not designed with security in mind.
to file /etc/elasticsearch/elasticsearch.yml added to the end
script.disable_dynamic: true
quick stuff
HEAD is no longer used, instead use kibana, which is it's own service.
elasticsearch-head and elastic search plugin ( https://github.com/mobz/elasticsearch-head )
_search?search_type=count
{ "aggs" : { "all_users": { "terms": { "field": "screen_name" } } } }
list indexes and summary:
curl 'localhost:9200/_cat/indices?v'
show health
curl 'localhost:9200/_cat/health?v'
list nodes:
curl 'localhost:9200/_cat/nodes?v'
delete an index
curl -XDELETE 'http://localhost:9200/twitterindex_v2/'
created an index with mappings from a file:
curl -XPUT localhost:9200/twitterindex_v2 -T 'mapping.1'
get the mappoings for an index
curl -XGET "http://localhost:9200/test-index/_mapping" | jsonlint > mapping
pattern of data import
- import data
- dump mapping
- edit mapping
- create new index with new mapping
- import data again.
Explicitly mapping date fields
from: http://joelabrahamsson.com/dynamic-mappings-and-dates-in-elasticsearch/
curl -XPUT "http://localhost:9200/myindex" -d' { "mappings": { "tweet": { "date_detection": false, "properties": { "postDate": { "type": "date" } } } } }'
curl -XPUT 'https://search-myiotcatcher-eq4tipuq24ltctdtgz5hydwvb4.us-east-1.es.amazonaws.com/iotworld_v4' -H 'Content-Type: application/json' -d' { "container" : { "_timestamp" : {"enabled": true, "type":"date", "format": "epoch_second", "store":true, "path" : "timestamp"} }, "mappings": { "sensordata": { "properties": { "temperature": { "type": "float" }, "humidity": { "type": "float" }, "timestamp": { "type": "date" } } } } } ' curl -XGET 'https://search-myiotcatcher-eq4tipuq24ltctdtgz5hydwvb4.us-east-1.es.amazonaws.com/iotworld_v4/_mapping' | python -m json.tool curl -XGET 'https://search-myiotcatcher-eq4tipuq24ltctdtgz5hydwvb4.us-east-1.es.amazonaws.com/iotworld_v4/sensordata/_search' | python -m json.tool
Changing mappings
so you don't like the data mapping and you want to change it:
first dump the existing mapping to a file:
curl -XGET 'http://localhost:9200/fitstat_v1/_mapping' | python -m json.tool > fitstat_v1_mapping
then copy that mapping to the new version:
cp fitstat_v1_mapping fitstat_v2_mapping
edit the new mapping, for example adding "type": "nested", to you nested objects.
then create a new index specifying the new mapping:
curl -XPUT 'http://localhost:9200/fitstat_v2' -d @fitstat_v2_mapping
next: extractin from old, puting into new and nuking old.
... FIXME
backup
from: https://www.elastic.co/guide/en/elasticsearch/guide/current/backing-up-your-cluster.html
add to the end of /etc/elasticsearch/elasticsearch.yml :
path.repo: ["/mnt/freenas/dataset_elasticsearch/backup"]
root@keres /mnt/freenas/dataset_elasticsearch/backup # curl -XPUT "http://localhost:9200/_snapshot/freenas_backup" -d' { "type": "fs", "settings": { "location": "/mnt/freenas/dataset_elasticsearch/backup" } }'
https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-snapshots.html
example searches
{ "query": { "match_all": {} } }
{ "query": { "match": { "filter_level": "low" } } }
{ "query": { "match": { "source": "iPad" } }, "_source": [ "source" , "text"] }
{ "size": 0, "aggs": { "group_by_state": { "terms": { "field": "source" } } } }
"size": 0, - print agg only and not hits. PERFORMANCE!!
{ "fields": [], "sort": [ { "zkb.totalValue": { "order": "asc" } }, "_score" ], "query": { "range": { "zkb.totalValue": { "lt": 200000000 } } } }
{ "fields" : [ "victim.shipTypeID" , "victim.corporationName", "victim.characterID" , "victim.characterName"], "sort" : [ { "zkb.totalValue" : {"order" : "asc"}}, "_score" ], "query": { "range": { "zkb.totalValue": { "lt": 200000000 } } } }
changing-mapping-with-zero-downtime
https://www.elastic.co/blog/changing-mapping-with-zero-downtime
aggregates
moving data between indexes
Use ElasticDump ( https://www.npmjs.com/package/elasticdump )
1) yum install epel-release
2) yum install nodejs
3) yum install nodejs npm
4) npm install elasticdump
5) cd node_modules/elasticdump/bin
6)
./elasticdump \ --input=http://192.168.1.1:9200/original \ --output=http://192.168.1.2:9200/newCopy \ --type=data
elasticdump \ --input=http://localhost:9700/.kibana \ --output=http://localhost:9700/.kibana_read_only \ --type=mapping elasticdump \ --input=http://localhost:9700/.kibana \ --output=http://localhost:9700/.kibana_read_only \ --type=data
Dumping to a file
In this example I dump my AWS Elasticsearch cluster to a file.
it's one index with 20k records, not huge.
time /home/david/node_modules/.bin/elasticdump \ --input=https://search-myiotcatcher-eq4tipuq24ltctdtgz5hydwvb4.us-east-1.es.amazonaws.com/iotworld_v5 \ --output=/mnt/freenas/dataset_elasticsearch/iotworld_v5/iotworld_v5_mapping.json \ --type=mapping time /home/david/node_modules/.bin/elasticdump \ --input=https://search-myiotcatcher-eq4tipuq24ltctdtgz5hydwvb4.us-east-1.es.amazonaws.com/iotworld_v5 \ --output=/mnt/freenas/dataset_elasticsearch/iotworld_v5/iotworld_v5.json \ --type=data
Disk full -> readonly lock
If the disk fills up the indexes will got into "read-only" mode.
reset it like this:
curl -X PUT http://${HOST}:9200/.kibana/_settings -d ' { "index": { "blocks": { "read_only_allow_delete": "false" } } }' -H'Content-Type: application/json'
and you will get back if it worked:
{"acknowledged":true}
clean up old indexes
#!/bin/sh export HOST=servername for i in `curl -s -XGET "http://es.staging.thecarrotlab.com:9200/_cat/indices?v" | grep logsta | sort -k 3 -n -r | awk '{print $3}' | tail -n +32` do echo $i echo curl -XDELETE "http://es.staging.thecarrotlab.com:9200/$i" done
serverconfig notes
stuff I've added to my default config:
# for backups path.repo: ["/mnt/freenas/dataset_elasticsearch/backup"] # to disallow remote code execution script.disable_dynamic: true
/etc/sysconfig/sysconfig/elasticsearch ( grep -v ^# )
DATA_DIR=/data/elasticsearch/data LOG_DIR=/data/elasticsearch/log WORK_DIR=data/elasticsearch/tmp ES_HEAP_SIZE=2g ES_GC_LOG_FILE=/data/elasticsearch/log/gc.log
Network
/etc/services updated:
$ grep 9200 /etc/services elasticsearch-rest 9200/tcp # elasticsearch-restful api #wap-wsp 9200/tcp # WAP connectionless session service wap-wsp 9200/udp # WAP connectionless session service $ grep 9300 /etc/services elasticsearch-transport 9300/tcp # elasticsearch-transpost # vrace 9300/tcp # Virtual Racing Service vrace 9300/udp # Virtual Racing Service
python es
sudo pip install elasticsearch sudo pip install certifi
to read
- https://www.elastic.co/blog/data-visualization-with-elasticsearch-and-protovis
- https://greg.blog/2012/08/20/quickly-build-faceted-search-with-elasticsearch-and-backbone-js/
- https://www.elastic.co/blog/elasticsearch-storage-the-true-story