Introduction to Elasticsearch

Elasticsearch Get started
Presenter: Ismail Anjrini
Date: 31 Aug - 2022

About Me
More than 15 years experience
Chief Architect

Middle-east Elasticsearch Community
https://www.meetup.com/elastic-saudi-arabia/
700+ Members
https://t.me/ElasticArabi
500+ Members
https://www.youtube.com/c/IsmailAnjrini
280+ Videos

Session Preparation
Download Elasticsearch 8.3.3
Download Kibana 8.3.3
run elasticsearch or elasticsearch.bat
Copy elastic user password
Copy Kibana token
run kibana or kibana.bat
Open Kibana URL and use Kibana token
Done

ELASTICSEARCH IS A CAREER PATH

Lucene
Apache Lucene is an open source project available for free
Lucene is a Java library
Elasticsearch is built over Lucene and provides a JSON based REST API to refer to Lucene features
Elasticsearch provides a distributed system on top of Lucene

Cluster
Node 1 Node 2 Node 3 Node 1 Node 2 Node 3
Cluster 1 Cluster 2

Node types
master data ingest
Cluster
master
data
data_warm
transform
data_hot
data_cold
remote_cluster_client
data_frozen
ingest
data_content
ml

Index
RDBMS Database Table Columns/Rows
Elasticsearch ? Index Document
NoSQL

Index (cont.)
NAME TYPE NULL?
ID INT NO
NAME VARCHAR(20) NO
ID Name
1 ISMAIL
2 MOHAMMAD

Document
A document is a basic unit of information that can be indexed

Shards
Each shard is in itself a fully-functional and independent "index" that can be hosted on any node in the cluster
index
shard 1 shard 2 shard 3

Replicas
index
replica 2-1
replica 1-1 replica 3-1

Building Search Based Application

e-Commerce (cont.)
Search Results

e-Commerce (cont.)
CDC SI RabbitMQ Client
NEST

e-Commerce (cont.)
Categories
ID
Name
Products
ID
Name
Price
LastUpdateDatetTime
ProductsCategories
ID
ProductID
CategoryID

Storing Big Data in Elasticsearch

Big Data - Business
Ingestion Performance Query Performance BOTH

Big Data - Index Architecture
index

Big Data - Ingestion
Ingest
Data Data

Big Data - Ingestion (cont.)
Ingest
Data Data
Logstash

Big Data - Query
Ingest
Data Data
Coordinating

Big Data - Multi Cluster
Node 1 Node 2 Node 3 Node 1 Node 2 Node 3
Cluster 1 Cluster 2

Big Data - Features
Rollup jobs Summarize and store historical data in a smaller index for future analysis
Transforms Use transforms to pivot existing Elasticsearch indices into summarized entity-centric
indices or to create an indexed view of the latest documents for fast access
ILM Makes it easier to manage indices in hot-warm-cold architectures, which are common
when you’re working with time series data such as logs and metrics
Data streams A data stream lets you store append-only time series data across multiple indices while
giving you a single named resource for requests

Elasticsearch - Search Capabilities

Search Capabilities
Autocomplete
Aggregations
Highlighting
Paginate search results
Geospatial search
“Did you mean”?
Sort search results
Elasticsearch SQL
Scripting
Text analysis
Pinned Query

Autocomplete
Completion Suggester
completion type
Category Context
Geo location Context

Aggregations
An aggregation summarizes your data as metrics, statistics, or other analytics
Metric Aggregations that calculate metrics, such as a sum or average, from field values
Bucket Aggregations that group documents into buckets, also called bins, based on field values, ranges,
or other criteria
Pipeline Aggregations that take input from other aggregations instead of documents or fields

Highlighting
Highlighters enable you to get highlighted snippets from one or more fields in your search results so you
can show users where the query matches are

Paginate search results
By default, searches return the top 10 matching hits
The from parameter defines the number of hits to skip, defaulting to 0
The size parameter is the maximum number of hits to return
Search after

Geospatial search
Elasticsearch supports two types of geo data: geo_point fields which support lat/lon pairs, and geo_shape
fields, which support points, lines, circles, polygons, multi-polygons, etc

“Did you mean”?
Suggesters
multi_match with bool_prefix
match_phrase_prefix

Sort search results
Allows you to add one or more sorts on specific fields
The sort is defined on a per field level, with special field name for _score to sort by score, and _doc to
sort by index order

Elasticsearch SQL
Elasticsearch SQL aims to provide a powerful yet lightweight SQL interface to Elasticsearch
Elasticsearch SQL is built from the ground up for Elasticsearch
No need for additional hardware, processes, runtimes or libraries to query Elasticsearch
Elasticsearch’s SQL jdbc driver is a rich, fully featured JDBC driver for Elasticsearch
Elasticsearch SQL ODBC Driver is a 3.80 compliant ODBC driver for Elasticsearch

Scripting
With scripting, you can evaluate custom expressions in Elasticsearch
you can use a script to return a computed value as a field or evaluate a custom score for a query
painless
expression
mustache

Text analysis
Text analysis enables Elasticsearch to perform full-text search, where the search returns all relevant
results rather than just exact matches
Tokenization Breaking a text down into smaller chunks, called tokens. In most cases, these tokens
are individual words
This allows you to match tokens that are not exactly the same as the search terms, but
similar enough to still be relevant
Normalization

Pinned Query
Promotes selected documents to rank higher than those matching a given query
This feature is typically used to guide searchers to curated documents that are promoted over and above
any "organic" matches for a search
The promoted or "pinned" documents are identified using the document IDs stored in the _id field

Introduction to Elasticsearch

More Related Content

What's hot

What's hot (20)

Similar to Introduction to Elasticsearch

Similar to Introduction to Elasticsearch (20)

More from Ismaeel Enjreny

More from Ismaeel Enjreny (20)

Recently uploaded

Recently uploaded (20)

Introduction to Elasticsearch

Editor's Notes