Seven rules for OpenSearch sizing
dejanualex
Posted on July 30, 2024
OpenSearch splits indices into shards. Each shard stores a subset of all documents in an index.
Shard sizes should be between 10ā50 GB per shard, 10ā30 GB for workloads prioritizing low latency (e.g. search workloads), or between 30ā50GB (e.g. logs workloads).
Estimate the total size of the data you plan to store in the index, decide on a shard size based on the rule above, and calculate the number of primary shards:
ingested_data_size/shard_size
.The number and size of shards you set for an index corresponds to the size of an index, OpenSearch defaults to one primary and one replica shard, for a total of two shards per index.
Shard count is secondary to shard size.
Shard size impacts both search latency and write performance, too many small shards will exhaust the memory (JVM Heap), and too few large shards prevent OpenSearch from properly distributing requests. The JVM heap size should be based on the available RAM: Set
Xms
andXmx
to the same value, and no more than 50% of your total memory.For fast indexing (ingestion), you need as many shards as possible; for fast searching, it is better to have as few shards as possible.
Number of shards for index: you should have at least 1 shard per data node, ideally try to make the index shard count an even multiple of the data node count.
š ļø Last, here's a simple OpenSearch calculator for shard sizing.
Posted on July 30, 2024
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.