Seven rules for OpenSearch sizing

dejanualex

dejanualex

Posted on July 30, 2024

Seven rules for OpenSearch sizing

OpenSearch splits indices into shards. Each shard stores a subset of all documents in an index.

  1. Shard sizes should be between 10ā€“50 GB per shard, 10ā€“30 GB for workloads prioritizing low latency (e.g. search workloads), or between 30ā€“50GB (e.g. logs workloads).

  2. Estimate the total size of the data you plan to store in the index, decide on a shard size based on the rule above, and calculate the number of primary shards: ingested_data_size/shard_size.

  3. The number and size of shards you set for an index corresponds to the size of an index, OpenSearch defaults to one primary and one replica shard, for a total of two shards per index.

  4. Shard count is secondary to shard size.

  5. Shard size impacts both search latency and write performance, too many small shards will exhaust the memory (JVM Heap), and too few large shards prevent OpenSearch from properly distributing requests. The JVM heap size should be based on the available RAM: Set Xms and Xmx to the same value, and no more than 50% of your total memory.

  6. For fast indexing (ingestion), you need as many shards as possible; for fast searching, it is better to have as few shards as possible.

  7. Number of shards for index: you should have at least 1 shard per data node, ideally try to make the index shard count an even multiple of the data node count.

šŸ› ļø Last, here's a simple OpenSearch calculator for shard sizing.

šŸ’– šŸ’Ŗ šŸ™… šŸš©
dejanualex
dejanualex

Posted on July 30, 2024

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related