Rallyとは?
RallyとはElastic社から公式でベンチマークツールとしてリリースされているものです。
Rally登場:Elasticsearchのベンチマークツール | Elastic
なぜやるのか
色々なスペックのサーバを扱うことが出てきたため、
OSレイヤだけではなくM/Wレイヤでもベンチマークして参考にしていきます。
※大人の事情により実環境におけるベンチマークスコアや具体的にどういう目的でどういう項目を細かく見ているのかは伏せています。。。
また、導入手順はAnsibleのPlayBook化していますので、興味ある方は試してみてください。
ベンチマーク環境
環境
クラウド:AWS
インスタンス:m3.2xlarge
ami:AWS MarketplaceのCentOS 7 (x86_64) - with Updates HVMを利用
(最初CentOS6でやったらgccのバージョンが古く、esrallyを起動時にbzip2絡みでコケました)
DISK:/home/user/.rally配下がrallyのテストデータ落とすため約3GB使います。
ノード数:1ノード
Ansible Playbook
Playbookは自分自身にAnsibleでデプロイする内容になっています。
github.com
インストール手順
#はrootユーザで実行
##はコメント
$は一般ユーザで実行
## Ansible Install # yum install -y epel-release # yum install -y ansible git # ssh-keygen # cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys # git clone https://github.com/st1t/setup-rally.git # cd setup-rally ## Ansibleのデプロイ先サーバ(localhost)への接続確認 # ansible -i hosts rally -m ping The authenticity of host 'localhost (::1)' can't be established. ECDSA key fingerprint is 6f:bb:62:2c:72:c6:fe:40:41:80:80:fe:09:3f:e4:24. Are you sure you want to continue connecting (yes/no)? yes localhost | SUCCESS => { "changed": false, "ping": "pong" } ## ORACLE JDKは手動でインストール ## http://www.oracle.com/technetwork/java/javase/downloads/jdk8-downloads-2133151.html ## Ansibleの実行タスクとデプロイ先を確認 # ansible-playbook -i hosts rally.yml --list-tasks --list-hosts playbook: rally.yml play #1 (rally): rally TAGS: [] pattern: [u'rally'] hosts (1): localhost tasks: elasticsearch : please install Oracle JDK manually TAGS: [setup_elasticsearch] elasticsearch : add rpm gpg key TAGS: [setup_elasticsearch] elasticsearch : add elasticsearch.repo TAGS: [setup_elasticsearch] elasticsearch : yum install elasticsearch TAGS: [setup_elasticsearch] elasticsearch : template files TAGS: [setup_elasticsearch, template_files] rally : yum install packages TAGS: [setup_rally] rally : download python-3.5.3 TAGS: [setup_python, setup_rally] rally : unarchive python-3.5.3 TAGS: [setup_python, setup_rally] rally : configure python-3.5.3 TAGS: [setup_python, setup_rally] rally : make install python-3.5.3 TAGS: [setup_python, setup_rally] rally : ensure pip TAGS: [setup_python, setup_rally] rally : download git-2.11.1 TAGS: [setup_git, setup_rally] rally : unarchive git-2.11.1 TAGS: [setup_git, setup_rally] rally : configure git-2.11.1 TAGS: [setup_git, setup_rally] rally : make install git-2.11.1 TAGS: [setup_git, setup_rally] rally : pip3.5 install esrally TAGS: [setup_esrally, setup_rally] # ## デプロイを実行 # ansible-playbook -i hosts rally.yml PLAY [rally] ******************************************************************* TASK [setup] ******************************************************************* ok: [localhost] TASK [elasticsearch : please install Oracle JDK manually] ********************** changed: [localhost] TASK [elasticsearch : add rpm gpg key] ***************************************** changed: [localhost] TASK [elasticsearch : add elasticsearch.repo] ********************************** changed: [localhost] TASK [elasticsearch : yum install elasticsearch] ******************************* changed: [localhost] TASK [elasticsearch : template files] ****************************************** changed: [localhost] => (item=etc/elasticsearch/elasticsearch.yml) changed: [localhost] => (item=etc/elasticsearch/jvm.options) TASK [rally : yum install packages] ******************************************** changed: [localhost] => (item=[u'gcc', u'openssl-devel', u'bzip2-devel', u'perl-ExtUtils-MakeMaker', u'curl-devel']) TASK [rally : download python-3.5.3] ******************************************* changed: [localhost -> localhost] TASK [rally : unarchive python-3.5.3] ****************************************** changed: [localhost] TASK [rally : configure python-3.5.3] ****************************************** changed: [localhost] TASK [rally : make install python-3.5.3] *************************************** changed: [localhost] TASK [rally : ensure pip] ****************************************************** changed: [localhost] TASK [rally : download git-2.11.1] ********************************************* changed: [localhost -> localhost] TASK [rally : unarchive git-2.11.1] ******************************************** changed: [localhost] TASK [rally : configure git-2.11.1] ******************************************** changed: [localhost] TASK [rally : make install git-2.11.1] ***************************************** changed: [localhost] TASK [rally : pip3.5 install esrally] ****************************************** changed: [localhost] PLAY RECAP ********************************************************************* localhost : ok=17 changed=16 unreachable=0 failed=0 # ## Elasticsearchの起動&動作確認 # systemctl start elasticsearch # curl localhost:9200 { "name" : "ip-172-64-1-146", "cluster_name" : "my-application", "cluster_uuid" : "Jjl-G0Y9R2WyqvI6opQYHw", "version" : { "number" : "5.2.0", "build_hash" : "24e05b9", "build_date" : "2017-01-24T19:52:35.800Z", "build_snapshot" : false, "lucene_version" : "6.4.0" }, "tagline" : "You Know, for Search" } # ## esrallyの設定 ## esrallyはrootでは動かないようになっているため、root以外のユーザへ移動(今回はcentos) # su - centos ## JDKのRootディレクトリパスを設定 $ esrally configure ____ ____ / __ \____ _/ / /_ __ / /_/ / __ `/ / / / / / / _, _/ /_/ / / / /_/ / /_/ |_|\__,_/_/_/\__, / /____/ Running simple configuration. Run the advanced configuration with: esrally configure --advanced-config INFO:rally.config:Running simple configuration routine. [✓] Autodetecting available third-party software which: no gradle in (/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/home/centos/.local/bin:/home/centos/bin) git : [✓] gradle : [✕] JDK 8 : [✕] (You cannot benchmark Elasticsearch 5.x without a JDK 8 installation) ********************************************************************************** You don't have the necessary software to benchmark source builds of Elasticsearch. You can still benchmark binary distributions with e.g.: esrally --distribution-version=5.0.0 ********************************************************************************** [✓] Setting up benchmark data directory in [/home/centos/.rally/benchmarks] (needs several GB). Enter the JDK 8 root directory:: /usr/java/jdk1.8.0_121 [✓] Configuration successfully written to [/home/centos/.rally/rally.ini]. Happy benchmarking! To benchmark Elasticsearch 5.0.0 with the default benchmark run: esrally --distribution-version=5.0.0 For help, type esrally --help or see the user documentation at https://esrally.readthedocs.io/en/0.4.8/ $ ## esrallyを実行 ## esrallyの実行ウィンドウとは別のウィンドウで[dstat -tlcmgdn --socket --tcp --io]を実行しながら眺めると良いかも $ esrally --pipeline=from-distribution --distribution-version=5.2.0 ____ ____ / __ \____ _/ / /_ __ / /_/ / __ `/ / / / / / / _, _/ /_/ / / / /_/ / /_/ |_|\__,_/_/_/\__, / /____/ [INFO] Writing logs to /home/centos/.rally/benchmarks/races/2017-02-14-14-47-08/local/logs/rally_out.log Running index-append [100% done] Running force-merge [100% done] Running index-stats [100% done] Running node-stats [100% done] Running default [100% done] Running term [100% done] Running phrase [100% done] Running country_agg_uncached [100% done] Running country_agg_cached [100% done] Running scroll [100% done] Running expression [100% done] Running painless_static [100% done] Running painless_dynamic [100% done] [INFO] Racing on track [geonames], challenge [append-no-conflicts] and car [defaults] [INFO] Rally will delete the benchmark candidate after the benchmark ------------------------------------------------------ _______ __ _____ / ____(_)___ ____ _/ / / ___/_________ ________ / /_ / / __ \/ __ `/ / \__ \/ ___/ __ \/ ___/ _ \ / __/ / / / / / /_/ / / ___/ / /__/ /_/ / / / __/ /_/ /_/_/ /_/\__,_/_/ /____/\___/\____/_/ \___/ ------------------------------------------------------ | Lap | Metric | Operation | Value | Unit | |------:|-------------------------------:|---------------------:|----------:|-------:| | All | Indexing time | | 34.2623 | min | | All | Merge time | | 9.54963 | min | | All | Refresh time | | 3.23123 | min | | All | Flush time | | 0.27265 | min | | All | Merge throttle time | | 0.92085 | min | | All | Median CPU usage | | 397.8 | % | | All | Total Young Gen GC | | 19.179 | s | | All | Total Old Gen GC | | 1.538 | s | | All | Index size | | 2.57505 | GB | | All | Totally written | | 14.7806 | GB | | All | Heap used for segments | | 14.036 | MB | | All | Heap used for doc values | | 0.110653 | MB | | All | Heap used for terms | | 12.9623 | MB | | All | Heap used for norms | | 0.0703735 | MB | | All | Heap used for points | | 0.247314 | MB | | All | Heap used for stored fields | | 0.645348 | MB | | All | Segment count | | 93 | | | All | Min Throughput | index-append | 26655.1 | docs/s | | All | Median Throughput | index-append | 27401.3 | docs/s | | All | Max Throughput | index-append | 27994.1 | docs/s | | All | 50.0th percentile latency | index-append | 1306.33 | ms | | All | 90.0th percentile latency | index-append | 1706.22 | ms | | All | 99.0th percentile latency | index-append | 4132.19 | ms | | All | 99.9th percentile latency | index-append | 7894.34 | ms | | All | 100th percentile latency | index-append | 8290.49 | ms | | All | 50.0th percentile service time | index-append | 1306.33 | ms | | All | 90.0th percentile service time | index-append | 1706.22 | ms | | All | 99.0th percentile service time | index-append | 4132.19 | ms | | All | 99.9th percentile service time | index-append | 7894.34 | ms | | All | 100th percentile service time | index-append | 8290.49 | ms | | All | Min Throughput | force-merge | 0.387803 | ops/s | | All | Median Throughput | force-merge | 0.387803 | ops/s | | All | Max Throughput | force-merge | 0.387803 | ops/s | | All | 100th percentile latency | force-merge | 2578.61 | ms | | All | 100th percentile service time | force-merge | 2578.61 | ms | | All | Min Throughput | index-stats | 100.058 | ops/s | | All | Median Throughput | index-stats | 100.082 | ops/s | | All | Max Throughput | index-stats | 100.134 | ops/s | | All | 50.0th percentile latency | index-stats | 1.96378 | ms | | All | 90.0th percentile latency | index-stats | 2.08593 | ms | | All | 99.0th percentile latency | index-stats | 2.52601 | ms | | All | 99.9th percentile latency | index-stats | 16.778 | ms | | All | 100th percentile latency | index-stats | 17.6823 | ms | | All | 50.0th percentile service time | index-stats | 1.83976 | ms | | All | 90.0th percentile service time | index-stats | 1.96208 | ms | | All | 99.0th percentile service time | index-stats | 2.38526 | ms | | All | 99.9th percentile service time | index-stats | 10.8649 | ms | | All | 100th percentile service time | index-stats | 14.0383 | ms | | All | Min Throughput | node-stats | 99.8284 | ops/s | | All | Median Throughput | node-stats | 100.124 | ops/s | | All | Max Throughput | node-stats | 100.788 | ops/s | | All | 50.0th percentile latency | node-stats | 1.98919 | ms | | All | 90.0th percentile latency | node-stats | 2.10209 | ms | | All | 99.0th percentile latency | node-stats | 3.51366 | ms | | All | 99.9th percentile latency | node-stats | 16.0786 | ms | | All | 100th percentile latency | node-stats | 18.5991 | ms | | All | 50.0th percentile service time | node-stats | 1.8646 | ms | | All | 90.0th percentile service time | node-stats | 1.97835 | ms | | All | 99.0th percentile service time | node-stats | 3.36162 | ms | | All | 99.9th percentile service time | node-stats | 7.46198 | ms | | All | 100th percentile service time | node-stats | 18.4724 | ms | | All | Min Throughput | default | 50.006 | ops/s | | All | Median Throughput | default | 50.0084 | ops/s | | All | Max Throughput | default | 50.024 | ops/s | | All | 50.0th percentile latency | default | 16.7353 | ms | | All | 90.0th percentile latency | default | 16.9511 | ms | | All | 99.0th percentile latency | default | 20.5111 | ms | | All | 99.9th percentile latency | default | 26.1668 | ms | | All | 100th percentile latency | default | 26.4457 | ms | | All | 50.0th percentile service time | default | 16.6098 | ms | | All | 90.0th percentile service time | default | 16.8254 | ms | | All | 99.0th percentile service time | default | 17.7306 | ms | | All | 99.9th percentile service time | default | 26.0405 | ms | | All | 100th percentile service time | default | 26.3195 | ms | | All | Min Throughput | term | 200.108 | ops/s | | All | Median Throughput | term | 200.152 | ops/s | | All | Max Throughput | term | 200.253 | ops/s | | All | 50.0th percentile latency | term | 1.2319 | ms | | All | 90.0th percentile latency | term | 1.28896 | ms | | All | 99.0th percentile latency | term | 1.85857 | ms | | All | 99.9th percentile latency | term | 6.55233 | ms | | All | 100th percentile latency | term | 6.96907 | ms | | All | 50.0th percentile service time | term | 1.10788 | ms | | All | 90.0th percentile service time | term | 1.16574 | ms | | All | 99.0th percentile service time | term | 1.49258 | ms | | All | 99.9th percentile service time | term | 5.63515 | ms | | All | 100th percentile service time | term | 6.84347 | ms | | All | Min Throughput | phrase | 200.097 | ops/s | | All | Median Throughput | phrase | 200.136 | ops/s | | All | Max Throughput | phrase | 200.225 | ops/s | | All | 50.0th percentile latency | phrase | 1.6242 | ms | | All | 90.0th percentile latency | phrase | 1.68859 | ms | | All | 99.0th percentile latency | phrase | 2.19295 | ms | | All | 99.9th percentile latency | phrase | 7.07005 | ms | | All | 100th percentile latency | phrase | 8.55214 | ms | | All | 50.0th percentile service time | phrase | 1.49897 | ms | | All | 90.0th percentile service time | phrase | 1.5647 | ms | | All | 99.0th percentile service time | phrase | 1.94413 | ms | | All | 99.9th percentile service time | phrase | 5.39353 | ms | | All | 100th percentile service time | phrase | 6.94662 | ms | | All | Min Throughput | country_agg_uncached | 5.45032 | ops/s | | All | Median Throughput | country_agg_uncached | 5.46468 | ops/s | | All | Max Throughput | country_agg_uncached | 5.46946 | ops/s | | All | 50.0th percentile latency | country_agg_uncached | 162963 | ms | | All | 90.0th percentile latency | country_agg_uncached | 228239 | ms | | All | 99.0th percentile latency | country_agg_uncached | 242750 | ms | | All | 99.9th percentile latency | country_agg_uncached | 244180 | ms | | All | 100th percentile latency | country_agg_uncached | 244330 | ms | | All | 50.0th percentile service time | country_agg_uncached | 182.442 | ms | | All | 90.0th percentile service time | country_agg_uncached | 189.22 | ms | | All | 99.0th percentile service time | country_agg_uncached | 206.477 | ms | | All | 99.9th percentile service time | country_agg_uncached | 227.438 | ms | | All | 100th percentile service time | country_agg_uncached | 235.705 | ms | | All | Min Throughput | country_agg_cached | 100.062 | ops/s | | All | Median Throughput | country_agg_cached | 100.092 | ops/s | | All | Max Throughput | country_agg_cached | 100.174 | ops/s | | All | 50.0th percentile latency | country_agg_cached | 1.28296 | ms | | All | 90.0th percentile latency | country_agg_cached | 1.33427 | ms | | All | 99.0th percentile latency | country_agg_cached | 1.62056 | ms | | All | 99.9th percentile latency | country_agg_cached | 4.7723 | ms | | All | 100th percentile latency | country_agg_cached | 4.79509 | ms | | All | 50.0th percentile service time | country_agg_cached | 1.1585 | ms | | All | 90.0th percentile service time | country_agg_cached | 1.21304 | ms | | All | 99.0th percentile service time | country_agg_cached | 1.49657 | ms | | All | 99.9th percentile service time | country_agg_cached | 4.64836 | ms | | All | 100th percentile service time | country_agg_cached | 4.66964 | ms | | All | Min Throughput | scroll | 60.6234 | ops/s | | All | Median Throughput | scroll | 60.7549 | ops/s | | All | Max Throughput | scroll | 60.782 | ops/s | | All | 50.0th percentile latency | scroll | 391710 | ms | | All | 90.0th percentile latency | scroll | 547911 | ms | | All | 99.0th percentile latency | scroll | 583086 | ms | | All | 99.9th percentile latency | scroll | 586607 | ms | | All | 100th percentile latency | scroll | 586989 | ms | | All | 50.0th percentile service time | scroll | 411.999 | ms | | All | 90.0th percentile service time | scroll | 415.801 | ms | | All | 99.0th percentile service time | scroll | 419.931 | ms | | All | 99.9th percentile service time | scroll | 425.589 | ms | | All | 100th percentile service time | scroll | 425.727 | ms | | All | Min Throughput | expression | 2.89118 | ops/s | | All | Median Throughput | expression | 2.90642 | ops/s | | All | Max Throughput | expression | 2.91998 | ops/s | | All | 50.0th percentile latency | expression | 294333 | ms | | All | 90.0th percentile latency | expression | 413862 | ms | | All | 99.0th percentile latency | expression | 440851 | ms | | All | 99.9th percentile latency | expression | 443586 | ms | | All | 100th percentile latency | expression | 443885 | ms | | All | 50.0th percentile service time | expression | 347.375 | ms | | All | 90.0th percentile service time | expression | 360.064 | ms | | All | 99.0th percentile service time | expression | 372.304 | ms | | All | 99.9th percentile service time | expression | 379.48 | ms | | All | 100th percentile service time | expression | 386.623 | ms | | All | Min Throughput | painless_static | 1.94561 | ops/s | | All | Median Throughput | painless_static | 1.9468 | ops/s | | All | Max Throughput | painless_static | 1.94792 | ops/s | | All | 50.0th percentile latency | painless_static | 463873 | ms | | All | 90.0th percentile latency | painless_static | 649285 | ms | | All | 99.0th percentile latency | painless_static | 691026 | ms | | All | 99.9th percentile latency | painless_static | 695246 | ms | | All | 100th percentile latency | painless_static | 695709 | ms | | All | 50.0th percentile service time | painless_static | 512.775 | ms | | All | 90.0th percentile service time | painless_static | 539.99 | ms | | All | 99.0th percentile service time | painless_static | 566.923 | ms | | All | 99.9th percentile service time | painless_static | 577.931 | ms | | All | 100th percentile service time | painless_static | 578.523 | ms | | All | Min Throughput | painless_dynamic | 1.99401 | ops/s | | All | Median Throughput | painless_dynamic | 1.99582 | ops/s | | All | Max Throughput | painless_dynamic | 1.99775 | ops/s | | All | 50.0th percentile latency | painless_dynamic | 451470 | ms | | All | 90.0th percentile latency | painless_dynamic | 630959 | ms | | All | 99.0th percentile latency | painless_dynamic | 671680 | ms | | All | 99.9th percentile latency | painless_dynamic | 675685 | ms | | All | 100th percentile latency | painless_dynamic | 676132 | ms | | All | 50.0th percentile service time | painless_dynamic | 500.268 | ms | | All | 90.0th percentile service time | painless_dynamic | 523.014 | ms | | All | 99.0th percentile service time | painless_dynamic | 543.797 | ms | | All | 99.9th percentile service time | painless_dynamic | 559.719 | ms | | All | 100th percentile service time | painless_dynamic | 564.137 | ms | [INFO] Archiving logs in /home/centos/.rally/benchmarks/races/2017-02-14-14-47-08/local/logs-geonames-append-no-conflicts-defaults.zip ---------------------------------- [INFO] SUCCESS (took 3377 seconds) ---------------------------------- $
やってみた感想
ベンチマークするときは何を目的にどうベンチマークを実行するかは結構難しくいつも頭を悩ませています。
今回もただ実行するだけではなかなか自分自身にとって意味のある数値にするのは難しいかなと思います。
しかし、インスタンス間の相対評価にはなるかなと思うのと、特別な設定をいれなくてもマルチコアを使ってくれてることが確認できただけでも個人的には良かったです。
何より公式でベンチマークツールを用意していただいているので困ったときには
Discuss the Elastic Stackに聞けますし、今後もElasticのリリース速度に合わせてメンテナンスされつづけるのは大きいと思います。
今後のrallyのロードマップにも期待してます!
最後に
Elasticの話題は以下公式サイトで日本語でも聞けるところが用意されているみたいなので、
盛り上がるとうれしいなーなんて思っております😃
discuss.elastic.co