当你创建一个配置文件并用 mkcephfs部署Ceph, Ceph会在你的配置中生成一个默认CRUSH映射。默认CRUSH映射对你的Ceph sandbox环境很好。然而,当你部署一个大规模的数据集群,你应该着重考虑开发一个自定义CRUSH映射,因为它会帮助你管理你的Ceph集群,提高性能并确保数据的安全性。
例如,如果一个OSD坏了,crush映射可以帮助你可以在你需要使用现场支持或替换硬件的项目中找到失败的OSD的物理数据中心,空间,行和主机的机架。.
同样,crush可以帮助你更迅速地找出故障。例如,如果在一个特定的机架中的所有的OSD同时坏了,故障可能在于一个网络交换机或在机架或网络交换机的电源,而不是OSD本身。
一个自定义的CRUSH映射也可以帮助你识别Ceph存储数据的冗余副本的物理位置。当失败的主机关联配置组都处于降级状态。
1. 获取CRUSH map的二进制文件
ceph osd getcrushmap-o {compiled-crushmap-filename}
# ceph osd getcrushmap -o crushmap.map
2.反编译,将二进制文件转成文本文件
crushtool -d {compiled-crushmap-filename} -o {decompiled-crushmap-filename}
# crushtool -d crushmap.map -o crushmap.txt
3.查看crush map
# vim crushmap.txt
# begin crush maptunable choose_local_tries 0tunable choose_local_fallback_tries 0tunable choose_total_tries 50tunable chooseleaf_descend_once 1tunable straw_calc_version 1# devicesdevice 0 osd.0device 1 osd.1device 2 osd.2device 3 osd.3device 4 osd.4device 5 osd.5device 6 osd.6device 7 osd.7device 8 osd.8# typestype 0 osdtype 1 hosttype 2 chassistype 3 racktype 4 rowtype 5 pdutype 6 podtype 7 roomtype 8 datacentertype 9 regiontype 10 root# bucketshost node2 { id -2 # do not change unnecessarily # weight 0.046 alg straw hash 0 # rjenkins1 item osd.0 weight 0.018 item osd.5 weight 0.018 item osd.6 weight 0.009}host node3 { id -3 # do not change unnecessarily # weight 0.046 alg straw hash 0 # rjenkins1 item osd.1 weight 0.018 item osd.7 weight 0.018 item osd.8 weight 0.009}host node1 { id -4 # do not change unnecessarily # weight 0.046 alg straw hash 0 # rjenkins1 item osd.2 weight 0.018 item osd.3 weight 0.018 item osd.4 weight 0.009}root default { id -1 # do not change unnecessarily # weight 0.137 alg straw hash 0 # rjenkins1 item node2 weight 0.046 item node3 weight 0.046 item node1 weight 0.046}# rulesrule replicated_ruleset { ruleset 0 type replicated min_size 1 max_size 10 step take default step chooseleaf firstn 0 type host step emit}# end crush map
也可以用以下的命令查看crush tree
# ceph osd crush tree
[ { "id": -1, "name": "default", "type": "root", "type_id": 10, "items": [ { "id": -2, "name": "node2", "type": "host", "type_id": 1, "items": [ { "id": 0, "name": "osd.0", "type": "osd", "type_id": 0, "crush_weight": 0.018494, "depth": 2 }, { "id": 5, "name": "osd.5", "type": "osd", "type_id": 0, "crush_weight": 0.018494, "depth": 2 }, { "id": 6, "name": "osd.6", "type": "osd", "type_id": 0, "crush_weight": 0.008789, "depth": 2 } ] }, { "id": -3, "name": "node3", "type": "host", "type_id": 1, "items": [ { "id": 1, "name": "osd.1", "type": "osd", "type_id": 0, "crush_weight": 0.018494, "depth": 2 }, { "id": 7, "name": "osd.7", "type": "osd", "type_id": 0, "crush_weight": 0.018494, "depth": 2 }, { "id": 8, "name": "osd.8", "type": "osd", "type_id": 0, "crush_weight": 0.008789, "depth": 2 } ] }, { "id": -4, "name": "node1", "type": "host", "type_id": 1, "items": [ { "id": 2, "name": "osd.2", "type": "osd", "type_id": 0, "crush_weight": 0.018494, "depth": 2 }, { "id": 3, "name": "osd.3", "type": "osd", "type_id": 0, "crush_weight": 0.018494, "depth": 2 }, { "id": 4, "name": "osd.4", "type": "osd", "type_id": 0, "crush_weight": 0.008789, "depth": 2 } ] } ] }]
以及用一下命令查看devices、buckets和rulesets信息。
# ceph osd crush dump
ceph osd crush dump{ "devices": [ { "id": 0, "name": "osd.0" }, { "id": 1, "name": "osd.1" }, { "id": 2, "name": "osd.2" }, { "id": 3, "name": "osd.3" }, { "id": 4, "name": "osd.4" }, { "id": 5, "name": "osd.5" }, { "id": 6, "name": "osd.6" }, { "id": 7, "name": "osd.7" }, { "id": 8, "name": "osd.8" } ], "types": [ { "type_id": 0, "name": "osd" }, { "type_id": 1, "name": "host" }, { "type_id": 2, "name": "chassis" }, { "type_id": 3, "name": "rack" }, { "type_id": 4, "name": "row" }, { "type_id": 5, "name": "pdu" }, { "type_id": 6, "name": "pod" }, { "type_id": 7, "name": "room" }, { "type_id": 8, "name": "datacenter" }, { "type_id": 9, "name": "region" }, { "type_id": 10, "name": "root" } ], "buckets": [ { "id": -1, "name": "default", "type_id": 10, "type_name": "root", "weight": 9000, "alg": "straw", "hash": "rjenkins1", "items": [ { "id": -2, "weight": 3000, "pos": 0 }, { "id": -3, "weight": 3000, "pos": 1 }, { "id": -4, "weight": 3000, "pos": 2 } ] }, { "id": -2, "name": "node2", "type_id": 1, "type_name": "host", "weight": 3000, "alg": "straw", "hash": "rjenkins1", "items": [ { "id": 0, "weight": 1212, "pos": 0 }, { "id": 5, "weight": 1212, "pos": 1 }, { "id": 6, "weight": 576, "pos": 2 } ] }, { "id": -3, "name": "node3", "type_id": 1, "type_name": "host", "weight": 3000, "alg": "straw", "hash": "rjenkins1", "items": [ { "id": 1, "weight": 1212, "pos": 0 }, { "id": 7, "weight": 1212, "pos": 1 }, { "id": 8, "weight": 576, "pos": 2 } ] }, { "id": -4, "name": "node1", "type_id": 1, "type_name": "host", "weight": 3000, "alg": "straw", "hash": "rjenkins1", "items": [ { "id": 2, "weight": 1212, "pos": 0 }, { "id": 3, "weight": 1212, "pos": 1 }, { "id": 4, "weight": 576, "pos": 2 } ] } ], "rules": [ { "rule_id": 0, "rule_name": "replicated_ruleset", "ruleset": 0, "type": 1, "min_size": 1, "max_size": 10, "steps": [ { "op": "take", "item": -1, "item_name": "default" }, { "op": "chooseleaf_firstn", "num": 0, "type": "host" }, { "op": "emit" } ] } ], "tunables": { "choose_local_tries": 0, "choose_local_fallback_tries": 0, "choose_total_tries": 50, "chooseleaf_descend_once": 1, "chooseleaf_vary_r": 0, "straw_calc_version": 1, "allowed_bucket_algs": 22, "profile": "unknown", "optimal_tunables": 0, "legacy_tunables": 0, "require_feature_tunables": 1, "require_feature_tunables2": 1, "require_feature_tunables3": 0, "has_v2_rules": 0, "has_v3_rules": 0, "has_v4_buckets": 0 }}
CRUSH Map介绍
关于详细的CRUSH Maps介绍可以查阅官方文档 中查看在ceph中CRUSH Map的介绍和规则,以及查阅算法。
TODO
参考文献
[1]
[2]
[3] CRUSH算法