在说明怎样做HBase region预拆分之前,要先介绍一个概念:region热点。什么是region热点?对于拥有很多region的大表来说,大部分region分布并不是均匀的, 有些regionserver具有较少的region,有些则具有较多的region。那么即使是使用随机的方式往表写数据,region多的那台服务器的负载也会大于其他的regionserver,这就形成了region热点。
下面就通过HBase Shell介绍如何在创建表的时候将表进行预拆分?
hbase(main):001:0> create 'test_split_tbl', 'colfam1', {SPLITS => ['ROW-100','ROW-200','ROW-300','ROW-400']}
0 row(s) in 0.4820 seconds=> Hbase::Table - test_split_tbl
打开HBase Master Web UI,查看对应的region,
打开HBase Regionserver Web UI,查看对应的region,
【大数据|HBase region预拆分】除了在HBase Shell的建表语句中指定SPLITS,也可以用以下方式,
[root@cent-1 bin]# ./hbase org.apache.hadoop.hbase.util.RegionSplitter
Java HotSpot(TM) 64-Bit Server VM warning: Using incremental CMS is deprecated and will likely be removed in a future release
usage: RegionSplitter
SPLITALGORITHM is a java class name of a class
implementing SplitAlgorithm, or one of the special
strings HexStringSplit or UniformSplit, which are
built-in split algorithms. HexStringSplit treats
keys as hexadecimal ASCII, and UniformSplit treats
keys as arbitrary bytes.
-c Create a new table with a pre-split number of
-DOverride HBase Configuration Settings
-f Column Families to create with new table.
Required with -c
--firstrowFirst Row in Table for Split Algorithm
-hPrint this usage help
--lastrowLast Row in Table for Split Algorithm
-o Max outstanding splits that have unfinished
major compactions
-rPerform a rolling split of an existing region
--riskySkip verification steps to complete
quickly.STRONGLY DISCOURAGED for production
systems.[root@cent-1 bin]# ./hbase org.apache.hadoop.hbase.util.RegionSplitter -c 10 test -f colfam1:colfam2:colfam3
- 大数据|HBase写数据
- 大数据|HBase 行锁概念
- 分布式原理|分布式数据库理论知识之CAP理论、ACID原则及分布式事务一致性算法
- 大数据|HBase 负载均衡
- HBase|HBase日志中报Slow ReadProcessor read fields
- 大数据|HBase Balancer失败日志报错 ERROR org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler
- 大数据|Pair智能标注神器,全面升级,更加智能
- 资讯|AI 编程“神器”国产化!华为耗时 8 个月,这个能用中文生成代码的模型诞生了...
- 使用@Cacheable注解完成redis缓存