Phoenix快速入门|Phoenix快速入门 - BigData工具篇
Phoenix是HBase的一个开源SQL外壳。您可以使用标准的JDBC api而不是常规的HBase客户机api来创建表、插入数据和查询HBase数据。
使用Phoenix,可以编写更少的代码,得到更好的处理性能:
- 将SQL查询编译为本机HBase扫描
- 确定您的扫描键的最佳开始和结束位置
- 调度并行执行扫描
- 为数据提供运算能力
- 将where子句中的条件推到服务器端过滤器
- 通过服务器端钩子(称为协处理器)执行聚合查询
- 用于改进非行键列查询性能的二级索引
- 统计数据收集以改进并行化,并指导优化之间的选择
- 跳过扫描过滤器来优化IN,LIKE和OR查询
- 可选的行键加盐以均匀分配写负载
1、环境搭建参考:DataX+Phoenix+Hbase大数据分析平台整合
2、创建自己的SQL脚本并使用命令行工具
$PhoenixPaht/bin/psql.py
执行脚本。首先,创建
us_population.sql
文件,包含表定义[admin@mvxl2429 examples]$ pwd
/mnt/yst/phoenix-5.0.0/examples
[admin@mvxl2429 examples]$ cat us_population.sql
CREATE TABLE IF NOT EXISTS us_population (
state CHAR(2) NOT NULL,
city VARCHAR NOT NULL,
population BIGINT
CONSTRAINT my_pk PRIMARY KEY (state, city));
然后,创建一个
us_population.csv
文件,包含要存入该表的数据[admin@mvxl2429 examples]$ cat us_population.csv
NY,New York,8143197
CA,Los Angeles,3844829
IL,Chicago,2842518
TX,Houston,2016582
PA,Philadelphia,1463281
AZ,Phoenix,1461575
TX,San Antonio,1256509
CA,San Diego,1255540
TX,Dallas,1213825
CA,San Jose,912332
最后,创建一个
us_population_queries.sql
文件,包含要运行的查询语句[admin@mvxl2429 examples]$ cat us_population_queries.sql
SELECT state as "State",count(city) as "City Count",sum(population) as "Population Sum"
FROM us_population
GROUP BY state
ORDER BY sum(population) DESC;
从命令终端执行以下命令:
[admin@mvxl2429 examples]$ ../bin/sqlline.py localhost:2182 us_population.sql us_population.csv us_population_queries.sql
usage: sqlline.py [-h] [-v VERBOSE] [-c COLOR] [-fc FASTCONNECT]
[zookeepers] [sqlfile]
sqlline.py: error: unrecognized arguments: us_population.csv us_population_queries.sql
[admin@mvxl2429 examples]$ ../bin/psql.py localhost:2182 us_population.sql us_population.csv us_population_queries.sql
19/08/08 14:51:30 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
no rows upserted
Time: 1.508 sec(s)csv columns from database.
CSV Upsert complete. 10 rows upserted
Time: 0.117 sec(s)StCity CountPopulation Sum
-- ---------------------------------------- ----------------------------------------
NY18143197
CA36012701
TX34486916
IL12842518
PA11463281
AZ11461575
Time: 0.024 sec(s)
3、查看
bin/performance.py
脚本,为您想到的任何模式创建任意多的行,并对其运行定时查询。【Phoenix快速入门|Phoenix快速入门 - BigData工具篇】性能测试的数量跟内存大小密切相关,基于机器配置不高,这里运行1000条数据作为测试。测试海量数据需要加大内存,避免OutOfMemoryError异常。
[admin@mvxl2429 bin]$ ./performance.py localhost:2182 1000
Phoenix Performance Evaluation Script 1.0
-----------------------------------------Creating performance table...
19/08/08 16:04:04 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
no rows upserted
Time: 0.943 sec(s)Query # 1 - Count - SELECT COUNT(1) FROM PERFORMANCE_1000;
Query # 2 - Group By First PK - SELECT HOST FROM PERFORMANCE_1000 GROUP BY HOST;
Query # 3 - Group By Second PK - SELECT DOMAIN FROM PERFORMANCE_1000 GROUP BY DOMAIN;
Query # 4 - Truncate + Group By - SELECT TRUNC(DATE,'DAY') DAY FROM PERFORMANCE_1000 GROUP BY TRUNC(DATE,'DAY');
Query # 5 - Filter + Count - SELECT COUNT(1) FROM PERFORMANCE_1000 WHERE CORE<10;
Generating and upserting data.../mnt/yst/phoenix-5.0.0/bin/../phoenix-core-5.0.0-HBase-2.0-tests.jar
.19/08/08 16:04:16 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
csv columns from database.
CSV Upsert complete. 1000 rows upserted
Time: 0.492 sec(s)COUNT(1)
----------------------------------------
1000
Time: 0.034 sec(s)HO
--
CS
EU
NA
Time: 0.018 sec(s)DOMAIN
----------------------------------------
Apple.com
Google.com
Salesforce.com
Time: 0.024 sec(s)DAY
-----------------------
2019-08-08 00:00:00.000
Time: 0.028 sec(s)COUNT(1)
----------------------------------------
19
Time: 0.019 sec(s)[admin@mvxl2429 bin]$ ./sqlline.py localhost:2182
Setting property: [incremental, false]
Setting property: [isolation, TRANSACTION_READ_COMMITTED]
issuing: !connect jdbc:phoenix:localhost:2182 none none org.apache.phoenix.jdbc.PhoenixDriver
Connecting to jdbc:phoenix:localhost:2182
19/08/08 15:21:43 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Connected to: Phoenix (version 5.0)
Driver: PhoenixEmbeddedDriver (version 5.0)
Autocommit status: true
Transaction isolation: TRANSACTION_READ_COMMITTED
Building list of tables and columns for tab-completion (set fastconnect to true to skip)...
146/146 (100%) Done
Done
sqlline version 1.2.0
0: jdbc:phoenix:localhost:2182> !tables
+------------+--------------+-------------------+---------------+----------+------------+----------------------------+-----------------+--------------+--+
| TABLE_CAT| TABLE_SCHEM|TABLE_NAME|TABLE_TYPE| REMARKS| TYPE_NAME| SELF_REFERENCING_COL_NAME| REF_GENERATION| INDEX_STATE||
+------------+--------------+-------------------+---------------+----------+------------+----------------------------+-----------------+--------------+--+
|| SYSTEM| CATALOG| SYSTEM TABLE|||||||
|| SYSTEM| FUNCTION| SYSTEM TABLE|||||||
|| SYSTEM| LOG| SYSTEM TABLE|||||||
|| SYSTEM| SEQUENCE| SYSTEM TABLE|||||||
|| SYSTEM| STATS| SYSTEM TABLE|||||||
||| PERFORMANCE_1000| TABLE|||||||
||| PERSON| TABLE|||||||
||| US_POPULATION| TABLE|||||||
+------------+--------------+-------------------+---------------+----------+------------+----------------------------+-----------------+--------------+--+
0: jdbc:phoenix:localhost:2182> select * from PERFORMANCE_1000 limit 3;
+-------+------------+------------+--------------------------+-------+-------+-----------------+
| HOST|DOMAIN|FEATURE|DATE| CORE|DB| ACTIVE_VISITOR|
+-------+------------+------------+--------------------------+-------+-------+-----------------+
| CS| Apple.com| Dashboard| 2019-08-08 16:04:38.000| 373| 658| 3847|
| CS| Apple.com| Dashboard| 2019-08-08 16:04:47.000| 465| 29| 4551|
| CS| Apple.com| Dashboard| 2019-08-08 16:05:05.000| 298| 1463| 142|
+-------+------------+------------+--------------------------+-------+-------+-----------------+
3 rows selected (0.041 seconds)
推荐阅读
- java|记一次Spring boot 和Vue前后端分离的入门培训
- 如何以最快速度将Vue接入在线客服系统()
- 详解SpringMVC中拦截器的概念及入门案例
- 5 分钟,快速入门 Python JWT 接口认证
- ES6数组去重、快速去重排序
- 渗透测试自学篇|[ 渗透入门篇 ] 渗透行业必备术语大集合(一)
- 微商快速加人方法/怎么样才能让别人主动加我!
- 《Python编程(从入门到实践》第十八章笔记:Django最基本用法笔记)
- 从零开始搭建Vue2.0项目(一)之快速开始
- 书法入门(不用纸墨也能轻松练毛笔字(!))