前言 本文隶属于专栏《大数据安装部署》,该专栏为笔者原创,引用请注明来源,不足和错误之处请在评论区帮忙指出,谢谢!
准备 建议下载下面的 2 个安装包
Apache Doris 1.1.2 FE 安装包
Apache Doris 1.1.2 BE 安装包
另外建议参考我的这篇博客部署好 MySQL 客户端——CentOS 7 安装 MySQL 5.7
部署 FE 首先将 FE 安装包复制到指定的节点。
配置 FE 配置文件是 conf/fe.conf
。
注意:meta_dir表示元数据存储位置,默认值为 ${DORIS_HOME}/doris-meta,目录不存在则需要手动创建(安装包解压后默认是存在的)。
fe.conf 中的JAVA_OPTS默认为java的最大堆内存为4GB,建议将生产环境调整为超过8G。
笔者的 fe.conf 配置如下,各位同学可以参考。注意,一般情况下这个文件用默认的就行了,除非出现端口冲突的情况。
文章图片
启动 FE 在 fe 的主目录下执行下面的命令(笔者的 FE 主目录是 /opt/bigdata/doris/apache-doris-fe-1.1.2-bin/fe)
bin/start_fe.sh --daemon
FE 进程启动并进入后台执行。
默认情况下,日志存储在 ${DORIS_HOME}/log 中。
如果启动失败,您可以通过查看 log/fe.log 或 log/fe.out 来查看错误消息。
部署 BE 将 BE 安装包复制到所有节点以部署 BE。
修改所有的 BE 配置 修改
be/conf/be.conf
。主要配置
storage_root_path
:数据存储目录。默认值是 be/storage,目录不存在则需要手动创建(安装包解压后默认是存在的)。在多目录的情况下,使用英文分号;
分离(不要在最后一个目录之后添加;
)。笔者的 be.conf 配置如下,各位同学可以参考,注意端口冲突请自行更改(可以使用这个命令
netstat -anp | grep 端口号
查看端口是否被占用)文章图片
将所有 BE 节点添加到 FE BE 节点需要在 FE 中添加,然后才能加入集群。
建议使用 mysql-client(下载 MySQL 5.7)连接到 FE:
如果要用 Apache Doris 的话,建议还是装个 MySQL,毕竟完美兼容 MySQL 的协议和语法。
建议参考我的这篇博客安装好 MySQL——CentOS 7 安装 MySQL 5.7
./mysql-client -h fe_host -P query_port -uroot
- fe_host:FE 所在的节点IP;
- query_port:fe/conf/fe.conf 中的 query_port;
- u:默认使用 root 帐户,无需输入密码即可登录。
[root@node1 ~]# mysql -h node1 -P 9030 -uroot
Welcome to the MySQL monitor.Commands end with ;
or \g.
Your MySQL connection id is 0
Server version: 5.7.37 Doris version 1.1.2-rc05-a8323dae4Copyright (c) 2000, 2022, Oracle and/or its affiliates.Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.Type 'help;
' or '\h' for help. Type '\c' to clear the current input statement.mysql>
登录后,执行以下命令来添加每个BE:
ALTER SYSTEM ADD BACKEND "be_host:heartbeat_service_port";
- be_host:BE 所在的节点IP;
- heartbeat_service_port:be/conf/be.conf 中的 heartbeat_service_port。
mysql> ALTER SYSTEM ADD BACKEND "node1:9050";
Query OK, 0 rows affected (0.16 sec)
mysql> ALTER SYSTEM ADD BACKEND "node2:9050";
Query OK, 0 rows affected (0.01 sec)
mysql> ALTER SYSTEM ADD BACKEND "node3:9050";
Query OK, 0 rows affected (0.00 sec)
启动 BE 在每一个节点的 be 的主目录下执行下面的命令(笔者的 BE 主目录是 /opt/bigdata/doris/apache-doris-be-1.1.2-bin-x86_64/be)
bin/start_be.sh --daemon
BE 进行启动并进入后台执行。
默认情况下,日志存储在 be/log 目录中。
如果启动失败,可以通过查看 be/log/be.log 或 be/log/be.out 来查看错误消息。
至此,实际上安装部署过程已经结束了,但是我们需要检查一下安装的有没有问题,Apache Doris 集群能不能正常工作。
检查 查看 BE 状态 使用 MySQL 客户端连接 FE 并执行下面的命令查看 BE 的状态:
SHOW PROC '/backends';
如果一切正常的话,
Alive
列应该是 true
,如下所示mysql> SHOW PROC '/backends';
+-----------+-----------------+---------------+----------+---------------+--------+----------+----------+---------------------+---------------------+-------+----------------------+-----------------------+-----------+------------------+---------------+---------------+---------+----------------+--------------------------+--------+----------------------+-------------------------------------------------------------------------------------------------------------------------------+
| BackendId | Cluster| IP| HostName | HeartbeatPort | BePort | HttpPort | BrpcPort | LastStartTime| LastHeartbeat| Alive | SystemDecommissioned | ClusterDecommissioned | TabletNum | DataUsedCapacity | AvailCapacity | TotalCapacity | UsedPct | MaxDiskUsedPct | Tag| ErrMsg | Version| Status|
+-----------+-----------------+---------------+----------+---------------+--------+----------+----------+---------------------+---------------------+-------+----------------------+-----------------------+-----------+------------------+---------------+---------------+---------+----------------+--------------------------+--------+----------------------+-------------------------------------------------------------------------------------------------------------------------------+
| 10002| default_cluster | 192.168.10.11 | node1| 9050| 9060| 8040| 8060| 2022-09-26 09:59:55 | 2022-09-26 10:13:58 | true| false| false| 0| 0.000| 12.509 GB| 16.986 GB| 26.36 % | 26.36 %| {"location" : "default"} || 1.1.2-rc05-a8323dae4 | {"lastSuccessReportTabletsTime":"2022-09-26 10:13:44","lastStreamLoadTime":-1,"isQueryDisabled":false,"isLoadDisabled":false} |
| 10003| default_cluster | 192.168.10.12 | node2| 9050| 9060| 8040| 8060| 2022-09-26 10:13:51 | 2022-09-26 10:13:58 | true| false| false| 0| 0.000| 14.046 GB| 16.986 GB| 17.31 % | 17.31 %| {"location" : "default"} || 1.1.2-rc05-a8323dae4 | {"lastSuccessReportTabletsTime":"2022-09-26 10:13:55","lastStreamLoadTime":-1,"isQueryDisabled":false,"isLoadDisabled":false} |
| 10004| default_cluster | 192.168.10.13 | node3| 9050| 9060| 8040| 8060| 2022-09-26 10:13:10 | 2022-09-26 10:13:58 | true| false| false| 0| 0.000| 14.261 GB| 16.986 GB| 16.04 % | 16.04 %| {"location" : "default"} || 1.1.2-rc05-a8323dae4 | {"lastSuccessReportTabletsTime":"2022-09-26 10:13:15","lastStreamLoadTime":-1,"isQueryDisabled":false,"isLoadDisabled":false} |
+-----------+-----------------+---------------+----------+---------------+--------+----------+----------+---------------------+---------------------+-------+----------------------+-----------------------+-----------+------------------+---------------+---------------+---------+----------------+--------------------------+--------+----------------------+-------------------------------------------------------------------------------------------------------------------------------+
3 rows in set (0.01 sec)
查看 FE 日志 【大数据技术体系|Apache Doris 安装部署指南】在 fe 的 log 目录下搜索关键字
thrift server started
,出现如下的输出,说明 FE 是正常工作的。[root@node1 log]# grep -nR -A 10 "thrift server started" .
./fe.log:36:2022-09-26 09:07:12,649 INFO (UNKNOWN 192.168.10.11_9010_1664154423601(-1)|1) [FeServer.start():48] thrift server started.
./fe.log-37-2022-09-26 09:07:12,890 INFO (UNKNOWN 192.168.10.11_9010_1664154423601(-1)|1) [RestartApplicationListener.onApplicationStartingEvent():93] Restart disabled due to System property 'spring.devtools.restart.enabled' being set to false
./fe.log-38-2022-09-26 09:07:13,617 INFO (background-preinit|108) [Version.():27] HV000001: Hibernate Validator 5.1.0.Final
./fe.log-39-2022-09-26 09:07:14,188 INFO (UNKNOWN 192.168.10.11_9010_1664154423601(-1)|1) [StartupInfoLogger.logStarting():55] Starting PaloFe v1.0-SNAPSHOT using Java 1.8.0_261 on node1 with PID 20366 (/opt/bigdata/doris/apache-doris-fe-1.1.2-bin/fe/lib/doris-fe.jar started by root in /opt/bigdata/doris/apache-doris-fe-1.1.2-bin/fe)
./fe.log-40-2022-09-26 09:07:14,194 INFO (UNKNOWN 192.168.10.11_9010_1664154423601(-1)|1) [SpringApplication.logStartupProfileInfo():634] No active profile set, falling back to 1 default profile: "default"
./fe.log-41-2022-09-26 09:07:14,310 INFO (UNKNOWN 192.168.10.11_9010_1664154423601(-1)|1) [DeferredLog.logTo():255] For additional web related logging consider setting the 'logging.level.web' property to 'DEBUG'
./fe.log-42-2022-09-26 09:07:15,651 INFO (UNKNOWN 192.168.10.11_9010_1664154423601(-1)|1) [RepositoryConfigurationDelegate.registerRepositoriesIn():132] Bootstrapping Spring Data LDAP repositories in DEFAULT mode.
./fe.log-43-2022-09-26 09:07:15,682 INFO (UNKNOWN 192.168.10.11_9010_1664154423601(-1)|1) [RepositoryConfigurationDelegate.registerRepositoriesIn():201] Finished Spring Data repository scanning in 19 ms. Found 0 LDAP repository interfaces.
./fe.log-44-2022-09-26 09:07:17,021 INFO (UNKNOWN 192.168.10.11_9010_1664154423601(-1)|1) [JettyServletWebServerFactory.getWebServer():166] Server initialized with port: 8030
./fe.log-45-2022-09-26 09:07:17,104 INFO (UNKNOWN 192.168.10.11_9010_1664154423601(-1)|1) [ServletWebServerApplicationContext.prepareWebApplicationContext():292] Root WebApplicationContext: initialization completed in 2791 ms
./fe.log-46-2022-09-26 09:07:18,362 INFO (UNKNOWN 192.168.10.11_9010_1664154423601(-1)|1) [WelcomePageHandlerMapping.():53] Adding welcome page: class path resource [static/index.html]
You have new mail in /var/spool/mail/root
Restful API 检查 检查 FE
http://192.168.10.11:8030/api/bootstrap
文章图片
检查 BE
http://192.168.10.11:8040/api/health
文章图片
http://192.168.10.12:8040/api/health
文章图片
http://192.168.10.13:8040/api/health
文章图片
MySQL 客户端检查 MySQL 客户端检查 FE
mysql> show frontends\G;
*************************** 1. row ***************************
Name: 192.168.10.11_9010_1664154423601
IP: 192.168.10.11
EditLogPort: 9010
HttpPort: 8030
QueryPort: 9030
RpcPort: 9020
Role: FOLLOWER
IsMaster: true
ClusterId: 1565676480
Join: true
Alive: true
ReplayedJournalId: 1829
LastHeartbeat: 2022-09-26 10:48:15
IsHelper: true
ErrMsg:
Version: 1.1.2-rc05-a8323dae4
CurrentConnected: Yes
1 row in set (0.16 sec)
MySQL 客户端检查 BE
mysql> SHOW BACKENDS\G;
*************************** 1. row ***************************
BackendId: 10002
Cluster: default_cluster
IP: 192.168.10.11
HeartbeatPort: 9050
BePort: 9060
HttpPort: 8040
BrpcPort: 8060
LastStartTime: 2022-09-26 10:37:47
LastHeartbeat: 2022-09-26 10:49:10
Alive: true
SystemDecommissioned: false
ClusterDecommissioned: false
TabletNum: 0
DataUsedCapacity: 0.000
AvailCapacity: 12.503 GB
TotalCapacity: 16.986 GB
UsedPct: 26.39 %
MaxDiskUsedPct: 26.39 %
Tag: {"location" : "default"}
ErrMsg:
Version: 1.1.2-rc05-a8323dae4
Status: {"lastSuccessReportTabletsTime":"2022-09-26 10:48:23","lastStreamLoadTime":-1,"isQueryDisabled":false,"isLoadDisabled":false}
*************************** 2. row ***************************
BackendId: 10003
Cluster: default_cluster
IP: 192.168.10.12
HeartbeatPort: 9050
BePort: 9060
HttpPort: 8040
BrpcPort: 8060
LastStartTime: 2022-09-26 10:13:51
LastHeartbeat: 2022-09-26 10:49:10
Alive: true
SystemDecommissioned: false
ClusterDecommissioned: false
TabletNum: 0
DataUsedCapacity: 0.000
AvailCapacity: 14.046 GB
TotalCapacity: 16.986 GB
UsedPct: 17.31 %
MaxDiskUsedPct: 17.31 %
Tag: {"location" : "default"}
ErrMsg:
Version: 1.1.2-rc05-a8323dae4
Status: {"lastSuccessReportTabletsTime":"2022-09-26 10:48:28","lastStreamLoadTime":-1,"isQueryDisabled":false,"isLoadDisabled":false}
*************************** 3. row ***************************
BackendId: 10004
Cluster: default_cluster
IP: 192.168.10.13
HeartbeatPort: 9050
BePort: 9060
HttpPort: 8040
BrpcPort: 8060
LastStartTime: 2022-09-26 10:13:10
LastHeartbeat: 2022-09-26 10:49:10
Alive: true
SystemDecommissioned: false
ClusterDecommissioned: false
TabletNum: 0
DataUsedCapacity: 0.000
AvailCapacity: 14.261 GB
TotalCapacity: 16.986 GB
UsedPct: 16.04 %
MaxDiskUsedPct: 16.04 %
Tag: {"location" : "default"}
ErrMsg:
Version: 1.1.2-rc05-a8323dae4
Status: {"lastSuccessReportTabletsTime":"2022-09-26 10:48:55","lastStreamLoadTime":-1,"isQueryDisabled":false,"isLoadDisabled":false}
3 rows in set (0.01 sec)
WEB UI 我们使用用户 root 登录 WEB UI,密码为空即可。
登录后可以看到如下的界面:
可以在 Playground 中执行 SQL 命令操作 Apache Doris 集群。
第一次使用 创建数据库
create database demo;
创建表
use demo;
CREATE TABLE IF NOT EXISTS demo.example_tbl
(
`user_id` LARGEINT NOT NULL COMMENT "user id",
`date` DATE NOT NULL COMMENT "",
`city` VARCHAR(20) COMMENT "",
`age` SMALLINT COMMENT "",
`sex` TINYINT COMMENT "",
`last_visit_date` DATETIME REPLACE DEFAULT "1970-01-01 00:00:00" COMMENT "",
`cost` BIGINT SUM DEFAULT "0" COMMENT "",
`max_dwell_time` INT MAX DEFAULT "0" COMMENT "",
`min_dwell_time` INT MIN DEFAULT "99999" COMMENT ""
)
AGGREGATE KEY(`user_id`, `date`, `city`, `age`, `sex`)
DISTRIBUTED BY HASH(`user_id`) BUCKETS 1
PROPERTIES (
"replication_allocation" = "tag.location.default: 1"
);
示例数据
10000,2017-10-01,beijing,20,0,2017-10-01 06:00:00,20,10,10
10006,2017-10-01,beijing,20,0,2017-10-01 07:00:00,15,2,2
10001,2017-10-01,beijing,30,1,2017-10-01 17:05:45,2,22,22
10002,2017-10-02,shanghai,20,1,2017-10-02 12:59:12,200,5,5
10003,2017-10-02,guangzhou,32,0,2017-10-02 11:20:00,30,11,11
10004,2017-10-01,shenzhen,35,0,2017-10-01 10:00:15,100,3,3
10004,2017-10-03,shenzhen,35,0,2017-10-03 10:20:22,11,6,6
将上述数据保存到 test.csv 文件中。
导入数据 在这里,我们将保存到上述文件中的数据通过 通过
_stream_load
导入我们刚刚创建的表中。[root@node1 ~]# curl --location-trusted -u root: -T test.csv -H "column_separator:," http://node1:8030/api/demo/example_tbl/_stream_load
{
"TxnId": 2,
"Label": "c6780496-ac8f-4784-b059-013dec735511",
"TwoPhaseCommit": "false",
"Status": "Success",
"Message": "OK",
"NumberTotalRows": 7,
"NumberLoadedRows": 7,
"NumberFilteredRows": 0,
"NumberUnselectedRows": 0,
"LoadBytes": 411,
"LoadTimeMs": 399,
"BeginTxnTimeMs": 54,
"StreamLoadPutTimeMs": 229,
"ReadDataTimeMs": 0,
"WriteDataTimeMs": 18,
"CommitAndPublishTimeMs": 93
}
查询数据
mysql> select * from example_tbl;
+---------+------------+-----------+------+------+---------------------+------+----------------+----------------+
| user_id | date| city| age| sex| last_visit_date| cost | max_dwell_time | min_dwell_time |
+---------+------------+-----------+------+------+---------------------+------+----------------+----------------+
| 10000| 2017-10-01 | beijing|20 |0 | 2017-10-01 06:00:00 |20 |10 |10 |
| 10001| 2017-10-01 | beijing|30 |1 | 2017-10-01 17:05:45 |2 |22 |22 |
| 10002| 2017-10-02 | shanghai|20 |1 | 2017-10-02 12:59:12 |200 |5 |5 |
| 10003| 2017-10-02 | guangzhou |32 |0 | 2017-10-02 11:20:00 |30 |11 |11 |
| 10004| 2017-10-01 | shenzhen|35 |0 | 2017-10-01 10:00:15 |100 |3 |3 |
| 10004| 2017-10-03 | shenzhen|35 |0 | 2017-10-03 10:20:22 |11 |6 |6 |
| 10006| 2017-10-01 | beijing|20 |0 | 2017-10-01 07:00:00 |15 |2 |2 |
+---------+------------+-----------+------+------+---------------------+------+----------------+----------------+
7 rows in set (0.17 sec)mysql> select * from example_tbl where city='shanghai';
+---------+------------+----------+------+------+---------------------+------+----------------+----------------+
| user_id | date| city| age| sex| last_visit_date| cost | max_dwell_time | min_dwell_time |
+---------+------------+----------+------+------+---------------------+------+----------------+----------------+
| 10002| 2017-10-02 | shanghai |20 |1 | 2017-10-02 12:59:12 |200 |5 |5 |
+---------+------------+----------+------+------+---------------------+------+----------------+----------------+
1 row in set (0.14 sec)mysql> select city, sum(cost) as total_cost from example_tbl group by city;
+-----------+------------+
| city| total_cost |
+-----------+------------+
| beijing|37 |
| shenzhen|111 |
| guangzhou |30 |
| shanghai|200 |
+-----------+------------+
4 rows in set (0.16 sec)
推荐阅读
- 编程语言|奇舞周刊第 437 期(一文看懂云原生时代 DevOps 如何选型)
- python|使用 Python 实现一个简单的智能聊天机器人
- 分析报告|这位人力HR总监的数据分析能力,太牛了!大多数人根本没有
- 报表开发|计算机专业哀鸿遍野(低代码平台和程序员水火不容,马上被取代)
- 职场|未来5年,这个职业最有可能被BI软件替代,网友(现在跑还来得及)
- 大数据|格力分红55亿,董明珠能拿多少()
- Apache Hudi使用简介
- 云计算|基于 K8S 构建数据中心操作系统
- 数据库|基于MyCat实现的Mysql数据库的分库分表、全局表、父子表等