CDH6.3.0上配置各种对象存储

莫道桑榆晚,为霞尚满天。这篇文章主要讲述CDH6.3.0上配置各种对象存储相关的知识,希望能为你提供帮助。
cm-hdfs:ufile: 还需添加jar包

CDH6.3.0上配置各种对象存储

文章图片



S3:是自带jar包
CDH6.3.0上配置各种对象存储

文章图片



OSS: CDH6.3.0不需要下载包, CDH5需要
core-site.xml 的群集范围高级配置代码段(安全阀)
CDH6.3.0上配置各种对象存储

文章图片


CDH6.3.0上配置各种对象存储

文章图片



fs.oss.endpoint oss-eu-west-1.aliyuncs.com#oss的外网地址
fs.oss.accessKeyId
fs.oss.accessKeySecret
fs.oss.impl org.apache.hadoop.fs.aliyun.oss.AliyunOSSFileSystem
fs.oss.buffer.dir /tmp/oss
fs.oss.connection.secure.enabled false#是否enable https, 根据需要来设置,enable https会影响性能
fs.oss.connection.maximum 10000
#默认jar包位置:
/opt/cloudera/parcels/CDH-6.3.0-1.cdh6.3.0.p0.1279813/jars/aliyun-sdk-oss-2.8.3.jar
/opt/cloudera/parcels/CDH-6.3.0-1.cdh6.3.0.p0.1279813/jars/hadoop-aliyun-3.0.0-cdh6.3.0.jar
待测试:
cpjindofs-sdk-2.3.0.jar/opt/cloudera/parcels/CDH-6.3.0-1.cdh6.3.0.p0.1279813/jars/
参考链接:https://github.com/aliyun/alibabacloud-jindodata/blob/master/docs/jindofs_sdk_how_to_hadoop_cdh.md
把原有的aliyun-sdk-oss-2.8.3.jar做个备份删除,重启hdfs服务,看新报能不能用?

【CDH6.3.0上配置各种对象存储】#在CDH中需在cm中配置(需测试)
hadoop-env.sh配置修改
打开文件: vim $HADOOP_HOME/etc/hadoop/hadoop-env.sh
在相应位置增加如下内容:
export HADOOP_OPTIONAL_TOOLS="hadoop-aliyun"
修改完成之后,重启Hadoop集群
注:CDH需要重启组件后要部署客户端。

#检查
hdfs dfs -ls oss://dbbigdata/
CDH6.3.0上配置各种对象存储

文章图片

hive建表语句
create external table if not exists dim_sony_dev_list_oss (
`ymd` string comment \'日\',
`uuid` string comment \'当贝用户设备uuid\',

`chanel` string comment \'渠道\',
`brand` string comment \'品牌\',
`packagename` string comment \'包名\',
`unit_type` string comment \'型号\',
`model` string comment \'索尼机型\',
`vcode` string comment \'版本号\',
`vname` string comment \'版本名称\',
`sony_user_id` string comment \'索尼用户ID\',
`user_id` string comment \'当贝用户ID\',
`ip` string comment \'用户IP\',
`province` string comment \'省\',
`city` string comment \'市\',
`region` string comment \'区\',
`add_time` string comment \'设备新增时间(yyyy-MM-dd HH:mm:ss)\',

`mac` string comment \'mac\',

`cause1` string comment \'2.6号以前的设备\',
`cause2` string comment \'同IP设备数量超过3个\',
`cause3` string comment \'门店展示,上报了com.sony.dtv.multiscreendemo(演示应用)的\',
`cause4` string comment \'上报了非sonyos_sonyos渠道的\',
`cause5` string comment \'国外IP\',
`cause6` string comment \'版本异常:低于1.0.1版本的\',
`cause7` string comment \'一个Mac下的uuid数量超过3个\',
`cause8` string comment \'在2.6号以前登陆过的用户\',
`cause9` string comment \'不在型号列表中的设备\',

`ifblacklist` string comment \'是否在黑名单库,1:是,0:否\'

)
PARTITIONED BY (pt STRING)
row format delimited
fields terminated by \'\\001\'
lines terminated by \'\\n\'
STORED AS TEXTFILE
location \'oss://dbbigdata/hangwenping/dim_sony_dev_list_oss\';

插入语句

set mapreduce.map.memory.mb=3072;
set mapreduce.reduce.memory.mb=3072;
set hive.exec.dynamic.partition=true;
set hive.exec.dynamic.partition.mode=nonstrict;
set hive.exec.max.dynamic.partitions.pernode=10000;
set hive.exec.max.dynamic.partitions=10000;
set hive.exec.max.created.files=10000;

insert overwrite table dim_sony_dev_list_oss partition(pt) select * from dim_sony_dev_list;
插入的时候报错:
Error while processing statement: FAILED: Execution Error, return code 3 from org.apache.hadoop.hive.ql.exec.spark.SparkTask. Spark job failed due to: Class org.apache.hadoop.fs.aliyun.oss.AliyunOSSFileSystem not found
解决办法:
进入到hivecore-site.xml配置:
CDH6.3.0上配置各种对象存储

文章图片

CDH6.3.0上配置各种对象存储

文章图片

做spark impalahive的软链接
impala的软链接
进入到impala的lib目录
cd/opt/cloudera/parcels/CDH-6.3.0-1.cdh6.3.0.p0.1279813/lib/impala/lib
执行以下命令,所有的hive,impala,spark节点都要执行,可以用ansible,注ln -s要用绝对路径,我这里没有写。
进入到impala节点的目录
cd /opt/cloudera/parcels/CDH-6.3.0-1.cdh6.3.0.p0.1279813/lib/impala/lib

ln -s ../../../jars/hadoop-aliyun-3.0.0-cdh6.3.0.jar hadoop-aliyun-3.0.0-cdh6.3.0.jar
ln -s ../../../jars/aliyun-sdk-oss-2.8.3.jar aliyun-sdk-oss-2.8.3.jar
ln -s ../../../jars/jdom-1.1.jar jdom-1.1.jar



进入到spark的jars目录
cd/opt/cloudera/parcels/CDH-6.3.0-1.cdh6.3.0.p0.1279813/lib/spark/jars

ln -s ../../../jars/hadoop-aliyun-3.0.0-cdh6.3.0.jar hadoop-aliyun-3.0.0-cdh6.3.0.jar
ln -s ../../../jars/aliyun-sdk-oss-2.8.3.jar aliyun-sdk-oss-2.8.3.jar
ln -s ../../../jars/jdom-1.1.jar jdom-1.1.jar



进入到hive的lib目录执行
cd/opt/cloudera/parcels/CDH-6.3.0-1.cdh6.3.0.p0.1279813/lib/hive/lib/

ln -s ../../../jars/hadoop-aliyun-3.0.0-cdh6.3.0.jar hadoop-aliyun-3.0.0-cdh6.3.0.jar
ln -s ../../../jars/aliyun-sdk-oss-2.8.3.jar aliyun-sdk-oss-2.8.3.jar
ln -s ../../../jars/jdom-1.1.jar jdom-1.1.jar


如下图:
CDH6.3.0上配置各种对象存储

文章图片

查询语句
select * from dim_sony_dev_list_oss;
hive查询oss表中的数据
CDH6.3.0上配置各种对象存储

文章图片

impala查询表中的数据
CDH6.3.0上配置各种对象存储

文章图片

oss中的数据
CDH6.3.0上配置各种对象存储

文章图片


    推荐阅读