sparksql中行转列
进入sparksql
beeline -u "jdbc:hive2://172.16.12.46:10015" -n spark -p spark -d org.apache.hive.jdbc.HiveDriver --color=true --silent=false --fastConnect=false --verbose=true
执行查询
select x.data_number,concat_ws(',',collect_list(cast(x.data_day_max as string))), concat_ws(',',collect_list(cast(x.data_day_hour as string))) from (select data_number,data_day_max,data_day_hour from jt.data_day where data_type='SCT' and data_version in ('SCTS301','SCTS302') and data_number in ('SCTS301-D','SCTS302-D') and data_day_date='2013-11-01' order by data_day_hour) x group by data_number;
文章图片
若要去重将collect_list换成collect_set
【sparksql中行转列】转载于:https://www.cnblogs.com/zeppelin/p/6124751.html
推荐阅读
- 数据库|SQL行转列方式优化查询性能实践
- SCD|一种基于SparkSQL的Hive数据仓库拉链表缓慢变化维(SCD2+SCD1)的示例实现
- spark|Spark,SparkSql wordCount,java wordcount
- 行转列的4种方法(有完整例)
- DataTable|DataTable 行转列
- sparkSQL行转列,列转行
- Hive concat函数和concat_ws函数和concat_group函数&row_number over()和sum() over()&列转行,行转列
- SparkSQL|SparkSQL并行度参数设置方法
- sparksql代码执行过于缓慢
- SparkSQL之broadcast join