MySQL开窗函数

一,开窗函数: 开窗函数:为将要被操作的行的集合定义一个窗口,它对一组值进行操作,不需要使用GROUP BY子句对数据进行分组,能够在同一行中同时返回基础行的列和聚合列。
开窗函数又称OLAP函数(Online Analytical Processing),MySql在8.0的版本(2016年)增加了对开窗函数的支持。
注:在Oracle中称为分析函数。在MySQL中称为开窗函数。
二,开窗函数分类 1,开窗函数大体分两类:

a,聚合开窗函数:(sum,avg,count,max,min) 聚合函数()+over() ,聚合之后返回多行。 b,专用开窗函数:(Rank,Dense_Rank,Row_Number 等)注意:聚合开窗函数只能使用PARTITION BY子句,ORDER BY不能与聚合开窗函数一同使用。

2,专用开窗函数(按功能划分)
-- 序号函数/排名函数: ROW_NUMBER()排序:1,2,3 RANK()排序:1,1,3 DENSE_RANK()排序:1,1,2-- 分布函数: PERCENT_RANK()(rank-1)/(rows-1) CUME_DIST()<=当前rank值的函数/总函数-- 前后函数: LAG(expr,n)返回当前行的前n行的expr值; LEAD(expr,n)返回当前行的后n行的expr值;-- 头尾函数: FIRST_VALUE(expr)返回第一个expr值; LAST_VALUE(expr)返回最后个expr值;-- 其他函数: NTH_VALU(expr,n)返回第n个expr值; NTILE(n)将有序数据分为n个桶,记录等级数。返回每一行在第几桶。

3,窗口函数和普通聚合函数的区别
普通的聚合函数,聚合的行集是组,开窗函数聚合的行集是窗口。因此,普通的聚合函数每组(Group by)只返回一个值,而开窗函数则可为窗口中的每行都返回一个值。简单理解,就是对查询的结果多出一列,这一列可以是聚合值,也可以是排序值。
窗口函数和普通聚合函数的区别:
1,聚合函数是将多条记录聚合为一条;窗口函数是每条记录都会执行,有几条记录执行完还是几条。 2,聚合函数也可以用于窗口函数。

4,row_number,cume_dist,percent_rank 的区别
  1. row_number():显示行号
  2. cume_dist:计算某个值在一组有序的数据中累计的分布
    计算结果为相对位置/总行数,返回值为(0,1] 注意:对于重复值,计算的时候,取重复值的最后一行的位置

  3. percent_rank:和cume_dist 的不同点在于计算分布结果的方法
    计算方法为(相对位置-1)/(总行数-1)
    注意:对于重复值,计算的时候,取重复值的第一行的位

三,开窗函数的语法结构
# Key word :Partiton by & order by <开窗函数> over ([PARTITION by <列清单>] Order by <排序用列清单>) # 注意:[]中的内容可以不用写。

四,应用举例 1,准备数据:
创建表 student_scores: mysql> create table student_scores( -> id int, -> studentId int, -> language int, -> math int, -> english int, -> classId varchar(30), -> departmentId varchar(30) -> ); 插入数据: mysql> insert into student_scores values ->(1,111,68,69,90,'class1','department1'), ->(2,112,73,80,96,'class1','department1'), ->(3,113,90,74,75,'class1','department1'), ->(4,114,89,94,93,'class1','department1'), ->(5,115,99,93,89,'class1','department1'), ->(6,121,96,74,79,'class2','department1'), ->(7,122,89,86,85,'class2','department1'), ->(8,123,70,78,61,'class2','department1'), ->(9,124,76,70,76,'class2','department1'), ->(10,211,89,93,60,'class1','department2'), ->(11,212,76,83,75,'class1','department2'), ->(12,213,71,94,90,'class1','department2'), ->(13,214,94,94,66,'class1','department2'), ->(14,215,84,82,73,'class1','department2'), ->(15,216,85,74,93,'class1','department2'), ->(16,221,77,99,61,'class2','department2'), ->(17,222,80,78,96,'class2','department2'), ->(18,223,79,74,96,'class2','department2'), ->(19,224,75,80,78,'class2','department2'), ->(20,225,82,85,63,'class2','department2'); -- mysql> select *from student_scores; +------+-----------+----------+------+---------+---------+--------------+ | id| studentId | language | math | english | classId | departmentId | +------+-----------+----------+------+---------+---------+--------------+ |1 |111 |68 |69 |90 | class1| department1| |2 |112 |73 |80 |96 | class1| department1| |3 |113 |90 |74 |75 | class1| department1| |4 |114 |89 |94 |93 | class1| department1| |5 |115 |99 |93 |89 | class1| department1| |6 |121 |96 |74 |79 | class2| department1| |7 |122 |89 |86 |85 | class2| department1| |8 |123 |70 |78 |61 | class2| department1| |9 |124 |76 |70 |76 | class2| department1| |10 |211 |89 |93 |60 | class1| department2| |11 |212 |76 |83 |75 | class1| department2| |12 |213 |71 |94 |90 | class1| department2| |13 |214 |94 |94 |66 | class1| department2| |14 |215 |84 |82 |73 | class1| department2| |15 |216 |85 |74 |93 | class1| department2| |16 |221 |77 |99 |61 | class2| department2| |17 |222 |80 |78 |96 | class2| department2| |18 |223 |79 |74 |96 | class2| department2| |19 |224 |75 |80 |78 | class2| department2| |20 |225 |82 |85 |63 | class2| department2| +------+-----------+----------+------+---------+---------+--------------+

2,聚合类开窗函数
  1. count() 开窗函数
-- 查询: select studentId,math,departmentId,classId, -- 以符合条件的所有行作为窗口 count(math) over() as count1, -- 以按classId分组的所有行作为窗口 count(math) over(partition by classId) as count2, -- 以按classId分组、按math排序的所有行作为窗口 count(math) over(partition by classId order by math) as count3, -- 以按classId分组、按math排序、按 当前行+往前1行+往后2行的行作为窗口 count(math) over(partition by classId order by math rows between 1 preceding and 2 following) as count4 from student_scores where departmentId='department1'; -- 结果 +-----------+------+--------------+---------+--------+--------+--------+--------+ | studentId | math | departmentId | classId | count1 | count2 | count3 | count4 | +-----------+------+--------------+---------+--------+--------+--------+--------+ |111 |69 | department1| class1|9 |5 |1 |3 | |113 |74 | department1| class1|9 |5 |2 |4 | |112 |80 | department1| class1|9 |5 |3 |4 | |115 |93 | department1| class1|9 |5 |4 |3 | |114 |94 | department1| class1|9 |5 |5 |2 | |124 |70 | department1| class2|9 |4 |1 |3 | |121 |74 | department1| class2|9 |4 |2 |4 | |123 |78 | department1| class2|9 |4 |3 |3 | |122 |86 | department1| class2|9 |4 |4 |2 | +-----------+------+--------------+---------+--------+--------+--------+--------+/* 结果解释: studentid=115,count1为所有的行数9,count2为分区class1中的行数5,count3为分区class1中math值<=93的行数4,count4为分区class1中math值向前+1行向后+2行(实际只有1行)的总行数3。 */

  1. sum() 开窗函数
# sum开窗函数select studentId,math,departmentId,classId, -- 以符合条件的所有行作为窗口 sum(math) over() as sum1, -- 以按classId分组的所有行作为窗口 sum(math) over(partition by classId) as sum2, -- 以按classId分组、按math排序后、按到当前行(含当前行)的所有行作为窗口 sum(math) over(partition by classId order by math) as sum3, -- 以按classId分组、按math排序后、按当前行+往前1行+往后2行的行作为窗口 sum(math) over(partition by classId order by math rows between 1 preceding and 2 following) as sum4 from student_scores where departmentId='department1'; # 结果 +-----------+------+--------------+---------+------+------+------+------+ | studentId | math | departmentId | classId | sum1 | sum2 | sum3 | sum4 | +-----------+------+--------------+---------+------+------+------+------+ |111 |69 | department1| class1|718 |410 |69 |223 | |113 |74 | department1| class1|718 |410 |143 |316 | |112 |80 | department1| class1|718 |410 |223 |341 | |115 |93 | department1| class1|718 |410 |316 |267 | |114 |94 | department1| class1|718 |410 |410 |187 | |124 |70 | department1| class2|718 |308 |70 |222 | |121 |74 | department1| class2|718 |308 |144 |308 | |123 |78 | department1| class2|718 |308 |222 |238 | |122 |86 | department1| class2|718 |308 |308 |164 | +-----------+------+--------------+---------+------+------+------+------+# 结果解释:同count开窗函数

  1. min() 开窗函数
-- min 开窗函数select studentId,math,departmentId,classId, -- 以符合条件的所有行作为窗口 min(math) over() as min1, -- 以按classId分组的所有行作为窗口 min(math) over(partition by classId) as min2, -- 以按classId分组、按math排序后、按到当前行(含当前行)的所有行作为窗口 min(math) over(partition by classId order by math) as min3, -- 以按classId分组、按math排序后、按当前行+往前1行+往后2行的行作为窗口 min(math) over(partition by classId order by math rows between 1 preceding and 2 following) as min4 from student_scores where departmentId='department1'; -- 结果 +-----------+------+--------------+---------+------+------+------+------+ | studentId | math | departmentId | classId | min1 | min2 | min3 | min4 | +-----------+------+--------------+---------+------+------+------+------+ |111 |69 | department1| class1|69 |69 |69 |69 | |113 |74 | department1| class1|69 |69 |69 |69 | |112 |80 | department1| class1|69 |69 |69 |74 | |115 |93 | department1| class1|69 |69 |69 |80 | |114 |94 | department1| class1|69 |69 |69 |93 | |124 |70 | department1| class2|69 |70 |70 |70 | |121 |74 | department1| class2|69 |70 |70 |70 | |123 |78 | department1| class2|69 |70 |70 |74 | |122 |86 | department1| class2|69 |70 |70 |78 | +-----------+------+--------------+---------+------+------+------+------+# 结果解释:同count开窗函数

  1. max() 开窗函数
-- min 开窗函数select studentId,math,departmentId,classId, -- 以符合条件的所有行作为窗口 min(math) over() as min1, -- 以按classId分组的所有行作为窗口 min(math) over(partition by classId) as min2, -- 以按classId分组、按math排序后、按到当前行(含当前行)的所有行作为窗口 min(math) over(partition by classId order by math) as min3, -- 以按classId分组、按math排序后、按当前行+往前1行+往后2行的行作为窗口 min(math) over(partition by classId order by math rows between 1 preceding and 2 following) as min4 from student_scores where departmentId='department1'; -- 结果 +-----------+------+--------------+---------+------+------+------+------+ | studentId | math | departmentId | classId | max1 | max2 | max3 | max4 | +-----------+------+--------------+---------+------+------+------+------+ |111 |69 | department1| class1|94 |94 |69 |80 | |113 |74 | department1| class1|94 |94 |74 |93 | |112 |80 | department1| class1|94 |94 |80 |94 | |115 |93 | department1| class1|94 |94 |93 |94 | |114 |94 | department1| class1|94 |94 |94 |94 | |124 |70 | department1| class2|94 |86 |70 |78 | |121 |74 | department1| class2|94 |86 |74 |86 | |123 |78 | department1| class2|94 |86 |78 |86 | |122 |86 | department1| class2|94 |86 |86 |86 | +-----------+------+--------------+---------+------+------+------+------+# 结果解释:同count开窗函数

  1. avg() 开窗函数
-- avg 开窗函数select studentId,math,departmentId,classId, -- 以符合条件的所有行作为窗口 avg(math) over() as avg1, -- 以按classId分组的所有行作为窗口 avg(math) over(partition by classId) as avg2, -- 以按classId分组、按math排序后、按到当前行(含当前行)的所有行作为窗口 avg(math) over(partition by classId order by math) as avg3, -- 以按classId分组、按math排序后、按当前行+往前1行+往后2行的行作为窗口 avg(math) over(partition by classId order by math rows between 1 preceding and 2 following) as avg4 from student_scores where departmentId='department1'; -- 结果 +-----------+------+--------------+---------+---------+---------+---------+---------+ | studentId | math | departmentId | classId | avg1| avg2| avg3| avg4| +-----------+------+--------------+---------+---------+---------+---------+---------+ |111 |69 | department1| class1| 79.7778 | 82.0000 | 69.0000 | 74.3333 | |113 |74 | department1| class1| 79.7778 | 82.0000 | 71.5000 | 79.0000 | |112 |80 | department1| class1| 79.7778 | 82.0000 | 74.3333 | 85.2500 | |115 |93 | department1| class1| 79.7778 | 82.0000 | 79.0000 | 89.0000 | |114 |94 | department1| class1| 79.7778 | 82.0000 | 82.0000 | 93.5000 | |124 |70 | department1| class2| 79.7778 | 77.0000 | 70.0000 | 74.0000 | |121 |74 | department1| class2| 79.7778 | 77.0000 | 72.0000 | 77.0000 | |123 |78 | department1| class2| 79.7778 | 77.0000 | 74.0000 | 79.3333 | |122 |86 | department1| class2| 79.7778 | 77.0000 | 77.0000 | 82.0000 | +-----------+------+--------------+---------+---------+---------+---------+---------+# 结果解释:同count开窗函数

3,专用开窗函数
排序开窗函数
  1. rank() 开窗函数
    rank 开窗函数基于 over 子句中的 order by 确定一组值中一个值的排名。如果存在partition by ,则为每个分区组中的每个值排名。排名可能不是连续的。例如,如果两个行的排名为 1,则下一个排名为 3( 即:1、2、2、4)。
-- rank 开窗函数select *, -- 对全部学生按数学分数排序 rank() over(order by math) as rank1, -- 对院系 按数学分数排序 rank() over(partition by departmentId order by math) as rank2, -- 对每个院系每个班级 按数学分数排序 rank() over(partition by departmentId,classId order by math) as rank3 from student_scores; -- 结果 +------+-----------+----------+------+---------+---------+--------------+-------+-------+-------+ | id| studentId | language | math | english | classId | departmentId | rank1 | rank2 | rank3 | +------+-----------+----------+------+---------+---------+--------------+-------+-------+-------+ |1 |111 |68 |69 |90 | class1| department1|1 |1 |1 | |3 |113 |90 |74 |75 | class1| department1|3 |3 |2 | |2 |112 |73 |80 |96 | class1| department1|9 |6 |3 | |5 |115 |99 |93 |89 | class1| department1|15 |8 |4 | |4 |114 |89 |94 |93 | class1| department1|17 |9 |5 | |9 |124 |76 |70 |76 | class2| department1|2 |2 |1 | |6 |121 |96 |74 |79 | class2| department1|3 |3 |2 | |8 |123 |70 |78 |61 | class2| department1|7 |5 |3 | |7 |122 |89 |86 |85 | class2| department1|14 |7 |4 | |15 |216 |85 |74 |93 | class1| department2|3 |1 |1 | |14 |215 |84 |82 |73 | class1| department2|11 |5 |2 | |11 |212 |76 |83 |75 | class1| department2|12 |6 |3 | |10 |211 |89 |93 |60 | class1| department2|15 |8 |4 | |12 |213 |71 |94 |90 | class1| department2|17 |9 |5 | |13 |214 |94 |94 |66 | class1| department2|17 |9 |5 | |18 |223 |79 |74 |96 | class2| department2|3 |1 |1 | |17 |222 |80 |78 |96 | class2| department2|7 |3 |2 | |19 |224 |75 |80 |78 | class2| department2|9 |4 |3 | |20 |225 |82 |85 |63 | class2| department2|13 |7 |4 | |16 |221 |77 |99 |61 | class2| department2|20 |11 |5 | +------+-----------+----------+------+---------+---------+--------------+-------+-------+-------+# 结果解释

  1. dense_rank() 开窗函数
dense_rank与rank有一点不同,当排名一样的时候,接下来的行是连续的。如两个行的排名为 1,则下一个排名为 2( 即:1、2、2、3)。
-- dense_rank 开窗函数select *, -- 对全部学生按数学分数排序 dense_rank() over(order by math) as den_r1, -- 对院系 按数学分数排序 dense_rank() over(partition by departmentId order by math) as den_r2, -- 对每个院系每个班级 按数学分数排序 dense_rank() over(partition by departmentId,classId order by math) as den_r3 from student_scores; -- 结果 +------+-----------+----------+------+---------+---------+--------------+--------+--------+--------+ | id| studentId | language | math | english | classId | departmentId | den_r1 | den_r2 | den_r3 | +------+-----------+----------+------+---------+---------+--------------+--------+--------+--------+ |1 |111 |68 |69 |90 | class1| department1|1 |1 |1 | |3 |113 |90 |74 |75 | class1| department1|3 |3 |2 | |2 |112 |73 |80 |96 | class1| department1|5 |5 |3 | |5 |115 |99 |93 |89 | class1| department1|10 |7 |4 | |4 |114 |89 |94 |93 | class1| department1|11 |8 |5 | |9 |124 |76 |70 |76 | class2| department1|2 |2 |1 | |6 |121 |96 |74 |79 | class2| department1|3 |3 |2 | |8 |123 |70 |78 |61 | class2| department1|4 |4 |3 | |7 |122 |89 |86 |85 | class2| department1|9 |6 |4 | |15 |216 |85 |74 |93 | class1| department2|3 |1 |1 | |14 |215 |84 |82 |73 | class1| department2|6 |4 |2 | |11 |212 |76 |83 |75 | class1| department2|7 |5 |3 | |10 |211 |89 |93 |60 | class1| department2|10 |7 |4 | |12 |213 |71 |94 |90 | class1| department2|11 |8 |5 | |13 |214 |94 |94 |66 | class1| department2|11 |8 |5 | |18 |223 |79 |74 |96 | class2| department2|3 |1 |1 | |17 |222 |80 |78 |96 | class2| department2|4 |2 |2 | |19 |224 |75 |80 |78 | class2| department2|5 |3 |3 | |20 |225 |82 |85 |63 | class2| department2|8 |6 |4 | |16 |221 |77 |99 |61 | class2| department2|12 |9 |5 | +------+-----------+----------+------+---------+---------+--------------+--------+--------+--------+# 结果解释:同count开窗函数

  1. row_number() 开窗函数
从1开始对分区内的数据排序。
-- row_number 开窗函数select studentid,departmentid,classid,math, -- 对分区departmentid,classid内的数据按math排序 row_number() over(partition by departmentid,classid order by math) as row_num from student_scores; -- 结果 +-----------+--------------+---------+------+---------+ | studentid | departmentid | classid | math | row_num | +-----------+--------------+---------+------+---------+ |111 | department1| class1|69 |1 | |113 | department1| class1|74 |2 | |112 | department1| class1|80 |3 | |115 | department1| class1|93 |4 | |114 | department1| class1|94 |5 | |124 | department1| class2|70 |1 | |121 | department1| class2|74 |2 | |123 | department1| class2|78 |3 | |122 | department1| class2|86 |4 | |216 | department2| class1|74 |1 | |215 | department2| class1|82 |2 | |212 | department2| class1|83 |3 | |211 | department2| class1|93 |4 | |213 | department2| class1|94 |5 | |214 | department2| class1|94 |6 | |223 | department2| class2|74 |1 | |222 | department2| class2|78 |2 | |224 | department2| class2|80 |3 | |225 | department2| class2|85 |4 | |221 | department2| class2|99 |5 | +-----------+--------------+---------+------+---------+# 结果解释:注意参数不要与关键字 row_number()重名。

分布函数:
  1. percent_rank() 开窗函数
    计算给定行的百分比排名。可以用来计算超过了百分之多少的人。如360小助手开机速度超过了百分之多少的人。
    (当前行的rank值-1)/(分组内的总行数-1)
-- percent_rank 开窗函数select studentid,departmentid,classid,math, row_number() over(partition by departmentid,classid order by math) as row_num, percent_rank() over(partition by departmentid,classid order by math) as per_rank from student_scores; -- 结果 +-----------+--------------+---------+------+---------+--------------------+ | studentid | departmentid | classid | math | row_num | per_rank| +-----------+--------------+---------+------+---------+--------------------+ |111 | department1| class1|69 |1 |0 | |113 | department1| class1|74 |2 |0.25 | |112 | department1| class1|80 |3 |0.5 | |115 | department1| class1|93 |4 |0.75 | |114 | department1| class1|94 |5 |1 | |124 | department1| class2|70 |1 |0 | |121 | department1| class2|74 |2 | 0.3333333333333333 | |123 | department1| class2|78 |3 | 0.6666666666666666 | |122 | department1| class2|86 |4 |1 | |216 | department2| class1|74 |1 |0 | |215 | department2| class1|82 |2 |0.2 | |212 | department2| class1|83 |3 |0.4 | |211 | department2| class1|93 |4 |0.6 | |213 | department2| class1|94 |5 |0.8 | |214 | department2| class1|94 |6 |0.8 | |223 | department2| class2|74 |1 |0 | |222 | department2| class2|78 |2 |0.25 | |224 | department2| class2|80 |3 |0.5 | |225 | department2| class2|85 |4 |0.75 | |221 | department2| class2|99 |5 |1 | +-----------+--------------+---------+------+---------+--------------------+# 结果解释:

  1. 【MySQL开窗函数】cume_dist() 开窗函数
    计算某个窗口或分区中某个值的累积分布。假定升序排序,则使用以下公式确定累积分布:
    小于等于当前值x的行数 / 窗口或partition分区内的总行数。其中,x 等于 order by 子句中指定的列的当前行中的值。
-- cume_dist 开窗函数select studentId,math,departmentId,classId, -- 统计小于等于当前分数的人数占总人数的比例 cume_dist() over(order by math) as cume_dist1, -- 统计大于等于当前分数的人数占总人数的比例 cume_dist() over(order by math desc) as cume_dist2, -- 统计分区内小于等于当前分数的人数占总人数的比例 cume_dist() over(partition by classId order by math) as cume_dist3 from student_scores where departmentId='department1'; -- 结果 +-----------+------+--------------+---------+--------------------+--------------------+------------+ | studentId | math | departmentId | classId | cume_dist1| cume_dist2| cume_dist3 | +-----------+------+--------------+---------+--------------------+--------------------+------------+ |111 |69 | department1| class1| 0.1111111111111111 |1 |0.2 | |113 |74 | department1| class1| 0.4444444444444444 | 0.7777777777777778 |0.4 | |112 |80 | department1| class1| 0.6666666666666666 | 0.4444444444444444 |0.6 | |115 |93 | department1| class1| 0.8888888888888888 | 0.2222222222222222 |0.8 | |114 |94 | department1| class1|1 | 0.1111111111111111 |1 | |124 |70 | department1| class2| 0.2222222222222222 | 0.8888888888888888 |0.25 | |121 |74 | department1| class2| 0.4444444444444444 | 0.7777777777777778 |0.5 | |123 |78 | department1| class2| 0.5555555555555556 | 0.5555555555555556 |0.75 | |122 |86 | department1| class2| 0.7777777777777778 | 0.3333333333333333 |1 | +-----------+------+--------------+---------+--------------------+--------------------+------------+# 结果解释:

前后函数:
  1. lag(expr,n) 开窗函数
    lag(col,n,default) 用于统计窗口内往上第n个值。
    col:列名
    n:往上第n行
    default:往上第n行为NULL时候,取默认值,不指定则取NULL
-- lag 开窗函数select studentId,math,departmentId,classId, -- 窗口内 往上取第二个 取不到时赋默认值60 lag(math,2,60) over(partition by classId order by math) as lag1, -- 窗口内 往上取第二个 取不到时赋默认值NULL lag(math,2) over(partition by classId order by math) as lag2 from student_scores where departmentId='department1'; -- 结果 +-----------+------+--------------+---------+------+------+ | studentId | math | departmentId | classId | lag1 | lag2 | +-----------+------+--------------+---------+------+------+ |111 |69 | department1| class1|60 | NULL | |113 |74 | department1| class1|60 | NULL | |112 |80 | department1| class1|69 |69 | |115 |93 | department1| class1|74 |74 | |114 |94 | department1| class1|80 |80 | |124 |70 | department1| class2|60 | NULL | |121 |74 | department1| class2|60 | NULL | |123 |78 | department1| class2|70 |70 | |122 |86 | department1| class2|74 |74 | +-----------+------+--------------+---------+------+------+/* 结果解释: 第3行(studentId=112),lag1:窗口内(69 74 80) 当前行80 向上取第二个值为69 倒数第3行(studentId=121),lag2:窗口内(70 74) 当前行74 向上取第二个值为NULL */

  1. lead(expr,n) 开窗函数
    lead(col,n,default) 用于统计窗口内往下第n个值。
    col:列名
    n:往下第n行
    default:往下第n行为NULL时候,取默认值,不指定则取NULL
-- lead开窗函数select studentId,math,departmentId,classId, -- 窗口内 往下取第二个 取不到时赋默认值60 lead(math,2,60) over(partition by classId order by math) as lead1, -- 窗口内 往下取第二个 取不到时赋默认值NULL lead(math,2) over(partition by classId order by math) as lead2 from student_scores where departmentId='department1'; -- 结果 +-----------+------+--------------+---------+-------+-------+ | studentId | math | departmentId | classId | lead1 | lead2 | +-----------+------+--------------+---------+-------+-------+ |111 |69 | department1| class1|80 |80 | |113 |74 | department1| class1|93 |93 | |112 |80 | department1| class1|94 |94 | |115 |93 | department1| class1|60 |NULL | |114 |94 | department1| class1|60 |NULL | |124 |70 | department1| class2|78 |78 | |121 |74 | department1| class2|86 |86 | |123 |78 | department1| class2|60 |NULL | |122 |86 | department1| class2|60 |NULL | +-----------+------+--------------+---------+-------+-------+# 结果解释:第4行lead1 窗口内向下第二个值为空,赋值60

头尾函数:
  1. first_value(expr) 开窗函数
    返回分区中的第一个值。
-- first_value 开窗函数select studentId,math,departmentId,classId, -- 以符合条件的所有行作为窗口 first_value(math) over() as f_v1, -- 以按classId分组的所有行作为窗口 first_value(math) over(partition by classId) as f_v2, -- 以按classId分组、按math排序后、按到当前行(含当前行)的所有行作为窗口 first_value(math) over(partition by classId order by math) as f_v3, -- 以按classId分组、按math排序后、按当前行+往前1行+往后2行的行作为窗口 first_value(math) over(partition by classId order by math rows between 1 preceding and 2 following) as f_v4 from student_scores where departmentId='department1'; -- 结果 +-----------+------+--------------+---------+------+------+------+------+ | studentId | math | departmentId | classId | f_v1 | f_v2 | f_v3 | f_v4 | +-----------+------+--------------+---------+------+------+------+------+ |111 |69 | department1| class1|69 |69 |69 |69 | |113 |74 | department1| class1|69 |69 |69 |69 | |112 |80 | department1| class1|69 |69 |69 |74 | |115 |93 | department1| class1|69 |69 |69 |80 | |114 |94 | department1| class1|69 |69 |69 |93 | |124 |70 | department1| class2|69 |74 |70 |70 | |121 |74 | department1| class2|69 |74 |70 |70 | |123 |78 | department1| class2|69 |74 |70 |74 | |122 |86 | department1| class2|69 |74 |70 |78 | +-----------+------+--------------+---------+------+------+------+------+/* 结果解释: 1,studentid=124 first_value1:第一个值是69,first_value2:classId=class1分区 math的第一个值是69。 2,studentId=124行,为什么 f_v2=74? 因为:f_v2,只是分组,并没有排序,而 studentId=121 行,在class2中是第一个出现的,所以f_v2就取了studentId=121行,math的值,正是,74。 student_scores 表单数据,参考 数据准备的数据输出。 3,注意:order by之后, rows between num1 preceding and num2 followin 的使用。 // order by math rows between 1 preceding and 2 followin */

  1. last_value(expr) 开窗函数
    返回分区中的最后一个值。
-- last_value 开窗函数select studentId,math,departmentId,classId, -- 以符合条件的所有行作为窗口 last_value(math) over() as last_v1, -- 以按classId分组的所有行作为窗口 last_value(math) over(partition by classId) as last_v2, -- 以按classId分组、按math排序后、按到当前行(含当前行)的所有行作为窗口 last_value(math) over(partition by classId order by math) as last_v3, -- 以按classId分组、按math排序后、按当前行+往前1行+往后2行的行作为窗口 last_value(math) over(partition by classId order by math rows between 1 preceding and 2 following) as last_v4 from student_scores where departmentId='department1'; -- 结果 +-----------+------+--------------+---------+---------+---------+---------+---------+ | studentId | math | departmentId | classId | last_v1 | last_v2 | last_v3 | last_v4 | +-----------+------+--------------+---------+---------+---------+---------+---------+ |111 |69 | department1| class1|70 |93 |69 |80 | |113 |74 | department1| class1|70 |93 |74 |93 | |112 |80 | department1| class1|70 |93 |80 |94 | |115 |93 | department1| class1|70 |93 |93 |94 | |114 |94 | department1| class1|70 |93 |94 |94 | |124 |70 | department1| class2|70 |70 |70 |78 | |121 |74 | department1| class2|70 |70 |74 |86 | |123 |78 | department1| class2|70 |70 |78 |86 | |122 |86 | department1| class2|70 |70 |86 |86 | +-----------+------+--------------+---------+---------+---------+---------+---------+/* 结果解释: studentId=111 行,last_v1=70 参考student_scores表单原始数据。 studentId=111 行,last_v2=93 参考student_scores表单原始数据。 */

其他函数:
  1. ntile(n) 开窗函数
    将分区中已排序的行划分为大小尽可能相等的指定数量的排名的组,并返回给定行所在的组的排名。
    即:尽可能的平均分租,返回所在的组是第几组。
-- ntile 开窗函数select studentId,math,classId,departmentId, -- 对分区内的数据分成两组 ntile(2) over(partition by departmentid order by math) as n_1, -- 对分区内的数据分成三组 ntile(3) over(partition by departmentid order by math) as n_2 from student_scores; -- 结果 +-----------+------+---------+--------------+------+------+ | studentId | math | classId | departmentId | n_1| n_2| +-----------+------+---------+--------------+------+------+ |111 |69 | class1| department1|1 |1 | |124 |70 | class2| department1|1 |1 | |113 |74 | class1| department1|1 |1 | |121 |74 | class2| department1|1 |2 | |123 |78 | class2| department1|1 |2 | |112 |80 | class1| department1|2 |2 | |122 |86 | class2| department1|2 |3 | |115 |93 | class1| department1|2 |3 | |114 |94 | class1| department1|2 |3 | |216 |74 | class1| department2|1 |1 | |223 |74 | class2| department2|1 |1 | |222 |78 | class2| department2|1 |1 | |224 |80 | class2| department2|1 |1 | |215 |82 | class1| department2|1 |2 | |212 |83 | class1| department2|1 |2 | |225 |85 | class2| department2|2 |2 | |211 |93 | class1| department2|2 |2 | |213 |94 | class1| department2|2 |3 | |214 |94 | class1| department2|2 |3 | |221 |99 | class2| department2|2 |3 | +-----------+------+---------+--------------+------+------+/* 结果解释: n_1:department1 被分为2组,第一组5个,第二组:4个。department2 被分为2组,第一组6个,第二组5个。 n_2:department1 被分为3组,第一组3个,第二组:3个,第三组3个。department2 被分为3组,第一组4个,第二组4个,第三组3个。 */

  1. NTH_VALU(expr,n) 开窗函数
-- NTH_VALU 开窗函数SELECTstudentId ,math , classId ,departmentId, -- 按照departmentid分区,math降序,取第二个的成绩。 nth_value(math, 2)OVER (PARTITION BY departmentid ORDER BY math DESC)AS n_1 , -- 按照departmentid分区,math升序,取第一个的成绩。 nth_value(math, 1)OVER (PARTITION BY departmentid ORDER BY math asc)AS n_2 FROM student_scores; -- 结果 +-----------+------+---------+--------------+------+------+ | studentId | math | classId | departmentId | n_1| n_2| +-----------+------+---------+--------------+------+------+ |111 |69 | class1| department1|93 |69 | |124 |70 | class2| department1|93 |69 | |113 |74 | class1| department1|93 |69 | |121 |74 | class2| department1|93 |69 | |123 |78 | class2| department1|93 |69 | |112 |80 | class1| department1|93 |69 | |122 |86 | class2| department1|93 |69 | |115 |93 | class1| department1|93 |69 | |114 |94 | class1| department1| NULL |69 | |216 |74 | class1| department2|94 |74 | |223 |74 | class2| department2|94 |74 | |222 |78 | class2| department2|94 |74 | |224 |80 | class2| department2|94 |74 | |215 |82 | class1| department2|94 |74 | |212 |83 | class1| department2|94 |74 | |225 |85 | class2| department2|94 |74 | |211 |93 | class1| department2|94 |74 | |213 |94 | class1| department2|94 |74 | |214 |94 | class1| department2|94 |74 | |221 |99 | class2| department2| NULL |74 | +-----------+------+---------+--------------+------+------+# 结果解释:为什么会出现 NULL?

五,常见报错 1,ERROR 1064
MySQL ERROR 1064 (42000): -- 1,使用了系统保留的关键字,如果非要使用,对关键字家反斜杠:`keyw0rd` -- 2,没有权限; -- 3,在给列设置string类型的时候,如果指定列的类型为varchar则需要指定该类型的最大‘容量值‘,若为char类型则不必指定。

2,创建表字符串类型:
-- MySQL 声明字符串: name varchar(10) -- hive 声明字符串: name string

3,插入数据:
MySQL insert into student_scores values -- 注意:insert into table student_scores values//不能带 table。

4,关键字重名
-- 报错: row_number()over(partition by departmentid,classId order by math) as row_number -- 正确: row_number() over(partition by departmentId,classId order by math) as row_num -- 原因:as row_number 与 row_number()属于关键字row_number重名;

5,注释不规范
MySQL语句中,做注释,#、-- 和注释内容之间需要加空格。否则报错。
ERROR 1064 (42000): xxxxx; check the manual that corresponds to your MySQL server version for the right syntax to use near ‘默认60 -- 注释

克己勤勉。

    推荐阅读