SQL|mysql牛客刷题(SQL大厂面试真题)


文章目录

  • 某音短视频
    • SQL1 各个视频的平均完播率
    • SQL2 平均播放进度大于60%的视频类别
    • SQL3 每类视频近一个月的转发量/率
    • SQL4 每个创作者每月的涨粉率及截止当前的总粉丝量
  • 用户增长场景(某度信息流)
    • SQL7 2021年11月每天的人均浏览文章时长
    • SQL8 每篇文章同一时刻最大在看人数
    • SQL9 2021年11月每天新用户的次日留存率
  • 电商场景(某东商城)
    • SQL13 计算商城中2021年每月的GMV
    • SQL14 统计2021年10月每个退货率不大于0.5的商品的各项指标
    • SQL15 某店铺的各商品毛利率及店铺整体毛利率
    • SQL16 零食类商品中复购率top3高的商品
  • 出行场景(某滴打车)
    • SQL19 2021年国庆在北京接单3次及以上的司机统计
    • SQL20 有取消订单记录的司机平均评分
  • 某宝店铺分析(电商模式)
    • SQL25 某宝店铺的SPU数量
    • SQL26 某宝店铺的实际销售额与客单价
  • 牛客直播课分析(在线教育行业)
    • SQL30 牛客直播转换率
  • 某乎问答(内容行业)
    • SQL35 某乎问答11月份日人均回答量

某音短视频 SQL1 各个视频的平均完播率
先理解题目意思:有播放记录的视频的id号和其完播率。条件是:2021年和根据其完播率降序排序。 拆解复杂指标:这里需要拆解的是完播率这个指标

SQL|mysql牛客刷题(SQL大厂面试真题)
文章图片

注意:记得加上题目的限制条件
SELECT a.video_id , round(sum(if(end_time - start_time >= duration, 1, 0))/count(start_time ),3) as avg_comp_play_rate FROM tb_user_video_log a LEFT JOIN tb_video_info b on a.video_id = b. video_id WHERE year(start_time) = 2021 GROUP BY a.video_id ORDER BY avg_comp_play_rate DESC;

SQL2 平均播放进度大于60%的视频类别 平均播放进度大于60%的视频类别
明确题意:
计算各类视频的平均播放进度,将进度大于60%的类别输出
问题分解:
关联用户-视频互动记录和短视频信息表:JOIN tb_video_info USING(video_id);
按视频类别分组:GROUP BY tag
计算每个类别的平均播放进度:
播放进度=播放时长÷视频时长*100%
播放时长=TIMESTAMPDIFF(SECOND, start_time, end_time);特殊情况:播放时长大于视频时长时,播放进度为100%(加个IF判断)
平均进度=AVG(每个进度)
结果保留2位小数:ROUND(x, 2)
百分比格式化:CONCAT(x, ‘%’)
计算公式:
ROUND(AVG(
IF(TIMESTAMPDIFF(SECOND, start_time, end_time) > duration, 1,
TIMESTAMPDIFF(SECOND, start_time, end_time) / duration)
) * 100, 2) as avg_play_progress
筛选播放进度>60%的视频类别:HAVING avg_play_progress > 60
细节问题:
表头重命名:as
按播放进度倒序排序:ORDER BY avg_play_progress DESC;
select tag,concat(avg_play_progress,"%") as avg_play_progress from( select tag, round(avg(if(timestampdiff(second,start_time,end_time)>duration,1, timestampdiff(second,start_time,end_time)/duration))*100, 2) as avg_play_progress from tb_user_video_log join tb_video_info using(video_id) group by tag having avg_play_progress>60 order by avg_play_progress desc )as t_progress

SQL3 每类视频近一个月的转发量/率 首先是题意,视频播放日期start_time和整体的日期最大值之差小于30的,叫做“视频在有用户互动的最近一个月”。 第二,mark一下,上面要写成(SELECT MAX(start_time) FROM tb_user_video_log),而不能直接写MAX(start_time)。where后面不能跟集函数。顺便说一句,having后面倒是常跟集函数。
1.有交互的视频的最近一个月(就是有播放量的视频的最大日期 减去 30天)
2.每类视频在有用户互动的最近一个月(并不是对于各类视频计算最大值,而是整体的日期最大值)
select tag,sum(if_retweet),round(sum(if_retweet)/count(*),3) retweet_rate from tb_user_video_log a left join tb_video_info b on a.video_id =b.video_id where timestampdiff(day,start_time,(select max(start_time) from tb_user_video_log))<30 group by tag order by retweet_rate desc

SQL4 每个创作者每月的涨粉率及截止当前的总粉丝量 第一个需要解决的地方是每个月状态2相当于掉粉,状态1相当于涨粉,这里不能用if去做,因为有多个状态,所以用case when去处理。
第二个需要解决的就是,我们需要计算每个月截止当前的粉丝量,这里自然地想到用窗口函数去处理即可,这里窗口函数需要去partition by author如果这样处理的话,如果有多个作者就会混乱。
SELECT author, date_format(start_time,'%Y-%m') month, round(sum(case when if_follow=1 then 1 when if_follow=2 then -1 else 0 end)/count(author),3) fans_growth_rate, sum(sum(case when if_follow=1 then 1 when if_follow=2 then -1 else 0 end)) over(partition by author order by date_format(start_time,'%Y-%m')) total_fans FROM tb_user_video_log log left join tb_video_info info on log.video_id=info.video_id where year(start_time)=2021 group by author,month order by author,total_fans

请问这个为啥要sum两次呢?只在窗口函数中放sum为啥不行?
因为一个是单月统计 一个是截止到当前的汇总
窗口函数是把单月新增累加一起
用户增长场景(某度信息流) SQL7 2021年11月每天的人均浏览文章时长
select date(in_time) dt, round(sum(timestampdiff(second,in_time,out_time))/count(distinct uid),1) avg_lensec from tb_user_log where date_format(in_time,"%Y-%m") ="2021-11" and artical_id!=0 group by dt order by avg_lensec

SQL8 每篇文章同一时刻最大在看人数 参考题解
select artical_id, max(cnt) as ax from (select artical_id, sum(Mark) over (partition by artical_id order by Time, Mark desc) as cnt from (select artical_id, in_time as Time , 1 Mark from tb_user_log where artical_id != 0 union all select artical_id, out_time as Time , -1 Mark from tb_user_log where artical_id != 0) as a )as b group by artical_id order by ax desc

SQL9 2021年11月每天新用户的次日留存率 流程:
先查询出每个用户第一次登陆时间(最小登陆时间)–每天新用户表
因为涉及到跨天活跃,所以要进行并集操作,将登录时间和登出时间取并集,这里union会去重–用户活跃表
将每天新用户表和用户活跃表左连接,只有是同一用户并且该用户第2天依旧登陆才会保留整个记录,否则右表记录为空
得到每天新用户第二天是否登陆表后,开始计算每天的次日留存率:根据日期分组计算,次日活跃用户个数/当天新用户个数
select t1.dt,round(count(t2.uid)/count(t1.uid),2) uv_left_rate from( select uid, min(date(in_time)) dt from tb_user_log group by uid) t1 -- 每天用户表 left join (select uid,date(in_time) dt from tb_user_log union select uid,date(out_time) dt from tb_user_log) t2 -- 用户活跃表 on t1.uid = t2.uid and t1.dt = date_sub(t2.dt,INTERVAL 1 day) where date_format(t1.dt,"%Y-%m") ='2021-11' group by t1.dt order by t1.dt

电商场景(某东商城) SQL13 计算商城中2021年每月的GMV 明确题意:
统计GMV大于10w的每月GMV
问题分解:
筛选满足条件的记录:
退款的金额不算(付款的记录还在,已算过一次):where status != 2
2021年的记录:and YEAR(event_time) = 2021
按月份分组:group by DATE_FORMAT(event_time, “%Y-%m”)
计算GMV:(sum(total_amount) as GMV
保留整数:ROUND(x, 0)
筛选GMV大于10w的分组:having GMV > 100000
另外有一个问题
group by 执行顺序不是在select 之前吗,为什么能直接用month啊
答:
mysql做了查询增强,别的不可以
select date_format(event_time,"%Y-%m")as month, sum(total_amount) as GMV from tb_order_overall where year(event_time) =2021 and (status!=2) group by month having GMV>100000 order by GMV

SQL14 统计2021年10月每个退货率不大于0.5的商品的各项指标 明确题意:
统计2021年10月每个有展示记录的退货率不大于0.5商品的各项指标:
商品点展比=点击数÷展示数;加购率=加购数÷点击数;成单率=付款数÷加购数;退货率=退款数÷付款数,当分母为0时整体结果记为0。
结果中各项指标保留3位小数,并按商品ID升序排序。
问题分解:
计算各个维度的计数(生成子表t_product_index_cnt)
筛选时间窗内的记录:where DATE_FORMAT(event_time, ‘%Y%m’) = ‘202110’
按商品ID分组:group by product_id
统计各种计数:
展示数(每条记录就是一次展示):COUNT(1) as show_cnt
点击数:sum(if_click) as click_cnt
加购数:sum(if_cart) as cart_cnt
付款数:sum(if_payment) as payment_cnt
退款数:sum(if_refund) as refund_cnt
计算各种指标率(除了展示数其他均可能为0,要特殊处理!):
点击率:click_cnt/show_cnt as ctr
加购率:IF(click_cnt>0, cart_cnt/click_cnt, 0) as cart_rate
付款率:IF(cart_cnt>0, payment_cnt/cart_cnt, 0) as payment_rate
退款率:IF(payment_cnt>0, refund_cnt/payment_cnt, 0) as refund_rate
都保留3位小数:ROUND(x, 3)
筛选退款率不大于0.5的商品,需注意分母可能为0:where payment_cnt = 0 or refund_rate <= 0.5
select product_id,round(click_cnt/show_cnt,3) as ctr, round(if(click_cnt>0,cart_cnt/click_cnt,0),3) as cart_rate, round(if(cart_cnt>0,payment_cnt/cart_cnt,0),3) as payment_rate, round(if(payment_cnt>0,refund_cnt/payment_cnt,0),3) as refund_rate from( select product_id,count(1) as show_cnt, sum(if_click) as click_cnt, sum(if_cart) as cart_cnt, sum(if_payment) as payment_cnt, sum(if_refund) as refund_cnt from tb_user_event where date_format(event_time,'%Y-%m') ='2021-10' group by product_id )as tb_user_e where payment_cnt =0 or refund_cnt/payment_cnt<=0.5 order by product_id

SQL15 某店铺的各商品毛利率及店铺整体毛利率 参考题解链接
with rollup用法
select product_id,concat(profit_rate,"%") as profit_rate from( select ifnull(product_id,'店铺汇总') as product_id, round(100*(1-sum(in_price*cnt)/sum(price*cnt)),1) as profit_rate from( select product_id,price,cnt,in_price from tb_order_detail join tb_order_overall using(order_id) join tb_product_info using(product_id) where shop_id =901 and date(event_time) >="2021-10-01" )as t_product_in_each_order group by product_id with rollup having profit_rate>24.9 or product_id is null order by product_id )as t1

SQL16 零食类商品中复购率top3高的商品 参考链接
date_sub用法
select product_id, round(sum(repurchase)/count(repurchase),3) as repurchase_rate from( select uid,product_id,if(count(event_time)>1,1,0) as repurchase from tb_order_detail join tb_order_overall using(order_id) join tb_product_info using(product_id) where tag = "零食" and event_time>= ( select date_sub(max(event_time),INTERVAL 89 DAY) from tb_order_overall ) group by uid,product_id )as tb group by product_id order by repurchase_rate desc,product_id limit 3

出行场景(某滴打车) SQL19 2021年国庆在北京接单3次及以上的司机统计
select "北京" as city,round(avg(order_num),3) as avg_order_num, round(avg(income),3) as avg_income from( select driver_id,count(order_id) as order_num, sum(fare) as income from tb_get_car_order join tb_get_car_record using(order_id) where city ="北京" and date_format(order_time,"%Y-%m-%d") between '2021-10-01' and '2021-10-07' group by driver_id having count(order_id)>=3 ) as t

SQL20 有取消订单记录的司机平均评分
select ifnull(driver_id,"总体") as driver_id, round(avg(grade),1) as avg_grade from tb_get_car_order where driver_id in ( select driver_id from tb_get_car_order where date_format(order_time,"%Y-%m")='2021-10' and isnull(fare) ) and not isnull(grade) group by driver_id with rollup

某宝店铺分析(电商模式) SQL25 某宝店铺的SPU数量
select style_id, count(distinct item_id) SPU_num from product_tb group by style_id order by SPU_num desc

SQL26 某宝店铺的实际销售额与客单价 【SQL|mysql牛客刷题(SQL大厂面试真题)】销售额:SUM(sales_price)
客单价:SUM(sales_price)/COUNT(DISTINCT user_id)
select sum(sales_price) sales_total, round(sum(sales_price)/count(distinct user_id),2) pre_trans from sales_tb

牛客直播课分析(在线教育行业) SQL30 牛客直播转换率
SELECT course_id, course_name, round(sum(if_sign)/ sum(if_vw)*100,2) as "sign_rate(%)" from course_tb join behavior_tb using(course_id) group by course_id,course_name order by course_id

某乎问答(内容行业) SQL35 某乎问答11月份日人均回答量
select answer_date, round(count(issue_id)/count(distinct author_id),2) per_num from answer_tb where month(answer_date) = 11 group by answer_date

    推荐阅读