09.Hive必刷50题--41-45题

    科技2022-08-04  126

    – 41、查询不同课程成绩相同的学生的学生编号、课程编号、学生成绩:

    SELECT sc1.s_id,sc1.c_id,sc2.s_id,sc2.c_id,sc1.s_score FROM score sc1 JOIN score sc2 ON sc1.s_id!=sc2.s_id AND sc1.c_id!=sc2.c_id WHERE sc1.s_score=sc2.s_score ORDER BY sc1.s_id

    报错:Error: Error while compiling statement: FAILED: SemanticException [Error 10017]: Line 2:33 Both left and right aliases encountered in JOIN 'c_id' (state=42000,code=10017)

    错误原因:两个表join的时候,不支持两个表的字段的非相等操作。

     

    修改:

    SELECT sc1.s_id,sc1.c_id,sc2.s_id,sc2.c_id,sc1.s_score FROM score sc1 JOIN score sc2 WHERE sc1.s_score=sc2.s_score AND sc1.s_id!=sc2.s_id AND sc1.c_id!=sc2.c_id ORDER BY sc1.s_id

    – 42、查询每门课程成绩最好的前三名:

    错误解:

    SELECT tmp.* FROM (SELECT s_id,c_id,s_score,RANK() over(ORDER BY s_score DESC) ranking FROM score sc GROUP BY c_id,s_id,s_score )tmp WHERE tmp.ranking=1 OR tmp.ranking=2 OR tmp.ranking=3

    ranking进行了全排名,和预想结果对每组cid进行排名不符。

    SELECT * FROM score WHERE c_id='01' ORDER BY s_score DESC LIMIT 3 UNION SELECT * FROM score WHERE c_id='02' ORDER BY s_score DESC LIMIT 3 UNION SELECT * FROM score WHERE c_id='03' ORDER BY s_score DESC LIMIT 3

    union和order by一起使用的方法:https://blog.csdn.net/ooooooobh/article/details/81335191

    报错:Incorrect usage of UNION and ORDER BY

    原因:因为union在没有括号的情况下只能使用一个order by

    修改:

    (SELECT * FROM score WHERE c_id='01' ORDER BY s_score DESC LIMIT 3) UNION (SELECT * FROM score WHERE c_id='02' ORDER BY s_score DESC LIMIT 3) UNION (SELECT * FROM score WHERE c_id='03' ORDER BY s_score DESC LIMIT 3)

    mysql中可以执行,但是这种方式的目的是为了让结果集先分别order by,然后再对两个结果集进行union。但是你会发现这种方式虽然不报错了,但是两个order by并没有效果,所以应该改成如下:order by不能直接出现在union的子句中,但是可以出现在子句的子句中。

    但是hive中不可,报错:

    Error: Error while compiling statement: FAILED: ParseException line 1:0 cannot recognize input near '(' 'SELECT' '*' (state=42000,code=40000)

     

    修改:在子查询的基础上加上查询子查询的语句

    方法一:

    SELECT tmp1.* FROM (SELECT * FROM score WHERE c_id='01' ORDER BY s_score DESC LIMIT 3) tmp1 UNION SELECT tmp2.* FROM (SELECT * FROM score WHERE c_id='02' ORDER BY s_score DESC LIMIT 3 ) tmp2 UNION SELECT tmp3.* FROM (SELECT * FROM score WHERE c_id='03' ORDER BY s_score DESC LIMIT 3 ) tmp3

    方法二:

    UNION: 默认按照sid排序

    SELECT tmp1.* FROM (SELECT *,ROW_NUMBER() over(ORDER BY s_score DESC) ranking FROM score WHERE c_id='01' ) tmp1 WHERE tmp1.ranking<=3 UNION SELECT tmp2.* FROM (SELECT *,ROW_NUMBER() over(ORDER BY s_score DESC) ranking FROM score WHERE c_id='02' ) tmp2 WHERE tmp2.ranking<=3 UNION SELECT tmp3.* FROM (SELECT *,ROW_NUMBER() over(ORDER BY s_score DESC) ranking FROM score WHERE c_id='03' ) tmp3 WHERE tmp3.ranking<=3

     

    UNION ALL:不排序

    SELECT tmp1.* FROM (SELECT *,ROW_NUMBER() over(ORDER BY s_score DESC) ranking FROM score WHERE c_id='01' ) tmp1 WHERE tmp1.ranking<=3 UNION ALL SELECT tmp2.* FROM (SELECT *,ROW_NUMBER() over(ORDER BY s_score DESC) ranking FROM score WHERE c_id='02' ) tmp2 WHERE tmp2.ranking<=3 UNION ALL SELECT tmp3.* FROM (SELECT *,ROW_NUMBER() over(ORDER BY s_score DESC) ranking FROM score WHERE c_id='03' ) tmp3 WHERE tmp3.ranking<=3

    Union和Union All的区别:

    Union:对两个结果集进行并集操作,不包括重复行,同时进行默认规则的排序;

    Union All:对两个结果集进行并集操作,包括重复行,不进行排序;

     

    – 43、统计每门课程的学生选修人数(超过5人的课程才统计): – 要求输出课程号和选修人数,查询结果按人数降序排列,若人数相同,按课程号升序排列

    SELECT c_id,COUNT(s_id) count FROM score GROUP BY c_id HAVING COUNT(s_id)>5 ORDER BY count DESC,c_id

    – 44、检索至少选修两门课程的学生学号:

    SELECT s_id,COUNT(c_id) count FROM score GROUP BY s_id HAVING COUNT(c_id)>=2

    – 45、查询选修了全部课程的学生信息:

    SELECT s.s_id,s_birth,s_name,s_sex FROM score sc JOIN student s ON s.s_id=sc.s_id GROUP BY s.s_id,s_birth,s_name,s_sex HAVING COUNT(sc.c_id)=3

    后续部分参见:

    00.Hive必刷50题--建表,插入数据

    01.Hive必刷50题--1-5题

    02.Hive必刷50题--6-10题

    03.Hive必刷50题--11-15题

    04.Hive必刷50题--16-20题

    05.Hive必刷50题--21-25题

    06.Hive必刷50题--26-30题

    07.Hive必刷50题--31-35题

    08.Hive必刷50题--36-40题

    09.Hive必刷50题--41-45题

    010.Hive必刷50题--46-50题

    Processed: 0.016, SQL: 8