1.join on and 和join on where

关于join on后面使用and时的情况，例如a left join b on a.id=b.id，会将左边表的列全部显示，如果添加a left join b on a.id=b.id and b.bonus<1000，条件and b.bonus<1000会对右表进行过滤，符合and b.bonus<1000条件的会与左表连接，显示具体的值，如果不符合and b.bonus<1000的条件，这行数据仍然显示，但值显示null

on 条件是在生成临时表时使用的条件，它不管 on 中的条件是否为真，都会返回左边表中的记录。而where是对生成后的临时表进行过滤，真正的过滤后就不显示

选出所有 bonus < 1000 的员工的 name 及其 bonus。

Employee 表单

+-------+--------+-----------+--------+
| empId |  name  | supervisor| salary |
+-------+--------+-----------+--------+
|   1   | John   |  3        | 1000   |
|   2   | Dan    |  3        | 2000   |
|   3   | Brad   |  null     | 4000   |
|   4   | Thomas |  3        | 4000   |
+-------+--------+-----------+--------+

empId 是这张表单的主关键字
Bonus 表单

+-------+-------+
| empId | bonus |
+-------+-------+
| 2     | 500   |
| 4     | 2000  |
+-------+-------+

empId 是这张表单的主关键字
输出示例：

+-------+-------+
| name  | bonus |
+-------+-------+
| John  | null  |
| Dan   | 500   |
| Brad  | null  |
+-------+-------+

select e.name, b.bonus from Employee e left join Bonus b on e.empId = b.empId where b.bonus < 1000 or b.bonus is null

2.Cross join

cross join为笛卡尔积，将两张表拼接，记得和含有条件的左右链接区分
要求写一段 SQL 语句，查询出每个学生参加每一门科目测试的次数，结果按 student_id 和 subject_name 排序。

示例

查询结构格式如下所示：

Students table:

+------------+--------------+
| student_id | student_name |
+------------+--------------+
| 1          | Alice        |
| 2          | Bob          |
| 13         | John         |
| 6          | Alex         |
+------------+--------------+

Subjects table:

+--------------+
| subject_name |
+--------------+
| Math         |
| Physics      |
| Programming  |
+--------------+

Examinations table:

+------------+--------------+
| student_id | subject_name |
+------------+--------------+
| 1          | Math         |
| 1          | Physics      |
| 1          | Programming  |
| 2          | Programming  |
| 1          | Physics      |
| 1          | Math         |
| 13         | Math         |
| 13         | Programming  |
| 13         | Physics      |
| 2          | Math         |
| 1          | Math         |
+------------+--------------+

Result table:

+------------+--------------+--------------+----------------+
| student_id | student_name | subject_name | attended_exams |
+------------+--------------+--------------+----------------+
| 1          | Alice        | Math         | 3              |
| 1          | Alice        | Physics      | 2              |
| 1          | Alice        | Programming  | 1              |
| 2          | Bob          | Math         | 1              |
| 2          | Bob          | Physics      | 0              |
| 2          | Bob          | Programming  | 1              |
| 6          | Alex         | Math         | 0              |
| 6          | Alex         | Physics      | 0              |
| 6          | Alex         | Programming  | 0              |
| 13         | John         | Math         | 1              |
| 13         | John         | Physics      | 1              |
| 13         | John         | Programming  | 1              |
+------------+--------------+--------------+----------------+

结果表需包含所有学生和所有科目（即便测试次数为0）：
Alice 参加了 3 次数学测试, 2 次物理测试，以及 1 次编程测试；
Bob 参加了 1 次数学测试, 1 次编程测试，没有参加物理测试；
Alex 啥测试都没参加；
John 参加了数学、物理、编程测试各 1 次。

select s.student_id, s.student_name, su.subject_name, count(e.subject_name) attended_exams from Students s cross join Subjects su left join Examinations e on s.student_id = e.student_id and su.subject_name = e.subject_name group by s.student_id, su.subject_name order by s.student_id, su.subject_name

3.Case when

简单函数

CASE [col_name] WHEN [value1] THEN [result1]…ELSE [default] END

搜索函数

CASE WHEN [expr] THEN [result1]…ELSE [default] END

MYSQL 中使用case when then 判断 null的方法

错误方法:
CASE columnName WHEN null THEN 0 ELSE columnName END
正确方法:
CASE WHEN columnName is null THEN 0 ELSE columnName END

示例

用户的确认率是 ‘confirmed’ 消息的数量除以请求的确认消息的总数。没有请求任何确认消息的用户的确认率为 0 。确认率四舍五入到小数点后两位。

编写一个SQL查询来查找每个用户的确认率。

以任意顺序返回结果表。

查询结果格式如下所示。

输入：
Signups 表:

+---------+---------------------+
| user_id | time_stamp          |
+---------+---------------------+
| 3       | 2020-03-21 10:16:13 |
| 7       | 2020-01-04 13:57:59 |
| 2       | 2020-07-29 23:09:44 |
| 6       | 2020-12-09 10:39:37 |
+---------+---------------------+

Confirmations 表:

+---------+---------------------+-----------+
| user_id | time_stamp          | action    |
+---------+---------------------+-----------+
| 3       | 2021-01-06 03:30:46 | timeout   |
| 3       | 2021-07-14 14:00:00 | timeout   |
| 7       | 2021-06-12 11:57:29 | confirmed |
| 7       | 2021-06-13 12:58:28 | confirmed |
| 7       | 2021-06-14 13:59:27 | confirmed |
| 2       | 2021-01-22 00:00:00 | confirmed |
| 2       | 2021-02-28 23:59:59 | timeout   |
+---------+---------------------+-----------+

输出:

+---------+-------------------+
| user_id | confirmation_rate |
+---------+-------------------+
| 6       | 0.00              |
| 3       | 0.00              |
| 7       | 1.00              |
| 2       | 0.50              |
+---------+-------------------+

解释:
用户 6 没有请求任何确认消息。确认率为 0。
用户 3 进行了 2 次请求，都超时了。确认率为 0。
用户 7 提出了 3 个请求，所有请求都得到了确认。确认率为 1。
用户 2 做了 2 个请求，其中一个被确认，另一个超时。确认率为 1 / 2 = 0.5。

select a.user_id, round(sum(a.confirmation)/count(a.confirmation), 2) confirmation_rate from  
(select s.user_id, case c.action when 'confirmed' then 1 when c.action is null then 0  else 0 end confirmation
from Signups s left join Confirmations c on s.user_id = c.user_id) a group by a.user_id

4.IF和聚合函数

IF函数能通过判断条件来返回特定值（类似case when），它的语法如下：

IF(expr,result_true,result_false)

常用的聚合函数包括count、sum、avg、min、max
使用如下：
1、不加条件查询

sum(money)

2.加条件查询

sum(if(status='0',money,0))         
sum(if(status in ('1','6'),money,0))         
sum(if(status >1,money,0))

avg（条件）相当于sum（if（条件，1，0））/count(全体)

示例

将查询结果的质量 quality 定义为：

各查询结果的评分与其位置之间比率的平均值。

将劣质查询百分比 poor_query_percentage 为：

评分小于 3 的查询结果占全部查询结果的百分比。

编写一组 SQL 来查找每次查询的名称(query_name)、质量(quality) 和劣质查询百分比(poor_query_percentage)。

质量(quality) 和劣质查询百分比(poor_query_percentage) 都应四舍五入到小数点后两位。

查询结果格式如下所示：

Queries table:

+------------+-------------------+----------+--------+
| query_name | result            | position | rating |
+------------+-------------------+----------+--------+
| Dog        | Golden Retriever  | 1        | 5      |
| Dog        | German Shepherd   | 2        | 5      |
| Dog        | Mule              | 200      | 1      |
| Cat        | Shirazi           | 5        | 2      |
| Cat        | Siamese           | 3        | 3      |
| Cat        | Sphynx            | 7        | 4      |
+------------+-------------------+----------+--------+

Result table:

+------------+---------+-----------------------+
| query_name | quality | poor_query_percentage |
+------------+---------+-----------------------+
| Dog        | 2.50    | 33.33                 |
| Cat        | 0.66    | 33.33                 |
+------------+---------+-----------------------+

Dog 查询结果的质量为 ((5 / 1) + (5 / 2) + (1 / 200)) / 3 = 2.50
Dog 查询结果的劣质查询百分比为 (1 / 3) * 100 = 33.33

Cat 查询结果的质量为 ((2 / 5) + (3 / 3) + (4 / 7)) / 3 = 0.66
Cat 查询结果的劣质查询百分比为 (1 / 3) * 100 = 33.33

select query_name, round(avg(rating/position), 2) quality, round(avg(rating < 3)*100, 2) poor_query_percentage from Queries group by query_name

5.IFNULL（可用于聚合函数）

常用于将null变成0，写法如下：

IFNULL(SUM(price),0)

5.group by和order by的顺序问题

在一般的sql中，执行顺序是先group by再order by
如果想先order by再group by可以使用下面子查询的写法：

select * from (select * from Delivery having 1 order by order_date asc) d group by customer_id

tips：
1.其中有一个having 1 ，因为子查询中使用排序不会生效，所以需要having 1强制生效。
2.子查询（表连接）需要给一个别名，否则会报错；结果作为输入不需要别名。

6.使用min函数来排序

示例：
配送表: Delivery

+-----------------------------+---------+
| Column Name                 | Type    |
+-----------------------------+---------+
| delivery_id                 | int     |
| customer_id                 | int     |
| order_date                  | date    |
| customer_pref_delivery_date | date    |
+-----------------------------+---------+

delivery_id 是表的主键。
该表保存着顾客的食物配送信息，顾客在某个日期下了订单，并指定了一个期望的配送日期（和下单日期相同或者在那之后）。

如果顾客期望的配送日期和下单日期相同，则该订单称为「即时订单」，否则称为「计划订单」。

「首次订单」是顾客最早创建的订单。我们保证一个顾客只会有一个「首次订单」。

写一条 SQL 查询语句获取即时订单在所有用户的首次订单中的比例。保留两位小数。

查询结果如下所示：

Delivery 表：

+-------------+-------------+------------+-----------------------------+
| delivery_id | customer_id | order_date | customer_pref_delivery_date |
+-------------+-------------+------------+-----------------------------+
| 1           | 1           | 2019-08-01 | 2019-08-02                  |
| 2           | 2           | 2019-08-02 | 2019-08-02                  |
| 3           | 1           | 2019-08-11 | 2019-08-12                  |
| 4           | 3           | 2019-08-24 | 2019-08-24                  |
| 5           | 3           | 2019-08-21 | 2019-08-22                  |
| 6           | 2           | 2019-08-11 | 2019-08-13                  |
| 7           | 4           | 2019-08-09 | 2019-08-09                  |
+-------------+-------------+------------+-----------------------------+

Result 表：

+----------------------+
| immediate_percentage |
+----------------------+
| 50.00                |
+----------------------+

1 号顾客的 1 号订单是首次订单，并且是计划订单。
2 号顾客的 2 号订单是首次订单，并且是即时订单。
3 号顾客的 5 号订单是首次订单，并且是计划订单。
4 号顾客的 7 号订单是首次订单，并且是即时订单。
因此，一半顾客的首次订单是即时的。

select round(avg(if(order_date = customer_pref_delivery_date, 1, 0)) * 100, 2) immediate_percentage from (select delivery_id,customer_id,min(order_date) order_date,customer_pref_delivery_date  from  Delivery group by customer_id) t

7.inner join 和 in

在做子查询后过滤的时候（有时需要把表B结果作为表A条件，其中表A和表B一样），可以使用inner join 和 in，其中in的用法为：

(col1, col2) in select col1, col2......

示例：
编写一个 SQL 查询，报告在首次登录的第二天再次登录的玩家的比率，四舍五入到小数点后两位。换句话说，您需要计算从首次登录日期开始至少连续两天登录的玩家的数量，然后除以玩家总数。

查询结果格式如下所示：

Activity table:

+-----------+-----------+------------+--------------+
| player_id | device_id | event_date | games_played |
+-----------+-----------+------------+--------------+
| 1         | 2         | 2016-03-01 | 5            |
| 1         | 2         | 2016-03-02 | 6            |
| 2         | 3         | 2017-06-25 | 1            |
| 3         | 1         | 2016-03-02 | 0            |
| 3         | 4         | 2018-07-03 | 5            |
+-----------+-----------+------------+--------------+

Result table:

+-----------+
| fraction  |
+-----------+
| 0.33      |
+-----------+

只有 ID 为 1 的玩家在第一天登录后才重新登录，所以答案是 1/3 = 0.33

select round(count(distinct player_id)/(select count(distinct player_id) from Activity), 2) fraction from Activity 
where (player_id, event_date) in (select player_id, DATE_ADD(min(event_date), interval 1 day) event_date from Activity group by player_id )

8.having 和 where

where是在group by分组之前进行条件筛选，后面不可以跟聚合函数。

having是在group by分组之后进行条件筛选，后面可以直接跟聚合函数。
示例：
MyNumbers 表：

+-------------+------+
| Column Name | Type |
+-------------+------+
| num         | int  |
+-------------+------+

这张表没有主键。可能包含重复数字。
这张表的每一行都含有一个整数。

单一数字是在 MyNumbers 表中只出现一次的数字。

请你编写一个 SQL 查询来报告最大的单一数字。如果不存在单一数字，查询需报告 null 。

select max(t.num) num from (select * from MyNumbers group by num having count(num) = 1) t

tips：此时子查询的结果将作为一张表数据，所以需要别名t

SQL相关

1.join on and 和join on where

选出所有 bonus < 1000 的员工的 name 及其 bonus。

2.Cross join

示例

3.Case when

示例

4.IF和聚合函数

示例

5.IFNULL（可用于聚合函数）

5.group by和order by的顺序问题

6.使用min函数来排序

7.inner join 和 in

8.having 和 where