MySQL面试专题-索引-CFANZ编程社区

一、MySQL为什么要选择B+树来存储索引？

（一）IO角度

（二）分而治之

（三）数据块的大小

（四）数据格式

（五）数据存储

（六）数据结构

1.哈希表

2.二叉树、BST、AVL、红黑树

3.B-树

4.B+树

二、索引有哪些分类？

（一）从数据结构角度

（二）从物理存储角度

1.聚簇索引

2.非聚簇索引

（三）从逻辑角度

（四）回表、覆盖索引、最左匹配原则、索引下推

1.回表

2.覆盖索引

（1）SQL性能优化案例

①SQL优化前

SQL优化前，查询时间将近两分钟

②SQL优化后

SQL优化后查询时间不到11s

（2）SQL命令

①show processlist：查看连接状态

用命令查看连接状态

②show profiles：查看每条语句执行时间

查看操作的执行时间

3.最左匹配原则

--假设有一张表，表中包含字段：id,name,age,gender,address，其中id是主键，（name，age）是组合索引。现在有如下查询：
1、select * from table where name = 'zhangsan' and age = 10;
2、select * from table where name = 'zhangsan';
3、select * from table where age = 10;
4、select * from table where age = 10 and name = 'zhangsan';
--在上面四个查询中，1、2、4都用到了索引，3用不到，因为3中的where条件中跳过了name字段没有按照建组合索引的字段顺序进行查询因此不。在4中虽然不满足最左匹配原则，但是经过***MySQL优化器***优化之后，age和name两个字段的顺序无论怎样其执行结果都是一样的，所以它也使用到了索引。

（1）MySQL架构

4.索引下推（ICP: Index Condition Pushdown）

（1）索引下推的使用规则

①当需要访问完整的行记录时，ICP用于range、ref、eq_ref和ref_or_null访问方法

②ICP可以用于innodb表和myisam表，包括分区的innodb表和myisam表

③对于innodb表，ICP仅用于二级索引。ICP的目标是减少整行的读取次数从而较少IO操作

④在虚拟列上创建的二级索引不支持ICP

⑤引用子查询的条件不能下推

⑥引用存储函数的条件不能下推

⑦触发器条件不能下推

⑧不能将条件下推到包含对系统变量引用的派生表中

SET optimizer_switch = 'index_condition_pushdown=off';
SET optimizer_switch = 'index_condition_pushdown=on';

三、如何设计优良的索引？

（一）索引列占用的空间越小也好

（二）通过count(distinct(column_name))/count(*)计算索引列的离散度，值越大也适合做索引

（三）在where条件后的order by字段上添加索引，如果有where条件则跟where条件一起创建索引，如果没有则只是给order by条件创建索引

（四）在join on的条件字段上添加索引

（五）索引个数过多会增加索引维护成本

（六）频繁更新的字段不适合做索引，因为会增加索引维护成本

（七）随机无序的值不适合做索引，比如身份证号、UUID

（八）索引列在设计时最好不要为null

（九）可以根据列前缀建立索引（计算列的前面某些连续字母的离散度确定根据哪几个字母建立索引）

四、造成索引失效的情况

（一）索引列上使用函数(replace/substr/concat/sum/count/avg)、表达式

（二）数据类型不匹配，当查询条件的数据类型和索引字段的数据类型不匹配

（三）like查询条件的关键字前面加%

（四）在组合索引中不满足最左匹配原则

（五）使用is not null

（六）MySQL优化器在分析时发现全表扫描效率比使用索引快的时候

（七）使用or关键字会导致索引失效

五、主键为什么建议选择自增主键？

六、如何查看SQL是否使用了索引？

（一）查看执行计划

（二）分析执行计划

1.id

（1）如果id相同那么执行顺序从上往下

explain select * from emp e join dept d on e.deptno = d.deptno join salgrade sg on e.sal between sg.losal and sg.hisal;

id相同那么执行顺序从上往下

（2）如果id不同，如果是子查询，id的序号会递增，id值越大优先级越高越先被执行

explain select * from emp where ename not in (select ename from emp where ename like '%S%') ;

序号越大执行优先级越高

（3）如果id有相同的也有不同的，那么id相同的是一组从上往下执行，id值越大的组优先级越高

explain Select dept.*,person_num,avg_sal from dept,(select count(*) person_num,avg(sal) avg_sal,deptno from emp group by deptno) t where dept.deptno = t.deptno ;

id相同的是一组，执行时按照从上往下的顺序进行，id值越大的组执行优先级越高

2.select_type

（1）simple：简单查询，不包含子查询和union

explain select * from emp;

simple表示简单查询

（2）primary：如果查询中包含子查询，最外层的查询会被标记成primary

explain select * from emp where ename not in (select ename from emp where ename like '%S%') ;

包含子查询的查询中，最外层查询会被标记成primary

（3）union：若第二个select出现在union之后，那么该查询类型会被标记成union

explain select * from emp where deptno = 10 union select * from emp where sal >2000;

union查询类型

（4）DEPENDENT UNION：跟union类似，此处的dependent表示union或union all联合而成的结果会受外部表影响

explain select * from emp e where e.empno  in ( select empno from emp where deptno = 10 union select empno from emp where sal >2000);

DEPENDENT UNION查询类型

（5）union result：表示一个union的结果集作为一个单独的表返回，这通常发生在union操作之后，并且可能跟其他表进行join操作

explain select * from emp where deptno = 10 union select * from emp where sal >2000;

（6）SUBQUERY：在查询中作为另一个查询的子查询的查询

explain select * from emp where sal > (select avg(sal) from emp) ;

子查询类型

（7）DEPENDENT SUBQUERY：与SUBQUERY类似，但是这种查询类型依赖于外部查询的某些部分

explain select e.empno,e.ename,e.sal from emp e where e.sal < (select e2.sal from emp e2 where e2.empno = e.mgr)

DEPENDENT SUBQUERY查询类型

（8）DERIVED：出现在from子句中的子查询，MySQL会为这个子查询生成一个临时表。这个值表示该查询是为派生表生成的

explain select t.job from (select min(sal) min_sal,job from emp group by job) t where t.min_sal > 2500;

DERIVED查询类型

（9）DEPENDENT DERIVED：与DERIVED类似，但是这个查询依赖于外部查询的某些部分（未找到案例）

（10）MATERIALIZED：表示该子查询的结果被物化（即存储在临时表中），以供稍后的join使用，这种类型的子查询在执行时比常规子查询要慢

EXPLAIN 
select * from emp where deptno in (select deptno from (select min(sal) min_sal,deptno from emp group by deptno) a where min_sal < '2000') ;

MATERIALIZED查询类型

（11）UNCACHEABLE SUBQUERY：一个子查询的结果不能被缓存，因此每次都会重新计算（未找到案例）

（12）UNCACHEABLE UNION：一个union查询的结果不会被缓存，因此每次都会重新计算（未找到案例）

3.table

（1）如果是具体的表名，则表示从实际的物理表中获取数据，当然也可以是表的别名

（2）如果表名是derivedN的形式，表示使用了id为N的查询产生的衍生表

（3）当有union result的时候，表名是union n1,n2等形式，n1,n2表示参与union的id

4.type

（1）ALL

explain select * from emp;

访问类型ALL

（2）index

explain  select empno from emp;

访问类型是全索引扫描

（3）range

explain select * from emp where empno between 7000 and 7500;

range访问类型

（4）index_subquery：跟unique_subquery类型，使用的是辅助索引

SET optimizer_switch='materialization=off';
EXPLAIN select * from emp where ename not in (select dname from dept where dname like '%SALES' );
SET optimizer_switch='materialization=on';

使用辅助索引进行查询

（5）unique_subquery：子查询的结果由聚簇索引或者唯一索引覆盖

SET optimizer_switch='materialization=off';
EXPLAIN select * from emp where deptno not in (select deptno from dept where deptno >20 );
SET optimizer_switch='materialization=on';

unique_subquery访问类型

（6）index_merge：索引合并，在where条件中使用不同的索引字段

explain select * from emp where ename='SMITH' or deptno = 10;

（7）ref_or_null：跟ref类似，在ref的查询基础上，加一个null值的条件查询

explain select * from emp  where ename = 'SMITH' or ename is null;

ref_or_null访问类型

（8）ref：使用了非聚集索引进行数据的查找

alter table emp add index idx_name(ename);
explain select * from emp  where ename = 'SMITH';

ref访问类型

（9）eq_ref ：使用唯一性索引进行数据查找

explain select * from emp e,emp e2 where e.empno = e2.empno;

eq_ref访问类型

（10）const：这个表至多有一个匹配行

explain select * from emp where empno = 7369;

const访问类型

（11）system：表只有一行记录（等于系统表），这是const类型的特例，平时不会出现

5.possible_keys

explain select * from emp where ename = 'SIMTH' and deptno = 10;

SQL执行计划中显示可能会用到的索引

6.key：实际使用到的索引

7.key_len

8.ref：显示了哪些列或常量被用来查找索引列，这对于非唯一索引查找有效

explain select * from emp,dept where emp.deptno = dept.deptno and emp.deptno = 10;

ref参数信息的作用

9.rows

explain select * from emp;

rows参数值直接反映了查询时通过SQL扫描了多少行数据

10.filtered

11.extra：提供查询的额外信息

（1）Using filesort

explain select * from emp order by sal;

查询时使用文件排序

（2）Using temporary

explain select ename,count(*) from emp where deptno = 10 group by ename;

分组统计查询过程中会建立临时表

（3）Using index

explain select deptno,count(*) from emp group by deptno limit 10;

查询时使用覆盖索引

（4）Using where

explain select * from emp where job='SMITH';

额外信息为Using where

（5）Using join buffer：使用连接查询

explain select * from t3 join t2 on t3.c1 = t2.c1;

（6）Impossible WHERE：where语句的执行结果总是false

explain select * from emp where 1=0;

where语句的执行结果总是false