0
点赞
收藏
分享

微信扫一扫

Found a misplaced row,分区表修复MYSQL

在MySQL中,分区表的数据被分隔到不同的表空间中,但仍被视为单个表。在处理大量数据集的某些情况下,分区可能是一种有用的方法。在分区表中删除庞大的数据集可以加快速度,但是如果处理不当,它可能会把数据放在表的不对应的位置中。

测试环境:mysql 8.0.30

建表

master [localhost:22031] {msandbox} (test) > create table sbtest (
    -> id bigint not null,
    -> age int not null,
    -> from_date date not null,
    -> to_date date not null,
    -> primary key (id,from_date)
    -> )
    -> partition by range columns(from_date)
    -> (PARTITION p01 VALUES LESS THAN ('2014-12-31') ,
    -> PARTITION p02 VALUES LESS THAN ('2015-12-31') ,
    -> PARTITION p03 VALUES LESS THAN ('2016-12-31') ,
    -> PARTITION p04 VALUES LESS THAN ('2017-12-31') ,
    -> PARTITION p05 VALUES LESS THAN ('2018-12-31') ,
    -> PARTITION p06 VALUES LESS THAN ('2019-12-31') ,
    -> PARTITION p07 VALUES LESS THAN ('2020-12-31') ,
    -> PARTITION p08 VALUES LESS THAN ('2021-12-31') ,
    -> PARTITION p09 VALUES LESS THAN ('2022-12-31') ,
    -> PARTITION p10 VALUES LESS THAN (MAXVALUE) );
Query OK, 0 rows affected (0.05 sec)

create table sbtest_ex_p20(
id bigint not null,
age int not null,
from_date date not null,
to_date date not null,
primary key (id,from_date)
);

在大数据量的情况下,为了更快的替换分区数据,我们会执行如下操作

master [localhost:22031] {msandbox} (test) > ALTER TABLE sbtest EXCHANGE PARTITION p08 WITH TABLE sbtest_ex_p20 WITHOUT VALIDATION;
Query OK, 0 rows affected (0.02 sec)

现在查表的总数据。

master [localhost:22031] {msandbox} (test) > select * from sbtest ;
+----+-----+------------+------------+
| id | age | from_date  | to_date    |
+----+-----+------------+------------+
|  1 |  10 | 2014-06-01 | 2014-06-01 |
|  2 |  20 | 2015-06-01 | 2015-06-01 |
|  3 |  30 | 2016-06-01 | 2016-06-01 |
|  4 |  40 | 2017-06-01 | 2014-06-01 |
|  5 |  50 | 2018-06-01 | 2014-06-01 |
|  6 |  40 | 2019-06-01 | 2014-06-01 |
|  7 |  50 | 2020-06-01 | 2014-06-01 |
|  8 |  40 | 2021-06-01 | 2014-06-01 |
|  9 |  50 | 2022-06-01 | 2014-06-01 |
| 10 |  50 | 2023-06-01 | 2014-06-01 |
| 11 |  50 | 2023-05-01 | 2014-06-01 |
+----+-----+------------+------------+
11 rows in set (0.00 sec)

当按条件查询时:

master [localhost:22031] {msandbox} (test) > select * from sbtest where id=5 and from_date='2018-06-01';
Empty set (0.00 sec)

而明明全表查询的时候有ID=5 from_date='2018-06-01'的数据,根据条件查询确没有,再看一下执行计划

master [localhost:22031] {msandbox} (test) > explain select * from sbtest where id=5 and from_date='2018-06-01';
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+--------------------------------+
| id | select_type | table | partitions | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra                          |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+--------------------------------+
|  1 | SIMPLE      | NULL  | NULL       | NULL | NULL          | NULL | NULL    | NULL | NULL |     NULL | no matching row in const table |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+--------------------------------+
1 row in set, 1 warning (0.01 sec)

执行计划显示no matching row in const table。

这种结果是因为在做表交换分区的时候用了WITHOUT VALIDATION,不加这个选项时,分区交换是会报错的。那出现这种分区数据不在对应分区表中如可解决呢。可以有2种方法。

第一种:check table;

master [localhost:22031] {msandbox} (test) > check table sbtest;
+-------------+-------+----------+-------------------------------------------------------------------------------------+
| Table       | Op    | Msg_type | Msg_text                                                                            |
+-------------+-------+----------+-------------------------------------------------------------------------------------+
| test.sbtest | check | error    | Found a misplaced row in part 7 should be in part 0:
 from_date:2014-06-01          |
| test.sbtest | check | error    | Partition p08 returned error                                                        |
| test.sbtest | check | error    | Table upgrade required. Please do "REPAIR TABLE `sbtest`" or dump/reload to fix it! |
+-------------+-------+----------+-------------------------------------------------------------------------------------+
3 rows in set (0.00 sec)

按照提示方案解决

master [localhost:22031] {msandbox} (test) > REPAIR TABLE `sbtest`;
+-------------+--------+----------+-------------------------+
| Table       | Op     | Msg_type | Msg_text                |
+-------------+--------+----------+-------------------------+
| test.sbtest | repair | warning  | Moved 10 misplaced rows |
| test.sbtest | repair | status   | OK                      |
+-------------+--------+----------+-------------------------+
2 rows in set (0.00 sec)

再执行查询看看

master [localhost:22031] {msandbox} (test) > select * from sbtest where id=5 and from_date='2018-06-01';
+----+-----+------------+------------+
| id | age | from_date  | to_date    |
+----+-----+------------+------------+
|  5 |  50 | 2018-06-01 | 2014-06-01 |
+----+-----+------------+------------+
1 row in set (0.00 sec)

查询有数据,显示正常。但是如果表的数据量很大,这个repair表操作会锁表,影响线上业务。

第二种:pt工具,最小可能影响业务。

pt-online-schema-change --execute  --alter "ENGINE=InnoDB" h=localhost,D=test,t=sbtest

相当于是重建表,这种操作锁表的时间短,建议在业务低峰执行

举报

相关推荐

0 条评论