有趣的生产案例两则-CFANZ编程社区

前言

今天短暂的一天就出现了两起有趣的生产案例，十分值得分享 ~

删除列自动删除掉所有依赖的索引

第一个例子是同事反馈的：开发删除了某一列之后，将复合索引也顺带删除了，于是导致查询雪崩。

看个例子，开发正常使用这个复合索引进行查询

postgres=# create table t1(id int,info text);
CREATE TABLE
postgres=# insert into t1 select n,md5(random()::text) from generate_series(1,100000) as n;
INSERT 0 100000
postgres=# create index t1_idx on t1(id,info);
CREATE INDEX
postgres=# analyze t1;
ANALYZE
postgres=# explain select id,info from t1 where id = '99';
                              QUERY PLAN                               
-----------------------------------------------------------------------
 Index Only Scan using t1_idx on t1  (cost=0.42..8.44 rows=1 width=37)
   Index Cond: (id = 99)
(2 rows)

然后业务发版，进行表结构变更，删除了其中的info列

postgres=# alter table t1 drop column info;
ALTER TABLE
postgres=# \d t1
                 Table "public.t1"
 Column |  Type   | Collation | Nullable | Default 
--------+---------+-----------+----------+---------
 id     | integer |           |          |

DBA惊呆了😮，复合索引也被删除了，并且没有任何的提示，于是导致今天本该使用这个复合索引的SQL全部走了顺序扫描，雪崩...（这个在Oracle中也是相同行为）

虽然这看起来是正常的，列都不在了索引当然也不在。

postgres=# create table t2(id int,info text);
CREATE TABLE
postgres=# create index mydx on t2(info);
CREATE INDEX
postgres=# \d t2
                 Table "public.t2"
 Column |  Type   | Collation | Nullable | Default 
--------+---------+-----------+----------+---------
 id     | integer |           |          | 
 info   | text    |           |          | 
Indexes:
    "mydx" btree (info)

postgres=# alter table t2 drop column info;
ALTER TABLE
postgres=# \d t2
                 Table "public.t2"
 Column |  Type   | Collation | Nullable | Default 
--------+---------+-----------+----------+---------
 id     | integer |           |          |

但是对于复合索引就有点微妙了，我觉得更优雅的方式应该是从复合索引中自动去掉这一列，然后重构索引，而不是直接一把梭删掉了这个复合索引（Oracle也是一样的行为），并且没有任何提示，不然稍不注意就可能掉进了坑里。这个锅太重了。

因此，强烈建议在删除列之前，查找一下依赖关系，哪些表包含这个列，是否有复合索引

postgres=# create table test(id int,info text);
CREATE TABLE
postgres=# create table test3(info int);
CREATE TABLE
postgres=# create index myidx on test(id);
CREATE INDEX
postgres=# create index mydix2 on test(id,info);
CREATE INDEX
postgres=# select
    t.relname as table_name,
    i.relname as index_name,
    a.attname as column_name
from
    pg_class t,
    pg_class i,
    pg_index ix,
    pg_attribute a
where
    t.oid = ix.indrelid
    and i.oid = ix.indexrelid
    and a.attrelid = t.oid
    and a.attnum = ANY(ix.indkey)
    and t.relkind = 'r'
    and t.relname like 'test%'
order by
    t.relname,
    i.relname;
 table_name | index_name | column_name 
------------+------------+-------------
 test       | mydix2     | id
 test       | mydix2     | info
 test       | myidx      | id
(3 rows)

postgres=# SELECT
    t.table_schema,
    t.table_name
FROM
    information_schema.tables t
    INNER JOIN information_schema.columns c ON c.table_name = t.table_name
        AND c.table_schema = t.table_schema
WHERE
    c.column_name = 'info'
    AND t.table_schema NOT IN ('information_schema', 'pg_catalog')
    AND t.table_type = 'BASE TABLE'
ORDER BY
    t.table_schema;
 table_schema | table_name 
--------------+------------
 public       | test3
 public       | test
(2 rows)

这样的话，比如我要删除info列，通过这个SQL我就可以知晓myidx2这个复合索引也会级联删除！

或者更硬气一点，通过event trigger禁止删列、删索引

create function fetg() returns event_trigger language plpgsql as $$
begin
  if exists (
    select 1
    from pg_event_trigger_dropped_objects() as t
    where
      t.object_type = 'table column' and
      t.object_identity like any(array['%.t.y', 'public.tt.zz']))
  then
    raise exception 'Columns t.y and public.tt.zz are important!';
  end if;
end $$;

create event trigger etg on sql_drop execute procedure fetg();

postgres=# create table t(x int, y int);
CREATE TABLE
postgres=# alter table t drop column y;
ERROR:  Columns t.y and public.tt.zz are important!
CONTEXT:  PL/pgSQL function fetg() line 10 at RAISE

pg_restore不恢复索引

这个例子也是今天发现的，同事找到我说pg_restore不会恢复索引？这又是什么鬼？让我们复现一下

postgres=# create table test_dump(id int,info text);
CREATE TABLE
postgres=# create index on test_dump(id);
CREATE INDEX
postgres=# \d test_dump
             Table "public.test_dump"
 Column |  Type   | Collation | Nullable | Default 
--------+---------+-----------+----------+---------
 id     | integer |           |          | 
 info   | text    |           |          | 
Indexes:
    "test_dump_id_idx" btree (id)

postgres=# insert into test_dump values(1,'test');
INSERT 0 1
postgres=# insert into test_dump values(2,'handsome');
INSERT 0 1

同事使用的Fd格式，因为要导出的这个表比较大，而只有Fd支持并行

The “directory” format is the only format that supports parallel dumps.

那让我们也来同样操作一下，这里加上-v输出详细的备份信息

[postgres@xiongcc ~]$ pg_dump -Fd -t test_dump -f dump_dir -v 
pg_dump: last built-in OID is 16383
pg_dump: reading extensions
...
...
pg_dump: reading indexes
pg_dump: flagging indexes in partitioned tables
...
pg_dump: dumping contents of table "public.test_dump"

可以看到其实pg_dump是备份了索引的，另外也可以看一下备份出来doc文件。

有些老铁可能不知道pg_restore --list这个操作，这个其实可以查看Fd备份出来的文件信息的，另外还可以做很多灵活的操作，典型场景就是导出函数，比如：

pg_dump -U username --format=c --schema-only -f dump_test your_database
pg_restore --list dump_test | grep FUNCTION > function_list    ---从list里面摘出来所有的函数
pg_restore -U username -d your_other_database -L function_list dump_test

当然也可以使用如下这个原始方式导出函数，通过SQL先查出所有的函数

[postgres@xiongcc ~]$ psql -At postgres > func_file.sql <<"__END__"
SELECT pg_get_functiondef(f.oid)
FROM pg_catalog.pg_proc f
INNER JOIN pg_catalog.pg_namespace n ON (f.pronamespace = n.oid)
WHERE n.nspname = 'public';
__END__

[postgres@xiongcc ~]$ cat func_file.sql 
create or replace function loop_test_01() returns void
as $$
declare
 n numeric := 0;
begin
 loop
     n := n + 1;
     raise notice 'n 的当前值为: %',n;
     exit when n >= 10;
     -- return;
   end loop;
end;
$$ language plpgsql;

扯远了，看下备份内容，里面确实有索引

[postgres@xiongcc ~]$ pg_restore --list dump_dir/
;
; Archive created at 2022-06-23 16:27:19 CST
;     dbname: postgres
;     TOC Entries: 8
;     Compression: -1
;     Dump Version: 1.14-0
;     Format: DIRECTORY
;     Integer: 4 bytes
;     Offset: 8 bytes
;     Dumped from database version: 15beta1
;     Dumped by pg_dump version: 15beta1
;
;
; Selected TOC Entries:
;
218; 1259 16455 TABLE public test_dump postgres
3465; 0 16455 TABLE DATA public test_dump postgres
3321; 1259 16460 INDEX public test_dump_id_idx postgres

现在恢复一下试试

[postgres@xiongcc ~]$ pg_restore -t test_dump -d postgres dump_dir/
[postgres@xiongcc ~]$ psql
psql (15beta1)
Type "help" for help.

postgres=# \d test_dump 
             Table "public.test_dump"
 Column |  Type   | Collation | Nullable | Default 
--------+---------+-----------+----------+---------
 id     | integer |           |          | 
 info   | text    |           |          |

可以看到索引确实没有！那么我的索引去了哪里？

后来经过一阵摸索，原来使用Fd的方式，使用-t只是指定了表，你还得指定-I，指定恢复索引，因为这二者属于不同的对象

If you use -t, you limit the operation to the table you named. Indices are different objects and need to be selected with the -I (or --index=) selector.

[postgres@xiongcc ~]$ psql -c "drop table test_dump"
DROP TABLE
[postgres@xiongcc ~]$ pg_restore -t test_dump -d postgres -I test_dump_id_idx dump_dir/
[postgres@xiongcc ~]$ psql
psql (15beta1)
Type "help" for help.

postgres=# \d test_dump 
             Table "public.test_dump"
 Column |  Type   | Collation | Nullable | Default 
--------+---------+-----------+----------+---------
 id     | integer |           |          | 
 info   | text    |           |          | 
Indexes:
    "test_dump_id_idx" btree (id)

赶脚怪怪的。还是Fp好使，使用管道都不需要落地，pg_dump -Fp | psql，还能自己编辑查看备份SQL，不过文本格式不支持并行，这或许是plain格式的缺点了吧，而custom和directory支持调整顺序（如上）和默认压缩，所以这三种模式的区别，一定得知晓其中的坑。

小结

论影响的话，其实第一个情况比较坑人，稍不注意就可能造成生产事故，所以我决定在生产规范中添加一条

若开发发版需要删除表中的某些列，需要先执行SQL，查看是否存在包含该列的复合索引，因为删除列会级联删除所有包括该列的索引，并且没有任何的提示！可能会导致其他依赖该复合索引的查询使用顺序扫描，导致雪崩

另外，今天学徒群里我看到有人迫切想要PostgreSQL高可用相关的文章，比如HA对比、HA选型、各个HA的利弊等，后续我就把之前早就写好的发出来吧，师母已呆！

参考

https://dba.stackexchange.com/questions/251838/prevent-drop-column-with-index-on-it-in-postgresql

https://stackoverflow.com/questions/5347050/postgresql-sql-script-to-get-a-list-of-all-tables-that-have-a-particular-column

https://www.codegrepper.com/code-examples/sql/postgresql+search+all+tables+for+column+name

https://serverfault.com/questions/806271/pg-restore-on-a-single-table-not-restoring-indexes

https://stackoverflow.com/questions/13758003/how-to-take-backup-of-functions-only-in-postgres