前言
PostgreSQL中的xmin、xmax相信各位已经很熟悉了,用于判断不同事务间数据的可见性,在此不再赘述。除了这两个系统列,在表中还存在其他几个系统列(可以通过pg_attribute查看),包括cmin、cmax、ctid和tableoid。
ctid表示的是数据块内的偏移量,也就是所在数据块内的具体位置。而tableoid表示行所在的表,可以在涉及分区表查询或者union时查出数据行所在的具体表。
CREATE TABLE ptab01 (
id int not null,
tm timestamptz not null
) PARTITION BY RANGE (tm);
create table ptab01_202001 partition of ptab01 for values from ('2020-01-01') to ('2020-02-01');
create table ptab01_202002 partition of ptab01 for values from ('2020-02-01') to ('2020-03-01');
create table ptab01_202003 partition of ptab01 for values from ('2020-03-01') to ('2020-04-01');
create table ptab01_202004 partition of ptab01 for values from ('2020-04-01') to ('2020-05-01');
create table ptab01_202005 partition of ptab01 for values from ('2020-05-01') to ('2020-06-01');
insert into ptab01 select extract(epoch from seq), seq from generate_series('2020-01-01'::timestamptz, '2020-05-31 23:59:59'::timestamptz, interval '10 seconds') as seq;
postgres=# select tableoid::regclass as relname,* from ptab01 where tm='2020-01-07'::timestamptz;
-[ RECORD 1 ]-------------------
relname | ptab01_202001 ---通过tableoid找出所在的子表
id | 1578326400
tm | 2020-01-07 00:00:00+08
那么 cmin 和 cmax 有什么用呢?其实光看名字也能猜到大概,和xmin、xmax类似,为了判断行的可见性,只不过针对的是同一个事务内的情况。
何为cmin
在源码中,关于cmin和cmax的定义如下,可以看到使用了一个4字节的CommandId字段来表示cmin和cmax:
typedef struct HeapTupleFields
{
TransactionId t_xmin;/* inserting xact ID */
TransactionId t_xmax;/* deleting or locking xact ID */
union
{
CommandIdt_cid;/* inserting or deleting command ID, or both */
TransactionId t_xvac;/* old-style VACUUM FULL xact ID */
} t_field3;
} HeapTupleFields;
在头部的注释当中,对于这一块有了一个简单的阐述:
* We store five "virtual" fields Xmin, Cmin, Xmax, Cmax, and Xvac in three
* physical fields. Xmin and Xmax are always really stored, but Cmin, Cmax
* and Xvac share a field. This works because we know that Cmin and Cmax
* are only interesting for the lifetime of the inserting and deleting
* transaction respectively. If a tuple is inserted and deleted in the same
* transaction, we store a "combo" command id that can be mapped to the real
* cmin and cmax, but only by use of local state within the originating
* backend. See combocid.c for more details. Meanwhile, Xvac is only set by
* old-style VACUUM FULL, which does not have any command sub-structure and so
* does not need either Cmin or Cmax. (This requires that old-style VACUUM
* FULL never try to move a tuple whose Cmin or Cmax is still interesting,
* ie, an insert-in-progress or delete-in-progress tuple.)
PostgreSQL总共在三个物理域里面存储xmin、xmax、cmin、cmax和xvac,其中xmin和xmax任何时候都需要,因为要用来辅以判断不同事务间行的可见性,所以可以xmin和xmax分别占了4个字节。
而cmin、cmax和xvac可以看到是一个共用体,与之类似的是我们熟知的结构体。
结构体的各个成员会占用不同的内存,互相之间没有影响;而共用体的所有成员占用同一段内存,修改一个成员会影响其余所有成员。结构体占用的内存大于等于所有成员占用的内存的总和(成员之间可能会存在缝隙),共用体占用的内存等于最长的成员占用的内存。共用体使用了内存覆盖技术,同一时刻只能保存一个成员的值,如果对新的成员赋值,就会把原来成员的值覆盖掉。所以对于t_cid来说,在不同的声明周期,会有不同的表示:cmin、cmax、xvac。
作用是什么
如前文所述,cmin代表的是插入元组的命令ID,cmax代表的是删除元组的命令ID,在源码中,使用了CommandIdt_cid 这一个字段来同时表示cmin和cmax,是存储在一个字段里的,所以我们可以看到cmin和cmax是一样的。
那么为什么还需要cmin和cmax呢?cmin和cmax的目的是用于判断同一个事务内的不同命令导致的行版本的变化是否可见。如果一个事务是严格按顺序依次执行的,那么后面的命令总是可以看到之前命令产生的变更的。这种情况不需要cmin和cmax。
但是考虑这么一种情况:游标,游标的声明是某一时间的快照,对于后续的更改,是不会影响到游标的取值的,所以这也存在类似不同事务交替执行产生的数据可见性的问题。
见如下示例,Fetch游标时看到的是声明游标时的数据快照而不是Fetch执行时,即声明游标后对数据的变更对该游标不可见:
postgres=# begin;
BEGIN
postgres=*# insert into cur_test values(1);
INSERT 0 1
postgres=*# select xmin,xmax,cmin,cmax,id from cur_test;
xmin | xmax | cmin | cmax | id
-------+------+------+------+----
18188 | 0 | 0 | 0 | 1
(1 row)
postgres=*# declare mycursor cursor for select xmin,xmax,cmin,cmax,id from cur_test ;
DECLARE CURSOR
postgres=*# update cur_test set id = 10 where id = 1;
UPDATE 1
postgres=*# fetch all from mycursor;
xmin | xmax | cmin | cmax | id
-------+-------+------+------+----
18188 | 18188 | 0 | 0 | 1
(1 row)
postgres=*# select xmin,xmax,cmin,cmax,id from cur_test;
xmin | xmax | cmin | cmax | id
-------+------+------+------+----
18188 | 0 | 1 | 1 | 10
(1 row)
可以看到,在此例中,游标里面看到的cmin和cmax是0,而更新后的cmin和cmax均为1,声明游标后产生的变更对游标不可见,所以最后fetch取数据的时候,发现命令ID大于声明游标时的命令ID,所以数据不可见。进一步测试,看看cmin和cmax的分配规则:
postgres=# insert into test values(1);
INSERT 0 1
postgres=# select cmin,cmax,id from test ;
cmin | cmax | id
------+------+----
0 | 0 | 1
(1 row)
postgres=# insert into test values(2);
INSERT 0 1
postgres=# select cmin,cmax,id from test ;
cmin | cmax | id
------+------+----
0 | 0 | 1
0 | 0 | 2
(2 rows)
postgres=# begin; ---新开事务
BEGIN
postgres=*# insert into test values(3); ---事务开始时,命令ID初始值为0
INSERT 0 1
postgres=*# insert into test values(4); ---命令ID为1
INSERT 0 1
postgres=*# select cmin,cmax,id from test ;
cmin | cmax | id
------+------+----
0 | 0 | 1
0 | 0 | 2
0 | 0 | 3
1 | 1 | 4
(4 rows)
postgres=*# insert into test values(5); ---命令ID为2
INSERT 0 1
postgres=*# select cmin,cmax,id from test ;
cmin | cmax | id
------+------+----
0 | 0 | 1
0 | 0 | 2
0 | 0 | 3
1 | 1 | 4
2 | 2 | 5
(5 rows)
postgres=*# commit ;
COMMIT
postgres=# select cmin,cmax,id from test ;
cmin | cmax | id
------+------+----
0 | 0 | 1
0 | 0 | 2
0 | 0 | 3
1 | 1 | 4
2 | 2 | 5
(5 rows)
postgres=# begin; ---新开事务
BEGIN
postgres=*# insert into test values(6); ---命令ID初始值又变成了0
INSERT 0 1
postgres=*# insert into test values(7); ---命令ID为1
INSERT 0 1
postgres=*# select cmin,cmax,id from test ;
cmin | cmax | id
------+------+----
0 | 0 | 1
0 | 0 | 2
0 | 0 | 3
1 | 1 | 4
2 | 2 | 5
0 | 0 | 6
1 | 1 | 7
(7 rows)
postgres=# begin; ---新开事务
BEGIN
postgres=*# insert into test values(8); ---命令ID为0
INSERT 0 1
postgres=*# select id from test where id = 8 for update; ---命令ID增加,为1
id
----
8
(1 row)
postgres=*# insert into test values(9); ---命令ID为2
INSERT 0 1
postgres=*# select cmin,cmax,id from test ; ---普通的查询不会使命令ID加1
cmin | cmax | id
------+------+----
0 | 0 | 1
0 | 0 | 2
0 | 0 | 3
1 | 1 | 4
2 | 2 | 5
0 | 0 | 6
1 | 1 | 7
0 | 0 | 8
2 | 2 | 9
(9 rows)
postgres=*# insert into test values(10); ---命令ID为3
INSERT 0 1
postgres=*# select cmin,cmax,id from test ;
cmin | cmax | id
------+------+----
0 | 0 | 1
0 | 0 | 2
0 | 0 | 3
1 | 1 | 4
2 | 2 | 5
0 | 0 | 6
1 | 1 | 7
0 | 0 | 8
2 | 2 | 9
3 | 3 | 10
(10 rows)
通过以上的现象,可以得出几个结论:
- cmin和cmax始终是一致的,因为是使用的同一个字段 t_cid 来表示的
- 最开始的两条插入和查询操作,因为是不同的事务,所以判断可见性是基于xmin、xmax和snapshot来判断了
- 只有会对数据库产生实质变更的SQL,比如insert、update、select for update时,才会对命令ID加1
- 命令ID的类型是uint32,即无符号32位整型,最大支持2^32 - 1个命令,所以当一个事务内的命令不断累加之后,会报错:cannot hava more than 2^32 - 1 commands in a transaction,为此我们也可以看到,对于普通的查询select,不会使命令ID增加,这个和虚拟事务ID是类似的,节省资源
通过以上现象,可以发现cmin和cmax一直是一样的,那么问题就来了,假如有一行插入后又更新或者删除了,那么这个cmin和cmax到底是以哪个为准呢?为此,PostgreSQL引入了comboid的机制,这个也是一个标志位,在之前的篇章有提到
#define HEAP_COMBOCID0x0020 /* t_cid is a combo cid */
postgres=# begin;
BEGIN
postgres=*# insert into test values(1);
INSERT 0 1
postgres=*# insert into test values(2);
INSERT 0 1
postgres=*# update test set id = 99 where id = 1;
UPDATE 1
postgres=*# select cmin,cmax,id from test;
cmin | cmax | id
------+------+----
1 | 1 | 2
2 | 2 | 99
(2 rows)
postgres=*# select lp, t_xmin, t_xmax, t_ctid,
postgres-*# infomask(t_infomask, 1) as infomask,
postgres-*# infomask(t_infomask2, 2) as infomask2
postgres-*# from heap_page_items(get_raw_page('test', 0));
lp | t_xmin | t_xmax | t_ctid | infomask | infomask2
----+--------+--------+--------+----------------------+-----------------
1 | 18229 | 18229 | (0,3) | COMBOCID | HOT_UPDATED
2 | 18229 | 0 | (0,2) | XMAX_INVALID |
3 | 18229 | 0 | (0,3) | UPDATED|XMAX_INVALID | HEAP_ONLY_TUPLE
(3 rows)
可以看到,第一行因为插入又因为更新被删除了,变成了老版本,所以infomask添加了comboid这个标志位,在判断行可见性的时候假如发现了这个标志位,就会通过comboid去获取实际的cmin和cmax,具体逻辑在 src/backend/utils/time/combocid.c 里面
* Before version 8.3, HeapTupleHeaderData had separate fields for cmin
* and cmax. To reduce the header size, cmin and cmax are now overlayed
* in the same field in the header. That usually works because you rarely
* insert and delete a tuple in the same transaction, and we don't need
* either field to remain valid after the originating transaction exits.
* To make it work when the inserting transaction does delete the tuple,
* we create a "combo" command ID and store that in the tuple header
* instead of cmin and cmax. The combo command ID can be mapped to the
* real cmin and cmax using a backend-private array, which is managed by
* this module.
*
* To allow reusing existing combo cids, we also keep a hash table that
* maps cmin,cmax pairs to combo cids. This keeps the data structure size
* reasonable in most cases, since the number of unique pairs used by any
* one transaction is likely to be small.
*
* With a 32-bit combo command id we can represent 2^32 distinct cmin,cmax
* combinations. In the most perverse case where each command deletes a tuple
* generated by every previous command, the number of combo command ids
* required for N commands is N*(N+1)/2. That means that in the worst case,
* that's enough for 92682 commands. In practice, you'll run out of memory
* and/or disk space way before you reach that limit.
*
* The array and hash table are kept in TopTransactionContext, and are
* destroyed at the end of each transaction.
在8.3以前,对于xmin和xmax是分开存储的,在8.3以后,为了减少头部的大小,cmin和cmax便存储在一个域里面了,因为在同一个事务里插入又删除的情况很少。
另外由于类型是无符号32位整型,在最极端的情况下,即插了立马删,最多可以处理N*(N+1)/2 = 92682个命令,但是在这之前早就OOM了。
首先是具体的数据结构
/* Key and entry structures for the hash table */
typedef struct
{
CommandId cmin;
CommandId cmax;
} ComboCidKeyData;
typedef ComboCidKeyData *ComboCidKey;
typedef struct
{
ComboCidKeyData key;
CommandId combocid;
} ComboCidEntryData;
判断行可见性的时候,假如标志位是HEAP_COMBOCID,那么就去调用GetRealCmax和GetRealCmin去获取真正的cmin和cmax
CommandId
HeapTupleHeaderGetCmin(HeapTupleHeader tup)
{
CommandId cid = HeapTupleHeaderGetRawCommandId(tup);
Assert(!(tup->t_infomask & HEAP_MOVED));
Assert(TransactionIdIsCurrentTransactionId(HeapTupleHeaderGetXmin(tup)));
if (tup->t_infomask & HEAP_COMBOCID)
return GetRealCmin(cid);
else
return cid;
}
CommandId
HeapTupleHeaderGetCmax(HeapTupleHeader tup)
{
CommandId cid = HeapTupleHeaderGetRawCommandId(tup);
Assert(!(tup->t_infomask & HEAP_MOVED));
/*
* Because GetUpdateXid() performs memory allocations if xmax is a
* multixact we can't Assert() if we're inside a critical section. This
* weakens the check, but not using GetCmax() inside one would complicate
* things too much.
*/
Assert(CritSectionCount > 0 ||
TransactionIdIsCurrentTransactionId(HeapTupleHeaderGetUpdateXid(tup)));
if (tup->t_infomask & HEAP_COMBOCID)
return GetRealCmax(cid);
else
return cid;
}
获取方式也很简单
static CommandId
GetRealCmin(CommandId combocid)
{
Assert(combocid < usedComboCids);
return comboCids[combocid].cmin;
}
static CommandId
GetRealCmax(CommandId combocid)
{
Assert(combocid < usedComboCids);
return comboCids[combocid].cmax;
}
头铁印证
好记性不如GDB,印证一下
postgres=# begin;
BEGIN
postgres=*# insert into test values(1);
INSERT 0 1
postgres=*# insert into test values(2);
INSERT 0 1
postgres=*# update test set id = 99 where id = 1;
UPDATE 1
postgres=*# select cmin,cmax,id from test;
此处卡住
另外一个窗口打个断点
(gdb) b GetRealCmax
Breakpoint 1 at 0xb04661: file combocid.c, line 289.
(gdb) b GetRealCmin
Breakpoint 2 at 0xb0461d: file combocid.c, line 282.
(gdb) info b
Num Type Disp Enb Address What
1 breakpoint keep y 0x0000000000b04661 in GetRealCmax at combocid.c:289
2 breakpoint keep y 0x0000000000b0461d in GetRealCmin at combocid.c:282
(gdb) bt
#0 0x00007ff99fe0fd23 in __epoll_wait_nocancel () from /lib64/libc.so.6
#1 0x00000000009026fe in WaitEventSetWaitBlock (set=0x1c90468, cur_timeout=-1, occurred_events=0x7fff41c6b910, nevents=1) at latch.c:1295
#2 0x00000000009025da in WaitEventSetWait (set=0x1c90468, timeout=-1, occurred_events=0x7fff41c6b910, nevents=1, wait_event_info=100663296)
at latch.c:1247
#3 0x0000000000778e19 in secure_read (port=0x1c866d0, ptr=0xfd63c0 <PqRecvBuffer>, len=8192) at be-secure.c:184
#4 0x0000000000782949 in pq_recvbuf () at pqcomm.c:947
#5 0x00000000007829f9 in pq_getbyte () at pqcomm.c:990
#6 0x0000000000930cba in SocketBackend (inBuf=0x7fff41c6baf0) at postgres.c:337
#7 0x000000000093113d in ReadCommand (inBuf=0x7fff41c6baf0) at postgres.c:510
#8 0x0000000000935ad4 in PostgresMain (argc=1, argv=0x1c90668, dbname=0x1c90580 "postgres", username=0x1c3baa8 "postgres") at postgres.c:4269
#9 0x00000000008857a0 in BackendRun (port=0x1c866d0) at postmaster.c:4526
#10 0x0000000000884f90 in BackendStartup (port=0x1c866d0) at postmaster.c:4210
#11 0x00000000008815a1 in ServerLoop () at postmaster.c:1739
#12 0x0000000000880e78 in PostmasterMain (argc=3, argv=0x1c39a20) at postmaster.c:1412
#13 0x000000000078805c in main (argc=3, argv=0x1c39a20) at main.c:210
(gdb) c
Continuing.
Breakpoint 2, GetRealCmin (combocid=0) at combocid.c:282
282 Assert(combocid < usedComboCids);
(gdb) s
283 return comboCids[combocid].cmin;
(gdb) p comboCids[combocid].cmin
$1 = 0
(gdb) p comboCids[combocid].cmax
$2 = 2
继续c了之后,打印出来了结果,可以看到,实际的cmin是0,cmax是2,符合之前的结论。
postgres=# begin;
BEGIN
postgres=*# insert into test values(1); ---命令ID是0
INSERT 0 1
postgres=*# insert into test values(2); ---命令ID是1
INSERT 0 1
postgres=*# update test set id = 99 where id = 1; ---命令ID是2
UPDATE 1
postgres=*# select cmin,cmax,id from test;
cmin | cmax | id
------+------+----
1 | 1 | 2
2 | 2 | 99
(2 rows)
而对于没有comboid的情况下,则是HeapTupleSatisfiesMVCC这个函数做到事情了,简单的分析如下,基本和MVCC那一套类似,子事务的情况没有过多分析
--如果是本事务
else if (TransactionIdIsCurrentTransactionId(HeapTupleHeaderGetRawXmin(tuple)))
{
--元组中的cmin比快照中的cmin大,说明是之后插入的,不可见
if (HeapTupleHeaderGetCmin(tuple) >= snapshot->curcid)
return false; /* inserted after scan started */
--元组插入了没有被删除,可见
if (tuple->t_infomask & HEAP_XMAX_INVALID) /* xid invalid */
return true;
--元组中的xmax仅仅是行锁,可见
if (HEAP_XMAX_IS_LOCKED_ONLY(tuple->t_infomask)) /* not deleter */
return true;
if (tuple->t_infomask & HEAP_XMAX_IS_MULTI)
{
TransactionId xmax;
xmax = HeapTupleGetUpdateXid(tuple);
/* not LOCKED_ONLY, so it has to have an xmax */
Assert(TransactionIdIsValid(xmax));
--如果不是当前事务删除的,可见
/* updating subtransaction must have aborted */
if (!TransactionIdIsCurrentTransactionId(xmax))
return true;
--如果cmax大于cid,说明在获取快照后面删除的,可见,反之不可见
else if (HeapTupleHeaderGetCmax(tuple) >= snapshot->curcid)
return true; /* updated after scan started */
else
return false; /* updated before scan started */
}
if (!TransactionIdIsCurrentTransactionId(HeapTupleHeaderGetRawXmax(tuple)))
{
/* deleting subtransaction must have aborted */
SetHintBits(tuple, buffer, HEAP_XMAX_INVALID,
InvalidTransactionId);
return true;
}
if (HeapTupleHeaderGetCmax(tuple) >= snapshot->curcid)
return true; /* deleted after scan started */
else
return false; /* deleted before scan started */
}
pg_filedump
另外,我们也可以直接使用pg_filedump观察
postgres=# truncate table test;
TRUNCATE TABLE
postgres=# select pg_relation_filepath('test');
pg_relation_filepath
----------------------
base/13578/24817
(1 row)
postgres=# insert into test values(generate_series(1,10));
INSERT 0 10
postgres=# checkpoint ;
CHECKPOINT
postgres=# begin;
BEGIN
postgres=*# delete from test where id = 1;
DELETE 1
postgres=*# delete from test where id = 2;
DELETE 1
postgres=*# delete from test where id = 3;
DELETE 1
postgres=*# commit ;
COMMIT
postgres=# checkpoint ;
CHECKPOINT
使用pg_filedump查看,可以很清晰的很看,前三条元组的CID|XVAC分别是0,1,2
[postgres@xiongcc ~]$ pg_filedump -i -D int pgdata/base/13578/24817
*******************************************************************
* PostgreSQL File/Block Formatted Dump Utility
*
* File: pgdata/base/13578/24817
* Options used: -i -D int
*******************************************************************
Block 0 ********************************************************
<Header> -----
Block Offset: 0x00000000 Offsets: Lower 64 (0x0040)
Block: Size 8192 Version 4 Upper 7872 (0x1ec0)
LSN: logid 0 recoff 0x23c6d388 Special 8192 (0x2000)
Items: 10 Free Space: 7808
Checksum: 0x0000 Prune XID: 0x00004741 Flags: 0x0000 ()
Length (including item array): 64
<Data> -----
Item 1 -- Length: 28 Offset: 8160 (0x1fe0) Flags: NORMAL
XMIN: 18240 XMAX: 18241 CID|XVAC: 0
Block Id: 0 linp Index: 1 Attributes: 1 Size: 24
infomask: 0x0100 (XMIN_COMMITTED|KEYS_UPDATED)
COPY: 1
Item 2 -- Length: 28 Offset: 8128 (0x1fc0) Flags: NORMAL
XMIN: 18240 XMAX: 18241 CID|XVAC: 1
Block Id: 0 linp Index: 2 Attributes: 1 Size: 24
infomask: 0x0100 (XMIN_COMMITTED|KEYS_UPDATED)
COPY: 2
Item 3 -- Length: 28 Offset: 8096 (0x1fa0) Flags: NORMAL
XMIN: 18240 XMAX: 18241 CID|XVAC: 2
Block Id: 0 linp Index: 3 Attributes: 1 Size: 24
infomask: 0x0100 (XMIN_COMMITTED|KEYS_UPDATED)
COPY: 3
Item 4 -- Length: 28 Offset: 8064 (0x1f80) Flags: NORMAL
XMIN: 18240 XMAX: 0 CID|XVAC: 0
Block Id: 0 linp Index: 4 Attributes: 1 Size: 24
infomask: 0x0900 (XMIN_COMMITTED|XMAX_INVALID)
COPY: 4
Item 5 -- Length: 28 Offset: 8032 (0x1f60) Flags: NORMAL
XMIN: 18240 XMAX: 0 CID|XVAC: 0
Block Id: 0 linp Index: 5 Attributes: 1 Size: 24
infomask: 0x0900 (XMIN_COMMITTED|XMAX_INVALID)
COPY: 5
Item 6 -- Length: 28 Offset: 8000 (0x1f40) Flags: NORMAL
XMIN: 18240 XMAX: 0 CID|XVAC: 0
Block Id: 0 linp Index: 6 Attributes: 1 Size: 24
infomask: 0x0900 (XMIN_COMMITTED|XMAX_INVALID)
COPY: 6
Item 7 -- Length: 28 Offset: 7968 (0x1f20) Flags: NORMAL
XMIN: 18240 XMAX: 0 CID|XVAC: 0
Block Id: 0 linp Index: 7 Attributes: 1 Size: 24
infomask: 0x0900 (XMIN_COMMITTED|XMAX_INVALID)
COPY: 7
Item 8 -- Length: 28 Offset: 7936 (0x1f00) Flags: NORMAL
XMIN: 18240 XMAX: 0 CID|XVAC: 0
Block Id: 0 linp Index: 8 Attributes: 1 Size: 24
infomask: 0x0900 (XMIN_COMMITTED|XMAX_INVALID)
COPY: 8
Item 9 -- Length: 28 Offset: 7904 (0x1ee0) Flags: NORMAL
XMIN: 18240 XMAX: 0 CID|XVAC: 0
Block Id: 0 linp Index: 9 Attributes: 1 Size: 24
infomask: 0x0900 (XMIN_COMMITTED|XMAX_INVALID)
COPY: 9
Item 10 -- Length: 28 Offset: 7872 (0x1ec0) Flags: NORMAL
XMIN: 18240 XMAX: 0 CID|XVAC: 0
Block Id: 0 linp Index: 10 Attributes: 1 Size: 24
infomask: 0x0900 (XMIN_COMMITTED|XMAX_INVALID)
COPY: 10
*** End of File Encountered. Last Block Read: 0 ***
postgres=# truncate table test;
TRUNCATE TABLE
postgres=# insert into test values(generate_series(1,3));
INSERT 0 3
postgres=# begin;
BEGIN
postgres=*# insert into test values(4);
INSERT 0 1
postgres=*# update test set id = 99 where id = 4;
UPDATE 1
postgres=*# delete from test where id = 1;
DELETE 1
postgres=*# delete from test where id = 2;
DELETE 1
postgres=*# SELECT t_ctid, raw_flags, combined_flags
FROM heap_page_items(get_raw_page('test', 0)),
LATERAL heap_tuple_infomask_flags(t_infomask, t_infomask2)
WHERE t_infomask IS NOT NULL OR t_infomask2 IS NOT NULL;
t_ctid | raw_flags | combined_flags
--------+--------------------------------------------------+----------------
(0,1) | {HEAP_XMIN_COMMITTED,HEAP_KEYS_UPDATED} | {}
(0,2) | {HEAP_XMIN_COMMITTED,HEAP_KEYS_UPDATED} | {}
(0,3) | {HEAP_XMIN_COMMITTED,HEAP_XMAX_INVALID} | {}
(0,5) | {HEAP_COMBOCID,HEAP_HOT_UPDATED} | {}
(0,5) | {HEAP_XMAX_INVALID,HEAP_UPDATED,HEAP_ONLY_TUPLE} | {}
(5 rows)
小结
- 在8.4以前,cmin和cmax分开存储,后来考虑到一个事务内插入了行又删的情况不多见,所以就合并成了一个字段,节省资源,讨论邮件 http://www.postgresql.cn/message-id/450FF1D6.5070500%40enterprisedb.com
- 又因为vacuum full不关心cmin和cmax,可以使用共用体,进一步节省资源
- cmin和cmax始终是一致的,因为是使用的同一个字段 t_cid 来表示的
- cmin和cmax用于同一个事务内的可见性判断,因为也会有类似的交错执行,比如游标
- 假如只有插入和删除的话,cmin和cmax就分别表示插入和删除的命令ID,假如是一个事务内插入了行又删除,就会有comboid,需要通过哈希表去获取实际的cmin和cmax
- 只有会对数据库产生实质变更的SQL,比如insert、update、select for update时,才会对命令ID加1,对于普通的查询select,不会使命令ID增加,这个和虚拟事务ID是类似的,节省资源
- 命令ID的类型是uint32,即无符号32位整型,最大支持2^32 - 1个命令,所以当一个事务内的命令不断累加之后,会报错:cannot hava more than 2^32 - 1 commands in a transaction,在最极端的情况下,也就是创建了又删,最多可以处理N*(N+1)/2 = 92682个命令,但是在这之前早就OOM了。
https://ibb.co/KR8Z32v https://ibb.co/Kx6MXH0S https://ibb.co/YC8gppX https://ibb.co/BVv4VpRB https://ibb.co/My2QRkx3 https://ibb.co/bMRQ9FvR https://ibb.co/zHVXgfDD https://ibb.co/tMPqcMSm
https://ibb.co/6Rdnt9PB https://ibb.co/1fPydcWC https://ibb.co/bgqxC7sz https://ibb.co/1JGg4G91 https://ibb.co/pj9gtp72 https://ibb.co/0pHGmBX5 https://ibb.co/9kPkM8hB https://ibb.co/9kqN24yd
https://ibb.co/JWHLxNqy https://ibb.co/6Rdnt9PB https://ibb.co/1fPydcWC https://ibb.co/bgqxC7sz https://ibb.co/cSKZb6K3 https://ibb.co/5htHRBCM https://ibb.co/svcYkW37 https://ibb.co/JWHLxNqy
https://ibb.co/GQvttGx7 https://ibb.co/rGrZqZLD https://ibb.co/XZMg3PsT https://ibb.co/fVtmqJzq https://ibb.co/yB5kgPZZ https://ibb.co/6p3NJbj https://ibb.co/MycFby8H https://ibb.co/6JRFhJjL