diff options
author | Andres Freund <andres@anarazel.de> | 2018-10-10 13:53:02 -0700 |
---|---|---|
committer | Andres Freund <andres@anarazel.de> | 2018-10-10 13:53:02 -0700 |
commit | a88482dd24aafeca555cf80aa58cf9ae39f25f9d (patch) | |
tree | c7a302d3fcfca01941079a4acb96606042907c1f /src/backend/replication/logical/reorderbuffer.c | |
parent | a653569c14c3a0b7f48a874a2770d58ce39e07d0 (diff) |
Fix logical decoding error when system table w/ toast is repeatedly rewritten.
Repeatedly rewriting a mapped catalog table with VACUUM FULL or
CLUSTER could cause logical decoding to fail with:
ERROR, "could not map filenode \"%s\" to relation OID"
To trigger the problem the rewritten catalog had to have live tuples
with toasted columns.
The problem was triggered as during catalog table rewrites the
heap_insert() check that prevents logical decoding information to be
emitted for system catalogs, failed to treat the new heap's toast table
as a system catalog (because the new heap is not recognized as a
catalog table via RelationIsLogicallyLogged()). The relmapper, in
contrast to the normal catalog contents, does not contain historical
information. After a single rewrite of a mapped table the new relation
is known to the relmapper, but if the table is rewritten twice before
logical decoding occurs, the relfilenode cannot be mapped to a
relation anymore. Which then leads us to error out. This only
happens for toast tables, because the main table contents aren't
re-inserted with heap_insert().
The fix is simple, add a new heap_insert() flag that prevents logical
decoding information from being emitted, and accept during decoding
that there might not be tuple data for toast tables.
Unfortunately that does not fix pre-existing logical decoding
errors. Doing so would require not throwing an error when a filenode
cannot be mapped to a relation during decoding, and that seems too
likely to hide bugs. If it's crucial to fix decoding for an existing
slot, temporarily changing the ERROR in ReorderBufferCommit() to a
WARNING appears to be the best fix.
Author: Andres Freund
Discussion: https://postgr.es/m/20180914021046.oi7dm4ra3ot2g2kt@alap3.anarazel.de
Backpatch: 9.4-, where logical decoding was introduced
Diffstat (limited to 'src/backend/replication/logical/reorderbuffer.c')
-rw-r--r-- | src/backend/replication/logical/reorderbuffer.c | 25 |
1 files changed, 20 insertions, 5 deletions
diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c index 63213b79975..15d8d307b76 100644 --- a/src/backend/replication/logical/reorderbuffer.c +++ b/src/backend/replication/logical/reorderbuffer.c @@ -1588,8 +1588,16 @@ ReorderBufferCommit(ReorderBuffer *rb, TransactionId xid, change->data.tp.relnode.relNode); /* - * Catalog tuple without data, emitted while catalog was - * in the process of being rewritten. + * Mapped catalog tuple without data, emitted while + * catalog table was in the process of being rewritten. We + * can fail to look up the relfilenode, because the the + * relmapper has no "historic" view, in contrast to normal + * the normal catalog during decoding. Thus repeated + * rewrites can cause a lookup failure. That's OK because + * we do not decode catalog changes anyway. Normally such + * tuples would be skipped over below, but we can't + * identify whether the table should be logically logged + * without mapping the relfilenode to the oid. */ if (reloid == InvalidOid && change->data.tp.newtuple == NULL && @@ -1644,10 +1652,17 @@ ReorderBufferCommit(ReorderBuffer *rb, TransactionId xid, * transaction's changes. Otherwise it will get * freed/reused while restoring spooled data from * disk. + * + * But skip doing so if there's no tuple-data. That + * happens if a non-mapped system catalog with a toast + * table is rewritten. */ - dlist_delete(&change->node); - ReorderBufferToastAppendChunk(rb, txn, relation, - change); + if (change->data.tp.newtuple != NULL) + { + dlist_delete(&change->node); + ReorderBufferToastAppendChunk(rb, txn, relation, + change); + } } change_done: |