diff options
| author | Tom Lane <tgl@sss.pgh.pa.us> | 2012-07-17 16:55:56 -0400 |
|---|---|---|
| committer | Tom Lane <tgl@sss.pgh.pa.us> | 2012-07-17 16:55:56 -0400 |
| commit | 7587286f667079245cd9692a927f74245b662709 (patch) | |
| tree | 5fdf55699ee947473ad727367ebb5c4ca4a2ea79 /src/backend/storage | |
| parent | f81d50dc87ae73f2817c67b205fdcf8d266ca3fb (diff) | |
Improve coding around the fsync request queue.
In all branches back to 8.3, this patch fixes a questionable assumption in
CompactCheckpointerRequestQueue/CompactBgwriterRequestQueue that there are
no uninitialized pad bytes in the request queue structs. This would only
cause trouble if (a) there were such pad bytes, which could happen in 8.4
and up if the compiler makes enum ForkNumber narrower than 32 bits, but
otherwise would require not-currently-planned changes in the widths of
other typedefs; and (b) the kernel has not uniformly initialized the
contents of shared memory to zeroes. Still, it seems a tad risky, and we
can easily remove any risk by pre-zeroing the request array for ourselves.
In addition to that, we need to establish a coding rule that struct
RelFileNode can't contain any padding bytes, since such structs are copied
into the request array verbatim. (There are other places that are assuming
this anyway, it turns out.)
In 9.1 and up, the risk was a bit larger because we were also effectively
assuming that struct RelFileNodeBackend contained no pad bytes, and with
fields of different types in there, that would be much easier to break.
However, there is no good reason to ever transmit fsync or delete requests
for temp files to the bgwriter/checkpointer, so we can revert the request
structs to plain RelFileNode, getting rid of the padding risk and saving
some marginal number of bytes and cycles in fsync queue manipulation while
we are at it. The savings might be more than marginal during deletion of
a temp relation, because the old code transmitted an entirely useless but
nonetheless expensive-to-process ForgetRelationFsync request to the
background process, and also had the background process perform the file
deletion even though that can safely be done immediately.
In addition, make some cleanup of nearby comments and small improvements to
the code in CompactCheckpointerRequestQueue/CompactBgwriterRequestQueue.
Diffstat (limited to 'src/backend/storage')
| -rw-r--r-- | src/backend/storage/smgr/md.c | 12 |
1 files changed, 9 insertions, 3 deletions
diff --git a/src/backend/storage/smgr/md.c b/src/backend/storage/smgr/md.c index c0e41c7053b..340611003bd 100644 --- a/src/backend/storage/smgr/md.c +++ b/src/backend/storage/smgr/md.c @@ -126,7 +126,7 @@ static MemoryContext MdCxt; /* context for all md.c allocations */ typedef struct { RelFileNode rnode; /* the targeted relation */ - ForkNumber forknum; + ForkNumber forknum; /* which fork */ BlockNumber segno; /* which segment */ } PendingOperationTag; @@ -1212,7 +1212,7 @@ mdpostckpt(void) * If there is a local pending-ops table, just make an entry in it for * mdsync to process later. Otherwise, try to pass off the fsync request * to the background writer process. If that fails, just do the fsync - * locally before returning (we expect this will not happen often enough + * locally before returning (we hope this will not happen often enough * to be a performance problem). */ static void @@ -1239,6 +1239,9 @@ register_dirty_segment(SMgrRelation reln, ForkNumber forknum, MdfdVec *seg) /* * register_unlink() -- Schedule a file to be deleted after next checkpoint * + * We don't bother passing in the fork number, because this is only used + * with main forks. + * * As with register_dirty_segment, this could involve either a local or * a remote pending-ops table. */ @@ -1349,6 +1352,9 @@ RememberFsyncRequest(RelFileNode rnode, ForkNumber forknum, BlockNumber segno) MemoryContext oldcxt = MemoryContextSwitchTo(MdCxt); PendingUnlinkEntry *entry; + /* PendingUnlinkEntry doesn't store forknum, since it's always MAIN */ + Assert(forknum == MAIN_FORKNUM); + entry = palloc(sizeof(PendingUnlinkEntry)); entry->rnode = rnode; entry->cycle_ctr = mdckpt_cycle_ctr; @@ -1398,7 +1404,7 @@ RememberFsyncRequest(RelFileNode rnode, ForkNumber forknum, BlockNumber segno) } /* - * ForgetRelationFsyncRequests -- forget any fsyncs for a rel + * ForgetRelationFsyncRequests -- forget any fsyncs for a relation fork */ void ForgetRelationFsyncRequests(RelFileNode rnode, ForkNumber forknum) |
