From 2ffcb0cb6a5bf97de22f0ce58f55537ce1c87653 Mon Sep 17 00:00:00 2001 From: Tom Lane Date: Thu, 9 Dec 2010 13:03:11 -0500 Subject: Eliminate O(N^2) behavior in parallel restore with many blobs. With hundreds of thousands of TOC entries, the repeated searches in reduce_dependencies() become the dominant cost. Get rid of that searching by constructing reverse-dependency lists, which we can do in O(N) time during the fix_dependencies() preprocessing. I chose to store the reverse dependencies as DumpId arrays for consistency with the forward-dependency representation, and keep the previously-transient tocsByDumpId[] array around to locate actual TOC entry structs quickly from dump IDs. While this fixes the slow case reported by Vlad Arkhipov, there is still a potential for O(N^2) behavior with sufficiently many tables: fix_dependencies itself, as well as mark_create_done and inhibit_data_for_failed_table, are doing repeated searches to deal with table-to-table-data dependencies. Possibly this work could be extended to deal with that, although the latter two functions are also used in non-parallel restore where we currently don't run fix_dependencies. Another TODO is that we fail to parallelize restore of multiple blobs at all. This appears to require changes in the archive format to fix. Back-patch to 9.0 where the problem was reported. 8.4 has potential issues as well; but since it doesn't create a separate TOC entry for each blob, it's at much less risk of having enough TOC entries to cause real problems. --- src/bin/pg_dump/pg_backup_archiver.h | 2 ++ 1 file changed, 2 insertions(+) (limited to 'src/bin/pg_dump/pg_backup_archiver.h') diff --git a/src/bin/pg_dump/pg_backup_archiver.h b/src/bin/pg_dump/pg_backup_archiver.h index 0a135ee126f..9f826b678a1 100644 --- a/src/bin/pg_dump/pg_backup_archiver.h +++ b/src/bin/pg_dump/pg_backup_archiver.h @@ -321,6 +321,8 @@ typedef struct _tocEntry struct _tocEntry *par_next; /* these are NULL if not in either list */ bool created; /* set for DATA member if TABLE was created */ int depCount; /* number of dependencies not yet restored */ + DumpId *revDeps; /* dumpIds of objects depending on this one */ + int nRevDeps; /* number of such dependencies */ DumpId *lockDeps; /* dumpIds of objects this one needs lock on */ int nLockDeps; /* number of such dependencies */ } TocEntry; -- cgit v1.2.3