From: | Bruce Momjian <bruce(at)momjian(dot)us> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | pgsql-hackers(at)postgresql(dot)org, Stephen Denne <Stephen(dot)Denne(at)datamail(dot)co(dot)nz> |
Subject: | Re: Possible future performance improvement: sort updates/deletes by ctid |
Date: | 2008-03-23 01:23:24 |
Message-ID: | 200803230123.m2N1NOt16671@momjian.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Added to TODO:
* Sort large UPDATE/DELETEs so it is done in heap order
http://archives.postgresql.org/pgsql-hackers/2008-01/msg01119.php
---------------------------------------------------------------------------
Tom Lane wrote:
> We've had a couple of discussions recently revolving around the
> inefficiency of using hashjoin/hashaggregation output to update a target
> table, because of the resulting very random access pattern. I believe
> this same mechanism is underlying the slowness of Stephen Denne's
> "alternate query" described here:
> http://archives.postgresql.org/pgsql-performance/2008-01/msg00227.php
>
> I made up the attached doubtless-oversimplified test case to model what
> he was seeing. It's cut down about 4x from the table size he describes,
> but the UPDATE still takes forever --- I gave up waiting after 2 hours,
> when it had deleted about a fifth of its hashjoin temp files, suggesting
> that the total runtime would be about 10 hours.
>
> A brute force idea for fixing this is to sort the intended update or
> delete operations of an UPDATE/DELETE command according to the target
> table's ctid, which is available for free anyway since the executor top
> level must have it to perform the operation. I made up an even more
> brute force patch (also attached) that forces that to happen for every
> UPDATE or DELETE --- obviously we'd not want that for real, it's just
> for crude performance testing. With that patch, I got the results
>
> QUERY PLAN
> -----------------------------------------------------------------------------------------------------------------------------------------------
> Sort (cost=6075623.03..6085623.05 rows=4000008 width=618) (actual time=2078726.637..3371944.124 rows=4000000 loops=1)
> Sort Key: df.ctid
> Sort Method: external merge Disk: 2478992kB
> -> Hash Join (cost=123330.50..1207292.72 rows=4000008 width=618) (actual time=20186.510..721120.455 rows=4000000 loops=1)
> Hash Cond: (df.document_id = d.id)
> -> Seq Scan on document_file df (cost=0.00..373334.08 rows=4000008 width=614) (actual time=11.775..439993.807 rows=4000000 loops=1)
> -> Hash (cost=57702.00..57702.00 rows=4000200 width=8) (actual time=19575.885..19575.885 rows=4000000 loops=1)
> -> Seq Scan on document d (cost=0.00..57702.00 rows=4000200 width=8) (actual time=0.039..14335.615 rows=4000000 loops=1)
> Total runtime: 3684037.097 ms
>
> or just over an hour runtime --- still not exactly speedy, but it
> certainly compares favorably to the estimated 10 hours for unsorted
> updates.
>
> This is with default shared_buffers (32MB) and work_mem (1MB);
> a more aggressive work_mem would have meant fewer hash batches and fewer
> sort runs and hence better performance in both cases, but with the
> majority of the runtime going into the sort step here, I think that the
> sorted update would benefit much more.
>
> Nowhere near a workable patch of course, but seems like food for
> thought.
>
> regards, tom lane
>
Content-Description: bighash.sql
> drop table if exists document;
> drop table if exists document_file ;
>
> create table document (document_type_id int, id int primary key);
> create table document_file (document_type_id int, document_id int primary key,
> filler char(600));
>
> insert into document_file select x,x,'z' from generate_series(1,4000000) x;
> insert into document select x,x from generate_series(1,4000000) x;
>
> analyze document_file;
> analyze document;
>
> set enable_mergejoin = false;
>
> explain analyze UPDATE ONLY document_file AS df SET document_type_id = d.document_type_id FROM document AS d WHERE d.id = document_id;
Content-Description: ctid-sort.patch
> Index: src/backend/optimizer/prep/preptlist.c
> ===================================================================
> RCS file: /cvsroot/pgsql/src/backend/optimizer/prep/preptlist.c,v
> retrieving revision 1.88
> diff -c -r1.88 preptlist.c
> *** src/backend/optimizer/prep/preptlist.c 1 Jan 2008 19:45:50 -0000 1.88
> --- src/backend/optimizer/prep/preptlist.c 30 Jan 2008 03:06:30 -0000
> ***************
> *** 32,37 ****
> --- 32,38 ----
> #include "optimizer/var.h"
> #include "parser/analyze.h"
> #include "parser/parsetree.h"
> + #include "parser/parse_clause.h"
> #include "parser/parse_coerce.h"
>
>
> ***************
> *** 103,108 ****
> --- 104,120 ----
> tlist = list_copy(tlist);
>
> tlist = lappend(tlist, tle);
> +
> + /*
> + * Force the query result to be sorted by CTID, for better update
> + * speed. (Note: we expect parse->sortClause to be NIL here,
> + * but this code will do no harm if it's not.)
> + */
> + parse->sortClause = addTargetToSortList(NULL, tle,
> + parse->sortClause, tlist,
> + SORTBY_DEFAULT,
> + SORTBY_NULLS_DEFAULT,
> + NIL, false);
> }
>
> /*
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: if posting/reading through Usenet, please send an appropriate
> subscribe-nomail command to majordomo(at)postgresql(dot)org so that your
> message can get through to the mailing list cleanly
--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://postgres.enterprisedb.com
+ If your life is a hard drive, Christ can be your backup. +
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2008-03-23 01:24:34 | Re: Idea for minor tstore optimization |
Previous Message | Tom Lane | 2008-03-23 01:21:51 | Re: pg_dump -i wording |