From: | Robert Haas <robertmhaas(at)gmail(dot)com> |
---|---|
To: | Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp> |
Cc: | Amit Langote <amitlangote09(at)gmail(dot)com>, Rajkumar Raghuwanshi <rajkumar(dot)raghuwanshi(at)enterprisedb(dot)com>, Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Declarative partitioning - another take |
Date: | 2016-11-30 15:54:56 |
Message-ID: | CA+TgmoYuPJUpnO21XLyYCbomk+6NtJrTWEqHYmbToe9aAwgBww@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Tue, Nov 29, 2016 at 6:24 AM, Amit Langote
<Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp> wrote:
> # All times in seconds (on my modestly-powerful development VM)
> #
> # nrows = 10,000,000 generated using:
> #
> # INSERT INTO $tab
> # SELECT '$last'::date - ((s.id % $maxsecs + 1)::bigint || 's')::interval,
> # (random() * 5000)::int % 4999 + 1,
> # case s.id % 10
> # when 0 then 'a'
> # when 1 then 'b'
> # when 2 then 'c'
> # ...
> # when 9 then 'j'
> # end
> # FROM generate_series(1, $nrows) s(id)
> # ORDER BY random();
> #
> # The first item in the select list is basically a date that won't fall
> # outside the defined partitions.
>
> Time for a plain table = 98.1 sec
>
> #part parted tg-direct-map tg-if-else
> ===== ====== ============= ==========
> 10 114.3 1483.3 742.4
> 50 112.5 1476.6 2016.8
> 100 117.1 1498.4 5386.1
> 500 125.3 1475.5 --
> 1000 129.9 1474.4 --
> 5000 137.5 1491.4 --
> 10000 154.7 1480.9 --
Very nice!
Obviously, it would be nice if the overhead were even lower, but it's
clearly a vast improvement over what we have today.
> Regarding tuple-mapping-required vs no-tuple-mapping-required, all cases
> currently require tuple-mapping, because the decision is based on the
> result of comparing parent and partition TupleDesc using
> equalTupleDescs(), which fails so quickly because TupleDesc.tdtypeid are
> not the same. Anyway, I simply commented out the tuple-mapping statement
> in ExecInsert() to observe just slightly improved numbers as follows
> (comparing with numbers in the table just above):
>
> #part (sub-)parted
> ===== =================
> 10 113.9 (vs. 127.0)
> 100 135.7 (vs. 156.6)
> 500 182.1 (vs. 191.8)
I think you should definitely try to get that additional speedup when
you can. It doesn't seem like a lot when you think of how much is
already being saved, but a healthy number of users are going to
compare it to the performance on an unpartitioned table rather than to
our historical performance. 127/98.1 = 1.29, but 113.9/98.1 = 1.16
-- and obviously a 16% overhead from partitioning is way better than a
29% overhead, even if the old overhead was a million percent.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
From | Date | Subject | |
---|---|---|---|
Next Message | Amit Langote | 2016-11-30 15:56:14 | Re: Declarative partitioning - another take |
Previous Message | Robert Haas | 2016-11-30 15:48:51 | Re: Declarative partitioning - another take |