From: | Amit Langote <amitlangote09(at)gmail(dot)com> |
---|---|
To: | Jeff Janes <jeff(dot)janes(at)gmail(dot)com> |
Cc: | Postgres General <pgsql-general(at)postgresql(dot)org>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: [HACKERS] maintenance_work_mem and CREATE INDEX time |
Date: | 2013-07-30 00:29:22 |
Message-ID: | CA+HiwqHM4G8meKARFYMATnXKPKz1DVR856Ot7y8kzeZZrsuLcQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general pgsql-hackers |
Hi Jeff,
On Tue, Jul 30, 2013 at 3:25 AM, Jeff Janes <jeff(dot)janes(at)gmail(dot)com> wrote:
> On Tue, Jul 23, 2013 at 10:56 PM, Amit Langote <amitlangote09(at)gmail(dot)com> wrote:
>> On Wed, Jul 24, 2013 at 3:20 AM, Jeff Janes <jeff(dot)janes(at)gmail(dot)com> wrote:
>>>
>>> The heap structure used in external sorts is cache-unfriendly. The
>>> bigger the heap used, the more this unfriendliness becomes apparent.
>>> And the bigger maintenance_work_mem, the bigger the heap used.
>>>
>>> The bigger heap also means you have fewer "runs" to merge in the
>>> external sort. However, as long as the number of runs still fits in
>>> the same number of merge passes, this is generally not a meaningful
>>> difference.
>>
>> Does fewer runs mean more time (in whichever phase of external sort)?
>
> That's complicated. In general fewer runs are faster, as the heap
> used at that stage is smaller. But this difference is small. If you
> can get the number of runs down to a level that needs fewer passes
> over the data, that will make things faster. But this is rare. If
> the sort isn't already being done in a single pass, then your sort
> must be huge or your working memory setting is pathologically tiny.
>
> There is a rough conservation of total heap layers between the two
> phases: the initial tuple heap, and the merge stage heap-of-tapes.
> Say for example that by increasing work_mem, you can increase the
> initial heap from 25 layers to 27 layers, while decreasing the merge
> phase heap from 5 layers to 3 layers. The total number of comparisons
> for the entire sort will be about the same, but the comparisons across
> the 27 layer heap are much more likely to need to go to main RAM,
> rather than come from L3 cache (or whatever the cache level is).
>
If I my assumption that fewer runs mean longer runs is plausible, may
it be correct to think that performsort step (performsort_done -
performsort_starting) time increases when such longer runs are created
due to larger workMem?
--
Amit Langote
From | Date | Subject | |
---|---|---|---|
Next Message | Sergey Konoplev | 2013-07-30 03:54:25 | Re: Fastest Index/Algorithm to find similar sentences |
Previous Message | Klaus Ita | 2013-07-29 22:37:27 | Fwd: corrupted files |
From | Date | Subject | |
---|---|---|---|
Next Message | Noah Misch | 2013-07-30 00:56:27 | Re: Bison 3.0 updates |
Previous Message | Stephen Frost | 2013-07-30 00:24:17 | Re: [COMMITTERS] pgsql: WITH CHECK OPTION support for auto-updatable VIEWs |