Quick Links

Re: [HACKERS] maintenance_work_mem and CREATE INDEX time

From:	Amit Langote <amitlangote09(at)gmail(dot)com>
To:	Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
Cc:	Postgres General <pgsql-general(at)postgresql(dot)org>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: [HACKERS] maintenance_work_mem and CREATE INDEX time
Date:	2013-07-30 00:29:22
Message-ID:	CA+HiwqHM4G8meKARFYMATnXKPKz1DVR856Ot7y8kzeZZrsuLcQ@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general pgsql-hackers

Hi Jeff,

On Tue, Jul 30, 2013 at 3:25 AM, Jeff Janes <jeff(dot)janes(at)gmail(dot)com> wrote:
> On Tue, Jul 23, 2013 at 10:56 PM, Amit Langote <amitlangote09(at)gmail(dot)com> wrote:
>> On Wed, Jul 24, 2013 at 3:20 AM, Jeff Janes <jeff(dot)janes(at)gmail(dot)com> wrote:
>>>
>>> The heap structure used in external sorts is cache-unfriendly. The
>>> bigger the heap used, the more this unfriendliness becomes apparent.
>>> And the bigger maintenance_work_mem, the bigger the heap used.
>>>
>>> The bigger heap also means you have fewer "runs" to merge in the
>>> external sort. However, as long as the number of runs still fits in
>>> the same number of merge passes, this is generally not a meaningful
>>> difference.
>>
>> Does fewer runs mean more time (in whichever phase of external sort)?
>
> That's complicated. In general fewer runs are faster, as the heap
> used at that stage is smaller. But this difference is small. If you
> can get the number of runs down to a level that needs fewer passes
> over the data, that will make things faster. But this is rare. If
> the sort isn't already being done in a single pass, then your sort
> must be huge or your working memory setting is pathologically tiny.
>
> There is a rough conservation of total heap layers between the two
> phases: the initial tuple heap, and the merge stage heap-of-tapes.
> Say for example that by increasing work_mem, you can increase the
> initial heap from 25 layers to 27 layers, while decreasing the merge
> phase heap from 5 layers to 3 layers. The total number of comparisons
> for the entire sort will be about the same, but the comparisons across
> the 27 layer heap are much more likely to need to go to main RAM,
> rather than come from L3 cache (or whatever the cache level is).
>

If I my assumption that fewer runs mean longer runs is plausible, may
it be correct to think that performsort step (performsort_done -
performsort_starting) time increases when such longer runs are created
due to larger workMem?

--
Amit Langote

In response to

Re: maintenance_work_mem and CREATE INDEX time at 2013-07-29 18:25:15 from Jeff Janes

Browse pgsql-general by date

	From	Date	Subject
Next Message	Sergey Konoplev	2013-07-30 03:54:25	Re: Fastest Index/Algorithm to find similar sentences
Previous Message	Klaus Ita	2013-07-29 22:37:27	Fwd: corrupted files

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Noah Misch	2013-07-30 00:56:27	Re: Bison 3.0 updates
Previous Message	Stephen Frost	2013-07-30 00:24:17	Re: [COMMITTERS] pgsql: WITH CHECK OPTION support for auto-updatable VIEWs