Re: A qsort template

From: Ranier Vilela <ranier(dot)vf(at)gmail(dot)com>
To: John Naylor <john(dot)naylor(at)enterprisedb(dot)com>
Cc: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: A qsort template
Date: 2021-07-30 11:47:18
Message-ID: CAEudQAozhKrf278ye1B43mkqy02rso6dnCyeQum1uftPRVR2Sw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Em qui., 29 de jul. de 2021 às 21:34, John Naylor <
john(dot)naylor(at)enterprisedb(dot)com> escreveu:

>
> On Sun, Jul 4, 2021 at 12:27 AM Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
> wrote:
> >
> > Since you are experimenting with tuplesort and likely thinking similar
> > thoughts, here's a patch I've been using to explore that area. I've
> > seen it get, for example, ~1.18x speedup for simple index builds in
> > favourable winds (YMMV, early hacking results only). Currently, it
> > kicks in when the leading column is of type int4, int8, timestamp,
> > timestamptz, date or text + friends (when abbreviatable, currently
> > that means "C" and ICU collations only), while increasing the
> > executable by only 8.5kB (Clang, amd64, -O2, no debug).
> >
> > These types are handled with just three specialisations. Their custom
> > "fast" comparators all boiled down to comparisons of datum bits,
> > varying only in signedness and width, so I tried throwing them away
> > and using 3 new common routines. Then I extended
> > tuplesort_sort_memtuples()'s pre-existing specialisation dispatch to
> > recognise qualifying users of those and select 3 corresponding sort
> > specialisations.
>
> I got around to getting a benchmark together to serve as a starting point.
> I based it off something I got from the archives, but don't remember where
> (I seem to remember Tomas Vondra wrote the original, but not sure). To
> start I just used types that were there already -- int, text, numeric. The
> latter two won't be helped by this patch, but I wanted to keep something
> like that so we can see what kind of noise variation there is. I'll
> probably cut text out in the future and just keep numeric for that purpose.
>
> I've attached both the script and a crude spreadsheet. I'll try to figure
> out something nicer for future tests, and maybe some graphs. The
> "comparison" sheet has the results side by side (min of five). There are 6
> distributions of values:
> - random
> - sorted
> - "almost sorted"
> - reversed
> - organ pipe (first half ascending, second half descending)
> - rotated (sorted but then put the smallest at the end)
> - random 0s/1s
>
> I included both "select a" and "select *" to make sure we have the recent
> datum sort optimization represented. The results look pretty good for ints
> -- about the same speed up master gets going from tuple sorts to datum
> sorts, and those got faster in turn also.
>
> Next I think I'll run microbenchmarks on int64s with the test harness you
> attached earlier, and experiment with the qsort parameters a bit.
>
> I'm also attaching your tuplesort patch so others can see what exactly I'm
> comparing.
>
The patch attached does not apply cleanly,
please can fix it?

error: patch failed: src/backend/utils/sort/tuplesort.c:4776
error: src/backend/utils/sort/tuplesort.c: patch does not apply

regards,
Ranier Vilela

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Zhihong Yu 2021-07-30 12:12:53 Re: Segment fault when excuting SPI function On PG with commit 41c6a5be
Previous Message Julien Rouhaud 2021-07-30 11:40:52 Re: pg_upgrade does not upgrade pg_stat_statements properly