From: | Aleksander Alekseev <a(dot)alekseev(at)postgrespro(dot)ru> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Robert Haas <robertmhaas(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Patch: ResourceOwner optimization for tables with many partitions |
Date: | 2016-01-25 16:18:27 |
Message-ID: | 20160125191827.615fe277@fujitsu |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hello, Tom
> Also, there are defined ways to convert between Datum format and
> other formats (DatumGetPointer() etc). This code isn't using 'em.
Fixed. But I was not sure how to deal with File and Buffer types since
they are ints (see fd.h and buf.h) and there is no DatumGetInt macro,
only DatumGetInt32/Int64. I don't know if there is a good reason for
this. So for now I just added these definitions right into resowner.c:
```
/* Convert File to Datum */
#define FileGetDatum(file) ((Datum)(file))
/* Convert Datum to File */
#define DatumGetFile(datum)((File)(datum))
/* Convert Buffer to Datum */
#define BufferGetDatum(buffer)((Datum)(buffer))
/* Convert Datum to Buffer */
#define DatumGetBuffer(datum)((Buffer)(datum))
```
... to make all code look similar and all intentions --- clear.
I have a feeling you could suggest a better solution :)
> This is certainly not doing what I had in mind for communication
> between ResourceOwnerGetAny and ResourceOwnerRelease, because the
> latter pays zero attention to "lastidx", but instead does a blind
> hash search regardless.
>
> In general, I'm suspicious of the impact of this patch on "normal"
> cases with not-very-large resource arrays. It might be hard to
> measure that because in such cases resowner.c is not a bottleneck;
> but that doesn't mean I want to give up performance in cases that
> perform well today. You could probably set up a test harness that
> exercises ResourceOwnerAdd/Release/etc in isolation and get good
> timings for them that way, to confirm or disprove whether there's
> a performance loss for small arrays.
>
> I still suspect it would be advantageous to have two different
> operating modes depending on the size of an individual resource
> array, and not bother with hashing until you got to more than a
> couple dozen entries. Given that you're rehashing anyway when you
> enlarge the array, this wouldn't cost anything except a few more
> lines of code, ISTM --- and you already have a net code savings
> because of the refactoring.
To be honest I don't have much faith in such micro benchmarks. They
don't consider how code will be executed under real load. Therefore any
results of such a benchmark wouldn't give us a clear answer which
solution is preferable. Besides different users can execute the same
code in different fashion.
I compared two implementations - "always use hashing" (step2a.path) and
"use hashing only for large arrays" (step2b.path). Both patches give
the same performance according to benchmark I described in a first
message of this thread.
Since none of these patches is better in respect of problem I'm trying
to solve I suggest to use step2b.path. It seems to be a good idea not to
use hashing for searching in small arrays. Besides in this case
behaviour of PostgreSQL will be changed less after applying a patch.
Users will probably appreciate this.
Best regards,
Aleksander
Attachment | Content-Type | Size |
---|---|---|
resource-owner-optimization-v5-step1.patch | text/x-patch | 30.2 KB |
resource-owner-optimization-v5-step2a.patch | text/x-patch | 7.3 KB |
resource-owner-optimization-v5-step2b.patch | text/x-patch | 7.3 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Pavel Stehule | 2016-01-25 16:29:31 | Re: [patch] Proposal for \crosstabview in psql |
Previous Message | Shulgin, Oleksandr | 2016-01-25 16:11:17 | Re: More stable query plans via more predictable column statistics |