From: | Ancoron Luciferis <ancoron(dot)luciferis(at)googlemail(dot)com> |
---|---|
To: | Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> |
Cc: | pgsql-performance(at)lists(dot)postgresql(dot)org |
Subject: | Re: Custom opclass for column statistics? |
Date: | 2019-07-06 15:35:33 |
Message-ID: | 8ef092e6-d0c1-4f7a-86ae-e708e1322337@googlemail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
On 06/07/2019 15:38, Tomas Vondra wrote:
> On Sat, Jul 06, 2019 at 11:02:27AM +0200, Ancoron Luciferis wrote:
>> Hi,
>>
>> I've been wondering whether it is possible somehow to have the standard
>> column statistics to respect a certain operator class?
>>
>> The reason why I am asking for this is that I have a UUID column with a
>> unique index at it using a custom operator class which implies a
>> different sort order than for the default UUID operator class.
>>
>> This results into planner mistakes when determining whether to use the
>> index for row selection or not. Too often it falls back into sequential
>> scan due to this.
>>
>
> Can you share an example demonstrating the issue?
>
>
> regards
>
Yes, I have an opclass as follows:
CREATE OPERATOR CLASS uuid_timestamp_ops FOR TYPE uuid
USING btree AS
OPERATOR 1 <*,
OPERATOR 1 <~ (uuid, timestamp with time zone),
OPERATOR 2 <=*,
OPERATOR 2 <=~ (uuid, timestamp with time zone),
OPERATOR 3 =,
OPERATOR 3 =~ (uuid, timestamp with time zone),
OPERATOR 4 >=*,
OPERATOR 4 >=~ (uuid, timestamp with time zone),
OPERATOR 5 >*,
OPERATOR 5 >~ (uuid, timestamp with time zone),
FUNCTION 1 uuid_timestamp_cmp(uuid, uuid),
FUNCTION 1 uuid_timestamp_only_cmp(uuid, timestamp
with time zone),
FUNCTION 2 uuid_timestamp_sortsupport(internal)
;
...and e.g. operator "<*" is defined as:
CREATE FUNCTION uuid_timestamp_lt(uuid, uuid)
RETURNS bool
AS 'MODULE_PATHNAME', 'uuid_timestamp_lt'
LANGUAGE C
IMMUTABLE
LEAKPROOF
STRICT
PARALLEL SAFE;
COMMENT ON FUNCTION uuid_timestamp_lt(uuid, uuid) IS 'lower than';
CREATE OPERATOR <* (
LEFTARG = uuid,
RIGHTARG = uuid,
PROCEDURE = uuid_timestamp_lt,
COMMUTATOR = '>*',
NEGATOR = '>=*',
RESTRICT = scalarltsel,
JOIN = scalarltjoinsel
);
The function "uuid_timestamp_lt" is basically defined as follows:
1. if not version 1 UUID fallback to standard uuid compare
2. extract timestamp values and compare
3. if equal timestamps fallback to standard uuid compare
...so that a chronological order is established.
The test table is created as follows:
CREATE TABLE uuid_v1_ext (id uuid);
CREATE UNIQUE INDEX idx_uuid_v1_ext ON uuid_v1_ext (id uuid_timestamp_ops);
The values for "histogram_bounds" of the test table look like this (due
to the default sort order for standard type UUID):
00003789-97bf-11e9-b6bb-e03f49f7f733
008b88f8-6deb-11e9-901a-e03f4947f477
010a8b22-586a-11e9-8258-e03f49ce78f3
...
6f682e68-978d-11e9-901a-e03f4947f477
6ff412ee-926f-11e9-901a-e03f4947f477
7079ffe2-642f-11e9-b0cc-e03f49e7fd3b
70ffaeca-4645-11e9-adf9-e03f494677fb
...
fef26b41-9b9d-11e9-b0cc-e03f49e7fd3b
ff779ce8-9e52-11e9-8258-e03f49ce78f3
ffff6bfc-4de4-11e9-b0d4-e03f49d6f6bf
...and I think that's where the planner gets the decision for a query
such as:
DELETE FROM uuid_v1_ext WHERE id <* '4bdf6f81-56ad-11e9-8258-e03f49ce78f3';
...which then get's executed as sequential scan instead of an index scan.
I was also thinking about changing the selectivity function used by the
custom operator, but I didn't find any hints how to implement that
without duplicating a lot of internal code.
Cheers,
Ancoron
From | Date | Subject | |
---|---|---|---|
Next Message | Tomas Vondra | 2019-07-06 15:57:53 | Re: Custom opclass for column statistics? |
Previous Message | Tom Lane | 2019-07-06 13:58:58 | Re: Custom opclass for column statistics? |