Hash join operator question

From: Paolo Tavalazzi <ptavalazzi(at)charta(dot)it>
To: pgsql-general(at)postgresql(dot)org
Subject: Hash join operator question
Date: 2005-05-23 13:17:11
Message-ID: 200505231517.12066.ptavalazzi@charta.it
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

I'd like to understand if it is possible to find a solution to the problem
that we have on ours DB in production.

I make an example simplified in order to explain itself better:

We have 2 table :

TABLE vendor (
group TEXT,
client TEXT,
vdr_venue_code CHAR(8),
vdr_location_code CHAR(8)
)

TABLE venue (
title TEXT,
date timestamp,
.......
code CHAR(8),
location CHAR(8)
)

For being able to couple some tuples of vendor with all the tuple of venue
that are inserted from other systems ,we have used a default character "*",
so that whichever is venue.code it comes coupled to the vendor tuple thet have
vdr_venue_code = "*".

For this reason base query base will be:

SELECT group FRON vendor,venue WHERE vdr_venue_code in (venue.code,"*") and
vdr_location_code in (venue.location,"*") and venue.data < .... and ....;

This type of query does not allow planner to use HASH JOIN slowing down the
query for great amounts of data.
Therefore we have tried to create an operator and a function that they
supported the hash and they resolved this case:



strcmp_left_default(PG_FUNCTION_ARGS)
{
text *str = PG_GETARG_TEXT_P(0);
text *cmp = PG_GETARG_TEXT_P(1);
char *my_str = NULL;
char *my_cmp = NULL;
bool result;

textInChar(&my_str,str);
textInChar(&my_cmp,cmp);

result = (strcmp(my_str, my_cmp) == 0 ||
strcmp(my_str, "*") == 0);

if (my_str != NULL)
pfree(my_str);

if (my_cmp != NULL)
pfree(my_cmp);

PG_FREE_IF_COPY(str, 0);
PG_FREE_IF_COPY(cmp, 1);

PG_RETURN_BOOL(result);
}



CREATE OPERATOR ==* (
PROCEDURE = strcmp_left_default,
LEFTARG = text,
RIGHTARG = text,
COMMUTATOR = OPERATOR(*==),
HASHES,
RESTRICT = eqsel,
JOIN = eqjoinsel,
SORT1 = <,
SORT2 = <,
LTCMP = <,
GTCMP = >
);



CREATE OPERATOR *== (
PROCEDURE = strcmp_right_default,
LEFTARG = text,
RIGHTARG = text,
COMMUTATOR = OPERATOR(==*),
HASHES,
RESTRICT = eqsel,
JOIN = eqjoinsel,
SORT1 = <,
SORT2 = <,
LTCMP = <,
GTCMP = >
);

CREATE OPERATOR CLASS text_default_ops
FOR TYPE text USING btree AS
OPERATOR 3 ==*(text,text) ,
FUNCTION 1 bttextcmp(text,text);

CREATE OPERATOR CLASS text_default_ops
FOR TYPE text USING hash AS
OPERATOR 1 ==*(text,text) ,
FUNCTION 1 hashtext(text);

For being able to have query of the type :

SELECT group FROM vendor,venue WHERE vendor.vdr_venue_code ==* venue.code and
vendor.vdr_location_code ==* venue.location and ...;



Effectively it comes used the hash join clause,but this cannot be the
solution,because my operator come used after the creation of the buckets
of the hash, so bucket that do not have correspondence on the key
(vdr_venue_code,code) does not come considers, even if has default value "*".

The situation does not change also using one our various function, different
from hashtext, for the creation of the hashtable,
in fact I cannot force the comparison between a value of venue.code with a
different bucket value where the default value "*" resides.

My feeling is that having to only confront the value key venue_code with a
variable value and a constant value "*",
it can be possible to create an operator that it manages this type of query
using a hashjoin clause.
Is it possible??

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Postgres General 2005-05-23 13:17:26 PITR restore hot standby
Previous Message Shaun Clements 2005-05-23 13:11:01 Postgres PHP error