Re: Planner: rows=1 after "similar to" where condition.

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Joris Dobbelsteen" <Joris(at)familiedobbelsteen(dot)nl>
Cc: "Gregory Stark" <stark(at)enterprisedb(dot)com>, "Scott Marlowe" <scott(dot)marlowe(at)gmail(dot)com>, pgsql-general(at)postgresql(dot)org
Subject: Re: Planner: rows=1 after "similar to" where condition.
Date: 2008-03-04 23:28:41
Message-ID: 28326.1204673321@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

"Joris Dobbelsteen" <Joris(at)familiedobbelsteen(dot)nl> writes:
> From: Tom Lane [mailto:tgl(at)sss(dot)pgh(dot)pa(dot)us]
>> If you increase the stats target for the column to 100 or
>> more then it
>> will try actually applying the regexp to all the histogram entries.
>> That might or might not give you a better estimate.

> A new test case (I did a fresh VACUUM ANALYZE with your statistics for
> text set to 100):
> Arround 5288 rows out of 4.3 Million match.

Ah, you had not given us that number before. That's one in 800 rows,
more or less, which means that there's no chance of getting a
well-founded statistical estimate with less than 800 items in the stats
collection. Does it do any better with stats target set to 1000?

I think though that the real problem may be that the index condition

> " Recheck Cond: (((program)::text = 'amavis'::text) AND
> ((facility)::text = 'mail'::text))"

selects rows that match the regex with much higher probability than the
general row population does. Since we don't yet have any cross-column
statistics the planner has no chance of realizing that.

> My construction with the regex as "substring()" construction and a
> "WHERE substring() IS NOT NULL" seems to give a better estimate in these
> cases. The result seems equivalent.

Actually, it's got exactly 0 knowledge about substring() and is giving
you a completely generic guess for this clause :-(

regards, tom lane

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Ralph Smith 2008-03-04 23:53:22 I don't understand this WARNING on pg_ctl startup
Previous Message Tom Lane 2008-03-04 23:14:50 Re: GRANT ALL ON recursive for all tables in my scheme?