Re: No index only scan on md5 index

From: Adam Brusselback <adambrusselback(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: No index only scan on md5 index
Date: 2015-11-26 01:16:40
Message-ID: CAMjNa7eUFYytPhhQY4mK9ZABPTy9mPawD84pWijuR1mK+x=SDg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

I appreciate the response Tom, and you are correct that the workaround
would not work in my case.

So no index expressions can return the their value without recomputing
without that work around? I learn something new every day it seems.
Thank you for the alternate method.

-Adam

On Wed, Nov 25, 2015 at 8:01 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> Adam Brusselback <adambrusselback(at)gmail(dot)com> writes:
> > CREATE TABLE attachment
> > (
> > attachment_id uuid NOT NULL DEFAULT gen_random_uuid(),
> > attachment_name character varying NOT NULL,
> > attachment_bytes_size integer NOT NULL,
> > attachment_bytes bytea NOT NULL,
> > CONSTRAINT attachment_pkey PRIMARY KEY (attachment_id)
> > );
> > CREATE INDEX idx_attachment_bytes_md5 ON attachment
> > ((md5(attachment_bytes)::uuid));
>
> > But if I wanted to return the md5 value, it seems to be totally unable to
> > use an index only scan:
> > SELECT md5(attachment_bytes)::uuid
> > FROM attachment;
>
> Nope, sorry, you're out of luck on that, because the check for whether an
> index-only scan is feasible checks whether all the variables used in the
> query are available from the index. (Testing whether an index expression
> could match everything asked for would greatly slow down planning, whether
> or not the index turned out to be applicable, so we don't try. I have
> some rough ideas about making that better, but don't hold your breath.)
>
> IIRC, it does actually get it right in terms of constructing the
> finished plan, if you can get past the index-only-scan-is-feasible test.
> So some people have done something like this to avoid recalculations of
> expensive functions:
>
> create table ff(f1 float8);
> create index on ff(sin(f1), f1);
> select sin(f1) from ff; -- can generate IOS and not re-evaluate sin()
>
> But if I'm right in guessing that attachment_bytes can be large,
> that's not going to be a workable hack for your case.
>
> Probably the only thing that's going to work for you is to store
> md5(attachment_bytes) in its own plain column (you can use a trigger
> to compute it for you), and then build a regular index on that,
> and query for that column not the md5() expression.
>
> regards, tom lane
>

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Albe Laurenz 2015-11-26 08:20:19 Re: No index only scan on md5 index
Previous Message Adam Brusselback 2015-11-26 01:06:10 Re: No index only scan on md5 index