Re: No index only scan on md5 index

From: Adam Brusselback <adambrusselback(at)gmail(dot)com>
To: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>
Cc: "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: No index only scan on md5 index
Date: 2015-11-26 01:06:10
Message-ID: CAMjNa7cDDb+gc5Djpqb+2OBWzi4WPNwPrB+PG8NqWXP+VK+qXQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Main reason I was hoping to not do that, is the value that would be stored
in that column is dependent on what is stored in the attachment_bytes
column, so to be 100% sure it's correct, you'd need that column controlled
by a trigger, disallowing any explicit inserts or updates to the value.
Was having a hard time finding info on this type of thing online though, so
I was unsure if Postgres was working as intended, or if I had made a
mistake somehow.

If you do know, what are the instances it is able to return data directly
from an index instead of having to go to heap?

On Wed, Nov 25, 2015 at 7:55 PM, David G. Johnston <
david(dot)g(dot)johnston(at)gmail(dot)com> wrote:

> On Wednesday, November 25, 2015, Adam Brusselback <
> adambrusselback(at)gmail(dot)com> wrote:
>
>> Hey all,
>>
>> I have an attachment table in my database which stores a file in a bytea
>> column, the file name, and the size of the file.
>>
>> Schema:
>> CREATE TABLE attachment
>> (
>> attachment_id uuid NOT NULL DEFAULT gen_random_uuid(),
>> attachment_name character varying NOT NULL,
>> attachment_bytes_size integer NOT NULL,
>> attachment_bytes bytea NOT NULL,
>> CONSTRAINT attachment_pkey PRIMARY KEY (attachment_id)
>> );
>>
>> I do lookups on this table based on the md5 of the attachment_bytes
>> column, so I added an index:
>> CREATE INDEX idx_attachment_bytes_md5 ON attachment
>> ((md5(attachment_bytes)::uuid));
>>
>> Queries like this are sped up by the index no problem:
>> SELECT attachment_id
>> FROM attachment
>> WHERE md5(attachment_bytes)::uuid = 'b2ab855ece13a72a398096dfb6c832aa';
>>
>> But if I wanted to return the md5 value, it seems to be totally unable to
>> use an index only scan:
>> SELECT md5(attachment_bytes)::uuid
>> FROM attachment;
>>
>>
> Ok.
>
> Any reason not to add the uuid column to the table?
>
> AFAIK The system is designed to return data from the heap, not an index.
> While it possibly can in some instances if you need to return data you
> should store it directly in the table.
>
> David J.
>

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Adam Brusselback 2015-11-26 01:16:40 Re: No index only scan on md5 index
Previous Message Tom Lane 2015-11-26 01:01:01 Re: No index only scan on md5 index