From: | Anil Menon <gakmenon(at)gmail(dot)com> |
---|---|
To: | Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com> |
Cc: | "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: Performance question |
Date: | 2014-11-20 04:08:47 |
Message-ID: | CAHzbRKf8d4c4OVGkDhryMmzbKh4MeCX6PDsyWW9Ws93GahDXbQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Thanks Adrian
On Thu, Nov 20, 2014 at 3:46 AM, Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>
wrote:
> On 11/19/2014 08:26 AM, Anil Menon wrote:
>
>> Hello,
>>
>> I would like to ask from your experience which would be the best
>> "generic" method for checking if row sets of a certain condition exists
>> in a PLPGSQL function.
>>
>> I know of 4 methods so far (please feel free to add if I missed out any
>> others)
>>
>> 1) get a count (my previous experience with ORCL shaped this option)
>>
>> select count(*) into vcnt
>> from table
>> where <<condition>>
>> if vcnt >0 then
>> do X
>> else
>> do y
>> end if
>> Cons : It seems doing a count(*) is not the best option for PG
>>
>
>
> Well that would depend on the table size, whether it was 100 rows vs
> 1,000,000 rows
>
>
The table is estimated/guesstimated to be ~900 million rows (~30Ma day,
90 days history, though initially it would be ~30M), though the <<where>>
part of the query would return between 0 and ~2 rows
>
>> 2) Use a non-count option
>> select primary_key_Col into vcnt
>> from table
>> where <<condition>>
>> if found then
>> do X
>> else
>> do y
>> end if
>> Cons :Some people seems not to prefer this as (AFAIU) it causes a
>> plpgsql->sql->plpgsql switches
>>
>
> plpgsql is fairly tightly coupled to SQL, so I have not really seen any
> problems. But then I am not working on large datasets.
>
I think that ~900M rows would constitute a large data set most likely
>
>
>> 3) using perform
>> perform primary_key_Col into vcnt
>> from table
>> where <<condition>>
>> if found then
>> do X
>> else
>> do y
>> end if
>>
>> Seems to remove the above (item 2) issues (if any)
>>
>
> AFAIK, you cannot do the above as written. PERFORM does not return a
> result:
>
> http://www.postgresql.org/docs/9.3/interactive/plpgsql-
> statements.html#PLPGSQL-STATEMENTS-SQL-NORESULT
>
> It would have to be more like:
>
> perform primary_key_Col from table where <<condition>>
>
>
You are absolutely right - my bad.
>
>> 4) using exists
>> if exists ( select 1 from table where <<condition>> ) then
>> do x
>> else
>> do y
>> end if
>>
>>
>> My question is what would be the best (in terms of performance) method
>> to use? My gut feeling is to use option 4 for PG. Am I right or is there
>> any other method?
>>
>
> All of the above is context specific. To know for sure you will need to
> test on actual data.
>
Absolutely right, just that I want to ensure that I follow the most
optimal method before the DB goes into production, after which priorities
change on what needs to be changed.
>
>
> --
> Adrian Klaver
> adrian(dot)klaver(at)aklaver(dot)com
>
I guess the best answer would be "its very context specific", but picking
the brains of experienced resources helps :-)
Thanks again
Anil
From | Date | Subject | |
---|---|---|---|
Next Message | Michael Paquier | 2014-11-20 07:30:48 | Re: [GENERAL] [sfpug] Linuxfest 2015 Call for Papers |
Previous Message | gabrielle | 2014-11-20 02:34:47 | Re: [GENERAL] [sfpug] Linuxfest 2015 Call for Papers |