Re: Partition performance causing ddl commands to slow down significantly

From: <fburgess(at)radiantblue(dot)com>
To: "John R Pierce" <pierce(at)hogranch(dot)com>, pgsql-bugs(at)postgresql(dot)org
Subject: Re: Partition performance causing ddl commands to slow down significantly
Date: 2013-04-12 19:10:02
Message-ID: 20130412121002.5a830134ae84016b0174832fdc1a3173.1349262d35.wbe@email11.secureserver.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

<html><body><span style="font-family:Verdana; color:#000000; font-size:10pt;"><div>Hi John,</div>
<div>&nbsp;</div>
<div>The SQL query should be: select count(*) from dna_strands where cid = 1;</div>
<div>&nbsp;</div>
<div>I&nbsp; just realize don't think this is not going to work. if for the&nbsp;sake of argument that cid = 1 is much more likely be be found in a more recent partition,&nbsp;any inverse search mechanism in the planner will find that match first but then continue through all of the other partitions, Right? The only way to optimize this is to find some way of adding the partition key date to the where clause.</div>
<div>&nbsp;</div>
<div>In actuality the query's are being generated&nbsp;via Hibernate</div>
<div>&nbsp;</div>
<div>thanks</div>
<BLOCKQUOTE style="BORDER-LEFT: blue 2px solid; PADDING-LEFT: 8px; FONT-FAMILY: verdana; COLOR: black; MARGIN-LEFT: 8px; FONT-SIZE: 10pt" id=replyBlockquote webmail="1">
<DIV id=wmQuoteWrapper>-------- Original Message --------<BR>Subject: Re: [BUGS] Partition performance causing ddl commands to slow<BR>down significantly<BR>From: John R Pierce &lt;<a href="mailto:pierce(at)hogranch(dot)com">pierce(at)hogranch(dot)com</a>&gt;<BR>Date: Fri, April 12, 2013 11:53 am<BR>To: <a href="mailto:pgsql-bugs(at)postgresql(dot)org">pgsql-bugs(at)postgresql(dot)org</a><BR><BR>
<DIV class=moz-cite-prefix>On 4/12/2013 11:46 AM, <A class=moz-txt-link-abbreviated href="mailto:fburgess(at)radiantblue(dot)com" target=_blank>fburgess(at)radiantblue(dot)com</A>&nbsp;wrote:<BR></DIV>
<BLOCKQUOTE cite=mid:20130412114630(dot)5a830134ae84016b0174832fdc1a3173(dot)fc7948457c(dot)wbe(at)email11(dot)secureserver(dot)net type="cite"><SPAN style="FONT-FAMILY: Verdana; COLOR: #000000; FONT-SIZE: 10pt">
<DIV>We are having performance related problems on one of our big data Partition tables. The table is partitioned by date and the partitions are organized from Jan 2003 thru Dec 2013. <BR>We have 268 child partitions associated with the Parent table, and we have constraint_exclusion=partition set. </DIV>
<DIV>The execution of the SQL query:&nbsp; select count(*) from dna_strands; </DIV>
<DIV>yields:&nbsp; QUERY PLAN<BR>_____________________________________________________________________________________________&nbsp;&nbsp; </DIV>
<DIV>Aggregate (cost=2246778.49..2246778.50 rows=1 width=0)<BR>&nbsp; -&gt; Append (0.00..2159647.04 rows=34852580 width=0)<BR>&nbsp;&nbsp;&nbsp;&nbsp; -&gt; Seq Scan on dna_strands (cost=0.00..0.00 rows=1 width)<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Filter: (cid = 1)<BR>&nbsp;&nbsp;&nbsp;&nbsp; -&gt; Index Scan using dna_strands_y2003m01_cid on dna_strands_y2003m01 dna_strands (cost=0.00..677652 rows=1 width=0)<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Index Cond: (cid = 1)<BR>&nbsp;&nbsp;&nbsp;&nbsp; -&gt; Index Scan using dna_strands_y2003m02_cid on dna_strands_y2003m02 dna_strands (cost=0.00..974423 rows=1 width=0)<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Index Cond: (cid = 1)<BR>&nbsp;&nbsp;&nbsp;&nbsp; -&gt; Index Scan using dna_strands_y2003m03_cid on dna_strands_y2003m03 dna_strands (cost=0.00..992301 rows=1 width=0)<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Index Cond: (cid = 1)<BR>&nbsp;&nbsp;&nbsp;&nbsp; ...<BR>&nbsp;&nbsp;&nbsp;&nbsp; ...</DIV>
<DIV>&nbsp;&nbsp;&nbsp;&nbsp; -&gt; Index Scan using dna_strands_y2013m12_cid on dna_strands_y2013m12 dna_strands (cost=0.00..8.27 rows=1 width=0)<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Index Cond: (cid = 1)</DIV>
<DIV>Question: Is there any way to modify the Planner to do the inverse of the Index Scan's.&nbsp; In other words, to start the index scans in reverse order from <BR>the most recent date to the oldest date, i.e. "dna_strands_y2013m12" backwards. Our application users query much more heavily at the most recent data that <BR>has been ingested into the PostgreSQL database.&nbsp; Would this capability speed up query performance?</DIV></SPAN></BLOCKQUOTE><BR>it wouldn't speed up your example, as your example has to scan every single row of the whole mess.&nbsp;&nbsp; not sure where Filter: (cid=1) comes from, since you showed the query as SELECT COUNT(*) FROM dna_strands;<BR><BR><BR><PRE class=moz-signature cols="72">-- john r pierce 37N 122W somewhere on the middle of the left coast</PRE></DIV></BLOCKQUOTE></span></body></html>

Attachment Content-Type Size
unknown_filename text/html 4.3 KB

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message John R Pierce 2013-04-12 19:45:21 Re: Partition performance causing ddl commands to slow down significantly
Previous Message John R Pierce 2013-04-12 18:53:11 Re: Partition performance causing ddl commands to slow down significantly