From: | "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com> |
---|---|
To: | Allen Landsidel <alandsidel(at)gmail(dot)com> |
Cc: | pgsql-performance(at)postgresql(dot)org |
Subject: | Re: Strange (?) Index behavior? |
Date: | 2004-11-11 21:48:27 |
Message-ID: | 4193DE2B.5060407@commandprompt.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
>>>-----------------------------------------------------------------------------------------------------------------------------------------------
>>> Index Scan using sname_unique on "testtable" (cost=0.00..34453.74
>>>rows=8620 width=20) (actual time=77.004..537065.079 rows=74612
>>>loops=1)
>>> Index Cond: ((sname >= 'AA'::text) AND (sname < 'AB'::text))
>>> Filter: (sname ~~ 'AA%'::text)
>>> Total runtime: 537477.737 ms
>>>(4 rows)
>>>
>>>Time: 537480.571 ms
>>
>>Nothing you're going to do to the query is going to come up with a more
>>effective plan than this. It's using the index after all. It's never going to
>>be lightning fast because it has to process 75k rows.
>>
>>However 75k rows shouldn't be taking nearly 10 minutes. It should be taking
>>about 10 seconds.
I am confused about this statement. I have a table with 1.77 million
rows that I use gist indexes on (TSearch) and I can pull out of it in
less than 2 seconds.
Are you saying it should be taking 10 seconds because of the type of
plan? 10 seconds seems like an awfullong time for this.
Sincerely,
Joshua D. Drake
>
>
> That's my feeling as well, I thought the index was to blame because it
> will be quite large, possibly large enough to not fit in memory nor be
> quickly bursted up.
>
>
>>The 77ms before finding the first record is a bit suspicious. Have you
>>vacuumed this table regularly? Try a VACUUM FULL VERBOSE, and send the
>>results. You might try to REINDEX it as well, though I doubt that would help.
>
>
> This table is *brand spanking new* for lack of a better term. I have
> the data for it in a CSV. I load the CSV up which takes a bit, then
> create the indexes, do a vacuum analyze verbose, and then posted the
> results above. I don't think running vacuum a more times is going to
> change things, at least not without tweaking config settings that
> affect vacuum. Not a single row has been inserted or altered since the
> initial load.. it's just a test.
>
> I can't give vacuum stats right now because the thing is reloading
> (again) with different newfs settings -- something I figure I have the
> time to fiddle with now, and seldom do at other times. These numbers
> though don't change much between 8K on up to 64K 'cluster' sizes. I'm
> trying it now with 8K page sizes, with 8K "minimum fragment" sizes.
> Should speed things up a tiny bit but not enough to really affect this
> query.
>
> Do you still see a need to have the output from the vacuum?
>
>
>>Actually you might consider clustering the table on sname_unique. That would
>>accomplish the same thing as the VACUUM FULL command and also speed up the
>>index scan. And the optimizer knows (if you analyze afterwards) it so it
>>should be more likely to pick the index scan. But currently you have to rerun
>>cluster periodically.
>
>
> Clustering is really unworkable in this situation. It would work now,
> in this limited test case, but using it if this were to go into
> production is unrealistic. It would have to happen fairly often since
> this table is updated frequently, which will break the clustering
> quickly with MVCC.
>
> Running it often.. well.. it has 70M+ rows, and the entire table is
> copied, reordered, and rewritten.. so that's a lot of 'scratch space'
> needed. Finally, clustering locks the table..
>
> Something I'd already considered but quickly ruled out because of
> these reasons..
>
> More ideas are welcome though. ;)
>
> -Allen
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to majordomo(at)postgresql(dot)org
--
Command Prompt, Inc., home of PostgreSQL Replication, and plPHP.
Postgresql support, programming shared hosting and dedicated hosting.
+1-503-667-4564 - jd(at)commandprompt(dot)com - http://www.commandprompt.com
Mammoth PostgreSQL Replicator. Integrated Replication for PostgreSQL
Attachment | Content-Type | Size |
---|---|---|
jd.vcf | text/x-vcard | 640 bytes |
From | Date | Subject | |
---|---|---|---|
Next Message | Josh Berkus | 2004-11-12 18:42:02 | Clarification on two bits on VACUUM FULL VERBOSE output |
Previous Message | Tom Lane | 2004-11-11 21:41:51 | Re: Strange (?) Index behavior? |