From: | Ben-Nes Yonatan <da(at)canaan(dot)co(dot)il> |
---|---|
To: | "Jim C(dot) Nasby" <jnasby(at)pervasive(dot)com>, Sean Davis <sdavis2(at)mail(dot)nih(dot)gov>, Dann Corbit <DCorbit(at)connx(dot)com> |
Cc: | pgsql-general(at)postgresql(dot)org |
Subject: | Re: Query results caching? |
Date: | 2005-08-22 22:27:39 |
Message-ID: | 430A515B.6080307@canaan.co.il |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
> On Mon, Aug 22, 2005 at 10:13:49PM +0200, Ben-Nes Yonatan wrote:
>
>>I think that I was misunderstood, Ill make an example:
>>Lets say that im making the following query for the first time on the
>>"motorcycles" table which got an index on the "manufacturer" field:
>>
>>EXPLAIN ANALYZE SELECT manufacturer FROM motorcycles WHERE
>>manufacturer='suzuki';
>>... Total runtime: 3139.587 ms
>>
>>Now im doing the same query again and i get a much faster result (cause
>>of the "caching"): Total runtime: 332.53 ms
>>
>>After both of those queries I drop the index and query the table again
>>with the exact same query as before and now I receive: Total runtime:
>>216834.871 ms
>>
>>And for my last check I run the exact same query again (without creating
>>the INDEX back again) and I get quite similar result to my third query:
>>Total runtime: 209218.01 ms
>>
>>
>>My problem is that (maybe I just dont understand something basic
>>here...) the last 2 (also the second query but I dont care about that)
>>queries were using the "cache" that was created after the first query
>>(which had an INDEX) so none of them actually showed me what will happen
>>if a client will do such a search (without an INDEX) for the first time.
>>
>>I want to delete that "caching" after I do the first 2 queries so my
>>next queries will show me "real life results".
>
>
Ok I tried to handle both of your replies cause I got them at 2 seperate
emails.
Dann Corbit wrote:
> These results are all what I would expect. When you delete the index,
> the query will be forced to do a table scan (to examine every single
> record in the table one by one). If the table is non-trivial it is
> unlikely that either the OS or the database will cache the whole thing
> in memory. However, when you query a small record set, then it is
> likely to be retained in RAM which is literally thousands of times
> faster than disk.
Didnt know that, good to know though doesnt assure me...
What if I drop the INDEX but create a diffrent INDEX which also make the
process alot faster then without an INDEX but slower/faster then the one
before, will it wont use the former "caching"?
> If a search is to be made on a frequent basis, you should create an
> index.
> The query results above show you why.
Obvious :)
> Think about this for a minute. The real life results you want are
> very fast results. For that reason, you should try to model the
> customer queries as nearly as possible. If you have a canned
> application like order entry, then the real parameterized query set
> will probably be quite small in real life. If you are creating a
> server for ad-hoc queries then it will be far more difficult to model
> in real life.
>
> What is the real purpose of the application that you are writing?
>
> Will users be using a pre-programmed front end, or will they be typing
> in queries free-form for whatever their heart desires?
Ok ill try to describe the system as short & precise as possible (its
also passed midnight here :)).
Each day I receive about 4 million rows of data (products) which I
insert into table1 (after I delete all of the previous data it had),
along it I receive for every row about another 15 keywords which I
insert into table2 (where as in table1 I delete all of the previous data
it had also), this process is a fact that I cant change.
Now the users of the site can search for data from table1 by typing
whichever (and up to 4) words as they want at a text field (search input
string) and the server should display the correct results by querying
table1 & join table2 for its keywords.
I succeded to do it quite fast but when I tried to ORDER BY my results
its times jumped up drastically (2-3 seconds for a query... and thats
after the caching..).
I can't allow a situation where a user will search with a keyword which
wasnt 'cached' before and because of that he will wait 15 seconds for a
result.
Jim C. Nasby wrote:
> Emptying the cache will not show real-life results. You are always going
> to have some stuff cached, even if you get a query for something new. In
> this case (since you'll obviously want those indexes there), after some
> amount of time you will have most (if not all) of the non-leaf index
> pages cached, since they take a fairly small amount of memory and are
> frequently accessed. This makes index traversal *much* faster than your
> initial case shows, even if you query on something different each time.
> Testing with a completely empty cache just isn't that realistic.
As far as I understand it at my situation where all of the data is
deleted and inserted each day from the start (INDEX will get lost with
it..) & the endless variety of possible keywords search's & the immense
size of the tables, the following reason wont last.. or am I wrong here?
Because of all of that I want to be able to see how much time a query
takes when its the first time its being run..... or I'm wrong again and
failing to understand something?
Again everyone THANKS ALOT its really amazing the help that I receive
from you!
Ben-Nes Yonatan
From | Date | Subject | |
---|---|---|---|
Next Message | Ben-Nes Yonatan | 2005-08-22 22:28:28 | Re: Query results caching? |
Previous Message | Dann Corbit | 2005-08-22 22:24:52 | Re: Query results caching? |