From: | Alexander Korotkov <aekorotkov(at)gmail(dot)com> |
---|---|
To: | Tomas Vondra <tv(at)fuzzy(dot)cz> |
Cc: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: WIP: store additional info in GIN index |
Date: | 2012-12-22 16:15:49 |
Message-ID: | CAPpHfdt+i0rjVouRNqiGSQBBDgaYsM3UewYLmAvOU-_OfAGkfg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi!
On Thu, Dec 6, 2012 at 5:44 AM, Tomas Vondra <tv(at)fuzzy(dot)cz> wrote:
> Then I've run a simple benchmarking script, and the results are not as
> good as I expected, actually I'm getting much worse performance than
> with the original GIN index.
>
> The following table contains the time of loading the data (not a big
> difference), and number of queries per minute for various number of
> words in the query.
>
> The queries looks like this
>
> SELECT id FROM messages
> WHERE body_tsvector @@ plainto_tsquery('english', 'word1 word2 ...')
>
> so it's really the simplest form of FTS query possible.
>
> without patch | with patch
> --------------------------------------------
> loading 750 sec | 770 sec
> 1 word 1500 | 1100
> 2 words 23000 | 9800
> 3 words 24000 | 9700
> 4 words 16000 | 7200
> --------------------------------------------
>
> I'm not saying this is a perfect benchmark, but the differences (of
> querying) are pretty huge. Not sure where this difference comes from,
> but it seems to be quite consistent (I usually get +-10% results, which
> is negligible considering the huge difference).
>
> Is this an expected behaviour that will be fixed by another patch?
>
Another patches which significantly accelerate index search will be
provided. This patch changes only GIN posting lists/trees storage. However,
it wasn't expected that this patch significantly changes index scan speed
in any direction.
The database contains ~680k messages from the mailing list archives,
> i.e. about 900 MB of data (in the table), and the GIN index on tsvector
> is about 900MB too. So the whole dataset nicely fits into memory (8GB
> RAM), and it seems to be completely CPU bound (no I/O activity at all).
>
> The configuration was exactly the same in both cases
>
> shared buffers = 1GB
> work mem = 64 MB
> maintenance work mem = 256 MB
>
> I can either upload the database somewhere, or provide the benchmarking
> script if needed.
Unfortunately, I can't reproduce such huge slowdown on my testcases. Could
you share both database and benchmarking script?
------
With best regards,
Alexander Korotkov.
From | Date | Subject | |
---|---|---|---|
Next Message | Pavel Stehule | 2012-12-22 17:13:01 | strange behave of fulltext query when query contains negation of prefix |
Previous Message | Andres Freund | 2012-12-22 11:50:12 | Re: foreign key locks |