From: | Gábor SZŰCS <surrano(at)gmail(dot)com> |
---|---|
To: | Jeff Janes <jeff(dot)janes(at)gmail(dot)com> |
Cc: | Matthew Hall <mhall(at)mhcomputing(dot)net>, pgsql-performance(at)lists(dot)postgresql(dot)org |
Subject: | Re: insert and query performance on big string table with pg_trgm |
Date: | 2017-11-25 09:19:59 |
Message-ID: | CAHEufv1Y+-G5HJteunb45DvXUS+Xqte+TKAqtPFHdHBQUVkxrQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
Don't know if it would make PostgreSQL happier but how about adding a hash
value column and creating the unique index on that one? May block some
false duplicates but the unique index would be way smaller, speeding up
inserts.
2017. nov. 25. 7:35 ezt írta ("Jeff Janes" <jeff(dot)janes(at)gmail(dot)com>):
>
>
> On Nov 21, 2017 00:05, "Matthew Hall" <mhall(at)mhcomputing(dot)net> wrote:
>
>
> > Are all indexes present at the time you insert? It will probably be
> much faster to insert without the gin index (at least) and build it after
> the load.
>
> There is some flexibility on the initial load, but the updates in the
> future will require the de-duplication capability. I'm willing to accept
> that might be somewhat slower on the load process, to get the accurate
> updates, provided we could try meeting the read-side goal I wrote about, or
> at least figure out why it's impossible, so I can understand what I need to
> fix to make it possible.
>
>
> As long as you don't let anyone use the table between the initial load and
> when the index build finishes, you don't have to compromise on
> correctness. But yeah, makes sense to worry about query speed first.
>
>
>
>
>
>
> > If you repeat the same query, is it then faster, or is it still slow?
>
> If you keep the expression exactly the same, it still takes a few seconds
> as could be expected for such a torture test query, but it's still WAY
> faster than the first such query. If you change it out to a different
> expression, it's longer again of course. There does seem to be a
> low-to-medium correlation between the number of rows found and the query
> completion time.
>
>
> To make this quick, you will need to get most of the table and most of the
> index cached into RAM. A good way to do that is with pg_prewarm. Of
> course that only works if you have enough RAM in the first place.
>
> What is the size of the table and the gin index?
>
>
> Cheers,
>
> Jeff
>
>
From | Date | Subject | |
---|---|---|---|
Next Message | Dmitry Shalashov | 2017-11-25 11:54:55 | Re: Query became very slow after 9.6 -> 10 upgrade |
Previous Message | Jeff Janes | 2017-11-25 06:35:22 | Re: insert and query performance on big string table with pg_trgm |