From: | Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com> |
---|---|
To: | Peter Geoghegan <pg(at)heroku(dot)com> |
Cc: | Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Claudio Freire <klaussfreire(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Corey Huinker <corey(dot)huinker(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Subject: | Re: Parallel tuplesort (for parallel B-Tree index creation) |
Date: | 2016-12-05 05:08:00 |
Message-ID: | CAJrrPGfJZVkZHXrK3T4KocOF1HH9GLROcwC4m4ncw0iuX_OYAA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Mon, Dec 5, 2016 at 7:44 AM, Peter Geoghegan <pg(at)heroku(dot)com> wrote:
> On Sat, Dec 3, 2016 at 7:23 PM, Tomas Vondra
> <tomas(dot)vondra(at)2ndquadrant(dot)com> wrote:
> > I do share your concerns about unpredictable behavior - that's
> > particularly worrying for pg_restore, which may be used for time-
> > sensitive use cases (DR, migrations between versions), so unpredictable
> > changes in behavior / duration are unwelcome.
>
> Right.
>
> > But isn't this more a deficiency in pg_restore, than in CREATE INDEX?
> > The issue seems to be that the reltuples value may or may not get
> > updated, so maybe forcing ANALYZE (even very low statistics_target
> > values would do the trick, I think) would be more appropriate solution?
> > Or maybe it's time add at least some rudimentary statistics into the
> > dumps (the reltuples field seems like a good candidate).
>
> I think that there is a number of reasonable ways of looking at it. It
> might also be worthwhile to have a minimal ANALYZE performed by CREATE
> INDEX directly, iff there are no preexisting statistics (there is
> definitely going to be something pg_restore-like that we cannot fix --
> some ETL tool, for example). Perhaps, as an additional condition to
> proceeding with such an ANALYZE, it should also only happen when there
> is any chance at all of parallelism being used (but then you get into
> having to establish the relation size reliably in the absence of any
> pg_class.relpages, which isn't very appealing when there are many tiny
> indexes).
>
> In summary, I would really like it if a consensus emerged on how
> parallel CREATE INDEX should handle the ecosystem of tools like
> pg_restore, reindexdb, and so on. Personally, I'm neutral on which
> general approach should be taken. Proposals from other hackers about
> what to do here are particularly welcome.
>
>
Moved to next CF with "needs review" status.
Regards,
Hari Babu
Fujitsu Australia
From | Date | Subject | |
---|---|---|---|
Next Message | Haribabu Kommi | 2016-12-05 05:09:28 | Re: sequence data type |
Previous Message | Haribabu Kommi | 2016-12-05 05:06:50 | Re: Parallel Index Scans |