Re: pgsql: Add a new GUC and a reloption to enable inserts in parallel-mode

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: "tsunakawa(dot)takay(at)fujitsu(dot)com" <tsunakawa(dot)takay(at)fujitsu(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Amit Kapila <akapila(at)postgresql(dot)org>, pgsql-committers <pgsql-committers(at)lists(dot)postgresql(dot)org>
Subject: Re: pgsql: Add a new GUC and a reloption to enable inserts in parallel-mode
Date: 2021-03-24 12:05:34
Message-ID: CA+TgmoZOA7Co=6OrGrQ0Bu=xa4TtJ4SJb-vF3GQ32oAziCxTng@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers

On Tue, Mar 23, 2021 at 11:13 PM tsunakawa(dot)takay(at)fujitsu(dot)com
<tsunakawa(dot)takay(at)fujitsu(dot)com> wrote:
> One problem with caching the result is that the first access in each session has to experience the slow processing. Some severe customers of our proprietary database, which is not based on Postgres, have requested to eliminate even the overhead associated with the first access, and we have provided features for them. As for the data file, users can use pg_prewam. But what can we recommend users to do in this case? Maybe the logon trigger feature, which is ready for committer in PG 14, can be used to allow users to execute possible queries at session start (or establishing a connection pool), but I feel it's inconvenient.

Well, I don't mind if somebody thinks up an even better solution.

> Regarding the picked xid assignment, I didn't think it's so grotty. Yes, in fact, I felt it's a bit unclean. But it's only a single line of code. With a single line of code, we can provide great value to users. Why don't we go for it? As discussed in the thread, the xid is wasted only when the source data is empty, which is impractical provided that the user wants to load much data probably for ETL.

The amount of code isn't the issue. I'd rather expend a little more
code and solve the problem in a better way.

> (I'm afraid "grotty" may be too strong a word considering the CoC statement "We encourage thoughtful, constructive discussion of the software and this community, their current state, and possible directions for development. The focus of our discussions should be the code and related technology, community projects, and infrastructure.")

I did not mean to give offense, but I also don't think grotty is a
strong word. I consider it a pretty mild word.

> > Likewise, the XXX comment you added to max_parallel_hazard_walker
> > claims that some of the code introduced there is to compensate for an
> > unspecified bug in the rewriter. I'm a bit skeptical that the comment
> > is correct, and there's no way to find out because the comment doesn't
> > say what the bug supposedly is, but let's just say for the sake of
> > argument that it's true. Well, you *could* have fixed the bug, but
> > instead you hacked around it, and in a relatively expensive way that
> > affects every query with a CTE in it whether it can benefit from this
> > patch or not. That's not a responsible way of maintaining the core
> > PostgreSQL code.
>
> It'd be too sad if we have to be bothered by an existing bug and give up an attractive feature. Adding more explanation in the comment is OK? Anyway, I think we can separate this issue.

I don't think I agree. These checks are adding a significant amount of
overhead, and one of the problems with this whole thing is that it
adds a lot of overhead.

--
Robert Haas
EDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-committers by date

  From Date Subject
Next Message Robert Haas 2021-03-24 12:14:53 Re: pgsql: Add a new GUC and a reloption to enable inserts in parallel-mode
Previous Message Christoph Berg 2021-03-24 09:56:29 Re: pgsql: Move tablespace path re-creation from the makefiles to pg_regres