Re: Parallel queries in single transaction

From: Paul Muntyanu <pmuntyanu(at)gmail(dot)com>
To: tomas(dot)vondra(at)2ndquadrant(dot)com
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Parallel queries in single transaction
Date: 2018-07-16 13:21:14
Message-ID: CACnYr+geQM1RmGArOrt5bimO6qNVk73zEjAhba7QV72Ez38dRQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> Well, sure. But you could just as well open multiple connections and
> make the queries concurrent that way. Or change the GUC to increase the
> number of workers for the nightly ETL.

This is an option right now for having permanent staging tables for future
join. I mistakenly said ETL while it is ELT what means that most of
operations are in the database so we try to keep all changes in db code
instead of changing engine for execution. In PG11 we have parallel CTAS
what is drammatical improvement for us, but there are still will be
operations(query plans) which are not parallel.

Having postgresql completely ACID is amazing feature, so when we need to do
ELT operation outside the transaction and guarantee that ELT job completed
successfully by checking that all steps(multiple transactions with staging
tables) are succeeded(with graceful rollback + cleanup in case of failure),
makes things more complex. Indeed I still agree that it is possible to
workaround by operating on application level.
-P

-P

On Mon, Jul 16, 2018 at 2:28 PM Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
wrote:

>
>
> On 07/16/2018 12:03 PM, Paul Muntyanu wrote:
> > Hi Tomas, thanks for looking into. I am more talking about queries which
> > can not be optimized, e.g.
> > * fullscan of the table and heavy calculations for another one.
> > * query through FDW for both queries(e.g. one query fetches data from
> > Kafka and another one is fetching from remote Postgres. There are no
> > bounds for both queries for anything except local CPU, network and
> > remote machine)
> >
> > IO bound is not a problem in case if you have multiple tablesapces.
>
> But it was you who mentioned "query stuck" not me. I merely pointed out
> that in such cases running queries concurrently won't help.
>
> > And CPU bound can be not the case when you have 32 cores and 6 max
> workers
> > per query. Then, during nigtly ETL, I do not have anything except single
> > query running) == 6 cores are occupied. If I can run queries in
> > parallel, I would occupy two IO stacks(two tablespaces) + 12 cores
> > instead of sequentially 6 and then again 6.
> >
>
> Well, sure. But you could just as well open multiple connections and
> make the queries concurrent that way. Or change the GUC to increase the
> number of workers for the nightly ETL.
>
>
> regards
>
> --
> Tomas Vondra http://www.2ndQuadrant.com
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2018-07-16 13:35:27 Re: Pluggable Storage - Andres's take
Previous Message Andres Freund 2018-07-16 13:19:25 Re: patch to allow disable of WAL recycling