Re: Add parallelism and glibc dependent only options to reindexdb

From: Julien Rouhaud <rjuju123(at)gmail(dot)com>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Daniel Verite <daniel(at)manitou-mail(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Kevin Grittner <kgrittn(at)gmail(dot)com>
Subject: Re: Add parallelism and glibc dependent only options to reindexdb
Date: 2019-07-09 07:52:38
Message-ID: CAOBaU_YZWa4dF+14_Q0BTepQtXgwmfJ+90CgQnS8w-7+bzAxWA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Thanks for the review.

On Tue, Jul 9, 2019 at 9:24 AM Michael Paquier <michael(at)paquier(dot)xyz> wrote:
>
> On Mon, Jul 08, 2019 at 11:02:14PM +0200, Julien Rouhaud wrote:
> > On Mon, Jul 8, 2019 at 9:08 PM Julien Rouhaud <rjuju123(at)gmail(dot)com> wrote:
> >> I'll resubmit the parallel patch using per-table only
> >> approach
> >
> > Attached.
>
> I have done a lookup of this patch set with a focus on the refactoring
> part, and the split is a bit confusing.

Yes, that wasn't a smart split :(

> +void
> +DisconnectDatabase(ParallelSlot *slot)
> +{
> + char errbuf[256];
> common.c has already an API to connect to a database. It would be
> more natural to move the disconnection part also to common.c and have
> the caller of DisconnectDatabase reset the slot connection by itself?

Ok.

> $ git grep select_loop
> scripts_parallel.c: /* We must reconstruct the fd_set for each
> call to select_loop */
> scripts_parallel.c: i = select_loop(maxFd, &slotset, &aborting);
> scripts_parallel.c:select_loop(int maxFd, fd_set *workerset, bool
> *aborting)
> scripts_parallel.h:extern int select_loop(int maxFd, fd_set
> *workerset, bool *aborting);
>
> select_loop is only present in scripts_parallel.c, so it can remain
> static.

Good point.

> + slots = (ParallelSlot *) pg_malloc(sizeof(ParallelSlot) *
> concurrentCons);
> + init_slot(slots, conn);
> + if (parallel)
> + {
> + for (i = 1; i < concurrentCons; i++)
> + {
> + conn = connectDatabase(dbname, host, port,
> username, prompt_password,
> +
> progname, echo, false, true);
> + init_slot(slots + i, conn);
> + }
> + }
>
> This comes from 0002 and could be more refactored as vacuumdb does the
> same thing. Based on 0001, init_slot() is called now in vacuumdb.c
> and initializes a set of slots while connecting to a given database.
> In short, in input we have a set of parameters and the ask to open
> connections with N slots, and the return result is an pg_malloc'd
> array of slots ready to be used. We could just call that
> ParallelSlotInit() (if you have a better name feel free).

Given how the rest of the functions are named, I'll probably use
InitParallelSlots().

>
> + /*
> + * Get the connection slot to use. If in parallel mode, here we wait
> + * for one connection to become available if none already is. In
> + * non-parallel mode we simply use the only slot we have, which we
> + * know to be free.
> + */
> + if (parallel)
> This business also is duplicated in both reindexdb.c and vacuumdb.c.
>
> +bool
> +GetQueryResult(PGconn *conn, const char *progname)
> +{
> This also does not stick with the parallel stuff, as that's basically
> only getting a query result. We could stick that into common.c.

This function also has a bad name, as it's discarding the result via
ProcessQueryResult. Maybe we should rename them to GetQuerySuccess()
and ConsumeAndTrashQueryResult()?

> Patch 2 has no documentation. The option as well as the restrictions
> in place need to be documented properly.

I forgot that I had forgotten to add documentation :( will fix this time.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Arthur Zakirov 2019-07-09 08:10:49 Re: Extra quote_all_identifiers in _dumpOptions
Previous Message Michael Paquier 2019-07-09 07:41:16 Re: Extra quote_all_identifiers in _dumpOptions