Re: ETL - sql orchestrator is stuck when there is not sleep() between queries

From: Allan Barrielle <allan(dot)barrielle(at)gmail(dot)com>
To: Justin Pryzby <pryzby(at)telsasoft(dot)com>
Cc: pgsql-performance(at)lists(dot)postgresql(dot)org, Pierrick Martin <pierrick(at)mart1(dot)dev>
Subject: Re: ETL - sql orchestrator is stuck when there is not sleep() between queries
Date: 2021-07-08 14:12:07
Message-ID: CANrBv8q6s=C-=CULunJKUUUVt6QZR-=dFTq0rY77GTDHoTrnSQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

fsync is off and full_page_writes is off because the script works one time.
We create the db, we load the data, then we dump the data and kill the db.
No need to handle servers crashed or anything like that.

0.1 vacuum_cleanup_index_scale_factor is the default value.

On Thu, Jul 8, 2021 at 4:06 PM Justin Pryzby <pryzby(at)telsasoft(dot)com> wrote:

> On Thu, Jul 08, 2021 at 03:49:12PM +0200, Allan Barrielle wrote:
> > > Is it true that the SELECTs have no joins in them ?
> >
> > Yes there is a lot of LEFT JOIN.
> >
> > > It'd be very useful to get "explain analyze" for a working query and
> for a
> > > stuck query. It sound like the stuck query never finishes, so maybe
> the second
> > > part is impossible (?)
> >
> > We run an explain analysis and we see some very interesting stuff going
> on.
> > It seems without explicitly adding a `ANALYZE`, the query has a cost of
> > over billions, so the query is not stuck but took forever.
> > When I run the same scripts with an ANALYZE right before running the
> query,
> > the query is exec is 50secondes and the cost is normal
>
> It sounds like sometimes autoanalyze processes important tables being
> queried,
> but sometimes it doesn't.
>
> Since there are JOINs involved, you should analyze the tables after
> populating
> them and before querying them. The same as if it were a temp table, or
> anything else.
>
> > The configuration is tuned by aws aurora, [...]
>
> > fsync,off
> > full_page_writes,off
>
> really?
>
> > vacuum_cleanup_index_scale_factor,0.1
>
> also interesting
>

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Nagaraj Raj 2021-07-08 21:01:45 ERROR: there is no unique or exclusion constraint matching the ON CONFLICT specification
Previous Message Justin Pryzby 2021-07-08 14:06:03 Re: ETL - sql orchestrator is stuck when there is not sleep() between queries