Re: PGSQL 9.3 - Materialized View - multithreading

From: Nicolas Paris <niparisco(at)gmail(dot)com>
To: "Graeme B(dot) Bell" <grb(at)skogoglandskap(dot)no>
Cc: postgres performance list <pgsql-performance(at)postgresql(dot)org>
Subject: Re: PGSQL 9.3 - Materialized View - multithreading
Date: 2014-04-07 13:56:43
Message-ID: CA+ssMOQ42D0BKDiza-_cXAA2a3-V+ryja2q-4s6LDcn0C=R0nw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Excellent.

Maybe the last sub-question :

Those 3600 mat views do have *indexes*.
I guess I will get better performances in *dropping indexes* first, then
refresh, then *re-creating indexes*.

Are there other way to improve performances (like mat views storage
parameters<http://www.postgresql.org/docs/9.3/static/sql-createtable.html#SQL-CREATETABLE-STORAGE-PARAMETERS>),
because this routines will be at night, and need to be finished quickly.

Thanks

Nicolas PARIS

2014-04-07 14:59 GMT+02:00 Graeme B. Bell <grb(at)skogoglandskap(dot)no>:

>
> Hi again Nick.
>
> Glad it helped.
>
> Generally, I would expect that doing all the A's first, then all the B's,
> and so on, would be fastest since you can re-use the data from cache.
>
> Concurrency when reading isn't generally a problem. Lots of things can
> read at the same time and it will be nice and fast.
> It's concurrent writes or concurrent read/write of the same data item that
> causes problems with locking. That shouldn't be happening here, judging by
> your description.
>
> If possible, try to make sure nothing is modifying those source tables
> A/B/C/D/E/F when you are doing your view refresh.
>
> Graeme.
>
> On 07 Apr 2014, at 14:49, Nicolas Paris <niparisco(at)gmail(dot)com> wrote:
>
> > Hello,
> > Thanks for this clear explanation !
> >
> > Then I have a sub-question :
> > Supposed I have 3600 materialised views say 600 mat views from 6 main
> table. (A,B,C,D,E,F are repetead 600 times with some differences)
> > Is it faster to :
> > 1) parallel refresh 600 time A, then 600 time B etc,
> > OR
> > 2) parallel refresh 600 time A,B,C,D,E,F
> >
> > I guess 1) is faster because they are 600 access to same table loaded in
> memory ? But do parallel access to the same table implies concurency
> > and bad performance ?
> >
> > Thanks
> >
> > Nicolas PARIS
> >
> >
> > 2014-04-07 12:29 GMT+02:00 Graeme B. Bell <grb(at)skogoglandskap(dot)no>:
> > On 04 Apr 2014, at 18:29, Nicolas Paris <niparisco(at)gmail(dot)com> wrote:
> >
> > > Hello,
> > >
> > > My question is about multiprocess and materialized View.
> > >
> http://www.postgresql.org/docs/9.3/static/sql-creatematerializedview.html
> > > I (will) have something like 3600 materialised views, and I would like
> to know the way to refresh them in a multithread way
> > > (anderstand 8 cpu cores -> 8 refresh process in the same time)
> >
> > Hi Nick,
> >
> > out of DB solution:
> >
> > 1. Produce a text file which contains the 3600 refresh commands you want
> to run in parallel. You can do that with select and format() if you don't
> have a list already.
> >
> > 2. I'm going to simulate your 3600 'refresh' commands here with some
> select and sleep statements that finish at unknown times.
> >
> > (In BASH):
> > for i in {1..3600} ; do echo "echo \"select
> pg_sleep(1+random()::int*10); select $i\" | psql mydb" ; done > 3600commands
> >
> > 3. Install Gnu Parallel and type:
> >
> > parallel < 3600commands
> >
> > 4. Parallel will automatically work out the appropriate number of
> cores/threads for your CPUs, or you can control it manually with -j.
> > It will also give you a live progress report if you use --progress.
> > e.g. this command balances 8 jobs at a time, prints a dynamic progress
> report and dumps stdout to /dev/null
> >
> > parallel -j 8 --progress < 3600commands > /dev/null
> >
> > 5. If you want to make debugging easier use the parameter --tag to tag
> output for each command.
> >
> > Of course it would be much more elegant if someone implemented something
> like Gnu Parallel inside postgres or psql ... :-)
> >
> > Hope this helps & have a nice day,
> >
> > Graeme.
> >
> >
> >
> >
> >
> >
>
>

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Graeme B. Bell 2014-04-07 14:05:11 Re: PGSQL 9.3 - Materialized View - multithreading
Previous Message Tatsuo Ishii 2014-04-07 13:46:10 Re: performance degradation after launching postgres cluster using pgpool-II