| From: | "Graeme B(dot) Bell" <grb(at)skogoglandskap(dot)no> | 
|---|---|
| To: | Nicolas Paris <niparisco(at)gmail(dot)com> | 
| Cc: | postgres performance list <pgsql-performance(at)postgresql(dot)org> | 
| Subject: | Re: PGSQL 9.3 - Materialized View - multithreading | 
| Date: | 2014-04-07 12:59:25 | 
| Message-ID: | D8DF4544-81A4-4465-8AD4-25DA3CC20596@skogoglandskap.no | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-performance | 
Hi again Nick.
Glad it helped.
Generally, I would expect that doing all the A's first, then all the B's, and so on, would be fastest since you can re-use the data from cache.
Concurrency when reading isn't generally a problem. Lots of things can read at the same time and it will be nice and fast.
It's concurrent writes or concurrent read/write of the same data item that causes problems with locking. That shouldn't be happening here, judging by your description.
If possible, try to make sure nothing is modifying those source tables A/B/C/D/E/F when you are doing your view refresh.
Graeme.
On 07 Apr 2014, at 14:49, Nicolas Paris <niparisco(at)gmail(dot)com> wrote:
> Hello,
> Thanks for this clear explanation !
> 
> Then I have a sub-question :
> Supposed I have 3600 materialised views say 600 mat views from 6 main table. (A,B,C,D,E,F are repetead 600 times with some differences) 
> Is it faster to :
> 1) parallel refresh  600 time A, then 600 time B etc,
> OR
> 2) parallel refresh  600 time A,B,C,D,E,F
> 
> I guess 1) is faster because they are 600 access to same table loaded in memory ? But do parallel access to the same table implies concurency
>  and bad performance ?
> 
> Thanks
> 
> Nicolas PARIS
> 
> 
> 2014-04-07 12:29 GMT+02:00 Graeme B. Bell <grb(at)skogoglandskap(dot)no>:
> On 04 Apr 2014, at 18:29, Nicolas Paris <niparisco(at)gmail(dot)com> wrote:
> 
> > Hello,
> >
> > My question is about multiprocess and materialized View.
> > http://www.postgresql.org/docs/9.3/static/sql-creatematerializedview.html
> > I (will) have something like 3600 materialised views, and I would like to know the way to refresh them in a multithread way
> > (anderstand 8 cpu cores -> 8 refresh process  in the same time)
> 
> Hi Nick,
> 
> out of DB solution:
> 
> 1. Produce a text file which contains the 3600 refresh commands you want to run in parallel. You can do that with select and format() if you don't have a list already.
> 
> 2. I'm going to simulate your 3600 'refresh' commands here with some select and sleep statements that finish at unknown times.
> 
> (In BASH):
>   for i in {1..3600} ; do echo "echo \"select pg_sleep(1+random()::int*10); select $i\" | psql mydb" ; done > 3600commands
> 
> 3. Install Gnu Parallel     and type:
> 
> parallel < 3600commands
> 
> 4. Parallel will automatically work out the appropriate number of cores/threads for your CPUs, or you can control it manually with -j.
> It will also give you a live progress report if you use --progress.
> e.g. this command balances 8 jobs at a time, prints a dynamic progress report and dumps stdout to /dev/null
> 
> parallel -j 8 --progress  < 3600commands > /dev/null
> 
> 5. If you want to make debugging easier use the parameter --tag to tag output for each command.
> 
> Of course it would be much more elegant if someone implemented something like Gnu Parallel inside postgres or psql ... :-)
> 
> Hope this helps & have a nice day,
> 
> Graeme.
> 
> 
> 
> 
> 
> 
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Tatsuo Ishii | 2014-04-07 13:46:10 | Re: performance degradation after launching postgres cluster using pgpool-II | 
| Previous Message | Heikki Linnakangas | 2014-04-07 12:53:33 | Re: Batch update query performance |