Re: Google Summer of code 2013

From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Karel K(dot) Rozhoň <karel(dot)rozhon(at)gmail(dot)com>
Cc: pgsql-students(at)postgresql(dot)org
Subject: Re: Google Summer of code 2013
Date: 2013-04-15 16:27:45
Message-ID: 20130415162745.GT4361@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-students

* Karel K. Rozhoň (karel(dot)rozhon(at)gmail(dot)com) wrote:
> Of course I don't see all aspects of this problem, so I cannot tell what should be good for future. But I have done some profiles of group by select and I believe, parallel calling of some hash procedures could help.

There seems to be some confuison here. It's certainly true that *many*
(most? all?) pieces of query processing would benefit from parallel
execution; there is no debate on that.

The issue is that PG is not currently set up to do *any* per-query
parallel processing and it is *not* a trival thing to change that. We
can talk all day about how wonderful it'd be to do parallel hashing,
parallel sorting, etc, but until PG has a way to parallelize query
processing, there's really no point to writing code to parallelize
individual nodes.

> Of course I know, these simply case is only teoretical and in real tables are data much more complicated, but as I can see, almost 40% of CPU time was computed only one hash function: hash_search_with_hash_value.

Improvements to that would be great, but you can't simply call
pthread_create() in a PG backend and expect things to work.

Thanks,

Stephen

In response to

Browse pgsql-students by date

  From Date Subject
Next Message viod 2013-04-15 16:45:14 Re: Google Summer of code 2013
Previous Message David Fetter 2013-04-15 15:55:58 Re: Google Summer of code 2013