Quick Links

Re:

From:	Peter Geoghegan <pg(at)heroku(dot)com>
To:	Robert Haas <robertmhaas(at)gmail(dot)com>
Cc:	Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Corey Huinker <corey(dot)huinker(at)gmail(dot)com>
Subject:	Re:
Date:	2016-09-27 14:42:49
Message-ID:	CAM3SWZRc1ik5O0ZLEDL-Mg2d8bM2G4Y=e+RpAUy=y82HOP8C1g@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Tue, Sep 27, 2016 at 3:31 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> You seem to have erased the subject line from this email somehow.

I think that that was Gmail. Maybe that's new? Generally, I have to go
out of my way to change the subject line, so it seems unlikely that I
fat-fingered it. I wish that they'd stop changing things...

> On Tue, Sep 27, 2016 at 10:18 AM, Peter Geoghegan <pg(at)heroku(dot)com> wrote:
> No, a shm_mq is not a mutual exclusion mechanism. It's a queue!
>
> A data dependency and a lock aren't the same thing. If your point in
> saying "we need something like an LWLock" was actually "the leader
> will have to wait if it's merged all tuples from a worker and no more
> are immediately available" then I think that's a pretty odd way to
> make that point. To me, the first statement appears to be false,
> while the second is obviously true.

Okay. Sorry for being unclear.

> OK, well I'm not taking any position on whether what Heikki is
> proposing will turn out to be good from a performance point of view.
> My intuitions about sorting performance haven't turned out to be
> particularly good in the past. I'm only saying that if you do decide
> to queue the tuples passing from worker to leader in a shm_mq, you
> shouldn't read from the shm_mq objects in a round robin, but rather
> read multiple tuples at a time from the same worker whenever that is
> possible without blocking. If you read tuples from workers one at a
> time, you get a lot of context-switch thrashing, because the worker
> wakes up, writes one tuple (or even part of a tuple) and immediately
> goes back to sleep. Then it shortly thereafter does it again.
> Whereas if you drain the worker's whole queue each time you read from
> that queue, then the worker can wake up, refill it completely, and go
> back to sleep again. So you incur fewer context switches for the same
> amount of work. Or at least, that's how it worked out for Gather, and
> what I am saying is that it will probably work out the same way for
> sorting, if somebody chooses to try to implement this.

Maybe. This would involve the overlapping of multiple worker merges
that are performed in parallel with a single leader merge. I don't
think it would be all that great. There would be a trade-off around
disk bandwidth utilization too, so I'm particularly doubtful that the
complexity would pay for itself. But, of course, I too have been wrong
about sorting performance characteristics in the past, so I can't be
sure.

--
Peter Geoghegan

In response to

Re: at 2016-09-27 14:31:53 from Robert Haas

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Stas Kelvich	2016-09-27 14:45:59	assert violation in logical messages serialization
Previous Message	Robert Haas	2016-09-27 14:39:06	Re: Detect supported SET parameters when pg_restore is run