From: | Peter Geoghegan <pg(at)heroku(dot)com> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Corey Huinker <corey(dot)huinker(at)gmail(dot)com> |
Subject: | Re: |
Date: | 2016-09-27 14:42:49 |
Message-ID: | CAM3SWZRc1ik5O0ZLEDL-Mg2d8bM2G4Y=e+RpAUy=y82HOP8C1g@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Tue, Sep 27, 2016 at 3:31 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> You seem to have erased the subject line from this email somehow.
I think that that was Gmail. Maybe that's new? Generally, I have to go
out of my way to change the subject line, so it seems unlikely that I
fat-fingered it. I wish that they'd stop changing things...
> On Tue, Sep 27, 2016 at 10:18 AM, Peter Geoghegan <pg(at)heroku(dot)com> wrote:
> No, a shm_mq is not a mutual exclusion mechanism. It's a queue!
>
> A data dependency and a lock aren't the same thing. If your point in
> saying "we need something like an LWLock" was actually "the leader
> will have to wait if it's merged all tuples from a worker and no more
> are immediately available" then I think that's a pretty odd way to
> make that point. To me, the first statement appears to be false,
> while the second is obviously true.
Okay. Sorry for being unclear.
> OK, well I'm not taking any position on whether what Heikki is
> proposing will turn out to be good from a performance point of view.
> My intuitions about sorting performance haven't turned out to be
> particularly good in the past. I'm only saying that if you do decide
> to queue the tuples passing from worker to leader in a shm_mq, you
> shouldn't read from the shm_mq objects in a round robin, but rather
> read multiple tuples at a time from the same worker whenever that is
> possible without blocking. If you read tuples from workers one at a
> time, you get a lot of context-switch thrashing, because the worker
> wakes up, writes one tuple (or even part of a tuple) and immediately
> goes back to sleep. Then it shortly thereafter does it again.
> Whereas if you drain the worker's whole queue each time you read from
> that queue, then the worker can wake up, refill it completely, and go
> back to sleep again. So you incur fewer context switches for the same
> amount of work. Or at least, that's how it worked out for Gather, and
> what I am saying is that it will probably work out the same way for
> sorting, if somebody chooses to try to implement this.
Maybe. This would involve the overlapping of multiple worker merges
that are performed in parallel with a single leader merge. I don't
think it would be all that great. There would be a trade-off around
disk bandwidth utilization too, so I'm particularly doubtful that the
complexity would pay for itself. But, of course, I too have been wrong
about sorting performance characteristics in the past, so I can't be
sure.
--
Peter Geoghegan
From | Date | Subject | |
---|---|---|---|
Next Message | Stas Kelvich | 2016-09-27 14:45:59 | assert violation in logical messages serialization |
Previous Message | Robert Haas | 2016-09-27 14:39:06 | Re: Detect supported SET parameters when pg_restore is run |