Re: Threads

From: Greg Copeland <greg(at)CopelandConsulting(dot)Net>
To: shridhar_daithankar(at)persistent(dot)co(dot)in
Cc: "<PGHackers" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Threads
Date: 2003-01-07 16:06:12
Message-ID: 1041955572.17639.148.camel@mouse.copelandconsulting.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, 2003-01-07 at 02:00, Shridhar Daithankar wrote:
> On 6 Jan 2003 at 6:48, Greg Copeland wrote:
> > > 1) Get I/O time used fuitfully
> > AIO may address this without the need for integrated threading.
> > Arguably, from the long thread that last appeared on the topic of AIO,
> > some hold that AIO doesn't even offer anything beyond the current
> > implementation. As such, it's highly doubtful that integrated threading
> > is going to offer anything beyond what a sound AIO implementation can
> > achieve.
>
> Either way, a complete aio or threading implementation is not available on
> major platforms that postgresql runs. Linux definitely does not have one, last
> I checked.
>

There are two or three significant AIO implementation efforts currently
underway for Linux. One such implementation is available from the Red
Hat Server Edition (IIRC) and has been available for some time now. I
believe Oracle is using it. SGI also has an effort and I forget where
the other one comes from. Nonetheless, I believe it's going to be a
hard fought battle to get AIO implemented simply because I don't think
anyone, yet, can truly argue a case on the gain vs effort.

> If postgresql is not using aio or threading, we should start using one of them,
> is what I feel. What do you say?
>

I did originally say that I'd like to see an AIO implementation. Then
again, I don't current have a position to stand other than simply saying
it *might* perform better. ;) Not exactly a position that's going to
win the masses over.

> > was expecting something that we could actually pull the trigger with.
>
> That could be done.
>

I'm sure it can, but that's probably the easiest item to address.

> >
> > o Code isn't very portable. Looked fairly okay for pthread platforms,
> > however, there is new emphasis on the Win32 platform. I think it would
> > be a mistake to introduce something as significant as threading without
> > addressing Win32 from the get-go.
>
> If you search for "pthread" in thread.c, there are not many instances. Same
> goes for thread.h. From what I understand windows threading, it would be less
> than 10 minutes job to #ifdef the pthread related part on either file.
>
> It is just that I have not played with windows threading and nor I am inclined
> to...;-)
>

Well, the method above is going to create a semi-ugly mess. I've
written thread abstraction layers which cover OS/2, NT, and pthreads.
Each have subtle distinction. What really needs to be done is the
creation of another abstraction layer which your current code would sit
on top of. That way, everything contained within is clear and easy to
read. The big bonus is that as additional threading implementations
need to be added, only the "low-level" abstraction stuff needs to
modified. Done properly, each thread implementation would be it's own
module requiring little #if clutter.

As you can see, that's a fair amount of work and far from where the code
currently is.

> >
> > o I would desire a more highly abstracted/portable interface which
> > allows for different threading and synchronization primitives to be
> > used. Current implementation is tightly coupled to pthreads.
> > Furthermore, on platforms such as Solaris, I would hope it would easily
> > allow for plugging in its native threading primitives which are touted
> > to be much more efficient than pthreads on said platform.
>
> Same as above. If there can be two cases separated with #ifdef, there can be
> more.. But what is important is to have a thread that can be woken up as and
> when required with any function desired. That is the basic idea.
>

Again, there's a lot of work in creating a well formed abstraction layer
for all of the mechanics that are required. Furthermore, different
thread implementations have slightly different semantics which further
complicates things. Worse, some types of primitives are simply not
available with some thread implementations. That means those platforms
require it to be written from the primitives that are available on the
platform. Yet more work.

> > o Code is fairly trivial and does not address other primitives
> > (semaphores, mutexs, conditions, TSS, etc) portably which would be
> > required for anything but the most trivial of threaded work. This is
> > especially true in such an application where data IS the application.
> > As such, you must reasonably assume that threads need some form of
> > portable serialization primitives, not to mention mechanisms for
> > non-trivial communication.
>
> I don't get this. Probably I should post a working example. It is not threads
> responsibility to make a function thread safe which is changed on the fly. The
> function has to make sure that it is thread safe. That is altogether different
> effort..

You're right, it's not the thread's responsibility, however, it is the
threading toolkit's. In this case, you're offering to be the toolkit
which functions across two platforms, just for starters. Reasonably,
you should expect a third to quickly follow.

>
> > o Does not address issues such as thread signaling or status reporting.
>
> >From what I learnt from pthreads on linux, I would not mix threads and signals.
> One can easily add code in runner function that disables any signals for thread
> while the thread starts running. This would leave original signal handling
> mechanism in place.
>
> As far as status reporting is concerned, the thread sould be initiated while
> back-end starts and terminated with backend termination. What is about status
> reporting?
>
> > o Pool interface is rather simplistic. Does not currently support
> > concepts such as wake pool, stop pool, pool status, assigning a pool to
> > work, etc. In fact, it's not altogether obvious what the capabilities
> > intent is of the current pool implementation.
>
> Could you please elaborate? I am using same interface in c++ for a server
> application and never faced a problem like that..;-)
>
>
> > o Doesn't seem to address any form of thread communication facilities
> > (mailboxes, queues, etc).
>
> Not part of this abstraction of threading mechanism. Intentionally left out to
> keep things clean.
>
> > There are probably other things that I can find if I spend more than
> > just a couple of minutes looking at the code. Honestly, I love threads
> > but I can see that the current code offering is not much more than a
> > token in its current form. No offense meant.
>
> None taken. Point is it is useful and that is enough for me. If you could
> elaborate examples for any problems you see, I can probably modify it. (Code
> documentation is what I will do now)
>
> > After it's all said and done, I'd have to see a lot more meat before I'd
> > be convinced that threading is ready for PostgreSQL; from both a social
> > and technological perspective.
>
> Tell me about it..
>

Long story short, if PostgreSQL is to use threads, it shouldn't be
handicapped by having a very limited subset of functionality. With the
code that has been currently submitted, I don't believe you could even
effectively implement a parallel sort.

To get an idea of the types of things that would be needed, check out
the ACE Toolkit. There are a couple of other fairly popular toolkits as
well. Nonetheless, it's a significant effort and the current code is a
long ways off from being usable.

--
Greg Copeland <greg(at)copelandconsulting(dot)net>
Copeland Computer Consulting

In response to

  • Re: Threads at 2003-01-07 08:00:05 from Shridhar Daithankar

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2003-01-07 16:07:00 contrib/noupdate does not work and never has worked
Previous Message Magnus Naeslund(f) 2003-01-07 16:00:45 Fw: Error using cursors/fetch and execute