Quick Links

Re: GSoC - Idea Discussion

From:	Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To:	hitesh ramani <hiteshramani(at)hotmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: GSoC - Idea Discussion
Date:	2015-03-19 21:31:19
Message-ID:	550B4027.3040207@2ndquadrant.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On 03/19/15 21:41, hitesh ramani wrote:
> Hello Tomas,
>
>
> > Could you please elaborate more why to choose CUDA, a nvidia-only
> > technology, rather than OpenCL, supported by much wider range of
> > companies and projects? Why do you consider OpenCL unsuitable?
> >
> > Not that CUDA is bad - it certainly works better in some scenarios, but
> > this is a cost/benefits question, and it only works with devices
> > manufactured by a single company. That significantly limits the
> > usefulness of the work, IMHO.
>
>
> I will never say OpenCL is unsuitable, I just meant, as per the research
> I did, CUDA came out with better results. I do agree OpenCL is also a
> great tool to exploit the power of GPUs. My aim is to enhance the
> performance using CUDA, though OpenCL implementation might work great too!

My point was that using open standards and frameworks (OpenCL) has much
higher chance of being welcomed by the community of open source
projects, compared to proprietary technologies like CUDA.

>
> > You mention that you ran into issues with PG Strom. What issues?
>
> While I was trying to compile, I ran into the error "src/main.c:27:29:
> fatal error: utils/ruleutils.h: No such file or directory", when I did
> make to the branch of Postgres suggested in the description, i.e the
> custom_join branch, I still ran into the same issue. Moreover, I
> couldn't locate the file.

That's strange, and you should probably ask people on the PG Strom
projects. Haven't tried PG Strom for a long time, but the compilation
worked fine some time ago.

>
> > Can we see some examples, what this actually means? What you can and
> > can't do at this point, etc.? Can you share some numbers how this
> > improves the performance?
>
> I did some benchmarking on quicksort for 1M random numbers(range 0 to
> 0xffffff) on GPU and CPU, the results showed enhancement of 700% on the GPU.

So you've created an array of 1M integers, and it's 7x faster on GPU
compared to pg_qsort(), correct?

Well, it might surprise you, but PostgreSQL almost never sorts numbers
like this. PostgreSQL sorts tuples, which is way more complicated and,
considering the variable length of tuples (causing issues with memory
access), rather unsuitable for GPU devices. I might be missing
something, of course.

Also, it often needs additional information, like collations when
sorting by a text field, for example.

> What this means and what I can do at this point - My aim was to
> integrate CUDA with Postgres so that I can make a call to the GPU for
> sorting operation. To start, I made a simple CUDA hello world program,
> and edited the code to call it from qsort, ran into name mangling
> issues, so sorted that out by creating 2 different .h files one for CUDA
> program and for the call I made from qsort. Finally, edited the make
> file to compile the CUDA program with the Postgres compilation itself
> and now when I compile my Postgres code, the CUDA file gets compiled too
> and prints the needed on the server end.

Why don't you show us the source code? Would be simpler than explaining
what it does.

>
> What I still haven't done - I still haven't actually enhanced the
> sorting yet, I'm still analyzing the code, how to tinkle with it, the
> right approach.
>

I'd recommend discussing the code here. It's certainly quite complex,
especially if this is your first encounter with it.

>
> > That's really difficult to judge, because you have not provided any
> > source code, examples or anything else to support this.
> >
> > >
> > > Please give in your valuable suggestions and views on this.
> >
> > From where I sit, this looks interesting, but rather as a research
> > project rather than something than can be integrated into PostgreSQL in
> > a foreseeable future. Not sure that's what GSoC is intended for.
> >
> > Also, we badly need more details on this - current status, examples, and
> > especially project plan explaining the scope. It's impossible to say
> > whether the sort can be implemented within the GSoC time frame.
>
> What I actually see it is as is to be a branch of Postgres which has
> CUDA compatible features. I wanted to start it by sorting which can

I find it very unlikely that this project will choose something that is
intended as a fork.

> further be improved. To be honest, I'm still analyzing the sort code
> for elements above a million integer elements(in a single row, for
> now) so that the use of GPUs is actually significant. As I saw,
> Postgres uses external sort for that.

PostgreSQL uses adaptive sort - in-memory when it fits into work_mem,
on-disk when it does not. This is decided at runtime.

You'll have to do the same thing, because the amount of memory available
on GPUs is limited to a few GBs, and it needs to work for datasets
exceeding that limit (the amount of data is uncertain at planning time).

>
> If you feel this isn't feasible in such a time span, I would love to
> hear any suggestion for any small function which can leverage off by
> parallelism.

I honestly don't know.

--
Tomas Vondra http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Re: GSoC - Idea Discussion at 2015-03-19 20:41:45 from hitesh ramani

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Alvaro Herrera	2015-03-19 21:59:20	Re: "cancelling statement due to user request error" occurs but the transaction has committed.
Previous Message	hitesh ramani	2015-03-19 20:41:45	Re: GSoC - Idea Discussion