Quick Links

Re: PL/R etc.

From:	Mark Morgan Lloyd <markMLl(dot)pgsql-general(at)telemetry(dot)co(dot)uk>
To:	pgsql-general(at)PostgreSQL(dot)org
Subject:	Re: PL/R etc.
Date:	2013-05-10 20:22:45
Message-ID:	kmjkun$vdp$1@pye-srv-01.telemetry.co.uk
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general

Merlin Moncure wrote:
> On Fri, May 10, 2013 at 2:32 PM, Mark Morgan Lloyd
> <markMLl(dot)pgsql-general(at)telemetry(dot)co(dot)uk> wrote:
>> Merlin Moncure wrote:
>>> On Fri, May 10, 2013 at 4:41 AM, Mark Morgan Lloyd
>>> <markMLl(dot)pgsql-general(at)telemetry(dot)co(dot)uk> wrote:
>>>> I don't know whether anybody active on the list has R (and in particular
>>>> PL/R) experience, but just in case... :-)
>>>>
>>>> i) Something like APL can operate on an array with minimal regard for
>>>> index order, i.e. operations across the array are as easily-expressed and
>>>> as
>>>> efficient as operations down the array. Does this apply to PL/R?
>>>>
>>>> ii) Things like OpenOffice can be very inefficient if operating over a
>>>> table comprising a non-trivial number of rows. Does PL/R offer a
>>>> significant
>>>> improvement, e.g. by using a cursor rather than trying to read an entire
>>>> resultset into memory?
>>>
>>> pl/r (via R) very terse and expressive. it will probably meet or beat
>>> any performance expectations you have coming from openoffice. that
>>> said, it's definitely a memory bound language; typically problem
>>> solving involves stuffing data into huge data frames which then pass
>>> to the high level problem solving functions like glm.
>>>
>>> you have full access to sql within the pl/r function, so nothing is
>>> keeping you from paging data into the frame via a cursor, but that
>>> only helps so much.
>>>
>>> a lot depends on the specific problem you solve of course.
>>
>> Thanks Merlin and Joe. As an occasional APL user "terse and oppressive"
>> doesn't really bother me :-)
>>
>> As a particular example of the sort of thing I'm thinking, using "pure" SQL
>> the operation of summing the columns in each row and summing the rows in
>> each column are very different.
>>
>> In contrast, in APL if I have an array
>>
>> B
>> 1 2 3 4
>> 5 6 7 8
>> 9 10 11 12
>>
>> I can perform a reduction operation using + over whichever axis I specify:
>>
>> +/[1]B
>> 15 18 21 24
>> +/[2]B
>> 10 26 42
>>
>> or even by default
>>
>> +/B
>> 10 26 42
>>
>> Does PL/R provide that sort of abstraction in a uniform fashion?
>
> certainly (for example see here:
> http://stackoverflow.com/questions/13352180/sum-different-columns-in-a-data-frame)
> -- getting good at R can take some time but it's worth it. R is
> "hot" right now with all the buzz around big data lately. The main
> challenge actually is the language is so rich it can be difficult to
> zero in on the precise behaviors you need. Also, the documentation
> is all over the place.
>
> pl/r plays in nicely because with some thought you can marry the R
> analysis functions directly to the query in terms of both inputs and
> outputs -- basically very, very sweet syntax sugar. It's a little
> capricious though (and be advised: Joe has put up some very important
> and necessary fixes quite recently) so usually I work out the R code
> in the R console first before putting in the database.

[Peruse] Thanks, I think I get the general idea. I'm aware of the
significance of R, and in particular that it's attracting attention due
to the undesirability of hiding functionality in spreadsheets where
these usurped APL for certain types of operation.

--
Mark Morgan Lloyd
markMLl .AT. telemetry.co .DOT. uk

[Opinions above are the author's, not those of his employers or colleagues]

In response to

Re: PL/R etc. at 2013-05-10 19:46:15 from Merlin Moncure

Browse pgsql-general by date

	From	Date	Subject
Next Message	Bexley Hall	2013-05-10 21:11:26	Re: PG in cash till machines
Previous Message	Merlin Moncure	2013-05-10 19:46:15	Re: PL/R etc.