Re: Does people favor to have matrix data type?

From: Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>
To: Gavin Flower <GavinFlower(at)archidevsys(dot)co(dot)nz>, Joe Conway <mail(at)joeconway(dot)com>, Jim Nasby <Jim(dot)Nasby(at)BlueTreble(dot)com>, Ants Aasma <ants(dot)aasma(at)eesti(dot)ee>, Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Does people favor to have matrix data type?
Date: 2016-05-31 02:05:40
Message-ID: 9A28C8860F777E439AA12E8AEA7694F8011F7E14@BPXM15GP.gisp.nec.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> -----Original Message-----
> From: Gavin Flower [mailto:GavinFlower(at)archidevsys(dot)co(dot)nz]
> Sent: Tuesday, May 31, 2016 9:47 AM
> To: Kaigai Kouhei(海外 浩平); Joe Conway; Jim Nasby; Ants Aasma; Simon Riggs
> Cc: pgsql-hackers(at)postgresql(dot)org
> Subject: Re: [HACKERS] Does people favor to have matrix data type?
>
> On 31/05/16 12:01, Kouhei Kaigai wrote:
> >> On 05/29/2016 04:55 PM, Kouhei Kaigai wrote:
> >>> For the closer integration, it may be valuable if PL/R and PL/CUDA can exchange
> >>> the data structure with no serialization/de-serialization when PL/R code tries
> >>> to call SQL functions.
> >> I had been thinking about something similar. Maybe PL/R can create an
> >> extension within the R environment that wraps PL/CUDA directly or at the
> >> least provides a way to use a fast-path call. We should probably try to
> >> start out with one common use case to see how it might work and how much
> >> benefit there might be.
> >>
> > My thought is the second option above. If SPI interface supports fast-path
> > like 'F' protocol, it may become a natural way for other PLs also to
> > integrate SQL functions in other languages.
> >
> >>> IIUC, pg.spi.exec("SELECT my_function(...)") is the only way to call SQL functions
> >> inside PL/R scripts.
> >>
> >> Correct (currently).
> >>
> >> BTW, this is starting to drift off topic I think -- perhaps we should
> >> continue off list?
> >>
> > Some elements are common for PostgreSQL (matrix data type and fastpath SPI
> > interface). I like to keep the discussion on the list.
> > Regarding to the PoC on a particular use case, it might be an off-list
> > discussion.
> >
> > Thanks,
> > --
> > NEC Business Creation Division / PG-Strom Project
> > KaiGai Kohei <kaigai(at)ak(dot)jp(dot)nec(dot)com>
> >
> Possibly there should be two matrix types in Postgres: the first would
> be the default and optimized for small dense matrices, the second would
> store large sparse matrices efficiently in memory at the expensive of
> speed (possibly with one or more parameters relating to how sparse it is
> likely to be?) - for appropriate definitions 'small' & 'large', though
> memory savings for the latter type might not kick in unless the matrices
> are big enough (so a small sparse matrix might consume more memory than
> a nominally larger dense matrix type & a sparse matrix might have to be
> sufficiently sparse to make real memory savings at any size).
>
One idea in my mind is that a sparse matrix is represented as a grid
of a smaller matrixes, and omit all-zero area. It looks like indirect
pointer reference. The header of matrix type has offset values to
each grid. If offset == 0, it means this grid contains all-zero.

Due to performance reason, location of each element must be deterministic
without walking on the data structure. This approach guarantees we can
reach individual element with 2 steps.

A flat matrix can be represented as a special case of the sparse matrix.
If entire matrix is consists of 1x1 grid, it is a flat matrix.
We may not need to define two individual data types.

> Probably good to think of 2 types at the start, even if the only the
> first is implemented initially.
>
I agree.

Thanks,
--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei <kaigai(at)ak(dot)jp(dot)nec(dot)com>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2016-05-31 02:06:10 Re: IPv6 link-local addresses and init data type
Previous Message Andreas Karlsson 2016-05-31 01:44:34 Re: IPv6 link-local addresses and init data type