Re: Could use some advice on search architecture

From: Dorian Hoxha <dorian(dot)hoxha(at)gmail(dot)com>
To: Robin <robinstc(at)live(dot)co(dot)uk>
Cc: PostgreSql-general <pgsql-general(at)postgresql(dot)org>
Subject: Re: Could use some advice on search architecture
Date: 2014-04-19 12:23:54
Message-ID: CANsFX045_b24aP2HzU+9jH-TJWpr9F7R_8rg6Wnv7iK30V-ySQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Postgresql has 2 column store, 1-in memory(cant remember the name) and
http://www.citusdata.com/blog/76-postgresql-columnar-store-for-analytics

On Sat, Apr 19, 2014 at 2:10 PM, Robin <robinstc(at)live(dot)co(dot)uk> wrote:

> bottom post
> On 19/04/2014 12:46, R. Pasch wrote:
>
> On 19-4-2014 9:38, Robin wrote:
>
>
> Well, given that there are known limited attributes, this is the type of
> application that really really suits a column oriented database, such as
> Sybase IQ (now sold by SAP). Its a neat product that scales. Great
> performance with drag'n'drop analytics.
>
> Unless you can charm IQ out of SAP (it has been known to happen), you
> might have to look at some other techniques
>
> So consider some binary data representation
> Red - 1 (0000 0001)
> Orange - 2 (0000 0010)
> Yellow - 4 (0000 0100)
> Green - 8 (0000 1000)
> Blue - 16 (0001 0000)
> Indigo - 32 (0010 0000)
> Violet - 64 (0100 0000)
>
> This way, you can encode several colours in 1 value
> Red or Green or Indigo = 1 + 8 + 32 = 41 = 0010 1001
>
>
> Robin
>
>
> I stopped reading when I heard the word "sold by SAP" ;-) This project is
> solely build with open-source and freely available software.
>
> I've been thinking about using a binary data representation but didn't
> come to a solution to this specific problem quite yet. Per property of a
> product, only one bit would be 1 and the rest would be 0. What would a
> query look like to match all products that have a bit in the correct
> position?
>
> Say for instance these are a couple records (and yes, property values can
> be null as well)
>
> title, property1, property2, property3
> ================================
> product1, 0000 0001, 0000 0010, NULL
> product2, 0000 0100, 0100 0000, 0010 0000
> product3, 0010 0000, 0010 0000, 0100 0000
>
> Say that I would like to retrieve the products that either have property1
> as 0010 0000, 1000 000 or 0000 0001. Combined that would be 0010 1001 and
> would have to match product1 and product3 as they both have their
> individual bit matching one of the bits being asked for. What would a where
> statement look like using this type of binary representation?
>
> If that would be fairly simple to do and fast (most important factor) then
> I could do an OR construction on all property columns and have something
> count the amount of properties that actually matched. Is that something you
> can do with a binary operator of some sort as well? Count the amount of
> overlapping bits?
>
> Say for instance I have a binary value of 0110 0101 and another binary
> value of 1100 0100, how could I found out how many bits matched? (in this
> case the number of matching bits would be 2)
>
>
> I understand the reluctance to pay SAP-style rates, as a longtime DB user,
> I have learned some 'charm' techniques.
>
> However, I poked around a bit for alternatives, as I do like the
> column-oriented approach, and found something called - *MonetDB *<http://www.monetdb.com/Home/Features>-
> it apparently has a column-store db kernel, and is open source - I suggest
> you have a look, if it does what it says on the label, then it looks like a
> find.
>
> There is a discussion of bitmask-trickiness here also dealing with colours<http://postgresql.1045698.n5.nabble.com/Bitmask-trickiness-td1865258.html>
>
> Robin
> <http://postgresql.1045698.n5.nabble.com/Bitmask-trickiness-td1865258.html>
>
>

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Andy Colson 2014-04-19 15:04:16 Re: Could use some advice on search architecture
Previous Message Robin 2014-04-19 12:10:48 Re: Could use some advice on search architecture