From: | Dorian Hoxha <dorian(dot)hoxha(at)gmail(dot)com> |
---|---|
To: | Robin <robinstc(at)live(dot)co(dot)uk> |
Cc: | PostgreSql-general <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: Could use some advice on search architecture |
Date: | 2014-04-19 12:23:54 |
Message-ID: | CANsFX045_b24aP2HzU+9jH-TJWpr9F7R_8rg6Wnv7iK30V-ySQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Postgresql has 2 column store, 1-in memory(cant remember the name) and
http://www.citusdata.com/blog/76-postgresql-columnar-store-for-analytics
On Sat, Apr 19, 2014 at 2:10 PM, Robin <robinstc(at)live(dot)co(dot)uk> wrote:
> bottom post
> On 19/04/2014 12:46, R. Pasch wrote:
>
> On 19-4-2014 9:38, Robin wrote:
>
>
> Well, given that there are known limited attributes, this is the type of
> application that really really suits a column oriented database, such as
> Sybase IQ (now sold by SAP). Its a neat product that scales. Great
> performance with drag'n'drop analytics.
>
> Unless you can charm IQ out of SAP (it has been known to happen), you
> might have to look at some other techniques
>
> So consider some binary data representation
> Red - 1 (0000 0001)
> Orange - 2 (0000 0010)
> Yellow - 4 (0000 0100)
> Green - 8 (0000 1000)
> Blue - 16 (0001 0000)
> Indigo - 32 (0010 0000)
> Violet - 64 (0100 0000)
>
> This way, you can encode several colours in 1 value
> Red or Green or Indigo = 1 + 8 + 32 = 41 = 0010 1001
>
>
> Robin
>
>
> I stopped reading when I heard the word "sold by SAP" ;-) This project is
> solely build with open-source and freely available software.
>
> I've been thinking about using a binary data representation but didn't
> come to a solution to this specific problem quite yet. Per property of a
> product, only one bit would be 1 and the rest would be 0. What would a
> query look like to match all products that have a bit in the correct
> position?
>
> Say for instance these are a couple records (and yes, property values can
> be null as well)
>
> title, property1, property2, property3
> ================================
> product1, 0000 0001, 0000 0010, NULL
> product2, 0000 0100, 0100 0000, 0010 0000
> product3, 0010 0000, 0010 0000, 0100 0000
>
> Say that I would like to retrieve the products that either have property1
> as 0010 0000, 1000 000 or 0000 0001. Combined that would be 0010 1001 and
> would have to match product1 and product3 as they both have their
> individual bit matching one of the bits being asked for. What would a where
> statement look like using this type of binary representation?
>
> If that would be fairly simple to do and fast (most important factor) then
> I could do an OR construction on all property columns and have something
> count the amount of properties that actually matched. Is that something you
> can do with a binary operator of some sort as well? Count the amount of
> overlapping bits?
>
> Say for instance I have a binary value of 0110 0101 and another binary
> value of 1100 0100, how could I found out how many bits matched? (in this
> case the number of matching bits would be 2)
>
>
> I understand the reluctance to pay SAP-style rates, as a longtime DB user,
> I have learned some 'charm' techniques.
>
> However, I poked around a bit for alternatives, as I do like the
> column-oriented approach, and found something called - *MonetDB *<http://www.monetdb.com/Home/Features>-
> it apparently has a column-store db kernel, and is open source - I suggest
> you have a look, if it does what it says on the label, then it looks like a
> find.
>
> There is a discussion of bitmask-trickiness here also dealing with colours<http://postgresql.1045698.n5.nabble.com/Bitmask-trickiness-td1865258.html>
>
> Robin
> <http://postgresql.1045698.n5.nabble.com/Bitmask-trickiness-td1865258.html>
>
>
From | Date | Subject | |
---|---|---|---|
Next Message | Andy Colson | 2014-04-19 15:04:16 | Re: Could use some advice on search architecture |
Previous Message | Robin | 2014-04-19 12:10:48 | Re: Could use some advice on search architecture |