From: | Benjamin Arai <me(at)benjaminarai(dot)com> |
---|---|
To: | Joshua D(dot)Drake <jd(at)commandprompt(dot)com> |
Cc: | Brandon Shalton <brandon(at)cydataservices(dot)com>, PostgreSQL <pgsql-general(at)postgresql(dot)org>, pgsql-performance(at)postgresql(dot)org |
Subject: | Re: [PERFORM] Partioning tsearch2 a table into chunks and accessing via views |
Date: | 2007-08-25 18:12:54 |
Message-ID: | 29A8F3CE-03F9-4E92-A354-6F4FC3C810F4@benjaminarai.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general pgsql-performance |
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
As stated in the previous email if I use partitioning then queries
will be executed sequentially - i.e., instead of log(n) it would be
(# partitions) * log(n). Right?
Benjamin
On Aug 25, 2007, at 9:18 AM, Joshua D. Drake wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Benjamin Arai wrote:
>> This kind of disappointing, I was hoping there was more that could
>> be done.
>>
>> There has to be another way to do incremental indexing without
>> loosing
>> that much performance.
>
> What makes you think you are loosing performance by using
> partitioning?
>
> Joshua D. Drake
>
>>
>> Benjamin
>>
>> On Aug 24, 2007, at 6:29 PM, Joshua D. Drake wrote:
>>
>>> -----BEGIN PGP SIGNED MESSAGE-----
>>> Hash: SHA1
>>
>>> Brandon Shalton wrote:
>>>> Benjamin,
>>>>
>>>>
>>>>>
>>>>> In order to avoid the re-indexing I was thinking of instead
>>>>> creating
>>>>> a new
>>>>> table each month (building its indexes and etc) and accessing
>>>>> them all
>>>>> through a view. This way I only have to index the new data each
>>>>> month.
>>>>>
>>>>
>>>> Take a look at bizgres.org (based on postgres).
>>>>
>>>> They have a parent-child structure.
>>>>
>>>> The way i use it, is I have about 30M records a day that are
>>>> inserted
>>>> into the database.
>>>>
>>>> Each day is a "child" table to the "parent".
>>>>
>>>> so example:
>>>>
>>>> the parent table is called "logfile"
>>>>
>>>> each day, is a child, with the structure like "logfile_YYMMDD"
>>>>
>>>> the "child" inherits the table structure of the parent, such
>>>> that you
>>>> could query the child table name directly, or you run the query
>>>> against
>>>> the parent (ie. logfile table) and get all the data.
>>>>
>>>> the indexes are done on a per table basis, so new data that
>>>> comes in, is
>>>> a lesser amount, and doesn't require re-indexing.
>>
>>
>>> PostgreSQL can do all of this too.
>>
>>> Sincerely,
>>
>>> Joshua D. Drake
>>
>>>>
>>>>
>>>> example:
>>>>
>>>> select * from logfile_070825 where datafield = 'foo'
>>>>
>>>> if i knew i wanted to specifically go into that child, or:
>>>>
>>>> select * from logfile where datafield = 'foo'
>>>>
>>>> and all child tables are searched and results merged. You can
>>>> perform
>>>> any kind of sql query and field structures are you normally do.
>>>>
>>>> the downside is that the queries are run sequentially.
>>>>
>>>> so if you had 100 child tables, each table is queried via
>>>> indexes, then
>>>> results are merged.
>>>>
>>>> but, this approach does allow me to dump alot of data in,
>>>> without having
>>>> the re-indexing issues you are facing.
>>>>
>>>> at some point, you could roll up the days, in to weekly child
>>>> tables,
>>>> then monthly tables, etc.
>>>>
>>>> I believe Bizgres has a new version of their system that does
>>>> parallel
>>>> queries which would certainly speed things up.
>>>>
>>>> For your documents, you can do it by the day it was checked in,
>>>> or maybe
>>>> you have some other way of logically grouping, but the parent/child
>>>> table structure really helped to solve my problem of adding in
>>>> millions
>>>> of records each day.
>>>>
>>>> The closest thing in mysql is using merge tables, which is not
>>>> really
>>>> practical when it comes time to do the joins to the tables.
>>>>
>>>> -brandon
>>>>
>>>> http://www.t3report.com - marketing intelligence for online
>>>> marketing
>>>> and affiliate programs
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> ---------------------------(end of
>>>> broadcast)---------------------------
>>>> TIP 6: explain analyze is your friend
>>>>
>>
>>
>>> - --
>>
>>> === The PostgreSQL Company: Command Prompt, Inc. ===
>>> Sales/Support: +1.503.667.4564 24x7/Emergency: +1.800.492.2240
>>> PostgreSQL solutions since 1997 http://www.commandprompt.com/
>>> UNIQUE NOT NULL
>>> Donate to the PostgreSQL Project: http://www.postgresql.org/about/
>>> donate
>>> PostgreSQL Replication: http://www.commandprompt.com/products/
>>
>>>
>
>
> - ---------------------------(end of
> broadcast)---------------------------
> TIP 6: explain analyze is your friend
>
>
>
> - --
>
> === The PostgreSQL Company: Command Prompt, Inc. ===
> Sales/Support: +1.503.667.4564 24x7/Emergency: +1.800.492.2240
> PostgreSQL solutions since 1997 http://www.commandprompt.com/
> UNIQUE NOT NULL
> Donate to the PostgreSQL Project: http://www.postgresql.org/about/
> donate
> PostgreSQL Replication: http://www.commandprompt.com/products/
>
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.6 (GNU/Linux)
> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
>
> iD8DBQFG0FZKATb/zqfZUUQRAsfRAJ4mjQP+1ltG7pqLFQ+Ru52LA5e7XACcDqKr
> PIihth2x3gx3qTEI8WfWNjo=
> =AhJx
> -----END PGP SIGNATURE-----
>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (Darwin)
iQIVAwUBRtBxK/yqRf6YpodNAQKVkhAAgF4DaXeMxplX1EUXZMuw9aqr+75NxNcp
ZOCJPSFN0jwzY3MlFCRVjL1kzXmRJB4L3fE2xVQX9reY62TPfYC8m/xatey1X6nc
RdfNb9IzL6OyAghcpnUnwYntQtmGRpJtS7LQrx/SiDz8LWIp2S5v3Q9S8alKNTUS
FupCNy1bL3yJf9tySSvol6JSH2edVt8f48J1j03f5B9zh+G/rKrQ+muuKOHyU3mb
cVJ+gbSWCesuo+9rfaJ24m2ODwZm/YA+ENhlc3EOvD8z+cYn2OjuvAqvHABRsEKe
+E9NWBPK/7UT4/T4B/LcBW1B6VISFqyETkwe2fhY5kVZnF+f0KtQIxXh/9qMsnnh
tWthI9YmG4MIBmCsJwdneABHdfMJDp8IlawXqMlX4VkPHUrUtiQV/oDNsHMrU8BM
SZOK5m0ADgXk0rndkEWXhERsyuFaocFj+snvaJEVH9PJSDVgjo7EMW5Qfo6p3NFg
ujBurhLaSuj52vClbdOs3lYp0Drbuf9iQnot3pD4XsCKAOTQm3S7BvgKMd5FUHLX
HBFn4KiSRGx7hwlrss4rjqJ8BoJKbtvGxyNSiwZkrAOke+gqEML6pPdvlAj3Dif8
KrsKcEu/cuR8euqX9IYCZIw4GYLqgs3mewfQIt5bSfw3yHvFyOgolyUeYfnYYlbr
+u145pL2KZc=
=T4dg
-----END PGP SIGNATURE-----
From | Date | Subject | |
---|---|---|---|
Next Message | Joshua D. Drake | 2007-08-25 18:18:33 | Re: [PERFORM] Partioning tsearch2 a table into chunks and accessing via views |
Previous Message | Shane Ambler | 2007-08-25 17:31:27 | Re: Bigtime scaling of Postgresql (cluster and stuff I suppose) |
From | Date | Subject | |
---|---|---|---|
Next Message | Joshua D. Drake | 2007-08-25 18:18:33 | Re: [PERFORM] Partioning tsearch2 a table into chunks and accessing via views |
Previous Message | Luke Lonergan | 2007-08-25 16:56:42 | FW: was [PERFORM] partitioned table and ORDER BY indexed_field DESC LIMIT 1 |