From: | Alexy Khrabrov <deliverable(at)gmail(dot)com> |
---|---|
To: | Craig James <craig_james(at)emolecules(dot)com> |
Cc: | pgsql-performance(at)postgresql(dot)org |
Subject: | Re: two memory-consuming postgres processes |
Date: | 2008-05-02 21:09:52 |
Message-ID: | C25E5DBA-B649-4483-B1A2-2A881522B891@gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
On May 2, 2008, at 2:02 PM, Craig James wrote:
> On Fri, May 2, 2008 at 2:26 PM, Alexy Khrabrov
> <deliverable(at)gmail(dot)com> wrote:
>> I naively thought that if I have a 100,000,000 row table, of the form
>> (integer,integer,smallint,date), and add a real coumn to it, it
>> will scroll
>> through the memory reasonably fast.
>
> In Postgres, an update is the same as a delete/insert. That means
> that changing the data in one column rewrites ALL of the columns for
> that row, and you end up with a table that's 50% dead space, which
> you then have to vacuum.
>
> Sometimes if you have a "volatile" column that goes with several
> "static" columns, you're far better off to create a second table for
> the volatile data, duplicating the primary key in both tables. In
> your case, it would mean the difference between 10^8 inserts of
> (int, float), very fast, compared to what you're doing now, which is
> 10^8 insert and 10^8 deletes of (int, int, smallint, date, float),
> followed by a big vacuum/analyze (also slow).
>
> The down side of this design is that later on, it requires a join to
> fetch all the data for each key.
>
> You do have a primary key on your data, right? Or some sort of index?
I created several indices for the primary table, yes. Sure I can do a
table for a volatile column, but then I'll have to create a new such
table for each derived column -- that's why I tried to add a column to
the existing table. Yet seeing this is really slow, and I need to to
many derived analyses like this -- which are later scanned in other
computations, so should persist -- I indeed see no other way but to
procreate derived tables with the same key, one column per each...
Cheers,
Alexy
From | Date | Subject | |
---|---|---|---|
Next Message | Greg Smith | 2008-05-02 21:23:45 | Re: two memory-consuming postgres processes |
Previous Message | Ernesto | 2008-05-02 21:07:58 | Very slow INFORMATION_SCHEMA |