Re: Problem with 11 M records table

From: Craig James <craig_james(at)emolecules(dot)com>
To: idc danny <idcdanny(at)yahoo(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: Problem with 11 M records table
Date: 2008-05-13 17:57:08
Message-ID: 4829D674.1060801@emolecules.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

idc danny wrote:
> Hi James,
>
> Than you for your response.
>
> What I want to achieve is to give to the application
> user 10k rows where the records are one after another
> in the table, and the application has a paginating GUI
> ("First page", "Previous page", "Next page", "Last
> page" - all links & "Jump to page" combobox) where
> thsi particular query gets to run if the user clicks
> on the "Last page" link.
> The application receive the first 10k rows in under a
> second when the user clicks on "First page" link and
> receive the last 10k rows in about 60 seconds when he
> clicks on "Last page" link.

You need a sequence that automatically assigns an ascending "my_rownum" to each row as it is added to the table, and an index on that my_rownum column. Then you select your page by (for example)

select * from my_table where my_rownum >= 100 and id < 110;

That will do what you want, with instant performance that's linear over your whole table.

If your table will have deletions, then you have to update the row numbering a lot, which will cause you terrible performance problems due to the nature of the UPDATE operation in Postgres. If this is the case, then you should keep a separate table just for numbering the rows, which is joined to your main table when you want to retrieve a "page" of data. When you delete data (which should be batched, since this will be expensive), then you truncate your rownum table, reset the sequence that generates your row numbers, then regenerate your row numbers with something like "insert into my_rownum_table (select id, nextval('my_rownum_seq') from my_big_table)". To retrieve a page, just do "select ... from my_table join my_rownum_table on (...)", which will be really fast since you'll have indexes on both tables.

Note that this method requires that you have a primary key, or at least a unique column, on your main table, so that you have something to join with your row-number table.

Craig

Browse pgsql-performance by date

  From Date Subject
Next Message Scott Marlowe 2008-05-13 18:02:20 Re: Problem with 11 M records table
Previous Message Craig James 2008-05-13 17:17:04 Re: Problem with 11 M records table