Re: Air-traffic benchmark

From: Craig James <craig_james(at)emolecules(dot)com>
To: Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Lefteris <lsidir(at)gmail(dot)com>, Jochen Erwied <jochen(at)pgsql-performance(dot)erwied(dot)eu>, pgsql-performance(at)postgresql(dot)org
Subject: Re: Air-traffic benchmark
Date: 2010-01-07 16:31:25
Message-ID: 4B460C5D.7080307@emolecules.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Alvaro Herrera wrote:
> No amount of tinkering is going to change the fact that a seqscan is the
> fastest way to execute these queries. Even if you got it to be all in
> memory, it would still be much slower than the other systems which, I
> gather, are using columnar storage and thus are perfectly suited to this
> problem (unlike Postgres). The talk about "compression ratios" caught
> me by surprise until I realized it was columnar stuff. There's no way
> you can get such high ratios on a regular, row-oriented storage.

One of the "good tricks" with Postgres is to convert a very wide table into a set of narrow tables, then use a view to create something that looks like the original table. It requires you to modify the write portions of your app, but the read portions can stay the same.

A seq scan on one column will *much* faster when you rearrange your database this way since it's only scanning relevant data. You pay the price of an extra join on primary keys, though.

If you have just a few columns in a very wide table that are seq-scanned a lot, you can pull out just those columns and leave the rest in the wide table.

The same trick is also useful if you have one or a few columns that are updated frequently: pull them out, and use a view to recreate the original appearance. It saves a lot on garbage collection.

Craig

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Ludwik Dylag 2010-01-07 16:38:26 Re: Massive table (500M rows) update nightmare
Previous Message Matthew Wakeling 2010-01-07 16:23:40 Re: Air-traffic benchmark