Re: perf problem with huge table

From: Leo Mannhart <leo(dot)mannhart(at)beecom(dot)ch>
To:
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: perf problem with huge table
Date: 2010-02-11 14:39:01
Message-ID: 4B741685.5070700@beecom.ch
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Dave Crooke wrote:
> On Wed, Feb 10, 2010 at 5:30 PM, Jon Lewison <jlewison1(at)gmail(dot)com
> <mailto:jlewison1(at)gmail(dot)com>> wrote:
>
>
>
> Just a nit, but Oracle implements MVCC. 90% of the databases out
> there do.
>
>
> Sorry, I spoke imprecisely. What I meant was the difference in how the
> rows are stored internally .... in Oracle, the main tablespace contains
> only the newest version of a row, which is (where possible) updated in
> place - queries in a transaction that can still "see" an older version
> have to pull it from the UNDO tablespace (rollback segments in Oracle 8
> and older).
>
> In Postgres, all versions of all rows are in the main table, and have
> validity ranges associated with them ("this version of this row existed
> between transaction ids x and y"). Once a version goes out of scope, it
> has to be garbage collected by the vacuuming process so the space can be
> re-used.
>
> In general, this means Oracle is faster *if* you're only doing lots of
> small transactions (consider how these different models handle an update
> to a single field in a single row) but it is more sensitive to the scale
> of transactions .... doing a really big transaction against a database
> with an OLTP workload can upset Oracle's digestion as it causes a lot of
> UNDO lookups, PG's performance is a lot more predictable in this regard.
>
> Both models have benefits and drawbacks ... when designing a schema for
> performance it's important to understand these differences.
>
>
> I find partitioning pretty useful in this scenario if the data
> allows is. Aging out data just means dropping a partition rather
> than a delete statement.
>
>
> Forgot to say this - yes, absolutely agree .... dropping a table is a
> lot cheaper than a transactional delete.
>
> In general, I think partitioning is more important / beneficial with
> PG's style of MVCC than with Oracle or SQL-Server (which I think is
> closer to Oracle than PG).

I would like to disagree here a little bit

Where Oracle's table partitioning is coming in very handy is for example
when you have to replace the data of a big (read-only) table on a
regularly basis (typically the replicated data from another system).
In this case, you just create a partitioned table of exact the same
columns/indexes whatsoever as the data table.

To load, you then do load the data into the partitioned table, i.e.
- truncate the partitioned table, disable constraints, drop indexes
- load the data into the partitioned table
- rebuild all indexes etc. on the partitioned table

during all this time (even if it takes hours) the application can still
access the data in the data table without interfering the bulk load.

Once you have prepared the data in the partitioned table, you
- exchange the partition with the data table
wich is a dictionary operation, that means, the application is (if ever)
only blocked during this operation which is in the sub-seconds range.

If you have to do this with convetional updates or deletes/inserts resp.
then this might not even be possible in the given timeframe.

just as an example
Leo

p.s. just to make it a little bit clearer about the famous ORA-01555:
Oracle is not "forgetting" the data as the Oracle RDBMS is of course
also ACID-compliant. The ORA-01555 can happen

- when the rollback tablespace is really to small to hold all the data
changed in the transaction (which I consider a configuration error)

- when a long running (read) transaction is trying to change a record
which is already updated AND COMMITTED by another transaction. The key
here is, that a second transaction has changed a record which is also
needed by the first transaction and the second transaction commited the
work. Committing the change means, the data in the rollback segment is
no longer needed, as it can be read directly from the data block (after
all it is commited and this means valid and visible to other
transactions). If the first transaction now tries to read the data from
the rollback segment to see the unchanged state, it will still succeed
(it is still there, nothing happend until now to the rollback segment).
The problem of the ORA-01555 shows up only, if now a third transaction
needs space in the rollback segment. As the entry from the first/second
transaction is marked committed (and therefore no longer needed), it is
perfectly valid for transaction #3 to grab this rollback segment and to
store its old value there. If THEN (and only then) comes transaction #1
again, asking for the old, unchanged value when the transaction started,
THEN the famous ORA-01555 is raised as this value is now overwritten by
transaction #3.
Thats why in newer versions you have to set the retention time of the
rollback blocks/segments to a value bigger than your expected longest
transaction. This will decrease the likelihood of the ORA-01555
drastically (but it is still not zero, as you could easily construct an
example where it still will fail with ORA-0155 as a transaction can
still run longer than you expected or the changed data is bigger the the
whole rollback tablespace)

>
>
> Cheers
> Dave
>
>
>

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Scott Marlowe 2010-02-11 16:12:08 Re: moving pg_xlog -- yeah, it's worth it!
Previous Message Robert Haas 2010-02-11 13:29:52 Re: Re: 512,600ms query becomes 7500ms... but why? Postgres 8.3 query planner quirk?