Re: Performance problem with row count trigger

From: "Greg Sabino Mullane" <greg(at)turnstep(dot)com>
To: tonyceb(at)andrew(dot)cmu(dot)edu
Cc: pgsql-sql(at)postgresql(dot)org
Subject: Re: Performance problem with row count trigger
Date: 2009-04-06 14:29:28
Message-ID: 948fef7e99f6ecc30c0abd802f530fe4@biglumber.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql


-----BEGIN PGP SIGNED MESSAGE-----
Hash: RIPEMD160

> A separate table just for that one column? Would that really help,
> given that I'd have to add the foreign key dataset_id to the related
> table? How does splitting activity across dataset and, say,
> dataset_counts help things?

Well, it reduce the size (and bloat) of the main table and let's you offload
the indexes as well. (especially important for pre-HOT systems). It's also
a win if there are many queries against the main table that *don't* hit the
summary count column. The cost of another table join for queries that *do* hit
it is probably offset by keeping the main table small and only updated when it
really needs to be. Still, it depends a lot on your particular circumstances;
the thread was started because of the pain of updating this one column, but
only you can make the final call about whether a separate table is a Good Idea
or a Silly Microoptimization.

>> 2) Do you really need bigint for the counts?

> Probably not. Still, changing to INTEGER hasn't changed the performance
> in any measurable way.

I was simply thinking about the amount of space used here, not speed. Of course,
if there is any conceivable way that the amounts in question would *ever* exceed
two billion, you should keep it BIGINT, as changing it later would be painful.

> I'm not sure I understand the danger you're talking about here. Doesn't
> putting the whole start_bulkload_assoc_count(), bulk insert, and
> end_bulkload_assoc_count() process in a transaction save me from any of
> these problems? Or is there some more subtle problem I'm not seeing?

No, that would work perfectly fine. The danger I was referring to was someone
calling the first function and then committing before they remembered
to call the second one. It pays to be paranoid around databases :), but if
you have control of the environment, and that scenario seems unlikely,
it should be fine the way it is.

Oh, and I just remembered that the end_.. function should be clearing the
temporary hash we build up - I think the version I emailed neglected to do that.
Wouldn't want those numbers to stick around in the session.

- --
Greg Sabino Mullane greg(at)turnstep(dot)com
End Point Corporation
PGP Key: 0x14964AC8 200904061028
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8

-----BEGIN PGP SIGNATURE-----

iEYEAREDAAYFAknaEdIACgkQvJuQZxSWSsgeeACfQRXopdyHdYoj5SLTiwedIYAc
bDUAoNvouyYtixHeXLowWqYr9Oc/jS/t
=sJ+s
-----END PGP SIGNATURE-----

In response to

Browse pgsql-sql by date

  From Date Subject
Next Message Peter Koczan 2009-04-06 16:42:29 Re: pl/pgsql or control structures outside of a function?
Previous Message Greg Sabino Mullane 2009-04-06 02:21:59 Re: How would I get rid of trailing blank line?