From: | Robert Haas <robertmhaas(at)gmail(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Josh Berkus <josh(at)agliodbs(dot)com>, "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Why we lost Uber as a user |
Date: | 2016-07-26 22:19:16 |
Message-ID: | CA+TgmobFp2feymV6xgKgPs00xMEQ+LpO-+gGoGOJTGngKhUzgQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Tue, Jul 26, 2016 at 6:07 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Josh Berkus <josh(at)agliodbs(dot)com> writes:
>> To explain this in concrete terms, which the blog post does not:
>
>> 1. Create a small table, but one with enough rows that indexes make
>> sense (say 50,000 rows).
>
>> 2. Make this table used in JOINs all over your database.
>
>> 3. To support these JOINs, index most of the columns in the small table.
>
>> 4. Now, update that small table 500 times per second.
>
>> That's a recipe for runaway table bloat; VACUUM can't do much because
>> there's always some minutes-old transaction hanging around (and SNAPSHOT
>> TOO OLD doesn't really help, we're talking about minutes here), and
>> because of all of the indexes HOT isn't effective.
>
> Hm, I'm not following why this is a disaster. OK, you have circa 100%
> turnover of the table in the lifespan of the slower transactions, but I'd
> still expect vacuuming to be able to hold the bloat to some small integer
> multiple of the minimum possible table size. (And if the table is small,
> that's still small.) I suppose really long transactions (pg_dump?) could
> be pretty disastrous, but there are ways around that, like doing pg_dump
> on a slave.
>
> Or in short, this seems like an annoyance, not a time-for-a-new-database
> kind of problem.
I've seen multiple cases where this kind of thing causes a
sufficiently large performance regression that the system just can't
keep up. Things are OK when the table is freshly-loaded, but as soon
as somebody runs a query on any table in the cluster that lasts for a
minute or two, so much bloat accumulates that the performance drops to
an unacceptable level. This kind of thing certainly doesn't happen to
everybody, but equally certainly, this isn't the first time I've heard
of it being a problem. Sometimes, with careful tending and a very
aggressive autovacuum configuration, you can live with it, but it's
never a lot of fun.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
From | Date | Subject | |
---|---|---|---|
Next Message | Michael Paquier | 2016-07-26 22:58:03 | Re: Why we lost Uber as a user |
Previous Message | Josh Berkus | 2016-07-26 22:17:38 | Re: Why we lost Uber as a user |