From: | David Hinkle <hinkle(at)cipafilter(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | pgsql-general(at)postgresql(dot)org |
Subject: | Re: Bad planning data resulting in OOM killing of postgres |
Date: | 2017-02-16 20:22:29 |
Message-ID: | CACw4T0pNRus3gaxUu4wmsb+m4WER9YerqtmvPfmSRQ4WnF5TiQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Yep, 420ish million records out of 540 million records have a titleid
of 1. There are about 880,000 other unique values, but most of the
records are 1. Of course, n_distinct is only 292. I'm surprised
it's not eliminating the duplicates while it builds that hash table.
This is what I'm doing for a work around right now. Getting
n_distinct right seems to be preventing the system from breaking.
It's going to be executed once a week during the weekly maintenance.
It's setting the n_distinct of each column to the number of rows in
the associated table.
CREATE OR REPLACE FUNCTION patch_ndistinct(_table varchar, _column
varchar, _string_table varchar)
RETURNS real AS
$$
DECLARE _cnt REAL;
BEGIN
SELECT reltuples INTO _cnt from pg_class where relname = _string_table;
EXECUTE 'ALTER TABLE ' || _table || ' ALTER COLUMN ' || _column
|| ' SET (n_distinct=' || _cnt || ')';
RETURN _cnt;
END
$$ LANGUAGE plpgsql;
select patch_ndistinct('log_raw', 'titleid', 'titles');
select patch_ndistinct('log_raw', 'urlid', 'urls');
select patch_ndistinct('log_raw', 'hostid', 'hosts');
ANALYZE log_raw;
On Thu, Feb 16, 2017 at 10:54 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> David Hinkle <hinkle(at)cipafilter(dot)com> writes:
>> Tom, there are three columns in this table that exhibit the problem,
>> here is the statistics data after an analyze, and the real data to
>> compare it to.
>
>> attname | n_distinct | most_common_freqs
>
>> titleid | 292 | {0.767167}
>
> Ouch. That's saying there's some single value of titleid that accounts
> for more than three-quarters of the entries ... does that square with
> reality? That'd certainly explain why a hash join goes nuts.
>
> regards, tom lane
--
David Hinkle
Senior Software Developer
Phone: 800.243.3729x3000
Email: hinkle(at)cipafilter(dot)com
Hours: Mon-Fri 8:00AM-5:00PM (CT)
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2017-02-16 22:39:31 | Re: Autovacuum stuck for hours, blocking queries |
Previous Message | Adrian Klaver | 2017-02-16 20:10:08 | Re: disk writes within a transaction |