| From: | Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> |
|---|---|
| To: | David Hinkle <hinkle(at)cipafilter(dot)com> |
| Cc: | "pgsql-general(at)postgresql(dot)org >> PG-General Mailing List" <pgsql-general(at)postgresql(dot)org> |
| Subject: | Re: Bad planning data resulting in OOM killing of postgres |
| Date: | 2017-02-13 17:54:10 |
| Message-ID: | CAFj8pRCQ+kiqyUU9Zh8UnAoSt9n0dLY0OgrszyBk9sj3ByiKVw@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-general |
Hi
2017-02-13 18:40 GMT+01:00 David Hinkle <hinkle(at)cipafilter(dot)com>:
> I'm having trouble with purges related to a large table. The delete
> query consumes ram until postgres crashes due to OOM. I have a very
> large table called log_raw. There are half a dozen related tables,
> such as 'urls' and 'titles'. log_raw.urlid = urls.urlid and urls
> contains the text of the various urls, for example.
>
> Each time I try to purge these side tables the unit OOM's.
>
> psql:postgres(at)cipafilter = explain DELETE FROM titles WHERE NOT EXISTS
> ( SELECT 1 FROM log_raw WHERE log_raw.titleid = titles.titleid );
> QUERY PLAN
> ────────────────────────────────────────────────────────────
> ─────────────────────────────
> Delete on titles (cost=22166473.44..24850954.67 rows=870382 width=12)
> -> Hash Anti Join (cost=22166473.44..24850954.67 rows=870382 width=12)
> Hash Cond: (titles.titleid = log_raw.titleid)
> -> Seq Scan on titles (cost=0.00..17871.64 rows=870664 width=10)
> -> Hash (cost=12744792.64..12744792.64 rows=542011264 width=10)
> -> Seq Scan on log_raw (cost=0.00..12744792.64
> rows=542011264 width=10)
> (6 rows)
>
>
what is your work_mem setting?
Regards
Pavel
> psql:postgres(at)cipafilter = select count(*) from (select titleid from
> log_raw group by titleid) as a;
> count
> ────────
> 872210
> (1 row)
>
> cipafilter=# select n_distinct from pg_stats where tablename =
> 'log_raw' and attname = 'titleid';
> n_distinct
> ------------
> 282
> (1 row)
>
> The planning data is wildly low for each of these fields, and I wonder
> if because of that error the planner thinks it can keep all these id's
> in ram while it works. Analyze doesn't fix it. Increasing the
> statistics target improves the data in n_distinct but not
> considerably, as increasing it 3 or 4 fold leads to it still being
> wildly off. ALTER TABLE set n_distinct doesn't seem to be used by the
> planner as it doesn't change any of the plans I've generated or seem
> to be taken into account in the row estimates. I'm out of ideas.
> Anybody have any ideas?
>
>
> --
> David Hinkle
>
> Senior Software Developer
>
> Phone: 800.243.3729x3000
>
> Email: hinkle(at)cipafilter(dot)com
>
> Hours: Mon-Fri 8:00AM-5:00PM (CT)
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>
| From | Date | Subject | |
|---|---|---|---|
| Next Message | John R Pierce | 2017-02-13 17:59:17 | Re: Auto-Rollback option |
| Previous Message | David Hinkle | 2017-02-13 17:40:28 | Bad planning data resulting in OOM killing of postgres |