Re: Bad planning data resulting in OOM killing of postgres

From: David Hinkle <hinkle(at)cipafilter(dot)com>
To: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
Cc: "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: Re: Bad planning data resulting in OOM killing of postgres
Date: 2017-02-13 22:44:32
Message-ID: CACw4T0onXPz5Wx7mvHS4Emnm75pW8AMo2j4FSK9+iVymSGKHGA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

I manually updated the pg_statistics data by literally set it to an
appropriate amount, and the planner picked a new plan and the new plan
worked. Any idea what I should do about this? Is manually updating
these values my best bet?

psql:daemon(at)cipafilter = update pg_statistic set stadistinct = 850000
where starelid = 16678;

psql:daemon(at)cipafilter = EXPLAIN (ANALYZE, BUFFERS) delete from titles
WHERE NOT EXISTS ( SELECT 1 FROM log_raw WHERE log_raw.titleid =
titles.titleid );
QUERY PLAN
──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
Delete on titles (cost=111982485.81..117653772.60 rows=26291
width=12) (actual time=726777.769..726777.769 rows=0 loops=1)
Buffers: shared hit=769918 read=7323762 dirtied=1, temp
read=1944017 written=2177366
-> Merge Anti Join (cost=111982485.81..117653772.60 rows=26291
width=12) (actual time=726777.766..726777.766 rows=0 loops=1)
Merge Cond: (titles.titleid = log_raw.titleid)
Buffers: shared hit=769918 read=7323762 dirtied=1, temp
read=1944017 written=2177366
-> Index Scan using titles_pkey on titles
(cost=0.42..44451.84 rows=876291 width=10) (actual time=0.006..316.375
rows=874829 loops=1)
Buffers: shared hit=769000 dirtied=1
-> Materialize (cost=109899462.67..112609518.99
rows=542011264 width=10) (actual time=545806.339..687770.786
rows=425297005 loops=1)
Buffers: shared hit=918 read=7323762, temp read=1944017
written=2177366
-> Sort (cost=109899462.67..111254490.83
rows=542011264 width=10) (actual time=545806.336..652402.716
rows=425297005 loops=1)
Sort Key: log_raw.titleid
Sort Method: external merge Disk: 11013224kB
Buffers: shared hit=918 read=7323762, temp
read=1944017 written=2177366
-> Seq Scan on log_raw (cost=0.00..12744792.64
rows=542011264 width=10) (actual time=0.003..274920.970 rows=544670242
loops=1)
Buffers: shared hit=918 read=7323762
Planning time: 0.158 ms
Execution time: 727533.213 ms
(17 rows)

On Mon, Feb 13, 2017 at 3:57 PM, David Hinkle <hinkle(at)cipafilter(dot)com> wrote:
> I managed to get this version to finish:
>
> psql:postgres(at)cipafilter = explain (ANALYZE, BUFFERS) select count(*)
> from (select titleid from log_raw group by titleid) as a;
> QUERY PLAN
> ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
> Aggregate (cost=14099827.15..14099827.16 rows=1 width=0) (actual
> time=248805.549..248805.549 rows=1 loops=1)
> Buffers: shared hit=598 read=7324082 dirtied=34
> -> HashAggregate (cost=14099820.80..14099823.62 rows=282 width=4)
> (actual time=248504.756..248760.382 rows=874750 loops=1)
> Group Key: log_raw.titleid
> Buffers: shared hit=598 read=7324082 dirtied=34
> -> Seq Scan on log_raw (cost=0.00..12744792.64
> rows=542011264 width=4) (actual time=0.002..145554.907 rows=544654818
> loops=1)
> Buffers: shared hit=598 read=7324082 dirtied=34
> Planning time: 0.072 ms
> Execution time: 248807.285 ms
> (9 rows)
>
> On Mon, Feb 13, 2017 at 3:47 PM, David Hinkle <hinkle(at)cipafilter(dot)com> wrote:
>> psql:postgres(at)cipafilter = EXPLAIN (ANALYZE, BUFFERS) select titleid
>> from titles WHERE NOT EXISTS ( SELECT 1 FROM log_raw WHERE
>> log_raw.titleid = titles.titleid );
>> server closed the connection unexpectedly
>> This probably means the server terminated abnormally
>> before or while processing the request.
>> The connection to the server was lost. Attempting reset: Failed.
>>
>> Nope, that pops too. The query runs for a long time at a somewhat
>> normal rate of ram consumption, using ~1G of RSS then suddenly spikes
>> to about 6G, at which point the OOM killer pops it. Box has 8G of ram
>> and 4G of swap.
>>
>> On Mon, Feb 13, 2017 at 3:21 PM, Jeff Janes <jeff(dot)janes(at)gmail(dot)com> wrote:
>>> On Mon, Feb 13, 2017 at 12:43 PM, David Hinkle <hinkle(at)cipafilter(dot)com>
>>> wrote:
>>>>
>>>> Thanks Jeff,
>>>>
>>>> No triggers or foreign key constrains:
>>>>
>>>> psql:postgres(at)cipafilter = \d+ titles
>>>> Table "public.titles"
>>>> Column │ Type │ Modifiers
>>>> │ Storage │ Stats target │ Description
>>>>
>>>> ─────────┼───────────────────┼──────────────────────────────────────────────────────────┼──────────┼──────────────┼─────────────
>>>> title │ character varying │
>>>> │ extended │ │
>>>> titleid │ integer │ not null default
>>>> nextval('titles_titleid_seq'::regclass) │ plain │ │
>>>> Indexes:
>>>> "titles_pkey" PRIMARY KEY, btree (titleid)
>>>> "titles_md5_title_idx" btree (md5(title::text))
>>>>
>>>> Do you see anything in there that would be problematic?
>>>
>>>
>>>
>>> I'm out of ideas here. What happens if you just select the rows, rather
>>> than deleting them? Does it have memory problems then? If not, can you
>>> post the explain (analyze, buffers) of doing that?
>>>
>>> Cheers,
>>>
>>> Jeff
>>
>>
>>
>> --
>> David Hinkle
>>
>> Senior Software Developer
>>
>> Phone: 800.243.3729x3000
>>
>> Email: hinkle(at)cipafilter(dot)com
>>
>> Hours: Mon-Fri 8:00AM-5:00PM (CT)
>
>
>
> --
> David Hinkle
>
> Senior Software Developer
>
> Phone: 800.243.3729x3000
>
> Email: hinkle(at)cipafilter(dot)com
>
> Hours: Mon-Fri 8:00AM-5:00PM (CT)

--
David Hinkle

Senior Software Developer

Phone: 800.243.3729x3000

Email: hinkle(at)cipafilter(dot)com

Hours: Mon-Fri 8:00AM-5:00PM (CT)

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Lynn Dobbs 2017-02-13 22:56:39 xmlelement AND timestamps.
Previous Message Thomas Kellerer 2017-02-13 22:26:27 Documentation inconsistency (at least to me)