From: | Richard Huxton <dev(at)archonet(dot)com> |
---|---|
To: | ARTEAGA Jose <Jose(dot)Arteaga(at)alcatel-lucent(dot)com> |
Cc: | pgsql-general(at)postgresql(dot)org, Alvaro Herrera <alvherre(at)commandprompt(dot)com> |
Subject: | Re: Limitations on 7.0.3? |
Date: | 2007-07-17 07:03:26 |
Message-ID: | 469C69BE.5020109@archonet.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
ARTEAGA Jose wrote:
> I have spent the last month battling and looking deeper into the issue,
> here's a summary of were I'm at:
> - Increasing shared buffers improved performance but did not resolve the
> backend FATAL disconnect error.
> - Dumping and recreating entire database also did not resolve the issue.
OK, so it's not a corrupted index/file then.
> - re-initializing the DB and recreating from the dump also did not
> resolve the issue.
> On both cases above the issue re-occurred within 2-3 days of run-time
> (insert of new records).
>
> I got the issue narrowed down to the point were I was able to re-create
> the issue at will by just inserting enough data, the data content did
> not matter. The issue always occurred while inserting into my
> "teststeprun" table, which is the largest of my tables (~15 Mill rows).
> The issue is that once I got this table to a certain size, then the
> backend system would crash.
>
> Since I was able to reproduce, I then decided to analyze the core dumps.
> Looking at the core dumps I immediately began to see a pattern, even the
> same patter was there from the initial core dumps I had when the problem
> began occurring back two months ago. In every case the dump indicated
> the last instruction was always in the call to tag_hash(). I also
> noticed that each time the values passed to tag_hash which are used to
> generate the key were just below the 32-bit max value, and tag_hash
> should be returning a uint32 value. Now I'm really suspecting that there
> is some issue with this. Below are the traces of the four core dumps
> which point to the issue I'm suspecting.
I think tag_hash (in /backend/utils/hash/hashfn.c) is responsible for
internal hash-tables (rather than hash indexes). It takes a pointer to a
key to hash and a keysize (in bytes), so either the pointer is bad or
the size is too long and it's reading off the end.
At the other end of your call, _bt_insertonpg
(/backend/access/nbtree/nbtinsert.c) is inserting into a btree index. In
one case it's splitting the index page it tries to insert into (because
it's full) but not in the others.
If it's not a hardware related problem, then it's a bug, but you're
unlikely to get a fix given how old the code is. If an upgrade to 8.2
looks like it will take a lot of effort, perhaps consider an
intermediate upgrade to 7.2 - I think schemas were introduced in 7.3 so
before that should be easier.
There is a chance that you might reduce the problem by REINDEXing the
table concerned every night. That's just a guess though, and you're real
solution will be to upgrade to something more recent.
--
Richard Huxton
Archonet Ltd
From | Date | Subject | |
---|---|---|---|
Next Message | Vince | 2007-07-17 07:13:08 | PHP pg_connect |
Previous Message | Jason Nerothin | 2007-07-17 07:03:07 | interaction with postgres defined types in custom c functions |