From: | Floris Bos / Maxnet <bos(at)je-eigen-domein(dot)nl> |
---|---|
To: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Multicolumn index corruption on 8.4 beta 2 |
Date: | 2009-06-09 11:40:28 |
Message-ID: | 4A2E4A2C.3090207@je-eigen-domein.nl |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
Richard Huxton wrote:
> Not a hacker myself, but I can tell you that the first question you'll
> be asked is "can you produce a test case"? If you can generate the
> problem from a test table+generated data that will let people figure out
> the problem for you.
Unfortunately, I have not been able to produce a test case (yet) on a
small data set.
While the data in the database is public information, the whole database
is about 100 GB, and therefore kinda hard to share.
> If not, details of the table schema will be needed, and is there any
> pattern to the missed rows? Also - compile settings, character set and
> locale details might be relevant too.
==
Compile settings
==
No fancy settings.
- Clean Opensolaris 2009.06 installation
- Installed gcc and gmake packages.
- Downloaded source and did a ./configure --disable-readline ; gmake ;
gmake install
==
Postgresql settings
==
The following settings differ from the defaults:
--
shared_buffers=3500MB
maintenance_work_mem = 128MB
fsync = off
synchronous_commit = off
checkpoint_segments = 25
--
The locale used when creating the database is SQL_ASCII
==
Hardware
==
Tyan barebone
2x Opteron 2376 quadcore
32 GB reg ecc memory
1x Intel X25-E 32 GB SSD for OS and pg_xlog directory
2x Intel X25-E 64 GB SSD (ZFS striping) for the database
==
Table layout
==
--
Table "public.posts_index"
Column | Type |
Modifiers
------------+------------------------+-----------------------------------------------------------
cid | integer | not null default
nextval('posts_index_cid
_seq'::regclass)
groupid | integer | not null
startdate | integer | not null
poster | character varying(64) | not null
basefile | character varying(64) | not null
subject | character varying(255) | not null
size | real |
nfo | boolean |
c | boolean |
parts | integer |
totalparts | integer |
imdb | integer |
ng1 | boolean | default false
g2 | integer | default 0
g3 | integer | default 0
data | bytea |
Indexes:
"posts_index5_pkey" PRIMARY KEY, btree (cid) CLUSTER
"gr_idx" btree (groupid, (- cid))
"pgb_idx" btree (poster, groupid, basefile)
--
Only noticed problems with the pgb_idx index so far.
The problem only occurs on a subset of the rows, at a time.
After adding/updating rows and doing a reindex, the rows that were
missing before sometimes suddenly do work, but then different ones do not.
> And can you post an explain plan for the incorrect scan? In particular is it using a bitmap index scan or a regular index scan? Or does it happen with either?
Happens with both.
Index scan:
===
=> explain SELECT count(*) FROM posts_index WHERE
poster='Yenc(at)power-post(dot)org (Yenc-PP-A&A)' AND groupid=300 AND
basefile='NIB8124849';
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=11.25..11.26 rows=1 width=0)
-> Index Scan using pgb_idx on posts_index (cost=0.00..11.25
rows=1 width=0)
Index Cond: (((poster)::text = 'Yenc(at)power-post(dot)org
(Yenc-PP-A&A)'::text) AND (groupid = 300) AND ((basefile)::text =
'NIB8124849'::text))
=> SELECT count(*) FROM posts_index WHERE poster='Yenc(at)power-post(dot)org
(Yenc-PP-A&A)' AND groupid=300 AND basefile='NIB8124849';
count
-------
0
===
When I disable index scan, it uses bitmap without luck:
==
=> set enable_indexscan=false;
SET
=> explain SELECT count(*) FROM posts_index WHERE
poster='Yenc(at)power-post(dot)org (Yenc-PP-A&A)' AND groupid=300 AND
basefile='NIB8124849';
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=11.26..11.27 rows=1 width=0)
-> Bitmap Heap Scan on posts_index (cost=7.24..11.26 rows=1 width=0)
Recheck Cond: (((poster)::text = 'Yenc(at)power-post(dot)org
(Yenc-PP-A&A)'::text) AND (groupid = 300) AND ((basefile)::text =
'NIB8124849'::text))
-> Bitmap Index Scan on pgb_idx (cost=0.00..7.24 rows=1 width=0)
Index Cond: (((poster)::text = 'Yenc(at)power-post(dot)org
(Yenc-PP-A&A)'::text) AND (groupid = 300) AND ((basefile)::text =
'NIB8124849'::text))
=> SELECT count(*) FROM posts_index WHERE poster='Yenc(at)power-post(dot)org
(Yenc-PP-A&A)' AND groupid=300 AND basefile='NIB8124849';
count
-------
0
==
Sequential scan does find the row:
==
=> set enable_indexscan=false;
SET
=> set enable_bitmapscan=false;
SET
=> explain SELECT count(*) FROM posts_index WHERE
poster='Yenc(at)power-post(dot)org (Yenc-PP-A&A)' AND groupid=300 AND
basefile='NIB8124849';
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=288153.28..288153.29 rows=1 width=0)
-> Seq Scan on posts_index (cost=0.00..288153.28 rows=1 width=0)
Filter: (((poster)::text = 'Yenc(at)power-post(dot)org
(Yenc-PP-A&A)'::text) AND (groupid = 300) AND ((basefile)::text =
'NIB8124849'::text))
(3 rows)
=> SELECT count(*) FROM posts_index WHERE poster='Yenc(at)power-post(dot)org
(Yenc-PP-A&A)' AND groupid=300 AND basefile='NIB8124849';
count
-------
1
==
Yours sincerely,
Floris Bos
From | Date | Subject | |
---|---|---|---|
Next Message | Kedar Potdar | 2009-06-09 11:51:30 | Re: Patch for automating partitions in PostgreSQL 8.4 Beta 2 |
Previous Message | gj | 2009-06-09 10:44:45 | Re: Patch for automating partitions in PostgreSQL 8.4 Beta 2 |