From: | scott(at)deltaex(dot)com |
---|---|
To: | pgsql-bugs(at)postgresql(dot)org |
Cc: | scott(at)deltaex(dot)com |
Subject: | BUG #14769: Logical replication error "cache lookup failed for type 0" |
Date: | 2017-08-02 23:59:27 |
Message-ID: | 20170802235927.8423.44538@wrigleys.postgresql.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
The following bug has been logged on the website:
Bug reference: 14769
Logged by: Scott Milliken
Email address: scott(at)deltaex(dot)com
PostgreSQL version: 10beta2
Operating system: Linux 4.10.0-27-generic #30~16.04.2-Ubuntu
Description:
I'm testing postgresql 10beta2, and found a bug with logical replication.
The bug halts logical replication, and produces the error "cache lookup
failed for type 0" in the secondary's log.
The error can be reproduced by dropping a column in a replicated table. When
a column is dropped, it zeroes out attypeid in pg_attribute:
```
test=# select * from pg_attribute where attrelid =
'public.testtbl'::regclass and atttypid = 0 order by attnum;
-[ RECORD 1 ]-+-----------------------------
attrelid | 16386
attname | ........pg.dropped.2........
atttypid | 0
attstattarget | 0
attlen | -1
attnum | 2
attndims | 0
attcacheoff | -1
atttypmod | -1
attbyval | f
attstorage | x
attalign | i
attnotnull | f
atthasdef | f
attidentity |
attisdropped | t
attislocal | t
attinhcount | 0
attcollation | 100
attacl | ¤
attoptions | ¤
attfdwoptions | ¤
```
Perhaps there's a missing check against pg_attribute.attisdropped
somewhere?
Here's full reproduction steps:
Primary:
```
mkdir -p /tmp/test0
PGPORT=5301 initdb -D /tmp/test0
echo "wal_level = logical" >> /tmp/test0/postgresql.conf
PGPORT=5301 pg_ctl -D /tmp/test0 start
for (( ; ; )); do if pg_isready -d postgres -p 5301; then break; fi; sleep
1; done
psql -p 5301 postgres -c "CREATE USER test WITH PASSWORD 'test' SUPERUSER
CREATEDB CREATEROLE LOGIN REPLICATION BYPASSRLS;"
createdb -p 5301 -E utf8 test
psql -p 5301 -U test test -c "CREATE TABLE testtbl (id int, name text);"
psql -p 5301 -U test test -c "ALTER TABLE testtbl ADD CONSTRAINT
testtbl_pkey PRIMARY KEY (id);"
psql -p 5301 -U test test -c "CREATE PUBLICATION testpub FOR TABLE
testtbl;"
psql -p 5301 -U test test -c "INSERT INTO testtbl (id, name) VALUES (1,
'a');"
# continue with "Secondary" instructions
psql -p 5301 -U test test -c "UPDATE testtbl SET name = 'b';"
# continue with "Secondary" instructions
psql -p 5301 -U test test -c "ALTER TABLE testtbl DROP COLUMN name;"
psql -p 5301 -U test test -c "ALTER TABLE testtbl ADD COLUMN name2 text;"
psql -p 5301 -U test test -c "UPDATE testtbl SET name2 = 'b';"
# continue with "Secondary" instructions
```
Secondary:
```
mkdir -p /tmp/test1
PGPORT=5302 initdb -D /tmp/test1
PGPORT=5302 pg_ctl -D /tmp/test1 start
for (( ; ; )); do if pg_isready -d postgres -p 5302; then break; fi; sleep
1; done
psql -p 5302 postgres -c "CREATE USER test WITH PASSWORD 'test' SUPERUSER
CREATEDB CREATEROLE LOGIN REPLICATION BYPASSRLS;"
createdb -p 5302 -E utf8 test
psql -p 5302 -U test test -c "CREATE TABLE testtbl (id int, name text);"
psql -p 5302 -U test test -c "ALTER TABLE testtbl ADD CONSTRAINT
testtbl_pkey PRIMARY KEY (id);"
psql -p 5302 -U test test -c "CREATE SUBSCRIPTION test0_testpub CONNECTION
'port=5301 user=test dbname=test' PUBLICATION testpub;"
psql -p 5302 -U test test -c "SELECT * FROM testtbl;" # OK
# continue with "Primary" instructions
psql -p 5302 -U test test -c "SELECT * FROM testtbl;" # OK
psql -p 5302 -U test test -c "ALTER TABLE testtbl DROP COLUMN name;"
psql -p 5302 -U test test -c "ALTER TABLE testtbl ADD COLUMN name2 text;"
# continue with "Primary" instructions
psql -p 5302 -U test test -c "SELECT * FROM testtbl;" # not OK, doesn't
receive update, errors in log
```
Here's the log:
```
2017-07-23 22:42:28.719 PDT [7706] LOG: logical replication apply worker
for subscription "test0_testpub" has started
2017-07-23 22:42:28.725 PDT [7706] ERROR: cache lookup failed for type 0
2017-07-23 22:42:28.725 PDT [7706] CONTEXT: processing remote data for
replication target relation "public.testtbl" column "id", remote type
integer, local type integer
2017-07-23 22:42:28.725 PDT [3359] LOG: worker process: logical replication
worker for subscription 16395 (PID 7706) exited with exit code 1
```
From | Date | Subject | |
---|---|---|---|
Next Message | Masahiko Sawada | 2017-08-03 00:59:50 | Re: BUG #14758: Segfault with logical replication on a function index |
Previous Message | Peter Eisentraut | 2017-08-02 18:48:09 | Re: Crash report for some ICU-52 (debian8) COLLATE and work_mem values |