Re: [GENERAL] Possible bug with row_to_json

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-hackers(at)postgreSQL(dot)org
Cc: Jack Christensen <jack(at)jackchristensen(dot)com>
Subject: Re: [GENERAL] Possible bug with row_to_json
Date: 2013-08-06 14:23:35
Message-ID: 27771.1375799015@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers

Jack Christensen <jack(at)jackchristensen(dot)com> writes:
> jack=# create table player(
> jack(# player_id serial primary key,
> jack(# name varchar not null unique
> jack(# );
> CREATE TABLE
> jack=# insert into player(name) values('Jack');
> INSERT 0 1
> jack=# select row_to_json(t)
> jack-# from (
> jack(# select player_id as renamed, name
> jack(# from player
> jack(# order by name
> jack(# ) t;
> row_to_json
> -------------------------------
> {"player_id":1,"name":"Jack"}
> (1 row)

> It ignored the rename.

I looked into this and found that the culprit is the optimization that
skips ExecProject() if a scan plan node is not doing any useful
projection. In the plan for this query:

Subquery Scan on t (cost=0.15..81.98 rows=1230 width=60)
Output: row_to_json(t.*)
-> Index Scan using player_name_key on public.player (cost=0.15..66.60 rows=1230 width=36)
Output: player.player_id, player.name

the indexscan node decides it can just return its "scan tuple" slot
directly to the caller, rather than projecting into the "result tuple"
slot. The problem is that the scan tuple slot's tuple descriptor is that
of the table being scanned, and in particular it has the actual column
names of the underlying table, plus a tdtypeid matching the table's
rowtype OID. This info is what's looked at by ExecEvalWholeRowVar to form
a tuple datum, so the datum ends up looking like it has exactly the
table's rowtype, and then row_to_json is going to be given the table's
tuple descriptor when it asks lookup_rowtype_tupdesc for a tupdesc.

The attached quick-hack patch fixes the reported case. I *think* it's
safe as is, but I wonder whether we ought to install additional checks
to verify physical compatibility of the two tuple descriptors before
we decide it's okay to skip projection. Another potential issue is
that we're losing whatever benefit there may be in having tuple datums
marked with named rowtypes rather than RECORD --- I'm not sure there's
any meaningful performance difference, but I'm not sure there isn't,
either. We could be even more paranoid by choosing to only install
the other descriptor if there's actually a difference in column names;
but is it worth the extra complexity?

Comments?

regards, tom lane

Attachment Content-Type Size
install-correct-descriptor.patch text/x-diff 2.4 KB

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Adrian Klaver 2013-08-06 14:32:00 Re: postgre restarts in short period
Previous Message Adrian Klaver 2013-08-06 14:16:18 Re: postgre restarts in short period

Browse pgsql-hackers by date

  From Date Subject
Next Message 'Bruce Momjian' 2013-08-06 14:25:54 Re: File-per-GUC WAS: Re: ALTER SYSTEM SET command to change postgresql.conf parameters (RE: Proposal for Allow postgresql.conf values to be changed via SQL [review])
Previous Message Amit Kapila 2013-08-06 13:00:18 Re: File-per-GUC WAS: Re: ALTER SYSTEM SET command to change postgresql.conf parameters (RE: Proposal for Allow postgresql.conf values to be changed via SQL [review])