On Mon, Jun 23, 2014 at 4:36 PM, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
wrote:
> On Mon, Jun 23, 2014 at 2:13 PM, Michael Paquier <
> michael(dot)paquier(at)gmail(dot)com> wrote:
>
>> I am guessing that the bug origin is in pg_parse_json in the way nested
>> json is managed, it is the only code path of populate_recordset_worker
>> where a switch on JSON[B]OID is used.
>>
> Digging into that, I am seeing that the hash table used to find the field
> values queried in populate_recordset_object_end in the hash table
> (PopulateRecordsetState *)->json_hash has null entries for all the columns
> inserted before the last nested json value. For example in my last example
> with '[{"a":2,"b":3,"c":{"z":4}, "d":6}]', this results in having null
> values for "a" and "b", "c" and "d" remaining correct.
> populate_recordset_object_field_end inserts those values correctly within
> the hash table though, so something strange is going on when inserting in
> json_hash directly a json value.
>
Digging more into that, I have found the issue and a fix for it. It happens
that populate_recordset_object_start, which is used to initialize the
process for the population of the record, is taken *each* time a json
object is found, re-creating every time the hash table for the parsing
process, hence removing from PopulateRecordsetState all the entries already
parsed and creating the problem reported by Matti. The fix I am proposing
to fix this issue is rather simple: simply bypass the creation of the hash
table if lex_level > 1 as we are in presence of a nested object and rely on
the existing hash table.
Patch is attached, and should be backpatched to REL9_4_STABLE where
json_to_recordset has been introduced.
Regards,
--
Michael