From: | Robert Haas <robertmhaas(at)gmail(dot)com> |
---|---|
To: | Andrew Dunstan <andrew(at)dunslane(dot)net> |
Cc: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: WIP Incremental JSON Parser |
Date: | 2024-01-03 15:12:37 |
Message-ID: | CA+TgmoZYB2xDi76w5EnuZHYtpyEEoq_Kt5c0rXPZ2XZ-MYhtxw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Wed, Jan 3, 2024 at 9:59 AM Andrew Dunstan <andrew(at)dunslane(dot)net> wrote:
> Say we have a document with an array 1m objects, each with a field
> called "color". As it stands we'll allocate space for that field name 1m
> times. Using a hash table we'd allocated space for it once. And
> allocating the memory isn't free, although it might be cheaper than
> doing hash lookups.
>
> I guess we can benchmark it and see what the performance impact of using
> a hash table might be.
>
> Another possibility would be simply to have the callback free the field
> name after use. for the parse_manifest code that could be a one-line
> addition to the code at the bottom of json_object_manifest_field_start().
Yeah. So I'm arguing that allocating the memory each time and then
freeing it sounds cheaper than looking it up in the hash table every
time, discovering it's there, and thus skipping the allocate/free.
I might be wrong about that. It's just that allocating and freeing a
small chunk of memory should boil down to popping it off of a linked
list and then pushing it back on. And that sounds cheaper than hashing
the string and looking for it in a hash bucket.
--
Robert Haas
EDB: http://www.enterprisedb.com
From | Date | Subject | |
---|---|---|---|
Next Message | Nathan Bossart | 2024-01-03 15:29:54 | Re: add AVX2 support to simd.h |
Previous Message | Robert Haas | 2024-01-03 15:10:09 | Re: trying again to get incremental backup |