| From: | Paul Ramsey <pramsey(at)cleverelephant(dot)ca> | 
|---|---|
| To: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> | 
| Subject: | Datum values consistency within one query | 
| Date: | 2020-04-02 22:48:28 | 
| Message-ID: | CACowWR3JEgEQmWJNbRK6UyPcMHdsa8UHKW7i_OTLMfv05JaV2w@mail.gmail.com | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-hackers | 
Imagine a function that was going to take a table of, say, images, and
so something to them over and over, like:
  SELECT pixel_count(img), clr_name, img_name
  FROM images img
  CROSS JOIN colors clr
When you run this function, you find that a great amount of time is
being spend in the decompression/detoasting routines, so you think: I
have a nested loop here, driven on the 'img' side, if I can avoid
re-loading the big image object over and over I can make things
faster.
Getting the datum value is really fast, so I can have a cache that
keeps the latest detoasted object around, and update it when the datum
changes, and store the cache information in the parent context. Like
so:
struct {
    Datum d;
    bytea *ba;
} DatumCache;
PG_FUNCTION_INFO_V1(pixel_count);
Datum pixel_count(PG_FUNCTION_ARGS)
{
    Datum d = PG_GETARG_DATUM(0);
    DatumCache *dcache = fcinfo->flinfo->fn_extra;
    bytea *ba;
    if (!dcache)
    {
        dcache = MemoryContextAllocZero(fcinfo->flinfo->fn_mcxt,
sizeof(DatumCache));
        fcinfo->flinfo->fn_extra = dcache;
    }
    if (dcache->d != d)
    {
        if (dcache->ba) pfree(dcache->ba);
        MemoryContext old_context =
MemoryContextSwitchTo(fcinfo->flinfo->fn_mcxt);
        dcache->ba = PG_GETARG_BYTEA_P_COPY(0);
        MemoryContextSwitchTo(old_context);
    }
ba = dcache->ba;
    /* now do things with ba here */
}
Now, notwithstanding any concerns about the particularities of my
example (I've found order-of-magnitude improvements on PostGIS
workloads avoiding the detoasting overhead this way) is my core
assumption correct: within the context of a single SQL statement, will
the Datum values for a particular object remain constant?
They *seem* to, in the examples I'm running. But do they always?
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Tom Lane | 2020-04-02 23:18:29 | Re: [PATCH] Fix for slow GIN index queries when "gin_fuzzy_search_limit" setting is relatively small for large tables | 
| Previous Message | Justin Pryzby | 2020-04-02 22:26:39 | Re: Add A Glossary |