Analytic Function Bug

From: Rumpi Gravenstein <rgravens(at)gmail(dot)com>
To: PostgreSQL <pgsql-general(at)lists(dot)postgresql(dot)org>
Subject: Analytic Function Bug
Date: 2024-08-29 23:18:16
Message-ID: CAEpg1wDJU6DzUJxJdFAUZFzmw6bY+p17bXvCLupwsxc0V6vqrg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Experts,

I am running on

PostgreSQL 14.13 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 8.5.0
20210514 (Red Hat 8.5.0-22), 64-bit

I have the following query which returns what I expect:

with

d(logical_partition_key, model_usage) as ( values( 'TEST_DATA' ,
'F(T61)(EXPORT)' )

)

,

usg_txt as (

SELECT DISTINCT logical_partition_key,

MODEL_USAGE as usage_text,

REPLACE ( REPLACE ( REPLACE ( REPLACE ( PIU.MODEL_USAGE::text, '<t
context="USAGE_TEXT">', ''), '<t context="FCN_NAME_MODFR">', ''), '<t
context="FCN_USAGE_MODFR">', ''), '</t>', '') AS txt

FROM d /*CAO_CALLOUT_GHOST_COMB_W*/ piu

)

,

parse( logical_partition_key, usage_text,txt, rpo_txt, indx ) as

(

select d.logical_partition_key, d.usage_text,d.txt, coalesce(a.rpo[1],a.rpo[
2]) as rpo_txt, a.pos

from usg_txt d

left join lateral regexp_matches( txt, '([ ,\-()/&])|([^ ,\-()/&]+)','g')
with ordinality as a(rpo,pos) on true

)

,

prv_nxt_token( logical_partition_key, usage_text,txt, rpo_txt, indx,
mx_indx, prev,nxt,nxt2, prv2,prv3) as

(

/* Get prior and next token to support later logic */

select p.logical_partition_key, p.usage_text,

p.txt, p.rpo_txt, indx,

max( indx) over ( partition by p.txt ) mx_indx,

lag( p.rpo_txt,1 ) over ( partition by p.logical_partition_key,p.txt order
by indx ) prev,

lag( p.rpo_txt,-1 ) over ( partition by p.logical_partition_key,p.txt order
by indx ) nxt,

lag( p.rpo_txt,-2) over ( partition by p.logical_partition_key,p.txt order
by indx ) nxt2,

lag( p.rpo_txt,2 ) over ( partition by p.logical_partition_key,p.txt order
by indx ) prv2,

lag( p.rpo_txt,3 ) over ( partition by p.logical_partition_key,p.txt order
by indx ) prv3

from parse p

)

select * from prv_nxt_token;

logical_partition_keyusage_texttxtrpo_txtindxmx_indxprevnxtnxt2prv2prv3
"TEST_DATA" "F(T61)(EXPORT)" "F(T61)(EXPORT)" "F" "1" "7" "[NULL]" "(" "T61"
"[NULL]" "[NULL]"
"TEST_DATA" "F(T61)(EXPORT)" "F(T61)(EXPORT)" "(" "2" "7" "F" "T61" ")"
"[NULL]" "[NULL]"
"TEST_DATA" "F(T61)(EXPORT)" "F(T61)(EXPORT)" "T61" "3" "7" "(" ")" "(" "F"
"[NULL]"
"TEST_DATA" "F(T61)(EXPORT)" "F(T61)(EXPORT)" ")" "4" "7" "T61" "(" "EXPORT"
"(" "F"
"TEST_DATA" "F(T61)(EXPORT)" "F(T61)(EXPORT)" "(" "5" "7" ")" "EXPORT" ")"
"T61" "("
"TEST_DATA" "F(T61)(EXPORT)" "F(T61)(EXPORT)" "EXPORT" "6" "7" "(" ")"
"[NULL]" ")" "T61"
"TEST_DATA" "F(T61)(EXPORT)" "F(T61)(EXPORT)" ")" "7" "7" "EXPORT" "[NULL]"
"[NULL]" "(" ")"

When I run the same statement, except this time using a large table instead
of a values statement, I get the wrong answer. The difference in the SQL
statement that produced the following data is that the large table (10M
records) and there is a closing where condition used to limit the result to
what is shown:

select * from prv_nxt_token;
is replaced by:

select * from prv_nxt_token where logical_partition_key='TEST_DATA' and
usage_text='F(T61)(EXPORT)';

Which returns:
logical_partition_keyusage_texttxtrpo_txtindxmx_indxprvnxtnxt2prv2prv3
"TEST_DATA"
"F(T61)(EXPORT)" "F(T61)(EXPORT)" "F" "1" "7" *"F"* "(" "(" "[NULL]"
"[NULL]"
"TEST_DATA"
"F(T61)(EXPORT)" "F(T61)(EXPORT)" "(" "2" "7" "F" "(" "T61" "F" "[NULL]"
"TEST_DATA"
"F(T61)(EXPORT)" "F(T61)(EXPORT)" "T61" "3" "7" *"T61"* ")" ")" "(" "("
"TEST_DATA"
"F(T61)(EXPORT)" "F(T61)(EXPORT)" ")" "4" "7" "T61" ")" "(" "T61" "("
"TEST_DATA"
"F(T61)(EXPORT)" "F(T61)(EXPORT)" "(" "5" "7" "*("* "EXPORT" "EXPORT" ")"
")"
"TEST_DATA"
"F(T61)(EXPORT)" "F(T61)(EXPORT)" "EXPORT" "6" "7" "(" "EXPORT" ")" "(" ")"
"TEST_DATA"
"F(T61)(EXPORT)" "F(T61)(EXPORT)" ")" "7" "7" ")" "[NULL]" "[NULL]" "EXPORT"
"EXPORT"
Notice that the prv column (lag - 1 ) is just wrong. I've highlighted
obvious bad values. Other columns are wrong as well.

Is this a PostgreSQL bug?

--
Rumpi Gravenstein

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Tom Lane 2024-08-30 00:07:13 Re: Analytic Function Bug
Previous Message Tom Lane 2024-08-29 20:53:03 Re: Significant Execution Time Difference Between PG13.14 and PG16.4 for Query on information_schema Tables.