From: | Rumpi Gravenstein <rgravens(at)gmail(dot)com> |
---|---|
To: | PostgreSQL <pgsql-general(at)lists(dot)postgresql(dot)org> |
Subject: | Analytic Function Bug |
Date: | 2024-08-29 23:18:16 |
Message-ID: | CAEpg1wDJU6DzUJxJdFAUZFzmw6bY+p17bXvCLupwsxc0V6vqrg@mail.gmail.com |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Experts,
I am running on
PostgreSQL 14.13 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 8.5.0
20210514 (Red Hat 8.5.0-22), 64-bit
I have the following query which returns what I expect:
with
d(logical_partition_key, model_usage) as ( values( 'TEST_DATA' ,
'F(T61)(EXPORT)' )
)
,
usg_txt as (
SELECT DISTINCT logical_partition_key,
MODEL_USAGE as usage_text,
REPLACE ( REPLACE ( REPLACE ( REPLACE ( PIU.MODEL_USAGE::text, '<t
context="USAGE_TEXT">', ''), '<t context="FCN_NAME_MODFR">', ''), '<t
context="FCN_USAGE_MODFR">', ''), '</t>', '') AS txt
FROM d /*CAO_CALLOUT_GHOST_COMB_W*/ piu
)
,
parse( logical_partition_key, usage_text,txt, rpo_txt, indx ) as
(
select d.logical_partition_key, d.usage_text,d.txt, coalesce(a.rpo[1],a.rpo[
2]) as rpo_txt, a.pos
from usg_txt d
left join lateral regexp_matches( txt, '([ ,\-()/&])|([^ ,\-()/&]+)','g')
with ordinality as a(rpo,pos) on true
)
,
prv_nxt_token( logical_partition_key, usage_text,txt, rpo_txt, indx,
mx_indx, prev,nxt,nxt2, prv2,prv3) as
(
/* Get prior and next token to support later logic */
select p.logical_partition_key, p.usage_text,
p.txt, p.rpo_txt, indx,
max( indx) over ( partition by p.txt ) mx_indx,
lag( p.rpo_txt,1 ) over ( partition by p.logical_partition_key,p.txt order
by indx ) prev,
lag( p.rpo_txt,-1 ) over ( partition by p.logical_partition_key,p.txt order
by indx ) nxt,
lag( p.rpo_txt,-2) over ( partition by p.logical_partition_key,p.txt order
by indx ) nxt2,
lag( p.rpo_txt,2 ) over ( partition by p.logical_partition_key,p.txt order
by indx ) prv2,
lag( p.rpo_txt,3 ) over ( partition by p.logical_partition_key,p.txt order
by indx ) prv3
from parse p
)
select * from prv_nxt_token;
logical_partition_keyusage_texttxtrpo_txtindxmx_indxprevnxtnxt2prv2prv3
"TEST_DATA" "F(T61)(EXPORT)" "F(T61)(EXPORT)" "F" "1" "7" "[NULL]" "(" "T61"
"[NULL]" "[NULL]"
"TEST_DATA" "F(T61)(EXPORT)" "F(T61)(EXPORT)" "(" "2" "7" "F" "T61" ")"
"[NULL]" "[NULL]"
"TEST_DATA" "F(T61)(EXPORT)" "F(T61)(EXPORT)" "T61" "3" "7" "(" ")" "(" "F"
"[NULL]"
"TEST_DATA" "F(T61)(EXPORT)" "F(T61)(EXPORT)" ")" "4" "7" "T61" "(" "EXPORT"
"(" "F"
"TEST_DATA" "F(T61)(EXPORT)" "F(T61)(EXPORT)" "(" "5" "7" ")" "EXPORT" ")"
"T61" "("
"TEST_DATA" "F(T61)(EXPORT)" "F(T61)(EXPORT)" "EXPORT" "6" "7" "(" ")"
"[NULL]" ")" "T61"
"TEST_DATA" "F(T61)(EXPORT)" "F(T61)(EXPORT)" ")" "7" "7" "EXPORT" "[NULL]"
"[NULL]" "(" ")"
When I run the same statement, except this time using a large table instead
of a values statement, I get the wrong answer. The difference in the SQL
statement that produced the following data is that the large table (10M
records) and there is a closing where condition used to limit the result to
what is shown:
select * from prv_nxt_token;
is replaced by:
select * from prv_nxt_token where logical_partition_key='TEST_DATA' and
usage_text='F(T61)(EXPORT)';
Which returns:
logical_partition_keyusage_texttxtrpo_txtindxmx_indxprvnxtnxt2prv2prv3
"TEST_DATA"
"F(T61)(EXPORT)" "F(T61)(EXPORT)" "F" "1" "7" *"F"* "(" "(" "[NULL]"
"[NULL]"
"TEST_DATA"
"F(T61)(EXPORT)" "F(T61)(EXPORT)" "(" "2" "7" "F" "(" "T61" "F" "[NULL]"
"TEST_DATA"
"F(T61)(EXPORT)" "F(T61)(EXPORT)" "T61" "3" "7" *"T61"* ")" ")" "(" "("
"TEST_DATA"
"F(T61)(EXPORT)" "F(T61)(EXPORT)" ")" "4" "7" "T61" ")" "(" "T61" "("
"TEST_DATA"
"F(T61)(EXPORT)" "F(T61)(EXPORT)" "(" "5" "7" "*("* "EXPORT" "EXPORT" ")"
")"
"TEST_DATA"
"F(T61)(EXPORT)" "F(T61)(EXPORT)" "EXPORT" "6" "7" "(" "EXPORT" ")" "(" ")"
"TEST_DATA"
"F(T61)(EXPORT)" "F(T61)(EXPORT)" ")" "7" "7" ")" "[NULL]" "[NULL]" "EXPORT"
"EXPORT"
Notice that the prv column (lag - 1 ) is just wrong. I've highlighted
obvious bad values. Other columns are wrong as well.
Is this a PostgreSQL bug?
--
Rumpi Gravenstein
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2024-08-30 00:07:13 | Re: Analytic Function Bug |
Previous Message | Tom Lane | 2024-08-29 20:53:03 | Re: Significant Execution Time Difference Between PG13.14 and PG16.4 for Query on information_schema Tables. |