A question about leakproof

From: "qiumingcheng" <qiumingcheng(at)aliyun(dot)com>
To: "pgsql-general" <pgsql-general(at)lists(dot)postgresql(dot)org>
Subject: A question about leakproof
Date: 2022-10-17 01:15:20
Message-ID: 9cc93fdc-2f99-4963-b895-169135b742a8.qiumingcheng@aliyun.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hello, My questions are as follows:
Problem description
After testing, we don't find the difference between functions of proleakproof=true and functions of proleakproof=false (the function is described in pg_proc). Can you give specific examples to show that functions of proleakproof=true are more secure or can prevent data disclosure than functions of proleakproof=false. My related testing process is as follows (the rsp_user and wumk used below are the two database users that have been created).
1. Operation under user rsp_user
1.1 create table
drop table if exists tb_a cascade;
create table tb_a(id int4,c1 int, c2 int, pad text, effective_date timestamp without time zone NOT NULL) ;
CREATE INDEX tb_a_t_idx_id ON tb_a USING btree (id) TABLESPACE pg_default;
CREATE INDEX tb_a_t_idx_ed ON tb_a USING btree (effective_date) TABLESPACE pg_default;
CREATE INDEX tb_a_t_idx_c2 ON tb_a USING btree (c2) TABLESPACE pg_default;
1.2 insert data
insert into tb_a select id, id %200, id%1000, 'ss', current_date - floor((random() * 10000))::int from (select generate_series(1,10000) id) tb_a;
analyze;
1.3 create view
a. condition of view:effective_date > now() - TIME'23:00', in particular, type of effective_date is ‘timestamp without time zone NOT NULL’,but the type of the result of now() - TIME'23:00' is:timestamp with time zone,the related SQL statements are as follows:
drop view if exists tb_a_date_v1;
CREATE VIEW tb_a_date_v1 AS select * from tb_a where effective_date > now() - TIME'23:00';
b. the condition of the view is id=183. Note that type of id field is int4. the relevant SQL statements are as follows:
drop view if exists tb_a_int4_v1;
CREATE VIEW tb_a_int4_v1 AS select * from tb_a where id=183;
1.4 Authorize the view to user wumk
GRANT ALL ON SCHEMA public TO wumk;
GRANT ALL ON table tb_a_date_v1 TO wumk;
GRANT ALL ON table tb_a_int4_v1 TO wumk;
1.5 test SQL
Execute the following SQL statements respectively:explain select * from tb_a_int4_v1; explain select * from tb_a_date_v1。the results are as follows, the plans for both SQL statements are indexscan。
2. Operation under user wumk
1.1. test SQL,looking the plans
Execute the following SQL statements respectively:explain select * from tb_a_int4_v1; explain select * from tb_a_date_v1。The results are as follows:
As shown above, the plan of view tb_a_date_ v1 is seqscan (under the user rsp_user, its plan is indexscan),After analyzing the source code of postgresql, it is found that this problem is related to the following code:
In the above figure, func_oid=2523, the condition (effective_date > now() - TIME'23:00' in view b_a_data_v1 ) will use the function(oid=2523) for calculation, the function name is timestamp_gt_timestamptz,as shown in the figure below:
The proleakproof of the function is false,so,for the selectivity calculation of effective_date > now() - TIME'23:00 don’t use statistical information
,so the selectivity calculation of indexscan is higher than the actual,So finally, seqscan is selected.
While condition id=183 in in view tb_a_int4_v1 uses a comparison function whose proleakproof is true, so tb_a_int4_v1 view will normally use statistics info to calculate the selectivity, so that the correct indexscan is finally selected
So check the official postgresql document about leakproof:
According to my understanding, if the proleakproof is true, the function will not cause data leakage, and if the proleakproof is false, the function may cause data leakage. So I had tested the proleakproof about data leakage in sections 2.2 and 2.3.
1.2. test the data leakage of seqscan
a. Execute the following SQL and the results are as follows:
create or replace function leak_date(timestamp with time zone) returns bool as $$begin raise notice 'abc:%', $1; return true; end$$ language plpgsql cost 0.0000000000000001;
create or replace function leak_int4(int4) returns bool as $$begin raise notice 'abc:%', $1; return true; end$$ language plpgsql cost 0.0000000000000001;
set enable_indexscan=off;
set enable_bitmapscan=off;
select * from tb_a_date_v1 where leak_date(effective_date);
select * from tb_a_int4_v1 where leak_int4(id);
Analysis conclusion:
It is found that whether the proleakproof is true or false, there is a problem of data leakage under the seqscan plan.
1.3. test the data leakage of indexscan
To ensure that the plan must be indexscan, you need to set the following parameters:
set enable_seqscan=off; set enable_indexscan=on; set enable_bitmapscan=on;
Then execute the sqls:
select * from tb_a_date_v1 where leak_date(effective_date);
select * from tb_a_int4_v1 where leak_int4(id);
The results are as follows:
Analysis conclusion:
It is found that whether the proleakproof is true or false, there is no problem of data leakage under the indexscan plan

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Julien Rouhaud 2022-10-17 01:40:20 Re: A question about leakproof
Previous Message Alex Theodossis 2022-10-15 22:57:31 Re: Attaching database