Re: Postgres not using correct indices for views.

From: "Thomas Rosenstein" <thomas(dot)rosenstein(at)creamfinance(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-performance(at)lists(dot)postgresql(dot)org
Subject: Re: Postgres not using correct indices for views.
Date: 2019-08-08 20:30:18
Message-ID: 87BDD0B0-6617-4182-A0C3-53C67E836484@creamfinance.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

To add additional info, the same behaviour is exhibited with the owner,
and the user which only has read priviledges on the view!

On 8 Aug 2019, at 22:04, Thomas Rosenstein wrote:

> Hi,
>
> I'm upgraded to 10.10 from today (on the replicated instance - main db
> is still 10.5), but still have the issue.
>
> The table is owned by the user "creamfinance", and the view is also
> owned by the same user - based on the text you quoted this should
> allow the correct access.
>
> The planner estimates the correct row counts, but still does the wrong
> planning.
>
> Wrong:
>
> ```
> Limit (cost=1880359.00..1880359.03 rows=9 width=1508) (actual
> time=25093.258..25093.270 rows=9 loops=1)
> -> Sort (cost=1880359.00..1884101.04 rows=1496816 width=1508)
> (actual time=25093.257..25093.257 rows=9 loops=1)
> Sort Key: p.customer_id DESC
> Sort Method: top-N heapsort Memory: 33kB
> -> Hash Join (cost=359555.11..1849150.95 rows=1496816
> width=1508) (actual time=1081.081..24251.466 rows=543231 loops=1)
> Hash Cond: (p.customer_id = l.customer_id)
> Join Filter: ((p.date - '3 days'::interval day) <=
> l.duedate)
> Rows Removed by Join Filter: 596120
> -> Seq Scan on payments p (cost=0.00..393323.74
> rows=10046437 width=228) (actual time=0.013..13053.366 rows=10054069
> loops=1)
> -> Hash (cost=333367.49..333367.49 rows=153409
> width=1272) (actual time=689.835..689.835 rows=156682 loops=1)
> Buckets: 32768 Batches: 8 Memory Usage: 7737kB
> -> Bitmap Heap Scan on loans l
> (cost=22732.48..331833.40 rows=153409 width=1272) (actual
> time=64.142..398.893 rows=156682 loops=1)
> Recheck Cond: (location_id = 46)
> Heap Blocks: exact=105938
> -> Bitmap Index Scan on
> loans_location_id_repaid_desc_id_index (cost=0.00..22694.12
> rows=153409 width=0) (actual time=41.324..41.324 rows=157794 loops=1)
> Index Cond: (location_id = 46)
> ```
>
> Correct:
>
> ```
> Limit (cost=0.87..52.60 rows=9 width=1471)
> -> Nested Loop (cost=0.87..2961441.25 rows=515233 width=1471)
> -> Index Scan Backward using loans_customer_id_index on
> loans (cost=0.43..2215467.63 rows=153409 width=1257)
> Filter: (location_id = 46)
> -> Index Scan using payments_customer_id_idx on payments
> (cost=0.43..4.76 rows=10 width=206)
> Index Cond: (customer_id = loans.customer_id)
> Filter: ((date - '3 days'::interval day) <=
> loans.duedate)
> ```
>
> Thanks
>
> Thomas
>
> On 8 Aug 2019, at 18:05, Tom Lane wrote:
>
>> "Thomas Rosenstein" <thomas(dot)rosenstein(at)creamfinance(dot)com> writes:
>>> we have created restricted view for our tables, so that we can allow
>>> access to non-gdpr relevant data but hide everything else.
>>> For exactly those views, the Query Planner uses the wrong indices,
>>> when
>>> executing exactly the same query, once it takes 0.1 s and on the
>>> views
>>> it takes nearly 18 sec (it does a full table scan, or uses the wrong
>>> indices).
>>> Do we have to GRANT additional rights? I see it's using some
>>> indices,
>>> just not the correct ones!
>>
>> Does EXPLAIN show reasonable rowcount estimates when you query
>> directly, but bad ones when you query via the views?
>>
>> If so, a likely guess is that you're falling foul of the restrictions
>> added for CVE-2017-7484:
>>
>> Author: Peter Eisentraut <peter_e(at)gmx(dot)net>
>> Branch: master Release: REL_10_BR [e2d4ef8de] 2017-05-08 09:26:32
>> -0400
>> Branch: REL9_6_STABLE Release: REL9_6_3 [c33c42362] 2017-05-08
>> 09:18:57 -0400
>> Branch: REL9_5_STABLE Release: REL9_5_7 [d45cd7c0e] 2017-05-08
>> 09:19:07 -0400
>> Branch: REL9_4_STABLE Release: REL9_4_12 [3e5ea1f9b] 2017-05-08
>> 09:19:15 -0400
>> Branch: REL9_3_STABLE Release: REL9_3_17 [4f1b2089a] 2017-05-08
>> 09:19:23 -0400
>> Branch: REL9_2_STABLE Release: REL9_2_21 [d035c1b97] 2017-05-08
>> 09:19:42 -0400
>>
>> Add security checks to selectivity estimation functions
>>
>> Some selectivity estimation functions run user-supplied operators
>> over
>> data obtained from pg_statistic without security checks, which
>> allows
>> those operators to leak pg_statistic data without having
>> privileges on
>> the underlying tables. Fix by checking that one of the following
>> is
>> satisfied: (1) the user has table or column privileges on the
>> table
>> underlying the pg_statistic data, or (2) the function
>> implementing the
>> user-supplied operator is leak-proof. If neither is satisfied,
>> planning
>> will proceed as if there are no statistics available.
>>
>> At least one of these is satisfied in most cases in practice.
>> The only
>> situations that are negatively impacted are user-defined or
>> not-leak-proof operators on a security-barrier view.
>>
>> Reported-by: Robert Haas <robertmhaas(at)gmail(dot)com>
>> Author: Peter Eisentraut <peter_e(at)gmx(dot)net>
>> Author: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
>>
>> Security: CVE-2017-7484
>>
>>
>> However, if you're not on the latest minor releases, you might
>> find that updating would fix this for you, because of
>>
>> Author: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
>> Branch: master Release: REL_12_BR [a0905056f] 2019-05-06 11:54:32
>> +0100
>> Branch: REL_11_STABLE Release: REL_11_3 [98dad4cd4] 2019-05-06
>> 11:56:37 +0100
>> Branch: REL_10_STABLE Release: REL_10_8 [ca74e3e0f] 2019-05-06
>> 11:58:32 +0100
>> Branch: REL9_6_STABLE Release: REL9_6_13 [71185228c] 2019-05-06
>> 12:00:00 +0100
>> Branch: REL9_5_STABLE Release: REL9_5_17 [01256815a] 2019-05-06
>> 12:01:44 +0100
>> Branch: REL9_4_STABLE Release: REL9_4_22 [3c0999909] 2019-05-06
>> 12:05:05 +0100
>>
>> Use checkAsUser for selectivity estimator checks, if it's set.
>>
>> In examine_variable() and examine_simple_variable(), when
>> checking the
>> user's table and column privileges to determine whether to grant
>> access to the pg_statistic data, use checkAsUser for the
>> privilege
>> checks, if it's set. This will be the case if we're accessing the
>> table via a view, to indicate that we should perform privilege
>> checks
>> as the view owner rather than the current user.
>>
>> This change makes this planner check consistent with the check in
>> the
>> executor, so the planner will be able to make use of statistics
>> if the
>> table is accessible via the view. This fixes a performance
>> regression
>> introduced by commit e2d4ef8de8, which affects queries against
>> non-security barrier views in the case where the user doesn't
>> have
>> privileges on the underlying table, but the view owner does.
>>
>> Note that it continues to provide the same safeguards controlling
>> access to pg_statistic for direct table access (in which case
>> checkAsUser won't be set) and for security barrier views, because
>> of
>> the nearby checks on rte->security_barrier and
>> rte->securityQuals.
>>
>> Back-patch to all supported branches because e2d4ef8de8 was.
>>
>> Dean Rasheed, reviewed by Jonathan Katz and Stephen Frost.
>>
>> regards, tom lane

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2019-08-08 22:45:07 Re: Postgres not using correct indices for views.
Previous Message Thomas Rosenstein 2019-08-08 20:04:31 Re: Postgres not using correct indices for views.