RE: Why the index is not used ?

From: ROS Didier <didier(dot)ros(at)edf(dot)fr>
To: "greatvovan(at)gmail(dot)com" <greatvovan(at)gmail(dot)com>
Cc: "pgsql-sql(at)lists(dot)postgresql(dot)org" <pgsql-sql(at)lists(dot)postgresql(dot)org>, "pgsql-performance(at)lists(dot)postgresql(dot)org" <pgsql-performance(at)lists(dot)postgresql(dot)org>, "pgsql-general(at)lists(dot)postgresql(dot)org" <pgsql-general(at)lists(dot)postgresql(dot)org>
Subject: RE: Why the index is not used ?
Date: 2018-10-08 11:47:06
Message-ID: 4c49366e251b4e708306c3d7064ab16c@PCYINTPEXMU001.NEOPROD.EDF.FR
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-performance pgsql-sql

Hi Vlad
Sorry for this delay, but apparently the subject is of interest to many people in the community. I received a lot of comments and answers.
I wrote my answers in the body of your message below

Best Regards
Didier

De : greatvovan(at)gmail(dot)com [mailto:greatvovan(at)gmail(dot)com]
Envoyé : samedi 6 octobre 2018 18:51
À : ROS Didier <didier(dot)ros(at)edf(dot)fr>
Cc : pgsql-sql(at)lists(dot)postgresql(dot)org; pgsql-performance(at)lists(dot)postgresql(dot)org; pgsql-general(at)lists(dot)postgresql(dot)org
Objet : Re: Why the index is not used ?

Hello Didier,

>>
(3), (5) to find the match, you decrypt the whole table, apparently this take quite a long time.
Index cannot help here because indexes work on exact match of type and value, but you compare mapped value, not indexed. Functional index should help, but like it was said, it against the idea of encrypted storage.
<<
I tested the solution of the functional index. It works very well, but the data is no longer encrypted. This is not the right solution
>>
(6) I never used pgp_sym_encrypt() but I see that in INSERT INTO you supplied additional parameter 'compress-algo=2, cipher-algo=aes256' while in (6) you did not. Probably this is the reason.

In general matching indexed bytea column should use index, you can ensure in this populating the column unencrypted and using 'test value 32'::bytea for match.
In you case I believe pgp_sym_encrypt() is not marked as STABLE or IMMUTABLE that's why it will be evaluated for each row (very inefficient) and cannot use index. From documentation:

"Since an index scan will evaluate the comparison value only once, not once at each row, it is not valid to use a VOLATILE function in an index scan condition."
https://www.postgresql.org/docs/10/static/xfunc-volatility.html

If you cannot add STABLE/IMMUTABLE to pgp_sym_encrypt() (which apparently should be there), you can encrypt searched value as a separate operation and then search in the table using basic value match.
>>
you're right about the missing parameter 'compress-algo=2, cipher-algo=aes256'. I agree with you.
(1) I have tested your proposal :
DROP TABLE cartedecredit;
CREATE TABLE cartedecredit(card_id SERIAL PRIMARY KEY, username VARCHAR(100), cc bytea);
INSERT INTO cartedecredit(username,cc) SELECT 'individu ' || x.id, decode('test value ' || x.id,'escape') FROM generate_series(1,100000) AS x(id);

è I inserted unencrypted data into the bytea column
postgres=# select * from cartedecredit limit 5 ;
card_id | username | cc
---------+-------------+------------------------------
1 | individu 1 | \x746573742076616c75652031
2 | individu 2 | \x746573742076616c75652032
3 | individu 3 | \x746573742076616c75652033
4 | individu 4 | \x746573742076616c75652034
5 | individu 5 | \x746573742076616c75652035
CREATE INDEX idx_cartedecredit_cc02 ON cartedecredit(cc);
SELECT encode(cc,'escape') FROM cartedecredit WHERE cc=decode('test value 32','escape');
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------
Index Only Scan using idx_cartedecredit_cc02 on cartedecredit (cost=0.42..8.44 rows=1 width=32) (actual time=0.033..0.034 rows=1 loops=1)
Index Cond: (cc = '\x746573742076616c7565203332'::bytea)
Heap Fetches: 1
Planning time: 0.130 ms
Execution time: 0.059 ms
(5 rows)

è It works but the data is not encrypted. everyone can have access to the data
(2) 2nd test :
DROP TABLE cartedecredit;
CREATE TABLE cartedecredit(card_id SERIAL PRIMARY KEY, username VARCHAR(100), cc bytea);
INSERT INTO cartedecredit(username,cc) SELECT 'individu ' || x.id, pgp_sym_encrypt('test value ' || x.id, 'motdepasse','compress-algo=2, cipher-algo=aes256') FROM generate_series(1,100000) AS x(id);
postgres=# select * from cartedecredit limit 5 ;
>>
card_id | username | cc
---------+-------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------
-------------------
1 | individu 1 | \xc30d0409030296304d007bf50ed768d2480153cd4a4e2d240249f94b31ec168391515ea80947f97970f7a4e058bff648f752df194498dd480c3b8a5c0d2942f90c6dde21a6b9bf4e9fd7986c6f986e3783
647e7a6205b48c03
2 | individu 2 | \xc30d0409030257b50bc0e6bcd8d270d248010984b60126af01ba922da27e2e78c33110f223f0210cf34da77243277305254cba374708d447fc7d653dd9e00ff9a96803a2c47ee95269534f2c24fab1c9dc
31f7909ca7adeaf0
3 | individu 3 | \xc30d040903023c5f8cb688c7945275d24801a518d70c6cc2d4a31f99f3738e736c5312f78bb9c3cc187a65d0cf7f893dbc9448825d39b79df5d0460508fc93336c2bec7794893bb08a290afd649ae15fe2
2b0433eff89222f7
4 | individu 4 | \xc30d04090302dcc3bb49a41b297578d2480167f17b09004e7dacc0891fc0cc7276dd551273eec72644520f8d0543abe8e795af7c1b84fc8e5b4adc33994c479d5ff17988e60bf446dc8c77caf3f3b008c1
c06bf0a3c4df41ae
5 | individu 5 | \xc30d04090302a8c3552fb4b297b567d24801c060fb9241355b49717479107ff59d2928b3c0d9001dabd0035a0419b1a54c0b15f1907a981f08a4227784ac5cf3994b32ba594eff35933825730ac42af8ca
76bd497c5079b127
CREATE INDEX idx_cartedecredit_cc02 ON cartedecredit(cc);
SELECT pgp_sym_decrypt(cc, 'motdepasse','compress-algo=2, cipher-algo=aes256') FROM cartedecredit WHERE cc=pgp_sym_encrypt('test value 32', 'motdepasse'::text,'compress-algo=2, cipher-algo=aes256');
pgp_sym_decrypt
-----------------
(0 rows)

è No row returned !
Time: 116185.300 ms (01:56.185)
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------
Seq Scan on cartedecredit (cost=0.00..3309.00 rows=1 width=32) (actual time=105969.099..105969.099 rows=0 loops=1)
Filter: (cc = pgp_sym_encrypt('test value 32'::text, 'motdepasse'::text, 'compress-algo=2, cipher-algo=aes256'::text))
Rows Removed by Filter: 100000
Planning time: 0.150 ms
Execution time: 105969.166 ms
(5 rows)
Time: 105969.912 ms (01:45.970)
-> Index is not used .
Best Regards
Vlad

Ce message et toutes les pièces jointes (ci-après le 'Message') sont établis à l'intention exclusive des destinataires et les informations qui y figurent sont strictement confidentielles. Toute utilisation de ce Message non conforme à sa destination, toute diffusion ou toute publication totale ou partielle, est interdite sauf autorisation expresse.

Si vous n'êtes pas le destinataire de ce Message, il vous est interdit de le copier, de le faire suivre, de le divulguer ou d'en utiliser tout ou partie. Si vous avez reçu ce Message par erreur, merci de le supprimer de votre système, ainsi que toutes ses copies, et de n'en garder aucune trace sur quelque support que ce soit. Nous vous remercions également d'en avertir immédiatement l'expéditeur par retour du message.

Il est impossible de garantir que les communications par messagerie électronique arrivent en temps utile, sont sécurisées ou dénuées de toute erreur ou virus.
____________________________________________________

This message and any attachments (the 'Message') are intended solely for the addressees. The information contained in this Message is confidential. Any use of information contained in this Message not in accord with its purpose, any dissemination or disclosure, either whole or partial, is prohibited except formal approval.

If you are not the addressee, you may not copy, forward, disclose or use any part of it. If you have received this message in error, please delete it and all copies from your system and notify the sender immediately by return message.

E-mail communication cannot be guaranteed to be timely secure, error or virus-free.

In response to

Browse pgsql-general by date

  From Date Subject
Next Message ROS Didier 2018-10-08 12:02:50 RE: Why the index is not used ?
Previous Message Michael Paquier 2018-10-08 09:24:51 Re: pg9.6: no backup history file (*.backup) created on hot standby

Browse pgsql-performance by date

  From Date Subject
Next Message ROS Didier 2018-10-08 12:02:50 RE: Why the index is not used ?
Previous Message Jean-Paul Argudo 2018-10-08 08:44:22 Re: Why the index is not used ?

Browse pgsql-sql by date

  From Date Subject
Next Message ROS Didier 2018-10-08 12:02:50 RE: Why the index is not used ?
Previous Message Jean-Paul Argudo 2018-10-08 08:44:22 Re: Why the index is not used ?