Re: [PATCH] Add log_transaction setting

From: Сергей Соловьев <sergey(dot)soloviev(at)tantorlabs(dot)ru>
To: Kirill Reshke <reshkekirill(at)gmail(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] Add log_transaction setting
Date: 2024-08-14 18:08:00
Message-ID: 21111723657837@mail.yandex.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


 

 

10.08.2024, 16:40, "Kirill Reshke" <reshkekirill(at)gmail(dot)com>:

On Thu, 4 Jul 2024 at 21:46, Sergey Solovev

<sergey(dot)soloviev(at)tantorlabs(dot)ru> wrote:


 Hi.

 We encountered a problem with excessive logging when transaction is

 sampled.

 When it is getting sampled every statement is logged, event SELECT. This

 can

 lead to performance degradation and log polluting.

 I have added new setting to filter statements when transaction is

 sampled - log_transaction.

 Overview

 ===================

 This parameter works very similar to log_statement, but it filters out

 statements only

 in sampled transaction. It has same values as log_statement, access

 rights (only superuser),

 setup in postgresql.conf or by superuser with SET.

 But by default it has value "all" in compliance with existing behaviour

 (to log every statement).

 Example usage

 ==================

 Log every DDL, but only subset of other statements.

 postgresql.conf

  > log_transaction = ddl

  > log_statement = all

  > log_transaction_sample_rate = 1

  > log_statement_sample_rate = 0.3

 backend:

  > BEGIN;

  > CREATE TABLE t1(value text);

  > INSERT INTO t1(value) VALUES ('hello'), ('world');

  > SELECT * FROM t1;

  > DROP TABLE t1;

  > COMMIT;

 logfile:

  > LOG: duration: 6.465 ms statement: create table t1(value text);

  > LOG: statement: insert into t1(value) values ('hello'), ('world');

  > LOG: duration: 6.457 ms statement: drop table t1;

 As you can see CREATE and DROP were logged with duration (because it is

 DDL), but

 only INSERT was logged (without duration) - not SELECT.

 Testing

 ===================

 All tests are passed - default configuration is compatible with existing

 behaviour.

 Honestly, I could not find any TAP/regress tests that would test logging

 - some

 tests only use logs to ensure test results.

 I did not find suitable place for such tests, so no tests provided

 Implementation details

 ===================

 I have modified signature of check_log_duration function - this accepts

 List of

 statements that were executed. This is to decide whether we should log

 current statement if transaction is sampled.

 But this list can be empty, in that case we decide to log. NIL is passed

 only

 in fast path, PARSE and BIND functions - by default these are logged

 if transaction is sampled, so we are compliant with existing behaviour

 again.

 In first implementation version, there was boolean flag (instead of

 List), but

 it was replaced by List to defer determining (function call) whether it

 is worth logging.

 -------------

 Best regards, Sergey Solovev

Hi!

As I understand, the need of this GUC variable comes from fact, that

is the transaction is sampled for logging, all statements within this

tx are logged and this is not configurable now, correct?

Well, if this is the case, why should we add a new GUC? Maybe we

should use `log_statement` in this case too, so there is a bug, that

log_statement is honored not during tx sampling?

Also, tests are failing[1]

[1] https://cirrus-ci.com/task/5645711230894080

 
--

Best regards,

Kirill Reshke

Hi, Kirill!

 

Thanks for review.



Also, tests are failing[1]

That is my mistake: I have tested patch on REL_17BETA2, but set 18 version, and

it seems that I run tests without proper ./configure flags. I reproduced error and fixed

it in new patch.


As I understand, the need of this GUC variable comes from fact, that

is the transaction is sampled for logging, all statements within this

tx are logged and this is not configurable now, correct?

Yes. That's the point - all or nothing.


Well, if this is the case, why should we add a new GUC? Maybe we

should use `log_statement` in this case too, so there is a bug, that

log_statement is honored not during tx sampling?


That can be good solution, but I proceed from possibility of flexible logging setup.


Shown example (when all DDL statements were logged) is just one of such use cases.


Another example is security policy: when we must log every data modification -


set `log_transaction = mod`.

Attachment Content-Type Size
unknown_filename text/html 5.0 KB
0002-Add-log_transaction-configuration-setting.patch text/x-diff 8.4 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Nathan Bossart 2024-08-14 18:16:06 Re: Remove dependence on integer wrapping
Previous Message Jelte Fennema-Nio 2024-08-14 18:04:09 Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs