From: | Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> |
---|---|
To: | Vik Fearing <vik(dot)fearing(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Sample values for pg_stat_statements |
Date: | 2018-03-10 14:02:19 |
Message-ID: | 3240d7fc-ae7c-f93b-825e-b4daf415b580@2ndquadrant.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
I've looked at this patch today. I like the idea / intent in general, as
it helps with some investigation tasks. That being said, I have a couple
of questions/comments based on read through the patch:
1) I see you've renamed the .sql script from 1.4 to 1.6. I thought we've
abandoned that approach some time ago, and are now only doing the
upgrade scripts. That is, keep 1.4 script and then 1.4--1.5 and 1.5-1.6.
That's how other extensions are doing it now, at least - see btree_gin
for example. But maybe pg_stat_statements has to do it this way for some
reason, not sure?
2) The patch should have updated doc/src/sgml/pgstatstatements.sgml
3) Do we really need both collect_consts and collect_params? I can't
really imagine wanting to set only one of those options. In any case,
the names seem unnecessarily abbreviated - just use collect_constants
and collect_parameters.
4) The width_threshold GUC name seems rather weird. I mean, I wouldn't
use "threshold" in this context, and it's really unclear size of what is
it referring to. We do have a precedent, though, as pg_stat_activity has
track_activity_query_size, so I suggest using either parameters_size or
max_parameters_size (prefixed by "pg_stat_statements." of course).
5) I don't quite see why keeping the first set of parameter values we
happen to see would be particularly useful. For example, I'm way more
interested in values for the longest execution - why not to keep those?
6) I suggest to use the same naming style as the existing functions, so
for example CollectParams should be pgss_CollectParams (and it's missing
a comment too).
7) There are a couple of places where the code style violates project
rules, e.g. by placing {} around a single command in if-statement.
8) I see Andres mentioned possible privacy issues (not quite sure what
is 'data minimalism', mentioned by Andres). I'm not sure it's a problem,
considering it can be disabled and it's subject to the usual role check
(susperuser/role_read_all_stats). Unfortunately we can't use the same
approach as pg_stat_activity (only showing data for user's own queries).
Well, we could keep per-user samples, but that might considerably
inflate the file size.
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
From | Date | Subject | |
---|---|---|---|
Next Message | Tomas Vondra | 2018-03-10 14:05:11 | Re: [HACKERS] PATCH: multivariate histograms and MCV lists |
Previous Message | Christos Maris | 2018-03-10 13:46:52 | Google Summer of Code: Potential Applicant |