From: | Kuntal Ghosh <kuntalghosh(dot)2007(at)gmail(dot)com> |
---|---|
To: | Andres Freund <andres(at)anarazel(dot)de> |
Cc: | PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: More efficient truncation of pg_stat_activity query strings |
Date: | 2017-09-15 12:13:35 |
Message-ID: | CAGz5QCKxZLWbs-14-FTdRno=mK44_R938CAZnN_5MZjN07Nt_g@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Thu, Sep 14, 2017 at 11:33 AM, Andres Freund <andres(at)anarazel(dot)de> wrote:
> On 2017-09-12 00:19:48 -0700, Andres Freund wrote:
>> Hi,
>>
>> I've recently seen a benchmark in which pg_mbcliplen() showed up
>> prominently. Which it will basically in any benchmark with longer query
>> strings, but fast queries. That's not that uncommon.
>>
>> I wonder if we could avoid the cost of pg_mbcliplen() from within
>> pgstat_report_activity(), by moving some of the cost to the read
>> side. pgstat values are obviously read far less frequently in nearly all
>> cases that are performance relevant.
>>
>> Therefore I wonder if we couldn't just store a querystring that's
>> essentially just a memcpy()ed prefix, and do a pg_mbcliplen() on the
>> read side. I think that should work because all *server side* encodings
>> store character lengths in the *first* byte of a multibyte character
>> (at least one clientside encoding, gb18030, doesn't behave that way).
>>
>> That'd necessitate an added memory copy in pg_stat_get_activity(), but
>> that seems fairly harmless.
>>
>> Faults in my thinking?
>
> Here's a patch that implements that idea. Seems to work well. I'm a
> bit loathe to add proper regression tests for this, seems awfully
> dependent on specific track_activity_query_size settings. I did confirm
> using gdb that I see incomplete characters before
> pgstat_clip_activity(), but not after.
>
> I've renamed st_activity to st_activity_raw to increase the likelihood
> that potential external users of st_activity notice and adapt. Increases
> the noise, but imo to a very bareable amount. Don't feel strongly
> though.
>
Hello,
The patch looks good to me. I've done some regression testing with a
custom script on my local system. The script contains the following
statement:
SELECT 'aaa..<repeated 600 times>' as col;
Test 1
-----------------------------------
duration: 300 seconds
clients/threads: 1
On HEAD TPS: 13181
+ 9.30% 0.27% postgres postgres [.] pgstat_report_activity
+ 8.88% 0.02% postgres postgres [.] pg_mbcliplen
+ 7.76% 4.77% postgres postgres [.] pg_encoding_mbcliplen
+ 4.06% 4.06% postgres postgres [.] pg_utf_mblen
With the patch TPS:13628 (+3.39%)
+ 0.36% 0.21% postgres postgres [.] pgstat_report_activity
Test 2
-----------------------------------
duration: 300 seconds
clients/threads: 8
On HEAD TPS: 53084
+ 12.17% 0.20% postgres postgres [.]
pgstat_report_activity
+ 11.83% 0.02% postgres postgres [.] pg_mbcliplen
+ 11.19% 8.03% postgres postgres [.] pg_encoding_mbcliplen
+ 3.74% 3.73% postgres postgres [.] pg_utf_mblen
With the patch TPS: 63949 (+20.4%)
+ 0.40% 0.25% postgres postgres [.] pgstat_report_activity
This shows the significance of this patch in terms of performance
improvement of pgstat_report_activity. Is there any other tests I
should do for the same?
--
Thanks & Regards,
Kuntal Ghosh
EnterpriseDB: http://www.enterprisedb.com
From | Date | Subject | |
---|---|---|---|
Next Message | Robert Haas | 2017-09-15 12:20:00 | Re: no test coverage for ALTER FOREIGN DATA WRAPPER name HANDLER ... |
Previous Message | Ashutosh Bapat | 2017-09-15 11:59:21 | Re: Partition-wise join for join between (declaratively) partitioned tables |