| From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> | 
|---|---|
| To: | Mike <mike(at)wolman(dot)co(dot)uk> | 
| Cc: | Marc Mamin <M(dot)Mamin(at)intershop(dot)de>, "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org> | 
| Subject: | Re: regexp_replace to remove sql comments | 
| Date: | 2015-10-28 23:17:47 | 
| Message-ID: | 4747.1446074267@sss.pgh.pa.us | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-general | 
Mike <mike(at)wolman(dot)co(dot)uk> writes:
> Thanks with a bit of moving stuff about I think thats sorted it - in 
> case anyone every needs it:
>    SELECT
>     query,
>       trim(regexp_replace(
>          regexp_replace(
>             regexp_replace(query,'\/\*.+\*\/','','g'),
>          '--[^\r\n]*', ' ', 'g')
>       , '\s+', ' ', 'g')) as q
>     FROM public.pg_stat_statements
>     WHERE dbid IN (SELECT oid FROM pg_database WHERE datname =
>   current_database())
This doesn't look too reliable from here:
1. Doesn't handle multiline /* comments.
2. Does wrong thing if more than one /* comment appears on one line.
(You could improve that by using .*? instead of .+, but then it'd
do the wrong thing with nested /* comments.)
3. Breaks things if either -- or /* appear inside a string literal,
double-quoted identifier, or $$ literal.
I'm not at all sure that it's possible to handle this requirement 100%
correctly with regexes; they're unable to do context-sensitive processing.
But so far as pg_stat_statements is concerned, why would you need to
do this at all?  The duplicate-query elimination it does should be
insensitive to comments already.
regards, tom lane
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Adrian Klaver | 2015-10-28 23:20:12 | Re: Postgresql Installation -- Red Hat vs OpenSUSE vs Ubuntu | 
| Previous Message | Mike | 2015-10-28 22:58:34 | Re: regexp_replace to remove sql comments |