From: | Heikki Linnakangas <hlinnakangas(at)vmware(dot)com> |
---|---|
To: | PostgreSQL-development <pgsql-hackers(at)postgreSQL(dot)org> |
Subject: | Regex with > 32k different chars causes a backend crash |
Date: | 2013-04-03 15:11:28 |
Message-ID: | 515C46A0.3090002@vmware.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
While playing with Alexander's pg_trgm regexp patch, I noticed that the
regexp library trips an assertion (if enabled) or crashes, when passed
an input string that contains more than 32k different characters:
select 'foo' ~ (select string_agg(chr(x),'') from generate_series(100,
35000) x) as nastyregex;
This is because it uses 'short' as the datatype to identify colors. When
it overflows, -32768 is used as index to the colordesc array, and you
get a crash. AFAICS this can't reliably be used for anything more
sinister than crashing the backend.
A regex with that many different colors is an extreme case, so I think
it's enough to turn the assertion in newcolor() into a run-time check,
and throw a "too many colors in regexp" error. Alternatively, we could
expand 'color' from short to int, but that would double the memory usage
of sane regexps with less different characters.
Thoughts?
- Heikki
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2013-04-03 15:21:27 | Re: Regex with > 32k different chars causes a backend crash |
Previous Message | Tom Lane | 2013-04-03 14:59:09 | Re: Drastic performance loss in assert-enabled build in HEAD |