From: | Jack Orenstein <jack(dot)orenstein(at)hds(dot)com> |
---|---|
To: | pgsql-general(at)postgresql(dot)org |
Cc: | Matt McDonald <matt(dot)mcdonald(at)hds(dot)com> |
Subject: | LIKE with pattern containing backslash |
Date: | 2009-02-03 16:35:50 |
Message-ID: | 49887266.3090905@hds.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Suppose I have this table:
create table test(id int, x varchar)
And I want to find rows whose x contains at least one backslash. The backslash
has to be escaped according to section 9.7.1 of the postgres docs.
select *
from test
where x like E'%\\%'
I'll skip the results of my psql experimentation to avoid having to think about
escaping backslashes from the command-line, inside psql.
My test data set has three rows:
0 a\b
1 a\\b
2 a\\\b
I wrote a JDBC test program, with two variants.
1) Searching with literals, escaping the backslash, e.g.
statement.executeQuery("select id, x from test where x like E'%\\%'")
This turns up all three rows. But this:
statement.executeQuery("select id, x from test where x like E'%\\\\%'")
doesn't retrieve any rows. From the docs, I would expect the second query to
retrieve rows with ids 1 and 2.
2) Avoiding literals completely, I created a PreparedStatement, and bound
variables containing the patterns, e.g.
PreparedStatement find = connection.prepareStatement("select id, x from
test where x like ?");
String oneBackslash = new String(new byte[]{'%', '\\', '%'});
find.setString(1, oneBackslash);
ResultSet resultSet = find.executeQuery();
Now, searching for %\% turns up nothing, while searching for %\\% turns up all
three rows.
BOTH behaviors seem wrong to me. In the first test (pattern specified as a
literal), it looks like %\\% is not matching strings that do contain two
backslashes. In the second test (pattern specified as a bound variable), it
looks like the first slash in each pattern is interpreted as an escape. Which I
didn't expect for a bound variable. Section 9.7.1 says "Note that the backslash
already has a special meaning in string literals ..." This seems to apply to
literals only, not to bound variables. And the need to have escapes in a bound
variable escapes me (so to speak).
Can someone help in my understanding of what is correct behavior (in the literal
and bound variable cases)? Is there a bug in the driver? in postgres? in the
docs? Or in my understanding?
Jack Orenstein
P.S. If you want to play with this, I can send you my test programs for the
cases described above.
From | Date | Subject | |
---|---|---|---|
Next Message | Alex Neth | 2009-02-03 16:41:50 | Re: Full text index not being used |
Previous Message | rhubbell | 2009-02-03 16:32:15 | Re: calculating elapsed times between timestamps |