From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Jean-Christophe Arnu <jcarnu(at)gmail(dot)com> |
Cc: | Artur Zakirov <zaartur(at)gmail(dot)com>, Ranier Vilela <ranier(dot)vf(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Empty string in lexeme for tsvector |
Date: | 2021-09-29 19:36:11 |
Message-ID: | 2997142.1632944171@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Jean-Christophe Arnu <jcarnu(at)gmail(dot)com> writes:
> [ empty_string_in_tsvector_v4.patch ]
I looked through this patch a bit. I don't agree with adding
these new error conditions to tsvector_setweight_by_filter and
tsvector_delete_arr. Those don't prevent bad lexemes from being
added to tsvectors, so AFAICS they can have no effect other than
breaking existing applications. In fact, tsvector_delete_arr is
one thing you could use to fix existing bad tsvectors, so making
it throw an error seems actually counterproductive.
(By the same token, I think there's a good argument for
tsvector_delete_arr to just ignore nulls, not throw an error.
That's a somewhat orthogonal issue, though.)
What I'm wondering about more than that is whether array_to_tsvector
is the only place that can inject an empty lexeme ... don't we have
anything else that can add lexemes without going through the parser?
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Drouvot, Bertrand | 2021-09-29 19:53:51 | Re: [BUG] failed assertion in EnsurePortalSnapshotExists() |
Previous Message | Ranier Vilela | 2021-09-29 19:16:44 | Re: jsonb crash |