From: | Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Morris de Oryx <morrisdeoryx(at)gmail(dot)com>, pgsql-general <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: Has there been any discussion of custom dictionaries being defined in the database? |
Date: | 2019-10-19 13:08:26 |
Message-ID: | 20191019130826.usuxx5k7rhwmmnr5@development |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On Thu, Oct 17, 2019 at 11:52:39AM +0200, Tom Lane wrote:
>Morris de Oryx <morrisdeoryx(at)gmail(dot)com> writes:
>> Given that Amazon is bragging this week about turning off Oracle, it seems
>> like they could kick some resources towards contributing something to the
>> Postgres project. With that in mind, is the idea of defining dictionaries
>> within a table somehow meritless, or unexpectedly difficult?
>
>Well, it'd just be totally different. I don't think anybody cares to
>provide two separate definitions of common dictionaries (which'd have to
>somehow be kept in sync).
>
>As for why we did it with external text files in the first place ---
>for at least some of the dictionary types, the point is that you can
>drop in data files that are available from upstream sources, without any
>modification. Getting the same info into a table would require some
>nonzero amount of data transformation.
>
IMHO being able to load dictionaries from a table would be quite
useful, and not just because of RDS. For example, it's not entirely true
we're just using the upstream dictionaries verbatim - it's quite common
to add new words, particularly in specialized fields. That's way easier
when you can do that through a table and not through a file.
>Having said that ... in the end a dictionary is really just a set of
>functions implementing the dictionary API; where they get their data
>from is their business. So in theory you could roll your own
>dictionary that gets its data out of a table. But the dictionary API
>would be pretty hard to implement except in C, and I bet RDS doesn't
>let you install your own C functions either :-(
>
Not sure. Of course, if we expect the dictionary to work just like the
ispell one, with preprocessing the dictionary into shmem, then that
requires C. I don't think that's entirely necessary, thoug - we could
use the table directly. Yes, that would be slower, but maybe it'd be
sufficient.
But I think the idea is ultimately that we'd implement a new dict type
in core, and people would just specify which table to load data from.
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
From | Date | Subject | |
---|---|---|---|
Next Message | Dmitry Dolgov | 2019-10-19 13:32:30 | Re: jsonb_set() strictness considered harmful to data |
Previous Message | Tomas Vondra | 2019-10-19 12:44:29 | Re: releasing space |