Re: Website mailing list search enhancement idea - one result per thread

From: Masahiro Ikeda <ikedamsh(at)oss(dot)nttdata(dot)com>
To: PostgreSQL WWW <pgsql-www(at)lists(dot)postgresql(dot)org>
Cc: Magnus Hagander <magnus(at)hagander(dot)net>, "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: Website mailing list search enhancement idea - one result per thread
Date: 2021-02-06 22:48:04
Message-ID: 6f68bf53ede1c315ae7f52931653eec1@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-www

On 2021-02-06 18:05, Masahiro Ikeda wrote:
> "David G. Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com> writes:
>> When searching the mailing lists in advanced search mode we have list
>> filtering, timeframe filtering (pretty coarse though) and then sorting
>> mode. For moderately long running threads a search term is going to
>> return
>> many messages from the same thread. It would desirable to ask that
>> only
>> threads be shown and maybe how many individual messages in that thread
>> matched.
>
> +1. I always think so.
>
> On 2020-05-01 01:55, Tom Lane wrote:
>> Magnus Hagander <magnus(at)hagander(dot)net> writes:
>>> On Thu, Apr 30, 2020 at 2:17 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>>> "David G. Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com> writes:
>>>>> It would desirable to ask that only
>>>>> threads be shown and maybe how many individual messages in that
>>>>> thread
>>>>> matched.
>>
>>>> +1, an option to do that would be handy. No idea how hard it is ...
>>
>>> Not having actually looked at it, probably not too hard. The hardest
>>> part
>>> might be bikeshedding about the color (aka user interface).
>>
>>> Should this replace the current interface, or do we want to make
>>> something
>>> like this a choice?
>>
>> No no, not replace. It should be an option, "aggregate into threads"
>> or something like that.
>
> Why don't you implement the following feature?
>
> (1) add an option on the advanced search page.
>
> I agree with the idea to add the option to aggregate into threads
> on the advanced search page.
>
> I attached the image("add_option_to_aggregate_into_threads.png").
> What do you think?
>
>
> (2) change the message's URLs on the result pages if the option is
> checked.
>
> I think it's better to change the message's URLs to flatted ones on
> result pages
> if the messages are aggregated into threads because users want to
> access the
> discussion pages not per messages but per threads directly.
>
> ```
> --- a/templates/search/listsearch.html
> +++ b/templates/search/listsearch.html
> @@ -53,7 +59,12 @@
> <h2>Results {{firsthit}}-{{lasthit}} of {%if hitcount ==
> 1000%}more than 1000{%else%}{{hitcount}}{%endif%}.</h2>
> {%if pagelinks %}Result pages:
> {{pagelinks|safe}}<br/><br/>{%endif%}
> {%for hit in hits %}
> - {{forloop.counter0|add:firsthit}}. <a
> href="https://www.postgresql.org/message-id/{{hit.messageid}}">{{hit.subject}}</a>
> [{{hit.rank|floatformat:2}}]<br/>
> + {{forloop.counter0|add:firsthit}}.
> + {%if aggregate%}
> + <a
> href="https://www.postgresql.org/message-id/flat/{{hit.messageid}}">{{hit.subject}}</a>
> [{{hit.rank|floatformat:2}}]<br/>
> + {%else%}
> + <a
> href="https://www.postgresql.org/message-id/{{hit.messageid}}">{{hit.subject}}</a>
> [{{hit.rank|floatformat:2}}]<br/>
> + {%endif%}
> From {{hit.author}} on {{hit.date}}.<br/>
> {{hit.abstract|safe}}<br/>
> <a
> href="https://www.postgresql.org/message-id/{{hit.messageid}}">https://www.postgresql.org/message-id/{{hit.messageid}}</a><br/>
> ```
>
>
> If someone already works for this, it's ok to ignore the following.
> Although I'm not familiar with Django, I made the patches for
> pgarchives and pgweb.
>
> - pgweb:
> "0001-wip-add-an-search-option-to-aggregate-into-threads.patch"
>
> This patch changes the user interface the above (1) and (2).
>
>
> - pgarchives:
> "0001-wip-add-an-search-parameter-to-aggregate-into-thread.patch"
>
> This patch changes the search query to the Postgresql. The example
> query is the following.
> To aggregate into threads, it uses the window function to extract the
> highest-ranked
> messages in the same thread group.
>
>
> ```
> SELECT messageid, date, subject, _from, ts_rank_cd(fti,
> plainto_tsquery('public.pg', 'wal writer')),
> ts_headline(bodytxt, plainto_tsquery('public.pg', 'wal
> writer'),'StartSel="[[[[[[",StopSel="]]]]]]"')
> FROM
> (SELECT messageid, date, subject, _from, fti, bodytxt,
> -- add a sub query
> RANK() OVER (PARTITION BY threadid ORDER BY ts_rank_cd(fti,
> plainto_tsquery('wal writer')) DESC, id) AS rank -- this line is
> main change
> FROM messages m
> WHERE
> fti @@ plainto_tsquery('public.pg', 'wal writer')
> AND EXISTS (SELECT 1 FROM list_threads lt WHERE
> lt.threadid=m.threadid AND lt.listid=ANY(ARRAY[2]))
> AND m.date > '2020-02-07T08:44:56.738979'::timestamp
> ) m1
> WHERE m1.rank = 1
> ORDER BY ts_rank_cd(fti, plainto_tsquery('wal writer')) DESC
> LIMIT 1000;
> ```
>
> If you have any good ideas, please let me know.

I fixed hard-coded value in "0001-wip-add-an-search-parameter-
to-aggregate-into-thread.patch" for pgarchives.

Regards,
--
Masahiro Ikeda
NTT DATA CORPORATION

Attachment Content-Type Size
v2-0001-wip-add-an-search-parameter-to-aggregate-into-thread.patch text/x-diff 2.1 KB

In response to

Browse pgsql-www by date

  From Date Subject
Next Message Magnus Hagander 2021-02-07 09:58:38 Re: gitweb configuration, tabs and spaces
Previous Message Jonathan S. Katz 2021-02-06 16:52:46 Re: Documentation building broken in CFBot