Re: once more: documentation search indexing

From: Andres Freund <andres(at)anarazel(dot)de>
To: "Jonathan S(dot) Katz" <jkatz(at)postgresql(dot)org>
Cc: PostgreSQL WWW <pgsql-www(at)lists(dot)postgresql(dot)org>
Subject: Re: once more: documentation search indexing
Date: 2021-06-12 21:37:53
Message-ID: 20210612213753.pagmpjjtixjztenj@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-www

Hi,

On 2021-06-12 17:05:22 -0400, Jonathan S. Katz wrote:
> Thank you for bringing this up I applaud the suggestion of approach.

Glad to hear it.

> > Suggested small steps:
> >
> > - add a docs/current link to https://www.postgresql.org/docs/. Often
> > enough that's what a user wants anyway, and it's not useful to add
> > additional steps for users and search engines to navigate to
> > docs/current/.
>
> We do that at the very top: that is the first link in the main body.
> This was done back in Nov 2020[1]

Oh - I had not realized that at all. I think the similarity to the news
bar made me completely blend the "view the manual" element out.

> > I can see us either making it a separate row in the versioned table,
> > or to split the most recent released version's link into a /current/
> > and $major link.
>
> I'm not sure if that's any different than the above right now; if there
> is something you could cite around that, I'm happy to be convinced
> otherwise.

I don't think the existing link is particularly helpful - it's just
visually too different from the other links. And doesn't indicate which
version it is for etc.

> However, I'm also not opposed to putting a (Current) link next to the
> current version in the table. I think that'd at least be helpful from a
> user perspective, if they don't click the big button up top.

Yea, I think that'd be good.

> > - put version in page titles where it makes sense. E.g. change
> > "PostgreSQL: Documentation: 10: 6.1. Inserting Data" to
> > "PostgreSQL 10 Documentation: 6.1. Inserting Data"
> >
> > The current ordering doesn't seem like it has much going for it, and
> > it can't help search engines to have the version number people might
> > search for removed from the product name.
> >
> > Right now this seem to contribute to less than helpful titles in
> > search engine results. Searching anonymously for "postgres alter
> > table" I get the less than helpful "Documentation: 12: ALTER TABLE -
> > PostgreSQL" on google.
> >
> > It might also be worth to go a bit further and put the documentation
> > version *after* the page title, given that it's most likely already
> > clear to the reader that this is about postgres. I.e. something like
> > "ALTER TABLE - Documentation for PostgreSQL 14"
>
> I think having "PostgreSQL $MAJOR_VERSION" together would help both for
> some of the indexing issues + readability in the search engine. The
> question is around how the content is ordered. in the title.
>
> Doing "PostgreSQL $MAJOR_VERSION: Documentation: $page_title" might be
> the way to go. The other thing I see done for SEO what you suggest, but
> just hyphenated i.e. "ALTER TABLE - Documentation - PostgreSQL 14"
>
> Anyway, I'm generally in favor for combining at least "PostgreSQL
> $MAJOR_VERSION."

Yea, let's do that separately then.

WRT ordering, I do think I prefer the versions with the actual subject
of the page first - to distinguish between different PG doc pages
"PostgreSQL 14 Documentation" is really not helpful. I often have
multiple doc pages open in different tabs, and there's right now no way
to distinguish them, because there's never enough space for even just
"PostgreSQL 13: Documentation:", not to speak of an actual title.

> That all said, as stated and cited in some of those previous threads, I
> think the biggest lift is around making our documentation URLs
> canonical. After discussing with Magnus a bit, there are a few things
> that we need to consider in it:
>
> 1. Whether or not the documentation page is in "current"
> 2. If it's not in "current", which is the last version the page is a
> part of? We make that the canonical

Yea, I know that's a potentially significant improvement. I just didn't
feel it's useful to wade into the topic because it's been discussed for
about a decade by now. And that there's things we could make easier
progress on...

> I've attached a patch that does this. The one part I'm not sure I like
> is how we treat something that is solely in "devel" -- knowing that
> eventually something in devel could end up in current. Perhaps if
> something is only in "devel", we exclude it from being part of the
> canonical tree?

Right now all of docs/devel is prevented from being indexed via
robots.txt:
Disallow: /docs/devel/

So it won't really matter for SEO purposes.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-www by date

  From Date Subject
Next Message Jonathan S. Katz 2021-06-13 12:41:14 Re: once more: documentation search indexing
Previous Message Jonathan S. Katz 2021-06-12 21:05:22 Re: once more: documentation search indexing