Re: Fixing Google Search on the docs (redux)

From: Magnus Hagander <magnus(at)hagander(dot)net>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: "Jonathan S(dot) Katz" <jkatz(at)postgresql(dot)org>, Dave Page <dpage(at)pgadmin(dot)org>, PostgreSQL WWW <pgsql-www(at)postgresql(dot)org>
Subject: Re: Fixing Google Search on the docs (redux)
Date: 2020-11-21 14:57:28
Message-ID: CABUevExxMMkQ78fHi9wkjcs1tTduUEE8ZrWxiRpdp0Tk1D0dcw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-www

On Thu, Nov 19, 2020 at 8:50 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
>
> Hi,
>
> On 2020-11-18 18:28:49 +0100, Magnus Hagander wrote:
> > We've discussed this many times before, and I think so far they've all
> > bogged down at "google suck" :) The problem is that they don't even
> > consider the case like we have where the pages *aren't* identical, but
> > yet related.
>
> Is any search engine better at this? I don't think so?

I doubt it, most tend to copy Google. And in either case it doesn't
matter that much -- the *vast* majority of our inbound search traffic
is google vs the other searches. By such a margin that it's not even a
point in considering the others.

> > The problem it usually comes down to is that if we do that, then you
> > will no longer be able to say search for something in the old docs *at
> > all*.
>
> I think that'd still be better than the current situation. But I hope we
> can do better:
>
> > A good example right now might be that recovery.conf stuff goes
> > away. Even if you explicitly search for "postgresql recovery.conf 11".
> > And I'd guess the majority of people are actually looking for things
> > in versions that are NOT the latest (though an even bigger majority of
> > people will be looking for things in versions that are not 9.1).
>
> E.g. not applying canonical when there's no newer version.

That we can definitely go. So for recovery.conf it would still work,
but anything that goes on a page where the page still exists, I don't
see how we could separate that out and not do a canonical for that...

> > I don't know of any way to actually tell google to prioritise the new
> > versions. You used to be able to do this using the sitemap.xml stuff,
> > which is why we do that, but at some point they just stopped caring
> > about those, even in the cases where we're *lowering* our own
> > priority, under the argument of not letting us increase our priority.
>
> Have we evaluated not using canonical, but not including old versions in
> the sitemap?

AIUI from my reading, Google mostly ignores sitemaps these days. The
only thing it's used for is seeding *new* URLs into the search engine,
not removing old and not having any effect on priority. Probably
because it was abused too much.

--
Magnus Hagander
Me: https://www.hagander.net/
Work: https://www.redpill-linpro.com/

In response to

Responses

Browse pgsql-www by date

  From Date Subject
Next Message Andres Freund 2020-11-21 19:45:34 Re: Fixing Google Search on the docs (redux)
Previous Message Bruce Momjian 2020-11-20 23:08:25 Re: Developer FAQ link