Re: Fixing Google Search on the docs (redux)

From: Greg Stark <stark(at)mit(dot)edu>
To: Dave Page <dpage(at)pgadmin(dot)org>
Cc: Magnus Hagander <magnus(at)hagander(dot)net>, "Jonathan S(dot) Katz" <jkatz(at)postgresql(dot)org>, PostgreSQL WWW <pgsql-www(at)postgresql(dot)org>
Subject: Re: Fixing Google Search on the docs (redux)
Date: 2020-11-19 14:19:06
Message-ID: CAM-w4HOheMMcDOJUCZn32YwKEux_VJYjPKjVXufLWnGkrWon_g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-www

> all other URLs will be considered duplicate URLs and crawled less often

What Google crawls and what Google considers a valid search result to
serve users are two independent questions. Google may well crawl the
non-canonical results but never serve them. The crawl would still, for
example, add weight to pages linked from it. It's always really hard
to tell when reading Google docs whether they're talking about crawl
behaviour or search results behaviour.

> - Where a page has been removed entirely, mark the most recent version of it as the canonical one instead of the /current/ version).

This seems like a significant advance on previous ideas. If we have
enough meta data available to do this that would be a big win. I think
it's rare that we remove information from a page but keep the same
page. Generally things like recovery.conf would mean removing whole
pages replacing them with new pages that document new functionality.

In response to

Responses

Browse pgsql-www by date

  From Date Subject
Next Message Magnus Hagander 2020-11-19 14:22:44 Re: Fixing Google Search on the docs (redux)
Previous Message Dave Page 2020-11-19 10:34:27 Re: Fixing Google Search on the docs (redux)