From: | Magnus Hagander <magnus(at)hagander(dot)net> |
---|---|
To: | Greg Stark <stark(at)mit(dot)edu> |
Cc: | Craig Ringer <craig(at)2ndquadrant(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: robots.txt on git.postgresql.org |
Date: | 2013-07-11 14:05:59 |
Message-ID: | CABUevEy9pS6ERtg3xqzo31wv_93br=AzEHfbeM5m4kidypgTRA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Thu, Jul 11, 2013 at 3:43 PM, Greg Stark <stark(at)mit(dot)edu> wrote:
> On Wed, Jul 10, 2013 at 9:36 AM, Magnus Hagander <magnus(at)hagander(dot)net> wrote:
>> We already run this, that's what we did to make it survive at all. The
>> problem is there are so many thousands of different URLs you can get
>> to on that site, and google indexes them all by default.
>
> There's also https://support.google.com/webmasters/answer/48620?hl=en
> which lets us control how fast the Google crawler crawls. I think it's
> adaptive though so if the pages are slow it should be crawling slowly
Sure, but there are plenty of other search engines as well, not just
google... Google is actually "reasonably good" at scaling back it's
own speed, in my experience. Which is not true of all the others. Of
course, it's also got the problem of it then taking a long time to
actually crawl the site, since there are so many different URLs...
--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/
From | Date | Subject | |
---|---|---|---|
Next Message | Claudio Freire | 2013-07-11 14:11:50 | Re: SSL renegotiation |
Previous Message | Andres Freund | 2013-07-11 13:50:58 | Re: robots.txt on git.postgresql.org |