| From: | Greg Stark <stark(at)mit(dot)edu> |
|---|---|
| To: | Magnus Hagander <magnus(at)hagander(dot)net> |
| Cc: | Craig Ringer <craig(at)2ndquadrant(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
| Subject: | Re: robots.txt on git.postgresql.org |
| Date: | 2013-07-11 13:43:21 |
| Message-ID: | CAM-w4HPdUbND-qA8ho1EB-wvj+tXcX=0H_6JtQNbkd_UZsDmHw@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On Wed, Jul 10, 2013 at 9:36 AM, Magnus Hagander <magnus(at)hagander(dot)net> wrote:
> We already run this, that's what we did to make it survive at all. The
> problem is there are so many thousands of different URLs you can get
> to on that site, and google indexes them all by default.
There's also https://support.google.com/webmasters/answer/48620?hl=en
which lets us control how fast the Google crawler crawls. I think it's
adaptive though so if the pages are slow it should be crawling slowly
--
greg
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Andres Freund | 2013-07-11 13:50:58 | Re: robots.txt on git.postgresql.org |
| Previous Message | KONDO Mitsumasa | 2013-07-11 12:29:07 | Re: Improvement of checkpoint IO scheduler for stable transaction responses |