Re: Uber migrated from Postgres to MySQL

From: Chris Travers <chris(dot)travers(at)gmail(dot)com>
To: Scott Mead <scottm(at)openscg(dot)com>
Cc: Achilleas Mantzios <achill(at)matrix(dot)gatewaynet(dot)com>, PostgreSQL General <pgsql-general(at)postgresql(dot)org>
Subject: Re: Uber migrated from Postgres to MySQL
Date: 2016-07-27 16:54:44
Message-ID: CAKt_ZftL1_N8178BQh=YyQ96-R53OkQX9edoPwxjz56Rho2P6g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Wed, Jul 27, 2016 at 4:22 PM, Scott Mead <scottm(at)openscg(dot)com> wrote:

> On Wed, Jul 27, 2016 at 3:34 AM, Achilleas Mantzios <
> achill(at)matrix(dot)gatewaynet(dot)com> wrote:
>
>> On 27/07/2016 10:15, Condor wrote:
>>
>>> On 26-07-2016 21:04, Dorian Hoxha wrote:
>>>
>>>> Many comments: https://news.ycombinator.com/item?id=12166585
>>>>
>>>> https://www.reddit.com/r/programming/comments/4uph84/why_uber_engineering_switched_from_postgres_to/
>>>>
>>>> On Tue, Jul 26, 2016 at 7:39 PM, Guyren Howe <guyren(at)gmail(dot)com> wrote:
>>>>
>>>> Honestly, I've never heard of anyone doing that. But it sounds like
>>>>> they had good reasons.
>>>>>
>>>>> https://eng.uber.com/mysql-migration/
>>>>>
>>>>> Thoughts?
>>>>>
>>>>> --
>>>>> Sent via pgsql-general mailing list (pgsql-general(at)postgresql(dot)org)
>>>>> To make changes to your subscription:
>>>>> http://www.postgresql.org/mailpref/pgsql-general
>>>>>
>>>>
>>>
>>> They are right for upgrades.
>>> It's a hard to shutdown 1 TB database and wait couple of days pg_upgrade
>>> to finish upgrade and meanwhile database is offline.
>>> In some distros after upgrade of PG version you don't have old binary
>>> and library, need to do full dump and restore that take time and disk space.
>>>
>>
>> Our last 1TB upgrade from 9.0 -> 9.3 went like a charm in something like
>> seconds. (with the -k option)
>> However, be warned that the planing and testing took one full week.
>>
>
> That being said, it doesn't really provide a back-out plan. The beauty of
> replication is that you can halt the upgrade at any point if need be and
> cut your (hopefully small) losses.
>

Replication though does have limits and one aspect of incremental backups
is you cannot restore from one major version to the next. Another one I
think they obliquely referred to (in the subtle problems section) was the
fact that if you have longer-running queries on the replica with a lot of
updates, you can get funny auto-vacuum-induced errors (writes from
autovacuum on the master can interrupt queries on the slave). BTW if there
is interest in what could be done for that, something which allows
autovacuum to decide how long to wait before cleaning up dead tuples would
be a great enhancement.

I was on a project once where I was told, "we use pg_dump for our upgrades"
for a multi-TB database. When asked why, the answer made a lot of sense.
Namely if something goes wrong you need to do a restore on the new version
from a logical backup anyway, so you have to take a pg_dump backup before
you start, and you might have to restore anyway. So the thinking was that
it was better to keep expectations low than promise low downtime and have a
two-week outage.

> If you use -k, you are all in. Sure, you could setup a new standby, stop
> traffic, upgrade whichever node you'd like (using -k) and still have the
> other ready in the event of total catastrophe. More often than not, I see
> DBAs and sysads lead the conversation with "well, postgres can't replicate
> from one version to another, so instead.... " followed by a fast-glazing of
> management's eyes and a desire to buy a 'commercial database'.
>

This is one area where we need better presentation of what we have and what
it does.

Streaming replication works great for certain things, such as where you
have lots of small queries against the replica, where they don't have to be
absolutely up to date, or where what you are really after is guarantees
that you can keep moving after one of your servers suffers a catastrophic
failure.

Where the guarantee that the two systems are guaranteed identical on the
filesystem level, it is great. Where that is not what you want, it is a
pretty bad solution. But then there is Slony, Bucardo, and other logical
replication solutions out there (plus the newer logical replication
approaches in PostgreSQL) which handle the other situations very well (with
a very different sort of added complexity).

>
> All in all, Evan's blog seemed to start out decently technical, it quickly
> took a turn with half-truths, outdated information and, in some cases,
> downright fud:
>
> "The bug we ran into only affected certain releases of Postgres 9.2 and
> has been fixed for a long time now. However, we still find it worrisome
> that this class of bug can happen at all. A new version of Postgres could
> be released at any time that has a bug of this nature, and because of the
> way replication works, this issue has the potential to spread into all of
> the databases in a replication hierarchy."
>
>
> ISTM that they needed a tire swing
> <http://i0.wp.com/blogs.perficient.com/perficientdigital/files/2011/07/treecomicbig.jpg>
> and were using a dump truck. Hopefully they vectored somewhere in the
> middle and got themselves a nice sandbox.
>

My first thought was, "If they know the database that well, surely they
could have built something that would work well!"

However, for what they seem to want to do specifically, MySQL might not
actually be a bad choice. In a case like what they are doing, nearly all
of your lookups are probably simple, primary key lookups and there InnoDB's
design helps more than it hurts. If I were to think of one are that MySQL
probably would do better, it would be looking up documents based on simple
primary key searches (no joins, no relational math, no need for complex
plans, just a single primary key index lookup). But this is also a reason
we might not want to worry about this sort of thing too much. Of course
NFS might be another alternative at that level of complexity....

So yeah, a sandbox ;-)

>
> --Scott
>
>
>>
>>
>>>
>>> Regards,
>>> Hristo S.
>>>
>>>
>>>
>>>
>>>
>>>
>>
>> --
>> Achilleas Mantzios
>> IT DEV Lead
>> IT DEPT
>> Dynacom Tankers Mgmt
>>
>>
>>
>>
>> --
>> Sent via pgsql-general mailing list (pgsql-general(at)postgresql(dot)org)
>> To make changes to your subscription:
>> http://www.postgresql.org/mailpref/pgsql-general
>>
>
>
>
> --
> --
> Scott Mead
> Sr. Architect
> *OpenSCG <http://openscg.com>*
> http://openscg.com
>

--
Best Wishes,
Chris Travers

Efficito: Hosted Accounting and ERP. Robust and Flexible. No vendor
lock-in.
http://www.efficito.com/learn_more

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Bruce Momjian 2016-07-27 16:59:59 Re: Uber migrated from Postgres to MySQL
Previous Message Bruce Momjian 2016-07-27 16:54:30 Re: Uber migrated from Postgres to MySQL