Re: Postgres performance comparing GCP and AWS

From: Imre Samu <pella(dot)samu(at)gmail(dot)com>
To: Maurici Meneghetti <maurici(dot)meneghetti(at)bixtecnologia(dot)com(dot)br>
Cc: pgsql-performance(at)lists(dot)postgresql(dot)org
Subject: Re: Postgres performance comparing GCP and AWS
Date: 2021-02-24 15:16:25
Message-ID: CAJnEWwkPJmkqSvYPynkmFJK_oMbVtqk6TDeTeUSnPqAne511Cw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

> I expect my postgres on GPC to be at least similar to the one managed by
AWS RDS

imho:
- on Google Cloud you can test with "Cloud SQL for Postgresql" (
https://cloud.google.com/sql/docs/postgres )
- on Google Compute Engine ( VM ): you have to tune the disks ; linux ;
file system ; scheduler ;
and it is a complex task

imho: select the perfect disk types for the postgresql data ( and create
a fast RAID )
https://cloud.google.com/compute/docs/disks

*Compute Engine offers several types of storage options for your instances.
Each of the following storage options has unique price and performance
characteristics:*

*- Zonal persistent disk: Efficient, reliable block storage.*

*- Regional persistent disk: Regional block storage replicated in two
zones.*
*- Local SSD: High performance, transient, local block storage.*
*- Cloud Storage buckets: Affordable object storage.*
*- Filestore: High performance file storage for Google Cloud users.*

regards,
Imre

Maurici Meneghetti <maurici(dot)meneghetti(at)bixtecnologia(dot)com(dot)br> ezt írta
(időpont: 2021. febr. 23., K, 23:14):

> Hi everyone,
>
> I have 2 postgres instances created from the same dump (backup), one on a
> GCP VM and the other on AWS RDS. The first instance takes 18 minutes and
> the second one takes less than 20s to run this simples query:
> SELECT "Id", "DateTime", "SignalRegisterId", "Raw" FROM
> "SignalRecordsBlobs" WHERE "SignalSettingId" = 103 AND "DateTime" BETWEEN
> '2019-11-28T14:00:12.540200000' AND '2020-07-23T21:12:32.249000000';
> I’ve run this query a few times to make sure both should be reading data
> from cache.
> I expect my postgres on GPC to be at least similar to the one managed by
> AWS RDS so that I can work on improvements parallelly and compare.
>
>
>
> *DETAILS:Query explain for Postgres on GCP VM:*Bitmap Heap Scan on
> SignalRecordsBlobs SignalRecordsBlobs (cost=18.80..2480.65 rows=799
> width=70) (actual time=216.766..776.032 rows=5122 loops=1)
> Filter: (("DateTime" >= \'2019-11-28 14:00:12.5402\'::timestamp
> without time zone) AND ("DateTime" <= \'2020-07-23
> 21:12:32.249\'::timestamp without time zone))
> Heap Blocks: exact=5223
> Buffers: shared hit=423 read=4821
> -> Bitmap Index Scan on IDX_SignalRecordsBlobs_SignalSettingId
> (cost=0.00..18.61 rows=824 width=0) (actual time=109.000..109.001
> rows=5228 loops=1)
> Index Cond: ("SignalSettingId" = 103)
> Buffers: shared hit=3 read=18
> Planning time: 456.315 ms
> Execution time: 776.976 ms
>
>
> *Query explain for Postgres on AWS RDS:*Bitmap Heap Scan on
> SignalRecordsBlobs SignalRecordsBlobs (cost=190.02..13204.28 rows=6213
> width=69) (actual time=2.215..14.505 rows=5122 loops=1)
> Filter: (("DateTime" >= \'2019-11-28 14:00:12.5402\'::timestamp
> without time zone) AND ("DateTime" <= \'2020-07-23
> 21:12:32.249\'::timestamp without time zone))
> Heap Blocks: exact=5209
> Buffers: shared hit=3290 read=1948
> -> Bitmap Index Scan on IDX_SignalRecordsBlobs_SignalSettingId
> (cost=0.00..188.46 rows=6405 width=0) (actual time=1.159..1.159 rows=5228
> loops=1)
> Index Cond: ("SignalSettingId" = 103)
> Buffers: shared hit=3 read=26
> Planning time: 0.407 ms
> Execution time: 14.87 ms
>
>
> *PostgreSQL version number running:• VM on GCP*: PostgreSQL 11.10 (Debian
> 11.10-0+deb10u1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 8.3.0-6)
> 8.3.0, 64-bit
> *• Managed by RDS on AWS:* PostgreSQL 11.10 on x86_64-pc-linux-gnu,
> compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-11), 64-bit
>
>
> *How PostgreSQL was installed:• VM on GCP*: Already installed when
> created VM running Debian on Google Console.
> *• Managed by RDS on AWS:* RDS managed the installation.
>
>
> *Changes made to the settings in the postgresql.conf file:*Here are some
> postgres parameters that might be useful:
> *Instance on VM on GCP (2 vCPUs, 2 GB memory, 800 GB disk):*
> • effective_cache_size: 1496MB
> • maintenance_work_mem: 255462kB (close to 249MB)
> • max_wal_size: 1GB
> • min_wal_size: 512MB
> • shared_buffers: 510920kB (close to 499MB)
> • max_locks_per_transaction 1000
> • wal_buffers: 15320kB (close to 15MB)
> • work_mem: 2554kB
> • effective_io_concurrency: 200
> • dynamic_shared_memory_type: posix
> On this instance we installed a postgres extension called timescaledb to
> gain performance on other tables. Some of these parameters were set using
> recommendations from that extension.
>
> *Instance managed by RDS (2 vCPUs, 2 GiB RAM, 250GB disk, 750 de IOPS):*
> • effective_cache_size: 1887792kB (close to 1844MB)
> • maintenance_work_mem: 64MB
> • max_wal_size: 2GB
> • min_wal_size: 192MB
> • shared_buffers: 943896kB (close to 922MB)
> • max_locks_per_transaction 64
>
>
> *Operating system and version by runing "uname -a":• VM on GCP:* Linux
> {{{my instance name}}} 4.19.0-14-cloud-amd64 #1 SMP Debian 4.19.171-2
> (2021-01-30) x86_64 GNU/Linux
> *• Managed by AWS RDS:* Aparently Red Hay as shown using SELECT version();
>
> *Program used to connect to PostgreSQL:* Python psycopg2.connect() to
> create the connection and pandas read_sql_query() to query using that
> connection.
>
> Thanks in advance
>

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Philip Semanchuk 2021-02-25 18:13:33 Re: Postgres performance comparing GCP and AWS
Previous Message Igor Gois 2021-02-24 15:11:30 Re: Postgres performance comparing GCP and AWS