From: | Philip Semanchuk <philip(at)americanefficient(dot)com> |
---|---|
To: | Igor Gois <igor(at)bixtecnologia(dot)com(dot)br> |
Cc: | Julien Rouhaud <rjuju123(at)gmail(dot)com>, Maurici Meneghetti <maurici(dot)meneghetti(at)bixtecnologia(dot)com(dot)br>, pgsql-performance(at)lists(dot)postgresql(dot)org |
Subject: | Re: Postgres performance comparing GCP and AWS |
Date: | 2021-02-25 22:32:32 |
Message-ID: | CD1E1A91-7F3D-4A6C-ACF7-AF59E6DCB9EA@americanefficient.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
> On Feb 25, 2021, at 4:04 PM, Igor Gois <igor(at)bixtecnologia(dot)com(dot)br> wrote:
>
> Philip,
>
> The results in first email in this thread were using explain analyze.
>
> I thought that you asked to run using only 'explain'. My bad.
>
> The point is, the execution time with explain analyze is less the 1 second. But the actual execution time (calculated from the python client) is 24 seconds (aws) and 300+ seconds in gcp
Oh OK, sorry, I wasn’t following. Yes, network speed sounds like the source of the problem.
Under AWS sometimes we log into an EC2 instance if we have to run a query that generates a lot of data so that both server and client are inside AWS. If GCP has something similar to EC2, it might be an interesting experiment to run your query from there and see how much, if any, that changes the time it takes to get results.
Hope this helps
Philip
>
> Em qui., 25 de fev. de 2021 às 17:53, Philip Semanchuk <philip(at)americanefficient(dot)com> escreveu:
>
>
> > On Feb 25, 2021, at 3:46 PM, Igor Gois <igor(at)bixtecnologia(dot)com(dot)br> wrote:
> >
> > Hi, Philip
> >
> > We ran: EXPLAIN (FORMAT JSON) SELECT "Id", "DateTime", "SignalRegisterId", "Raw" FROM "SignalRecordsBlobs" WHERE "SignalSettingId" = 103 AND "DateTime" BETWEEN '2019-11-28T14:00:12.540200000' AND '2020-07-23T21:12:32.249000000';
> >
> > but it was really fast. I think the results were discarded.
>
> EXPLAIN and EXPLAIN ANALYZE are different in an important way. EXPLAIN merely plans the query, EXPLAIN ANALYZE plans *and executes* the query. From the doc —
>
> "The ANALYZE option causes the statement to be actually executed, not only planned....Keep in mind that the statement is actually executed when the ANALYZE option is used. Although EXPLAIN will discard any output that a SELECT would return, other side effects of the statement will happen as usual. “
>
> https://www.postgresql.org/docs/12/sql-explain.html
>
>
> >
> > AWS Execution time select without explain: 24.96505s (calculated in python client)
> > AWS Execution time select with explain but without analyze: 0.03876s (calculated in python client)
> >
> > https://explain.depesz.com/s/5HRO
> >
> > Thanks in advance
> >
> >
> > Em qui., 25 de fev. de 2021 às 15:13, Philip Semanchuk <philip(at)americanefficient(dot)com> escreveu:
> >
> >
> > > On Feb 24, 2021, at 10:11 AM, Igor Gois <igor(at)bixtecnologia(dot)com(dot)br> wrote:
> > >
> > > Hi, Julien
> > >
> > > Your hypothesis about network transfer makes sense. The query returns a big size byte array blobs.
> > >
> > > Is there a way to test the network speed against the instances? I have access to the network speed in gcp (5 Mb/s), but don't have access in aws rds.
> >
> > Perhaps what you should run is EXPLAIN ANALYZE SELECT...? My understanding is that EXPLAIN ANALYZE executes the query but discards the results. That doesn’t tell you the network speed of your AWS instance, but it does isolate the query execution speed (which is what I think you’re trying to measure) from the network speed.
> >
> > Hope this is useful.
> >
> > Cheers
> > Philip
> >
> > >
> > >
> > > Em qua., 24 de fev. de 2021 às 10:35, Julien Rouhaud <rjuju123(at)gmail(dot)com> escreveu:
> > > Hi,
> > >
> > > On Wed, Feb 24, 2021 at 6:14 AM Maurici Meneghetti
> > > <maurici(dot)meneghetti(at)bixtecnologia(dot)com(dot)br> wrote:
> > > >
> > > > I have 2 postgres instances created from the same dump (backup), one on a GCP VM and the other on AWS RDS. The first instance takes 18 minutes and the second one takes less than 20s to run this simples query:
> > > > SELECT "Id", "DateTime", "SignalRegisterId", "Raw" FROM "SignalRecordsBlobs" WHERE "SignalSettingId" = 103 AND "DateTime" BETWEEN '2019-11-28T14:00:12.540200000' AND '2020-07-23T21:12:32.249000000';
> > > > I’ve run this query a few times to make sure both should be reading data from cache.
> > > > I expect my postgres on GPC to be at least similar to the one managed by AWS RDS so that I can work on improvements parallelly and compare.
> > > >
> > > > DETAILS:
> > > > [...]
> > > > Planning time: 456.315 ms
> > > > Execution time: 776.976 ms
> > > >
> > > > Query explain for Postgres on AWS RDS:
> > > > [...]
> > > > Planning time: 0.407 ms
> > > > Execution time: 14.87 ms
> > >
> > > Those queries were executed in respectively ~1s and ~15ms (one thing
> > > to note is that the slower one had less data in cache, which may or
> > > may note account for the difference). Does those plans reflect the
> > > reality of your slow executions? If yes it's likely due to quite slow
> > > network transfer. Otherwise we would need an explain plan from the
> > > slow execution, for which auto_explain can help you. See
> > > https://www.postgresql.org/docs/11/auto-explain.html for more details.
> > >
> > >
> > > --
> > > Att,
> > >
> > > Igor Gois | Sócio Consultor
> > > (48) 99169-9889 | Skype: igor_msg
> > > Site | Blog | LinkedIn | Facebook | Instagram
> > >
> > >
> >
> >
> >
> > --
> > Att,
> >
> > Igor Gois | Sócio Consultor
> > (48) 99169-9889 | Skype: igor_msg
> > Site | Blog | LinkedIn | Facebook | Instagram
> >
> >
>
>
>
> --
> Att,
>
> Igor Gois | Sócio Consultor
> (48) 99169-9889 | Skype: igor_msg
> Site | Blog | LinkedIn | Facebook | Instagram
>
>
From | Date | Subject | |
---|---|---|---|
Next Message | Arne Roland | 2021-02-25 23:07:08 | Disabling options lowers the estimated cost of a query |
Previous Message | Igor Gois | 2021-02-25 21:04:50 | Re: Postgres performance comparing GCP and AWS |