Re: Weird performance difference

From: Israel Brewster <israel(at)ravnalaska(dot)net>
To: Justin Pryzby <pryzby(at)telsasoft(dot)com>
Cc: pgsql-general <pgsql-general(at)postgresql(dot)org>
Subject: Re: Weird performance difference
Date: 2017-10-20 23:52:44
Message-ID: 033D1F44-56D4-4149-B9A4-8B8A8DFE96A5@ravnalaska.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Oct 20, 2017, at 3:31 PM, Justin Pryzby <pryzby(at)telsasoft(dot)com> wrote:
>
> On Fri, Oct 20, 2017 at 03:08:26PM -0800, Israel Brewster wrote:
>> Summary: the following query takes around 12 seconds on my test machine. On my production machine, it's at half an hour and counting. What's going on?
>>
>> which, when run on my test server, has this explain analyze output: https://explain.depesz.com/s/4piv <https://explain.depesz.com/s/4piv>. Around 12 second runtime, which isn't too bad (in the grand scheme of things), although there is probably room for improvement.
>
> Are these cast to ::date cast is really needed (Alternately, do you have an index on column::date ?)
> |WHERE outtime::date>='2017-01-01'
> |ON outtime::date BETWEEN oag_schedules.startdate
> |AND outtime::date BETWEEN oag_batches.eff_from

Possibly not, but outtime is a timestamp while startdate and eff_from are simple date fields. I was being explicit :-) and no, I don't have an index on column::date, that's certainly something I can try to optimize things.

> The problem is clearly here:
>
> Merge Join (cost=30,604.12..31,301.12 ROWS=1 width=76) (actual time=1,153.883..9,812.434 ROWS=3,420,235 loops=1)
> Merge Cond: ((((oag_schedules.flightnum)::text) = (legdetail.flightnum)::text) AND ((oag_schedules.origin)::text = (legdetail.legfrom)::text))
> Join Filter: (((legdetail.outtime)::date >= oag_schedules.startdate) AND ((legdetail.outtime)::date <= COALESCE(oag_schedules.enddate, 'infinity'::date)) AND (date_part('isodow'::text, ((legdetail.outtime)::date)::timestamp without time zone) = ANY ((oag_schedules.frequency)::double precision[])))
> ROWS REMOVED BY JOIN FILTER: 6822878
>
> Can you send "explain" (not explain analyze) for the production server?

Not for the full query - it only just completed, after 70 minutes or so, and I wasn't running under EXPLAIN ANALYZE. Running with a shorter date range of only 7 days, as you suggest below: https://explain.depesz.com/s/r80j <https://explain.depesz.com/s/r80j>

I notice that this is a COMPLETELY different query plan from what I got on my test machine, but I'm not sure if that's a good thing or not. The end result is obviously not.

>
> And \d for those tables.

Sure:

flightlogs=# \d legdetail
Table "public.legdetail"
Column | Type | Modifiers
------------+--------------------------+--------------------------------------------------------
id | integer | not null default nextval('legdetail_id_seq'::regclass)
logid | integer |
flightnum | character varying(32) |
legfrom | character varying(6) |
legto | character varying(6) |
pax | integer | default 0
cargo | integer | default 0
legdate | date | default now()
outtime | timestamp with time zone | not null
ontime | timestamp with time zone |
offtime | timestamp with time zone |
intime | timestamp with time zone |
blocktime | interval |
flighttime | interval |
mail | integer | default 0
csom | character(1) |
nonrevpax | smallint |
legsic | character varying(128) |
nonrevfrt | integer |
ti | boolean | default false
Indexes:
"legdetail_pkey" PRIMARY KEY, btree (id)
"csom_idx" btree (csom)
"flightnum_idx" btree (flightnum)
"legdate_idx" btree (legdate)
"legfrom_idx" btree (legfrom)
"legto_idx" btree (legto)
"logid_idx" btree (logid)
"outtime_idx" btree (outtime)
Foreign-key constraints:
"logfk" FOREIGN KEY (logid) REFERENCES logs(id)
Triggers:
legdetail_audit AFTER UPDATE ON legdetail FOR EACH ROW WHEN (old.* <> new.*) EXECUTE PROCEDURE changeaudit()
update_hours_trigger AFTER INSERT OR DELETE OR UPDATE ON legdetail FOR EACH ROW EXECUTE PROCEDURE update_hours()
update_logtime_trigger AFTER INSERT OR DELETE OR UPDATE ON legdetail FOR EACH ROW EXECUTE PROCEDURE update_logtime()
update_status_legs AFTER INSERT OR DELETE OR UPDATE ON legdetail FOR EACH ROW EXECUTE PROCEDURE acstatupdate_loc()

flightlogs=# \d oag_batches
Foreign table "public.oag_batches"
Column | Type | Modifiers | FDW Options
----------+--------------------------+-----------+--------------------------
batchid | uuid | not null | (column_name 'batchid')
name | character varying(256) | not null | (column_name 'name')
created | timestamp with time zone | | (column_name 'created')
eff_from | date | | (column_name 'eff_from')
eff_to | date | | (column_name 'eff_to')
Server: localhost
FDW Options: (schema_name 'public', table_name 'batches')

flightlogs=# \d oag_schedules
Foreign table "public.oag_schedules"
Column | Type | Modifiers | FDW Options
--------------+------------------------+-----------+------------------------------
id | uuid | not null | (column_name 'id')
batchid | uuid | | (column_name 'batchid')
company | character varying(3) | | (column_name 'company')
flightnum | integer | not null | (column_name 'flightnum')
startdate | date | | (column_name 'startdate')
enddate | date | | (column_name 'enddate')
frequency | integer[] | | (column_name 'frequency')
origin | character varying(3) | | (column_name 'origin')
depart_time | time without time zone | | (column_name 'depart_time')
destination | character varying(3) | | (column_name 'destination')
arrival_time | time without time zone | | (column_name 'arrival_time')
equipment | character varying(3) | | (column_name 'equipment')
flight_type | character(1) | | (column_name 'flight_type')
num_seats | smallint | | (column_name 'num_seats')
Server: localhost
FDW Options: (schema_name 'public', table_name 'schedules')

>
> And/or EXPLAIN ANALYZE for a query with shorter date range on production (to
> confirm it has a similar problem in rowcount estimation).
>
> You can try munging the query to move/change the "Join Filter" components of
> the query to see which one is contributing most to the rowcount estimate being
> off by a factor of 3e6.
>
> Justin

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Tom Lane 2017-10-21 00:38:33 Re: Weird performance difference
Previous Message Justin Pryzby 2017-10-20 23:31:40 Re: Weird performance difference