From: | Filip Rembiałkowski <plk(dot)zuber(at)gmail(dot)com> |
---|---|
To: | Herouth Maoz <herouth(at)unicell(dot)co(dot)il> |
Cc: | pgsql-general(at)postgresql(dot)org |
Subject: | Re: Slow update |
Date: | 2009-01-21 11:23:05 |
Message-ID: | 92869e660901210323w32fa0843s7111aaab20dd9a00@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
2009/1/21 Herouth Maoz <herouth(at)unicell(dot)co(dot)il>
> Hello.
>
> I have a daily process that synchronizes our reports database from our
> production databases. In the past few days, it happened a couple of times
> that an update query took around 7-8 hours to complete, which seems a bit
> excessive. This is the query:
>
> UPDATE rb
> SET service = b.service,
> status = b.status,
> has_notification = gateway_id NOT IN (4,101,102),
> operator = COALESCE(
> b.actual_target_network_id,
> b.requested_target_network_id
> )
> FROM sms.billing b
> WHERE b.time_arrived >= :date_start
> AND rb.time_stamp >= :date_start
> AND rb.delivered = 0
> AND rb.sms_user = b.user_id
> AND rb.reference = b.user_reference
> AND OVERLAY( rb.msisdn placing '972' from 1 for 1 ) = b.msisdn
> AND NOT mo_billed
> AND system_id <> 6 -- Exclude Corporate, as it aleady has
> service/status
> ;
>
> The variable ":date_start" is set to a date 3 days ago.
>
> I ran explain for this query and it gave me this:
>
>
>
> --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> Nested Loop (cost=21567.12..854759.82 rows=1 width=210)
> Join Filter: (((rb.sms_user)::text = (b.user_id)::text) AND
> (rb.reference = b.user_reference))
> -> Bitmap Heap Scan on rb (cost=21546.02..23946.16 rows=819 width=198)
> Recheck Cond: ((delivered = 0) AND (time_stamp >= '2009-01-18
> 00:00:00'::timestamp without time zone) AND (time_stamp < '2009-01-21
> 00:00:00'::timestamp without time zone))
> Filter: ((NOT mo_billed) AND (system_id <> 6))
> -> BitmapAnd (cost=21546.02..21546.02 rows=819 width=0)
> -> Bitmap Index Scan on rb_delivered_ind
> (cost=0.00..1419.99 rows=45768 width=0)
> Index Cond: (delivered = 0)
> -> Bitmap Index Scan on rb_timestamp_ind
> (cost=0.00..20125.37 rows=188994 width=0)
> Index Cond: ((time_stamp >= '2009-01-18
> 00:00:00'::timestamp without time zone) AND (time_stamp < '2009-01-21
> 00:00:00'::timestamp without time zone))
> -> Bitmap Heap Scan on billing b (cost=21.10..1004.77 rows=351
> width=49)
> Recheck Cond: ((b.msisdn)::text = (("substring"((rb.msisdn)::text,
> 1, 0) || '972'::text) || "substring"((rb.msisdn)::text, 2)))
> Filter: ((b.time_arrived >= '2009-01-18 00:00:00'::timestamp
> without time zone) AND (b.time_arrived < '2009-01-21 00:00:00'::timestamp
> without time zone))
> -> Bitmap Index Scan on billing_msisdn_sme_reference
> (cost=0.00..21.10 rows=351 width=0)
> Index Cond: ((b.msisdn)::text =
> (("substring"((rb.msisdn)::text, 1, 0) || '972'::text) ||
> "substring"((rb.msisdn)::text, 2)))
>
> I'm not an expert on reading plans, but it seems to me that it uses indices
> on both tables that participate in this query, so it shouldn't take such a
> long time.
>
> The number of records in the table rb for the past three days is 386833. On
> the sms.billing table it seems to select the index on the msisdn and
> sme_reference fields and use it partially (only using the msisdn field).
> Looking at that table, the frequency of each value in the msisdn field is at
> most 17678 for the current data, where mostly it's a couple of thousands.
> How can this take so long?
>
1. which postgres version?
2. can you post results of EXPLAIN ANALYZE (please note it actually executes
the query)?
--
Filip Rembiałkowski
From | Date | Subject | |
---|---|---|---|
Next Message | Marc Mamin | 2009-01-21 11:36:08 | Re: Slow update |
Previous Message | Roger Leigh | 2009-01-21 10:48:09 | Re: Custom type, operators and operator class not sorting/indexing correctly |