Re: [MASSMAIL]long running delete

From: Mark Steben <mark(dot)steben(at)drivedominion(dot)com>
To: gilberto(dot)castillo(at)etecsa(dot)cu
Cc: pgsql-admin <pgsql-admin(at)postgresql(dot)org>
Subject: Re: [MASSMAIL]long running delete
Date: 2016-06-28 16:31:28
Message-ID: CADyzmyyi66MSSJ=SS=Q8V0+n3OkzwKmkxhT8=NX_vhYfVM5HpQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

Thank you for your quick response Gilberto. Much appreciated.
I may very well follow your recommendation if I can determine if the
current delete statement is not doing any work.
Here is the statement and an explain:

DELETE FROM contents USING contents AS c LEFT JOIN
contents_social_posts csp ON csp.content_id = c.id LEFT JOIN social_posts
sp ON sp.id = csp.social_post_id WHERE c.type = 'Content::Text::News'
AND (csp.id IS NULL OR sp.id IS NULL);
QUERY PLAN

--------------------------------------------------------------------------------------------------------
Delete on contents (cost=1150608.37..2295927.20 rows=1 width=24)
-> Nested Loop (cost=1150608.37..2295927.20 rows=1 width=24)
-> Hash Right Join (cost=1150608.37..1173423.64 rows=1 width=18)
Hash Cond: (csp.content_id = c.id)
Filter: ((csp.id IS NULL) OR (sp.id IS NULL))
-> Hash Left Join (cost=6247.16..8286.32 rows=54372
width=24)
Hash Cond: (csp.social_post_id = sp.id)
-> Seq Scan on contents_social_posts csp
(cost=0.00..951.72 rows=54372 width=18)
-> Hash (cost=4978.18..4978.18 rows=101518 width=10)
-> Seq Scan on social_posts sp
(cost=0.00..4978.18 rows=101518 width=10)
-> Hash (cost=1075343.73..1075343.73 rows=3970439 width=10)
-> Seq Scan on contents c (cost=0.00..1075343.73
rows=3970439 width=10)
Filter: ((type)::text =
'Content::Text::News'::text)
-> Seq Scan on contents (cost=0.00..1059623.78 rows=6287978
width=6)
(14 rows)

As you can see, many sequential scans especially 2 on the table we are
performing the delete on

Description of contents:

\d contents
Table "public.contents"
Column | Type |
Modifiers
---------------+-----------------------------+-------------------------------------------------------
id | integer | not null default
nextval('contents_id_seq'::regclass)
raw_content | text |
title | text |
description | text |
source_url | text |
published_at | timestamp without time zone |
guid | text |
type | character varying(255) |
created_at | timestamp without time zone |
updated_at | timestamp without time zone |
image_url | text |
original_id | character varying(255) |
refined_date | date |
thumbnail_url | text |
user_id | integer |
flagged_at | timestamp without time zone |
flagged_by | integer |
removed_at | timestamp without time zone |
removed_by | integer |
category_id | integer |
parent_id | integer |
parent_type | character varying(255) |
processing | boolean | default false
sharable_id | integer |
sharable_type | character varying(255) |
meta_data | text |
medium_url | text |
pinned_at | date |
pinned_until | date |
banner_url | text |
deleted_at | timestamp without time zone |
Indexes:
"primets_contents_pkey_id" PRIMARY KEY, btree (id), tablespace
"prime2indexes"
"primets_content_parent" btree (parent_id, parent_type), tablespace
"prime2indexes"
"primets_contents_category" btree (category_id), tablespace
"prime2indexes"
"primets_contents_desc_gin" gin (to_tsvector('english'::regconfig,
description)), tablespace "prime2indexes"
"primets_contents_guid" btree (guid), tablespace "prime2indexes"
"primets_contents_pin_priority" btree
((GREATEST(refined_date::timestamp without time zone, pinned_until + '1
day'::interval)), id), tablespace "prime2indexes"
"primets_contents_refined_date_id" btree (refined_date, id), tablespace
"prime2indexes"
"primets_contents_sharable_id_sharable_type" btree (sharable_id,
sharable_type), tablespace "prime2indexes"
"primets_contents_source_gin" gin (to_tsvector('english'::regconfig,
source_url)), tablespace "prime2indexes"
"primets_contents_title_gin" gin (to_tsvector('english'::regconfig,
title)), tablespace "prime2indexes"
"primets_contents_type" btree (type), tablespace "prime2indexes"
"primets_contents_user_id" btree (user_id), tablespace "prime2indexes"
Triggers:
_replication_logtrigger AFTER INSERT OR DELETE OR UPDATE ON contents
FOR EACH ROW EXECUTE PROCEDURE _replication.logtrigger('_replication',
'26', 'k')
_replication_truncatetrigger BEFORE TRUNCATE ON contents FOR EACH
STATEMENT EXECUTE PROCEDURE _replication.log_truncate('26')
Disabled triggers:
_replication_denyaccess BEFORE INSERT OR DELETE OR UPDATE ON contents
FOR EACH ROW EXECUTE PROCEDURE _replication.denyaccess('_replication')
_replication_truncatedeny BEFORE TRUNCATE ON contents FOR EACH
STATEMENT EXECUTE PROCEDURE _replication.deny_truncate()

On Tue, Jun 28, 2016 at 1:15 PM, Gilberto Castillo <
gilberto(dot)castillo(at)etecsa(dot)cu> wrote:

>
> > Good morning,
> >
> > We have been running a delete for nearly 24 hours now. I would like to
> > verify that it is either doing what it is supposed to do or 'spinning its
> > wheels'.
> >
> > We are running postgres 9.2.12.
> >
> > The delete statement is not waiting on any other transaction.
> >
> > I have run straces on the pid and I see lots of 'reads, lseeks, and an
> > occasional semop. I have also looked in the base directory at the file
> > matched by the oid of the table (as defined in pg_class) and have seen no
> > change in size.
> >
> > Is there somewhere else I can verify that work is / is not being done?
> > Perhaps looking for something else in strace?
>
>
>
> My recomendation,
>
> El DELETE es prohibitivo en Cualquier gestor de BD para ello ponga a su
> concideración dos formas de como mejorar su comportamiento en PostgreSQL
>
> --Solución 1
>
> DELETE FROM string s WHERE NOT EXISTS (SELECT 1 FROM data d WHERE
> d.object_id = s.id OR d.property_id = s.id OR d.value_id = s.id);
>
> --Solución 2
>
> (1) CREATE TABLE copia AS SELECT (...) WHERE (...)
> (2) TRUNCATE en la tabla original.
> (3) INSERT (...) SELECT (...) --actualizar desde la copia la tabla original
> (4) DROP TABLE copia.
>
>
> >
> > Thanks for your time.
> >
> > --
> > *Mark Steben*
> > Database Administrator
> > @utoRevenue <http://www.autorevenue.com/> | Autobase
> > <http://www.autobase.net/>
> > CRM division of Dominion Dealer Solutions
> > 95D Ashley Ave.
> > West Springfield, MA 01089
> > t: 413.327-3045
> > f: 413.383-9567
> >
> > www.fb.com/DominionDealerSolutions
> > www.twitter.com/DominionDealer
> > www.drivedominion.com <http://www.autorevenue.com/>
> >
> > <http://autobasedigital.net/marketing/DD12_sig.jpg>
> >
>
>
> --
> Saludos,
> Gilberto Castillo
> ETECSA, La Habana, Cuba
>
>

--
*Mark Steben*
Database Administrator
@utoRevenue <http://www.autorevenue.com/> | Autobase
<http://www.autobase.net/>
CRM division of Dominion Dealer Solutions
95D Ashley Ave.
West Springfield, MA 01089
t: 413.327-3045
f: 413.383-9567

www.fb.com/DominionDealerSolutions
www.twitter.com/DominionDealer
www.drivedominion.com <http://www.autorevenue.com/>

<http://autobasedigital.net/marketing/DD12_sig.jpg>

In response to

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message Gilberto Castillo 2016-06-28 17:15:08 Re: [MASSMAIL]long running delete
Previous Message Mark Steben 2016-06-28 16:10:35 long running delete