Re: Performance implications of adding a "disabled" column to a table

From: "David Johnston" <polobo(at)yahoo(dot)com>
To: "'Seref Arikan'" <serefarikan(at)kurumsalteknoloji(dot)com>, "'PG-General Mailing List'" <pgsql-general(at)postgresql(dot)org>
Subject: Re: Performance implications of adding a "disabled" column to a table
Date: 2012-08-30 19:09:31
Message-ID: 07e501cd86e2$fcdc4f80$f694ee80$@yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

From: pgsql-general-owner(at)postgresql(dot)org
[mailto:pgsql-general-owner(at)postgresql(dot)org] On Behalf Of Seref Arikan
Sent: Thursday, August 30, 2012 12:48 PM
To: PG-General Mailing List
Subject: [GENERAL] Performance implications of adding a "disabled" column to
a table

Greetings,
I have a large number of rows (up to 3-4 millions) that I'll either be
fetching into ram (if it is a few thousand rows), or scrolling through a
cursor.
Deletions or updates on content of these rows lead to expensive operations
in my business logic, so I simply need to mark them as disabled.
I have two options here: to exclude the disabled rows from my query results
via ..where not disabled or to not to process disabled rows in business
logic.
There will be very few disabled rows compared to total rows I'll have to
return. This table will have very few queries, so including disabled = false
or enabled = true in every query would not be an issue

I have a feeling that including this criteria in the queries may add 
unnecessary processing to my queries, since the table is expected to get up
to 100 million or more rows. So should I deal with this in DB, or at the
application layer?

Kind regards
Seref

============================================================================
=

So you are willing to live with the fact that whatever, apparently
important, business logic processing occurs when a row is updated or deleted
is in-fact not important since now you are considering a class of data that
is not useful but still present and reflected elsewhere in the system.

If the proportion of rows that are disabled is small, and you don't intend
to query "where disabled = TRUE", then adding the column is sufficient. No
index required since it will not be used. Whether you alter queries to add
the "disabled = FALSE" or read and deal with it in the application doesn't
really matter (unless individual records are large). Personally I would
suggest creating VIEWs that do not output "disabled" and that always use
"disabled = FALSE". Call it something like "tablename_active" and then use
that view thus only sending enabled records to the application layer.

I would also make sure I am very confident on the implications of doing this
versus deleting the data outright (though possibly moving it to an "archive"
table)

David J.

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Mike Orr 2012-08-30 19:18:11 Baseline configurations
Previous Message Seref Arikan 2012-08-30 16:48:12 Performance implications of adding a "disabled" column to a table