Re: BUG #14768: CREATE INDEX CONCURRENTLY IF NOT EXISTS cancels autovacuum even if the index already exists.

From: Marcin Barczyński <mba(dot)ogolny(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Marcin Barczyński <mba(dot)ogolny(at)gmail(dot)com>, PostgreSQL mailing lists <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: BUG #14768: CREATE INDEX CONCURRENTLY IF NOT EXISTS cancels autovacuum even if the index already exists.
Date: 2017-08-24 07:32:19
Message-ID: CAP3o3PdRbOk0dkiQcH7JJxhHa6-9RV8vsWCtiPSPjAkdUfeSWQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Thu, Aug 24, 2017 at 7:00 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> I'm afraid this complaint is just wishful/sloppy thinking.

Cancelling autovacuum was a problem for two reasons:

1. We ran a bunch of CREATE INDEX CONCURRENTLY IF NOT EXISTS queries on our
service start up. We thought it's a no-op if an index already exists, and
wanted to have some kind of background migrations: the service works
normally, but some operations are slower until the index is created.

2. Autovacuum takes days/weeks in our scale (billions of rows), so
effectively it never completed due to service restarts. We investigated the
problem, and it turned out that most of the time was spent on vacuuming a
GIST index on timestamp range. I took a look at the code, and during vacuum
GIST index is traversed in a logical order which translates into random
disk accesses (function gistbulkdelete in gistvacuum.c). Our index has
almost 800 GB, so random accesses affect us badly. On the other hand, btree
indexes are vacuumed in physical order (function btvacuumscan in nbtree.c).

We replaced CREATE INDEX CONCURRENTLY IF NOT EXISTS with proper migrations.
By the way, CREATE INDEX IF NOT EXISTS also cancels autovacuum task. I get
your point with locks, but from a user's perspective, I don't understand
why I have to resort to the following code to avoid cancelling autovacuum:

DO $$
BEGIN
IF NOT EXISTS (
SELECT 1
FROM pg_class c JOIN pg_namespace n ON n.oid =
c.relnamespace
WHERE n.nspname = 'my_namespace' AND c.relname =
'my_index') THEN
CREATE INDEX my_index ...;
END IF;
END$$;

As for the long autovacuum, maybe I should report it as a separate bug. For
now, I'm planning to replace all uses of 'contains' operator with the
following function:

CREATE OR REPLACE FUNCTION tstzrange_contains(
range tstzrange,
ts timestamptz)
RETURNS bool AS
$$
SELECT (ts >= lower(range) AND (lower_inc(range) OR ts > lower(range)))
AND (ts <= upper(range) AND (upper_inc(range) OR ts < upper(range)))
$$ LANGUAGE SQL IMMUTABLE;

and create btree indexes on lower and upper bound:

CREATE INDEX my_table_time_range_lower_idx ON my_table
(lower(time_range));
CREATE INDEX my_table_time_range_upper_idx ON my_table
(upper(time_range));

Is it the best approach?

--
Regards,
Marcin

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message soleuu 2017-08-24 08:57:27 BUG #14790: pg_restore - segfault
Previous Message Tom Lane 2017-08-24 05:00:01 Re: BUG #14768: CREATE INDEX CONCURRENTLY IF NOT EXISTS cancels autovacuum even if the index already exists.