From: | "Kevin Grittner" <kgrittn(at)mail(dot)com> |
---|---|
To: | "Tim Uckun" <timuckun(at)gmail(dot)com> |
Cc: | "Richard Huxton" <dev(at)archonet(dot)com>,pgsql-general(at)postgresql(dot)org |
Subject: | Re: Running update in chunks? |
Date: | 2013-01-22 02:21:50 |
Message-ID: | 20130122022150.120590@gmx.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Kevin Grittner wrote:
> update imports set make_id = 0
>
> Query returned successfully: 98834 rows affected, 45860 ms execution time.
For difficult problems, there is nothing like a self-contained test
case, that someone else can run to see the issue. Here's a starting
point:
create extension if not exists hstore;
create schema cars;
drop table if exists cars.imports;
CREATE TABLE cars.imports
(
id serial NOT NULL,
target_id integer,
batch_id integer,
make_id integer,
model_id integer,
date timestamp without time zone,
division_id integer,
dealer_id integer,
data hstore,
created_at timestamp without time zone NOT NULL,
updated_at timestamp without time zone NOT NULL,
CONSTRAINT imports_pkey PRIMARY KEY (id)
);
CREATE INDEX index_imports_on_data ON cars.imports USING gin (data);
CREATE INDEX index_imports_on_dealer_id ON cars.imports USING btree (dealer_id);
CREATE INDEX index_imports_on_division_id ON cars.imports USING btree (division_id);
CREATE INDEX index_imports_on_make_id ON cars.imports USING btree (make_id);
CREATE INDEX index_imports_on_model_id ON cars.imports USING btree (model_id);
insert into cars.imports (created_at, updated_at) select now(), now() from (select generate_series(1, 100000)) x;
vacuum freeze analyze;
\timing on
update cars.imports set make_id = 0;
vacuum analyze;
update cars.imports set make_id = 0;
... and here's what I get when I run it on my desktop computer with
ordinary disk drives and a completely default configuration:
test=# create extension if not exists hstore;
CREATE EXTENSION
Time: 48.032 ms
test=# create schema cars;
CREATE SCHEMA
Time: 8.150 ms
test=# drop table if exists cars.imports;
NOTICE: table "imports" does not exist, skipping
DROP TABLE
Time: 0.205 ms
test=# CREATE TABLE cars.imports
test-# (
test(# id serial NOT NULL,
test(# target_id integer,
test(# batch_id integer,
test(# make_id integer,
test(# model_id integer,
test(# date timestamp without time zone,
test(# division_id integer,
test(# dealer_id integer,
test(# data hstore,
test(# created_at timestamp without time zone NOT NULL,
test(# updated_at timestamp without time zone NOT NULL,
test(# CONSTRAINT imports_pkey PRIMARY KEY (id)
test(# );
NOTICE: CREATE TABLE will create implicit sequence "imports_id_seq" for serial column "imports.id"
NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "imports_pkey" for table "imports"
CREATE TABLE
Time: 152.677 ms
test=# CREATE INDEX index_imports_on_data ON cars.imports USING gin (data);
CREATE INDEX
Time: 6.391 ms
test=# CREATE INDEX index_imports_on_dealer_id ON cars.imports USING btree (dealer_id);
CREATE INDEX
Time: 64.668 ms
test=# CREATE INDEX index_imports_on_division_id ON cars.imports USING btree (division_id);
CREATE INDEX
Time: 65.573 ms
test=# CREATE INDEX index_imports_on_make_id ON cars.imports USING btree (make_id);
CREATE INDEX
Time: 64.959 ms
test=# CREATE INDEX index_imports_on_model_id ON cars.imports USING btree (model_id);
CREATE INDEX
Time: 64.906 ms
test=# insert into cars.imports (created_at, updated_at) select now(), now() from (select generate_series(1, 100000)) x;
INSERT 0 100000
Time: 2516.559 ms
test=# vacuum freeze analyze;
VACUUM
Time: 3357.778 ms
test=# \timing on
Timing is on.
test=# update cars.imports set make_id = 0;
UPDATE 100000
Time: 2937.241 ms
test=# vacuum analyze;
VACUUM
Time: 2097.426 ms
test=# update cars.imports set make_id = 0;
UPDATE 100000
Time: 3935.939 ms
Ubuntu 12.10
i7-3770 CPU @ 3.40GHz with 16GB RAM
Linux Kevin-Desktop 3.5.0-22-generic #34-Ubuntu SMP Tue Jan 8 21:47:00 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux
A pair of WDC WD10EALX-009BA0 (15.01H51) 7200 RPM drives in RAID 1.
PostgreSQL 9.2.2 on x86_64-unknown-linux-gnu, compiled by gcc (Ubuntu/Linaro 4.7.2-2ubuntu1) 4.7.2, 64-bit
test=# SELECT name, current_setting(name), source
test-# FROM pg_settings
test-# WHERE source NOT IN ('default', 'override');
name | current_setting | source
----------------------------+--------------------+----------------------
application_name | psql | client
client_encoding | UTF8 | client
DateStyle | ISO, MDY | configuration file
default_text_search_config | pg_catalog.english | configuration file
lc_messages | en_US.UTF-8 | configuration file
lc_monetary | en_US.UTF-8 | configuration file
lc_numeric | en_US.UTF-8 | configuration file
lc_time | en_US.UTF-8 | configuration file
log_timezone | US/Central | configuration file
max_connections | 100 | configuration file
max_stack_depth | 2MB | environment variable
shared_buffers | 32MB | configuration file
TimeZone | US/Central | configuration file
(13 rows)
I did absolutely no tuning from the default configuration.
So, what timings do you get if you run the identical script? Is
there something you can do to the above script (maybe in terms of
populating data) which will cause the performance you see?
-Kevin
From | Date | Subject | |
---|---|---|---|
Next Message | Kevin Grittner | 2013-01-22 02:42:23 | Re: Yet Another Timestamp Question: Time Defaults |
Previous Message | Adrian Klaver | 2013-01-22 01:19:06 | Re: Yet Another Timestamp Question: Time Defaults |