From: | "Reuven M(dot) Lerner" <reuven(at)lerner(dot)co(dot)il> |
---|---|
To: | Webb Sprague <webb(dot)sprague(at)gmail(dot)com> |
Cc: | pgsql-general(at)postgresql(dot)org |
Subject: | Re: Database slowness -- my design, hardware, or both? |
Date: | 2007-03-08 15:43:47 |
Message-ID: | 45F02F33.2060904@lerner.co.il |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Hi, Webb Sprague. You wrote:
>
> Do you have new \timings?
Yup. It just finished executing a little while ago. With the
explicitly interpolated array in place, I got the following:
LOG: statement: UPDATE Transactions
SET previous_value = previous_value(id)
WHERE new_value IS NOT NULL
AND new_value <> ''
AND node_id IN (351, 169, 664, 240)
LOG: duration: 16842710.469 ms
The previous version, which included lots of calls to RAISE NOTICE and
also used a subselect, had the following timing:
LOG: statement: UPDATE Transactions
SET previous_value = previous_value(id)
WHERE new_value IS NOT NULL
AND new_value <> ''
AND node_id IN (SELECT node_id FROM NodeSegments)
LOG: duration: 16687993.067 ms
(I keep timing information in the logfile, rather than using \timing.)
So it looks like this didn't make much of a timing difference at all.
And what little difference there was, was negative. Bleah.
> What you or I think is a minor change isn't necessarily what the
> planner thinks is a minor change, especially when you change data from
> something that requires a query to something that is determinate. I
> would suggest changing your function to remove as many such queries as
> possible too (I am thinking of the order by limit 1). This would be a
> good move also in that you narrow down the amount of moving parts to
> diagnose and it just makes the whole thing cleaner.
Good idea. I'll see if I can get the function to be a bit cleaner,
although I'm not sure if it is, given the constraints of the problem.
That's one of the reasons why I've been adding these "hints" to the
database -- so that I can have many small queries, rather than one large
one.
> I would also try amortizing the analysis with triggers, rather than
> building the table all at once; this may be better or worse, depending
> on the on-line character of the application (e.g., if they are waiting
> at an ATM, in might be a deal breaker to add two seconds to do an
> insert / update, but not if you are tailing off a log file that gets
> updated every minute or so.)
The data that I'm dealing with is not changing over time. So I've been
trying to build the transactions table (aka my data warehouse) slowly,
adding one or two columns at a time with hints that will make extracting
the data easier. Unfortunately, building those hints has proven to be
very slow going.
Reuven
From | Date | Subject | |
---|---|---|---|
Next Message | Mark Stosberg | 2007-03-08 15:50:08 | Re: RFC tool to support development / operations work with slony replicated databases |
Previous Message | Tom Lane | 2007-03-08 15:20:04 | Re: Tabulate data incrementally |