Re: Index impact on update?

From: Israel Brewster <israel(at)ravnalaska(dot)net>
To: Paul Ramsey <pramsey(at)cleverelephant(dot)ca>
Cc: pgsql-general <pgsql-general(at)postgresql(dot)org>
Subject: Re: Index impact on update?
Date: 2017-01-04 17:16:05
Message-ID: FE59AAF7-3A3E-44BD-B714-3B25A49EA7C0@ravnalaska.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Jan 4, 2017, at 8:08 AM, Paul Ramsey <pramsey(at)cleverelephant(dot)ca> wrote:
>
> You'd be better off forcing the table to write in bulk with something like
>
> CREATE TABLE mynewtable AS
> SELECT *, geography(ST_SetSRID(ST_MakePoint(lng, lat), 4326)) AS geog
> FROM myoldtable;
>
> Then index the new table, rename, etc. Bulk update will, in addition to being slow, use 2x the amount of space on disk, as all the old tuples are left behind from the update until you cluster or vacuum full the table.
>
> P

Thanks for the suggestion, info, and MUCH nicer looking syntax (perhaps more efficient as well?) for populating the column. I'll give it a shot, and see how it goes!

-----------------------------------------------
Israel Brewster
Systems Analyst II
Ravn Alaska
5245 Airport Industrial Rd
Fairbanks, AK 99709
(907) 450-7293
-----------------------------------------------

>
>
>
> On Wed, Jan 4, 2017 at 8:59 AM, Israel Brewster <israel(at)ravnalaska(dot)net <mailto:israel(at)ravnalaska(dot)net>> wrote:
> Short version:
> Do indexes impact the speed of an UPDATE, even when the indexed columns aren't changing?
>
> Details:
> I have a table containing geographical data (Latitude, longitude, and elevation) with 406,833,705 records. The Latitude and Longitude columns are indexed. In order to better utilize the data, I've been looking into PostGIS, and decided I wanted to add a "Location" column with PostGIS type "GEOGRAPHY(point)". I then tried to populate it from the existing latitude/longitude data using the following query:
>
> UPDATE data SET location=ST_GeographyFromText('SRID=4326;POINT('||lng::text||' '||lat::text||')');
>
> I expected this update to take quite a while, since it has 406 million rows to update, but at this point it's been over 19 hours since I started the query, and it still hasn't completed.
>
> I'm wondering if the presence of the indexes could be slowing things down even though the indexed columns aren't being updated? Would I be better off canceling the update query, dropping the indexes, and trying again? Or is more likely that the update query is "almost" done, and it would be better to just let it run it's course? Or is there an even better option, such as perhaps exporting the data, adding the additional column in a text editor, and re-importing the data with a COPY command?
>
> Thanks for any feedback/advice you can offer!
> -----------------------------------------------
> Israel Brewster
> Systems Analyst II
> Ravn Alaska
> 5245 Airport Industrial Rd
> Fairbanks, AK 99709
> (907) 450-7293 <tel:(907)%20450-7293>
> -----------------------------------------------
>
>
>
>
>
>

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Adrian Klaver 2017-01-04 17:24:30 Re: replication slot to be used in the future
Previous Message Rob Sargent 2017-01-04 17:10:17 Re: Index impact on update?