Re: Turning off transactions completely.

From: "Arsalan Zaidi" <azaidi(at)directi(dot)com>
To: "Martijn van Oosterhout" <kleptog(at)svana(dot)org>
Cc: <pgsql-general(at)postgresql(dot)org>
Subject: Re: Turning off transactions completely.
Date: 2002-01-07 08:47:09
Message-ID: 010d01c19757$e6ac8760$4301a8c0@directi.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

> Well, no transactions is not possible. The whole data storage system is
> built around it almost. Besides, as long as everything is in one
> transaction, there is *no* overhead IIRC.
>

How can that be? You're doing *some* record keeping after all; so there has
to be some over-head!

> > However, as it currently stands, the app takes around 30 hrs to finish
it's
> > run. I wish to reduce that to 24hr or less.
>
> Wow. I can insert hundreds of rows per second within a transaction and my
> hardware is not even particulatly good. That would be 21 million rows in
> that time. How big is your data set? Are you using COPY or INSERT to
insert
> the data?
>

Using COPY for the init load and insert for subsequent handling. Can't use
COPY there cause there are WHERE clauses in there I need.

I start off with 3.3GB of data and things just grow from that with a fresh
3GB file every day! Plenty of dup data in it though. Lots of complex queries
get run on this data, which generates even more info... At the end of the
initial run, I get ~50GB of data in the pg data dir (course that's got a
lot of additional info than just a plain text file...) with ~10GB per
subsequent run.

Haven't checked yet, but I think I get ~30 million rows in the init run.
BTW, I've got a dual proc machine with a RAID-0 array and 1 GB of RAM, but
pg only uses one CPU at a time. Would have been great if it had been
multi-threaded or something.

> > I found a comment from 1999 where someone asked a similiar Q and
Mimijian
> > responded that that was not possible in pg. Is that still true? Can it
be
> > easily changed in the code?
>
> I think you're working under the assumption that transactions == overhead
> whereas I don't beleive that's true. Work out where the bottleneck is.

I've got too much data? :-)

--Arsalan.

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Jules Alberts 2002-01-07 11:15:47 Re: three tier woes (audit policies ???)
Previous Message Martijn van Oosterhout 2002-01-07 07:34:42 Re: Turning off transactions completely.