Re: performance issues with DBI module when data too big

From: Andrew Perrin <clists(at)perrin(dot)socsci(dot)unc(dot)edu>
To: Nicolas Nolst <nnolst(at)hotmail(dot)com>
Cc: pgsql-admin(at)postgresql(dot)org, pgsql-general(at)postgresql(dot)org
Subject: Re: performance issues with DBI module when data too big
Date: 2002-06-03 18:21:08
Message-ID: Pine.LNX.4.21.0206031420380.20154-100000@perrin.socsci.unc.edu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin pgsql-general

Are you using the {AutoCommit => 0} argument to DBI->connect()? If not,
do so, and then add a $dbh->commit; line when you're done with the
inserts. Should help a lot.

----------------------------------------------------------------------
Andrew J Perrin - http://www.unc.edu/~aperrin
Assistant Professor of Sociology, U of North Carolina, Chapel Hill
clists(at)perrin(dot)socsci(dot)unc(dot)edu * andrew_perrin (at) unc.edu

On Mon, 3 Jun 2002, Nicolas Nolst wrote:

>
> Hi all,
>
> I have developped a perl script to populate a database with two tables:
> sessions and actions.
>
> the table actions contains the following columns: session_id, url,
> timestamp. The column session_id references to the table sessions. 
>
> the table sessions contains the following columns: session_id, remote_ip,
> phone_type, phone_number. The column session_id is serial.
>
> The lines of the table actions which are part of the same session have
> the same session_id.
>
> There are then more lines in the table actions than in the table session.
>
>
> To fill the two tables, I first need to know if the session already
> exists for a certain phone_type, a certain remote_ip and a certain
> phone_number:
>
> SELECT session_id FROM sessions WHERE (phone_number = ?) AND (remote_ip 
> = ?) AND (phone_type = ?) ORDER BY session_id;
>
> I also need to apply a criteria to know if I have to add a new entry in
> the table sessions or not:
>
> SELECT (max(timestamp) + ?)<? FROM actions WHERE (session_id = ?);  
>
>  
>
> If the session already exists I add a line in the table actions with a
> INSERT
>
> If the session doesn't exist or if the criteria is true, I add a line in
> the table sessions with an INSERT, retrieve the session_id of the line
> just added with the following request
>
> SELECT session_id FROM sessions where (msisdn=?) AND (remote_ip=?) AND
> (user_agent=?) ORDER BY session_id DESC LIMIT 1
>
> and the add with a INSERT a line in the table actions.
>
> I have put indexes on sessions(session_id), sessions(msisdn),
> actions(session_id).
>
> My problem is that populating my database is slower  when the data gets
> bigger and the performance falls dramatically. I thought that is would be
> improve with my indexes but the problem still persists.
>
> Could you please give me some clues that could solve this issue.
>
> Thanks.
>
>
>  
>
> Nicolas Nolst
>
> [belgium_gs.gif]
>
> ________________________________________________________________________________
> Join the world’s largest e-mail service with MSN Hotmail. Click Here
>
>

In response to

Browse pgsql-admin by date

  From Date Subject
Next Message Tim Ellis 2002-06-03 18:30:49 multiple instances on one box?
Previous Message Jan Wieck 2002-06-03 17:50:17 Re: [GENERAL] performance issues with DBI module when data too big

Browse pgsql-general by date

  From Date Subject
Next Message Johnson, Shaunn 2002-06-03 18:23:55 invalid type error
Previous Message Jan Wieck 2002-06-03 17:50:17 Re: [GENERAL] performance issues with DBI module when data too big