Re: huge price database question..

From: Andy Colson <andy(at)squeakycode(dot)net>
To: Jim Green <student(dot)northwestern(at)gmail(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: huge price database question..
Date: 2012-03-21 02:43:36
Message-ID: 4F694058.5030701@squeakycode.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 03/20/2012 09:35 PM, Jim Green wrote:
> On 20 March 2012 22:25, Andy Colson<andy(at)squeakycode(dot)net> wrote:
>> I think the decisions:
>>
>> 1) one big table
>> 2) one big partitioned table
>> 3) many little tables
>>
>> would probably depend on how you want to read the data. Writing would be
>> very similar.
>>
>> I tried to read through the thread but didnt see how you're going to read.
>>
>> I have apache logs in a database. Single table, about 18 million rows. I
>> have an index on hittime (its a timestamp), and I can pull a few hundred
>> records based on a time, very fast. On the other hand, a count(*) on the
>> entire table takes a while. If you are going to hit lots and lots of
>> records, I think the multi-table (which include partitioning) would be
>> faster. If you can pull out records based on index, and be very selective,
>> then one big table works fine.
>> On the perl side, use copy. I have code in perl that uses it (and reads
>> from .gz as well), and its very fast. I can post some if you'd like.
>
> my queries would mostly consider select for one symbol for one
> particular day or a few hours in a particular day, occasionally I
> would do select on multiple symbols for some timestamp range. you code
> sample would be appreciated, Thanks!
>
> Jim.
>
>>
>> -Andy
>>

Here is some copy/pasted parts:

my @list = glob('*.gz');
for my $fname (@list)
{
$db->do('copy access from stdin');
open my $fh, "-|", "/usr/bin/zcat $fname" or die "$fname: $!";
while (<$fh>)
{
# bunch of stuff to format sniped here
# if you have comma separated or something you might be able
# to just feed it in
$db->pg_putcopydata("$county\t$ip\t$time\t$status\t$size\t$url\t$ua\n");
}
$db->pg_endcopy;
$db->commit;
}

Do you ever plan on batch deleted a BUNCH of records?

Do you ever want to do read all of one symbol (like, select avg(high) from stocks where symbol = 'bob')?

-Andy

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Jim Green 2012-03-21 02:49:08 Re: huge price database question..
Previous Message Jim Green 2012-03-21 02:35:34 Re: huge price database question..