| From: | Andy Colson <andy(at)squeakycode(dot)net> |
|---|---|
| To: | Jim Green <student(dot)northwestern(at)gmail(dot)com> |
| Cc: | pgsql-general(at)postgresql(dot)org |
| Subject: | Re: huge price database question.. |
| Date: | 2012-03-21 02:43:36 |
| Message-ID: | 4F694058.5030701@squeakycode.net |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-general |
On 03/20/2012 09:35 PM, Jim Green wrote:
> On 20 March 2012 22:25, Andy Colson<andy(at)squeakycode(dot)net> wrote:
>> I think the decisions:
>>
>> 1) one big table
>> 2) one big partitioned table
>> 3) many little tables
>>
>> would probably depend on how you want to read the data. Writing would be
>> very similar.
>>
>> I tried to read through the thread but didnt see how you're going to read.
>>
>> I have apache logs in a database. Single table, about 18 million rows. I
>> have an index on hittime (its a timestamp), and I can pull a few hundred
>> records based on a time, very fast. On the other hand, a count(*) on the
>> entire table takes a while. If you are going to hit lots and lots of
>> records, I think the multi-table (which include partitioning) would be
>> faster. If you can pull out records based on index, and be very selective,
>> then one big table works fine.
>> On the perl side, use copy. I have code in perl that uses it (and reads
>> from .gz as well), and its very fast. I can post some if you'd like.
>
> my queries would mostly consider select for one symbol for one
> particular day or a few hours in a particular day, occasionally I
> would do select on multiple symbols for some timestamp range. you code
> sample would be appreciated, Thanks!
>
> Jim.
>
>>
>> -Andy
>>
Here is some copy/pasted parts:
my @list = glob('*.gz');
for my $fname (@list)
{
$db->do('copy access from stdin');
open my $fh, "-|", "/usr/bin/zcat $fname" or die "$fname: $!";
while (<$fh>)
{
# bunch of stuff to format sniped here
# if you have comma separated or something you might be able
# to just feed it in
$db->pg_putcopydata("$county\t$ip\t$time\t$status\t$size\t$url\t$ua\n");
}
$db->pg_endcopy;
$db->commit;
}
Do you ever plan on batch deleted a BUNCH of records?
Do you ever want to do read all of one symbol (like, select avg(high) from stocks where symbol = 'bob')?
-Andy
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Jim Green | 2012-03-21 02:49:08 | Re: huge price database question.. |
| Previous Message | Jim Green | 2012-03-21 02:35:34 | Re: huge price database question.. |