From: | "Timasmith" <timasmith(at)hotmail(dot)com> |
---|---|
To: | pgsql-general(at)postgresql(dot)org |
Subject: | Re: which is more scalable for the database? |
Date: | 2007-03-13 03:39:10 |
Message-ID: | 1173757150.216710.220780@30g2000cwc.googlegroups.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On Mar 8, 2:26 pm, p(dot)(dot)(dot)(at)Sheeky(dot)Biz (Shane Ambler) wrote:
> Timasmithwrote:
> > Suppose I have a database table with 20 fields which are lookups to a
> > single table.
>
> > configtable(configtable_id, a_field, something_lookup_id,
> > another_lookup_id, ...)
> > lookup(lookup_id, value, description, ...)
>
> > what is going to be faster to map the rows to an Object which needs
> > the 'value' for every field ending in lookup_id
>
> How long is ball of string 'a' compared to ball of string 'b'?
>
> That will depend on a lot on how large the lookup table will be (100
> rows or 10 million rows?)
100-500,000 rows
and on how long your descriptions are.
10-60, perhaps averaging around 20 characters
Will
> you want all 20 descriptions for each query you run?
Yes, it is the price I pay with Hibernate for vastly simplifying
data access
If your
> descriptions are 200 characters each then maybe you can reduce some data
> transfer by caching these at the client instead of transferring them
> each time you retrieve a result. But then how will you keep your local
> cache up to date?
The data I would cache would change infrequently in the production
environment and there would be
no expectation it would be 'immediate' on change.
I would provide services to flush on demand, or on save (of
reference data), and/or through a scheduler etc.
>
> I would suggest the view for a couple of reasons - first it will
> simplify your queries as you won't need all the join details in every
> query you run. Second the select for the query can be parsed and stored
> whereas separate selects will be parsed each time.
If I was caching on the client, I would select all data only once
and there would be no queries, it would only be
select * from sometable
and the client (actually server side bean), would populate the
object with the missing lookups.
>
> A lot of this will depend on what you are doing and what you are using
> to do it. If you are using php then you have less options for caching
> than say a client program written in c. php may run pretty fast but it
> won't outrun a compiled c program.
Java hashtable.
>
>
>
>
>
> > a) View
>
> > select c.configtable_id, l1.value as something_lookup, l2.value as
> > another_lookup
> > from configtable c,
> > lookup l1,
> > lookup l2
> > where c.something_lookup_id = l1.lookup_id
> > and c.another_lookup_id = l2.lookup_id
>
> > foreach row
> > map values to object
> > end
>
> > b) Cache all lookup values and populate
>
> > select c.* from configtable
>
> > foreach row
> > map values to object
> > if lookup_id
> > find value from hashtable and map value to object
> > endif
> > end
>
> > It seems that the latter *might* be better to scale outward better,
> > as
> > you could add application servers to do the caching/mapping and you
> > only select from a single table?
>
> Maybe but then postgresql may do the lookups quicker than what you have
> available at the client end.
Right, but could it keep up, how much extra effort is it to do the
indexed lookups (almost all in memory), for the client.
>
> The only way you will really know is to load up some sample data and
> test each method yourself.
Its very hard to simulate though due to production hardware having 16
cpus, Gigs of memory, huge databases, hundereds of concurrent users -
I just dont have access to that kind of environment.
>
> How far are you expecting to scale? The simplest method may just work
> fast enough that all the extra caching and setup/maintenance of this
> will outweigh any benefit.
I would like to scale to 10,000s of power users on a complex OLTP
system. Adding servers to expand out is generally feasible, but one
can only scale upward so far before reaching a hardware peak.
>
> --
>
> Shane Ambler
> p(dot)(dot)(dot)(at)Sheeky(dot)Biz
>
> Get Sheeky @http://Sheeky.Biz
>
> ---------------------------(end of broadcast)---------------------------
> TIP 9: In versions below 8.0, the planner will ignore your desire to
> choose an index scan if your joining column's datatypes do not
> match- Hide quoted text -
>
> - Show quoted text -
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2007-03-13 03:57:08 | Re: Tracking disk writes? (again) |
Previous Message | Alvaro Herrera | 2007-03-13 03:30:27 | Re: Tracking disk writes? (again) |