Re: Dreaming About Redesigning SQL

From: Lauri Pietarinen <lauri(dot)pietarinen(at)atbusiness(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Dreaming About Redesigning SQL
Date: 2003-10-21 22:34:28
Message-ID: bn4cca$dj0$1@nyytiset.pp.htv.fi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Anthony W. Youngman wrote:

>In article <3F94BCBB(dot)7030001(at)atbusiness(dot)com>, Lauri Pietarinen
><lauri(dot)pietarinen(at)atbusiness(dot)com> writes
>
>
>>>Okay. Give me a FORMULA that returns a time in seconds for your query.
>>>
>>>Let's assume I want to print a statement of how many invoices were sent
>>>to a customer, along with various details of those invoices. My invoice
>>>file is indexed by company/month, and we can reasonably assume that the
>>>time taken to produce the statement is infinitesimal compared to the
>>>time taken to retrieve the invoice data from disk. For MV
>>>
>>>T = (2 + N) * ST * 1.05
>>>
>>>Where T is the time taken to produce the report, N is the number of
>>>invoices, and ST is the hard disk seek time.
>>>
>>>
>>>
>>First of all it is important to note that an important component of all modern
>>SQL-DBMS's is
>>the buffer pool (or cache) meaning that in a reasonably well tuned database you
>>get very few
>>disk I/O's, even when *writing* data into tables.
>>
>>
>
>Fine. But MV *doesn't* *need* much of a cache. Let's assume both SQL and
>MV have the same amount of RAM to cache in - i.e. *not* *much*. I did
>say the spec said "extract maximum performance from the hardware
>available".
>
So what's wrong with gettng a machine with lots of memory? How much
does 2G of
memory for an Intel-box cost now a days? Is this some kind of new
ultimate sport, trying
to get along with as little memory as possible?

>You're assuming that you can throw hardware at the problem - fine, but
>that's not always possible. You might have already maxed out the ram,
>you might have a "huge" database, you might be sharing your db server
>with other programs (BIND really likes to chew up every available drop
>of ram, doesn't it :-).
>
>I'm not saying that you shouldn't throw hardware at it, but what if you
>can't?
>
>
>>SQL-DBMS's also are very clever at using indexes, i.e. if they can find all
>>necessary data
>>
>>
>>from an index it will not even look at the table, so to speak.
>
>Same with MV
>
>
>>And, even when presuming conservatively that there is no data in cache,
>>depending on how
>>the data is clustered, you will get more than one row/disk read (= 8K in most(?)
>>systems).
>>
>>
>
>Same with MV
>
>
>>I could now denormalise OrderDetail so that it contains cust_id also
>>and cluster by cust_id
>>(might cause you trouble down the road, if you can change the customer
>>of an order), in which case, with 3 I/O's I would get
>>- 8 customer rows
>>- 16 order rows
>>- 24 order detail rows (which would all apply to one customer)
>>
>>Now the amout of I/O's would depend on how many detail rows
>>we have per customer.
>>
>>And, of course, because we are using sequential prefetch, we would be
>>getting more than one I/O block (8?, 16?) per disk seek, so it's a hard
>>comparison to
>>make but I suspect that it would about equal your example.
>>
>>
>
>Except my example was an *average* case, and yours is a *best* case. Oh,
>and my data is still normalised - I haven't had to denormalise it! AND I
>haven't run an optimiser over it :-)
>
Are you hiding your optimiser behind the curtain? ;-)

Well, if it is normalised, how easy is it for you to change the
customer_id of an order? Anyway,
if we stick to your example and even if we don't normalise using e.g.
clustering features of Oracle,
as Bob pointed out, we are getting at most the same number of I/O's.
So, answer to your
question: our formula is at least as good as yours.

>>Now, that was a *conservative* estimate, and we assumed that we did not have
>>any rows lying around in the (global!) cache. As the size of the cache grows in
>>proportion to the size of the total database we can assume less and less disk
>>I/O.
>>
>>
>
>You're relying on the hardware to bale you out :-) We can do the same!
>
Well why don't you?

>>Note also that the cache can be configured many ways, you can put different
>>tables (or indexes) in different caches, and even change the size of the cache
>>on the fly (you might want a bigger cache during evening and night when your
>>batch programs are running) so you can rig your system to favour certain
>>types of queries.
>>
>>I havn't even gone into the topic of using thick indexes so table access can
>>be totally avoided (=we are reading into memory only interesting columns).
>>
>>Now, in your example, what if the product department comes along and
>>wants to make a report with sales / product? What would be your formula
>>in that case?
>>
>>
>
>I'm not quite sure what you're trying to do. I'll assume you want a
>report of all invoices which refer to a given product. Assuming I've got
>the relevant indices defined, I can simply read a list of invoices from
>the product code index, a second list of invoices from the month index,
>and do an intersect of the two lists.
>
I want a list with all products with corresponding total sales, read
from order detail e.g.

Hammer 10000$
Nail 5000$
Screw 1200$

How many disk reads (or head movements)?

>So again, T = (2+N) * ST * 1.05 where N is the number of invoices that
>reference that product. And now ALL the invoice data has been retrieved
>from disk to ram ...
>
>
>>And: what if I was just reading customer-data. Would the same formula
>>apply (= (2+N)*ST*1.05)?
>>
>>
>
>Nope. If I understand you correctly, you want attributes that belong to
>the entity "customer", not the entity "invoice". T = ST * 1.05. (By the
>way, billing and/or invoice address (for example) are invoice
>attributes, not company attributes.)
>
No, I want you to give me a list of all your customers. How many disk
reads?

>>>But as I understand relational theory, such a question is completely
>>>outside the scope of the theory. Seeing as it tries to divorce the
>>>database logic from the practical implementation ...
>>>
>>>
>>>
>>The theory, indeed, does not say anything about buffer pools, but by decoupling
>>logic
>>
>>
>>from implementation we leave the implementor (DBMS) to do as it feels fit to do.
>
>
>>As DBMS technology advances, we get faster systems without having to change our
>>programs.
>>
>>
>
>But with MV, if our database is too large for current technology, we
>kick the shit out of relational for speed ...
>
>Don't forget. You've already said that, if nothing is cached, my average
>case exceeds your best. And my case is *already* assuming that the
>system is seriously stressed and struggling ...
>
>
>>When we design databases we can decouple logical planning from performance
>>considerations, which, you must agree, are two separate issues.
>>
>>
>>
>>>And you know it's been proven that Huffman coding is the most efficient
>>>compression algorithm? (Actually, it isn't - it's been proven it can't
>>>be improved upon, which isn't the same thing...). Can you improve on the
>>>formula I've just given you? Given that if we could change the 1.05 to 1
>>>then we can prove it can't be improved upon ... again - I've taken the
>>>liberty of assuming that a MV FILE is equivalent to an entity if we
>>>assume the relational designer has been thinking in an entity-attribute-
>>>relation sort of way. My maths isn't good enough to prove it, but I
>>>think it would be pretty easy to prove that accessing data as "one and
>>>only one complete entity" at a time is the most efficient way.
>>>
>>>
>>>
>>I think that in a typical system your cache hit ratio would approach 90%
>>so that could mean 0.1 disk seeks.
>>
>>
>
>That improves our performance just as much as improves yours. What
>happens to your response time if you just DON'T HAVE the cache
>available, for whatever reason?
>
>I can't find the post now :-( but is Christopher reading this? You know
>I compared that relational system on a twin Xeon 800, to an MV system
>running on a P90? Christopher made the (reasonable in the circumstances)
>assumption that the relational consultants must be crap, and the MV guy
>a guru. Actually, I'd come to exactly the OPPOSITE conclusion. My MV
>experience tells me that MV query was probably thrown together, by an
>average programmer, in 30 seconds. On the other hand, those SQL
>consultants had an axe to grind and a point to prove. They couldn't
>afford to let this "old fashioned" system beat them. That SQL query
>would have been optimised to within an inch of its life over weeks.
>Don't forget how proud they were to beat this MV system! Yet with
>hardware that was so much more powerful and a query that was heavily
>optimised, they had great difficulty beating a query that was thrown
>together in seconds by an average MV guy (or even just a luser!).
>
>Don't forget. I said I am a database *engineer*. Engineers believe in
>elegance, they believe in beauty. And when I look at relational, all I
>see is the theorists pleading "power", "hardware", "brute force", to get
>them out of trouble. And then all these people, who believe in maths
>over reality, are surprised when I turn round and say I despise their
>beliefs.
>
>Note, I did NOT say I despise relational theory. I despise the belief
>that it is the answer to life, the database universe, and everything
>data related. (By the way, 6 times 9 DOES equal 42 :-)
>
>
>
>>best regards,
>>Lauri Pietarinen
>>
>>
>>
>Cheers,
>Wol
>
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bob Badour 2003-10-21 22:52:18 Re: Dreaming About Redesigning SQL
Previous Message Bruce Momjian 2003-10-21 22:22:52 Re: [HACKERS] Mapping Oracle types to PostgreSQL types