Re: Scaleable DB structure for counters...

From: Ron Johnson <ron(dot)l(dot)johnson(at)cox(dot)net>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: Scaleable DB structure for counters...
Date: 2006-07-17 04:59:56
Message-ID: 44BB194C.5060604@cox.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

IOW, files. No problem.

The # of files is known. That's a start. Is there any existing
metric as to how often they are accessed? That's what you need to
know before deciding on a design.

This simple design might be perfectly feasible:
CREATE TABLE T_USAGE_TXN (
BOOK_ID INTEGER,
USER_ID INTEGER,
REFERENCED_DT DATE,
REFERENCED_TM TIME )

*All* the rows (in field order) would be the PK, and I'd then add
secondary indexes on
USER_ID/BOOK_ID
REFERENCED_DT/BOOK_ID
REFERENCED_DT/USER_ID

Lastly, create and algorithmically *pre-populate* this table :
T_CALENDAR (
DATE_ANSI DATE,
YEARNUM SMALLINT,
MONTH_NUM SMALLINT,
DAY_OF_MONTH SMALLINT,
DAY_OF_WEEK SMALLINT,
JULIAN_DAY SMALLINT)

So, if you want a list and count of all books that were referenced
on Sundays in 2006:

SELECT UT.BOOK_ID, COUNT(*)
FROM T_USAGE_COUNT UT,
T_CALENDAR C
WHERE C.YEARNUM = 2006
AND C.DAY_OF_WEEK = 0
AND C.DATE_ANSI = UT.REFERENCED_DT;

Eci Souji wrote:
> I think "books" may have thrown everyone for a loop. These are
> not physical books, but rather complete scanned collections that
> would be available for search and reference online. One of the
> most important features required would be keeping track of how
> often each book was referenced and when. Time of day, days of
> week, etc. This is why I was looking into how to construct some
> form of counter system that would allow us to keep track of
> accesses.
>
> Although I would love to see a robot librarian at work. :-)
>
> - E
>
> Ron Johnson wrote: Eci Souji wrote:
>
>>>> What if instead of book checkouts we were looking at how
>>>> often a book was referenced? In which case we're talking
>>>> multiple times an hour, and we could easily have each book
>>>> requiring hundreds of thousands of rows. Multiply that by
>>>> hundreds of thousands of books and a the table seems to
>>>> become huge quite quick. Would breaking up the table by
>>>> year still make sense? I'm just not familiar with having
>>>> to deal with a table that could easily hit millions of
>>>> records.
>
>
> Are all 200000 books accessed every hour? What kind of library
> is this? Do you have robot librarians moving at hyperspeed?
> Wouldn't a more reasonable value be 5000 books per *day*?
>
> It's easy to know when a book is checked out. How do you know
> when a book is referenced? Are all books only accessed by the
> librarians?

- --
Ron Johnson, Jr.
Jefferson LA USA

Is "common sense" really valid?
For example, it is "common sense" to white-power racists that
whites are superior to blacks, and that those with brown skins
are mud people.
However, that "common sense" is obviously wrong.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.3 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFEuxlMS9HxQb37XmcRAmXCAJ42IBwCvaDMlfMsiJoPsELxL0e1QQCfUBWH
6M7o4n9q2CEKbYn/xgh6OnY=
=iQF3
-----END PGP SIGNATURE-----

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Jaime Casanova 2006-07-17 05:10:12 Re: Log actual params for prepared queries: TO-DO item?
Previous Message Michael Fuhr 2006-07-17 04:33:26 Re: Lock changes with 8.1 - what's the right lock?