From: | "Joris Dobbelsteen" <Joris(at)familiedobbelsteen(dot)nl> |
---|---|
To: | "garry saddington" <garry(at)schoolteachers(dot)co(dot)uk> |
Cc: | <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: counting query |
Date: | 2007-01-28 14:11:45 |
Message-ID: | 73427AD314CC364C8DF0FFF9C4D693FF5527@nehemiah.joris2k.local |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
>-----Original Message-----
>From: pgsql-general-owner(at)postgresql(dot)org
>[mailto:pgsql-general-owner(at)postgresql(dot)org] On Behalf Of garry
>saddington
>Sent: zondag 28 januari 2007 14:06
>To: pgsql-general(at)postgresql(dot)org
>Subject: [GENERAL] counting query
>
>I have a table definition such as:
>
>CREATE TABLE attendance
>(
> attendanceid serial primary key,
Why you have this??? You already have (entered,timeperiod,studentid)
that you can use, since that must be unique too. Try to avoid surrogate
keys as much as possible (it really increases performance and
ease-of-writing for complex queries!
> entered date DEFAULT current_date NOT NULL,
> absent boolean,
> authorization text default 'N',
> timeperiod char(2) check(timeperiod in('AM','PM')),
> days varchar(10),
> studentid int,
> unique(entered,timeperiod,studentid)
>)
Guessing the meaning a bit, not too self-explaining.
>Which is used to record school attendance data. I am now
>trying to write a query to identify trends in absences by
>counting the days column and returning any student that has
>repeated absences on certain days. I am struggling to return
>anything that does not need further manipulation in Python
>before being useful.
>Does anyone have any ideas?
Yes, before starting you must have a well-defined idea on what you want
to know. What should the result look like? In most situations start
simple things, and eventually combine these to something more complex,
but always know what you are querying. You should have some idea of what
results you will be getting.
Of course, this is a proces of discovery rather than following set
rules. Some background on probabilities and statistics really helps.
Personally I like visualization quite a lot to help me with this.
Personally I've found nothing that will beat Excel for doing data
analysis. Learn to use the pivot table and pivot charts. They are
extremely powerful.
However, it can be a bit tricky to transform the input into something
the tool can use. A good starting point is to split the dates into
seperate year, month, day, week values.
Some idea's that might work to get it started:
* A graph with days vs occurrences (count).
* Graph of total sick days per student vs occurrences.
* Graph of Check the occurrences per month/day/week
Next try filtering of the data to form sequential periods and more funny
things.
I believe all this can be done with plain SQL and you don't need any
python or plsql or other languages.
- Joris Dobbelsteen
From | Date | Subject | |
---|---|---|---|
Next Message | John Meyer | 2007-01-28 14:36:15 | Re: counting query |
Previous Message | Shane Ambler | 2007-01-28 14:02:03 | Re: Password issue revisited |