From: | "Luigi N(dot) Puleio" <npuleio(at)rocketmail(dot)com> |
---|---|
To: | Craig Ringer <craig(at)postnewspapers(dot)com(dot)au> |
Cc: | pgsql-performance(at)postgresql(dot)org |
Subject: | Re: EXPLAIN detail |
Date: | 2008-04-09 14:44:23 |
Message-ID: | 981759.47577.qm@web33504.mail.mud.yahoo.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
>> With all that casting, is it possible that appropriate indexes aren't
>> being used because your WHERE / ON clauses aren't an exact type match
>> for the index?
>
> You mean to put an index on date with timestamptz datatype column?...
> Er ... I'm not quite sure what you mean. Do you mean an index on a cast
> of the column, eg:
> CREATE INDEX some_idx_name ON some_table ( some_timestamp_field::date )
> then ... maybe. It's hard to be sure when there is so little information
> available. It shouldn't be necessary, but there are certainly uses for
> that sort of thing - for example, I use a couple of functional indexes
> in the schema I'm working on at the moment. It's probably a good idea to
> look at ways to avoid doing that first, though.
>> Can you post the full EXPLAIN ANALYZE from the query? This snippet
>> doesn't even show how records are being looked up.
>
> HashAggregate (cost=52236.31..52236.33 rows=1 width=34) (actual time=7004.779...7004.782 rows=1 loops=1)
> -> Nested Loop (cost=0.00..52236.30 rows=1 width=34) (actual time=3939.450..7004.592 rows=1 loops=1)
> Join filter: (("inner".calldate)::time without time zone => ("outer".calldate)::time without time zone)
> -> Seq Scan on table a (cost=0.00..27444.03 rows=1 width=26) (actual time=2479.199..2485.266 rows=3 loops=1)
> Filter: (((calldate)::date = '2008-04-09'::date) AND ((src)::text = '410'::text) AND (substr((dst)::text, 1, 4)='*100'::text) AND ((lastdata)::text ='/dati/ita/loginok'::text))
> ->Seq Scan on table b (cost=0.00..24792.22 rows=3 width=16) (actual time=1504.508..1506.374 rows=1 loops=3)
> Filter: ((((lastdata)::text ='/dati/ita/logoutok'::text) AND ('410'::text=(src)::text) AND ('2008-04-09'::date = (calldate)::date))
> Total runtime: 7005.706 ms
> Personally, I'd want to get rid of all those casts first. Once that's
> cleaned up I'd want to look at creating appropriate indexes on your
> tables. If necessary, I might even create a composite index on
> (lastdata,src,calldate) .
>> What about a \d of the table from psql, or at least a summary of the
>> involved column data types and associated indexes?
>
> this table has an acctid column which is PK then most of the other columns are varchar(80) or so....
> Do you mean that the columns involved in your WHERE and ON clauses, the
> ones you're casting to date, timestamp, etc, are stored as VARCHAR? If
> so, it's no surprise that the query is slow because you're forcing
> PostgreSQL to convert a string to a date, timestamp, or time datatype to
> do anything with it ... and you're doing it many times in every query.
> That will be VERY slow, and prevent the use of (simple) indexes on those
> columns.
> If you're really storing dates/times as VARCHAR, you should probably
> look at some changes to your database design, starting with the use of
> appropriate data types.
> That's all guesswork, because you have not provided enough information.
> Can you please post the output of psql's \d command on the table in
> question?
> If for some reason you cannot do that, please at least include the data
> type of the primary key and all fields involved in the query, as well as
> a list of all the indexes on both tables.
> The easy way to do that is to just launch "psql" then run:
> \d table
> and paste the output to an email.
> So for 4 records result, 7 seconds are too way a lot I guess... but as I said before I'm gonna wait if > the responsible of the server did a VACUUM on the table...
>
> What do you think?...
> If you're really casting VARCHAR to DATE, TIME, TIMESTAMP, etc on demand
> then personally I really doubt that dead rows are your problem.
Well, this table has a primary key index on first column called acctid which is an integer; instead the calldate column is a TIMESTAMPTZ and in fact I'm using to do (calldate)::date in the ON clause because since the time part of that column is always different and in the nesting I have to identificate the date is the same...
the other two columns (src and lastdata) are both VARCHAR(80) and the query is this one:
EXPLAIN ANALYZE
SELECT
(a.calldate)::date,
a.src,
a.dst,
MIN(e.calldate) - a.calldate
FROM
cdr a
INNER JOIN cdr e
ON ((e.calldate)::date = (a.calldate)::date AND e.src = a.src
AND e.lastdata = '/dati/ita/logoutok' AND e.calldate >= a.calldate)
WHERE
(a.calldate)::date = '2008-04-09'
AND a.src = '410'
AND substr(a.dst, 1, 4) = '*100'
AND a.lastdata = '/dati/ita/loginok'
GROUP BY
a.calldate, a.src, a.dst
__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com
From | Date | Subject | |
---|---|---|---|
Next Message | Craig Ringer | 2008-04-09 15:28:51 | Re: EXPLAIN detail |
Previous Message | Craig Ringer | 2008-04-09 14:15:06 | Re: EXPLAIN detail |