Re: [HACKERS] VIEWS, DISTINCT and COUNT

From: wieck(at)debis(dot)com (Jan Wieck)
To: tgl(at)sss(dot)pgh(dot)pa(dot)us (Tom Lane)
Cc: bhirt(at)mobygames(dot)com, pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: [HACKERS] VIEWS, DISTINCT and COUNT
Date: 1999-11-04 03:59:39
Message-ID: m11jE3n-0003kLC@orion.SAPserv.Hamburg.dsh.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

>
> Brian Hirt <bhirt(at)mobygames(dot)com> writes:
> > "select count(distinct id) from table" is not supported.
>
> Yup. It's on the TODO list:
> * Allow COUNT(DISTINCT col)
>
> > For both count(distinct) and distinct in views, I have this question: Is
> > this something that needs to be supported but just never got implemented?
>
> I'm not sure what Jan has in mind for views, but certainly
> aggregate(DISTINCT ...) is an SQL-standard feature that we ought to
> support. I don't think it's a simple addition though :-(

All these DISTINCT, AGGREGATE etc. problems on views are
based on the fact, that the planner still requires that the
rewriters output is equivalent to a regular, allowed query.

I would like to be able to place a complete querytree (just
an entire SELECT's Query node) into a range table entry.
AFAIK, from the callers point of view there is not much
difference between the join-, group-, sort-, aggregate- and
scan-nodes. They are all just nodes returing some amount of
tuples. All of them could be the toplevel executor node of a
SELECT - just something returning tuples.

Unfortunately my knowledge in the planner is very limited, so
I would need help to go for it. Who has that knowledge?

The basic idea is this:

Let's have a view defined as

CREATE VIEW v1 AS SELECT a, count(*) AS n FROM t1 GROUP BY a;

The plan for such a query would be a

Aggregate
-> Group
-> Sort
-> Seq Scan on t1

Thus doing a

SELECT t2.a, v1.n FROM t2, v1 WHERE t2.a = v1.a;

could finally result in a

Merge Join
-> Sort
-> Seq Scan on t2
-> Sort
-> Aggregate
-> Group
-> Sort
-> Seq Scan on t1

It's impossible to cause such an execution plan from a
standard SQL statement. But why should it be impossible for
the rewriter too? If v1 where a regular table (not a view),
the generated plan would have been a

Merge Join
-> Sort
-> Seq Scan on t2
-> Sort
-> Seq Scan on v1

so oviously the only difference is that the scan over the v1
relation has been replaced by the more complicated plan for
the plain definition of the view. If the planner could do
that, I think we would get rid of all the limitations for
views very soon.

Again, who knows enough about the planner to be able to do
this kind of stuff?

Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#========================================= wieck(at)debis(dot)com (Jan Wieck) #

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tatsuo Ishii 1999-11-04 04:02:27 Re: [HACKERS] PostgreSQL 6.5.3 built, but not released ...
Previous Message Bruce Momjian 1999-11-04 03:45:03 Re: [HACKERS] PostgreSQL 6.5.3 built, but not released ...