Quick Links

Re: [HACKERS] [hackers]development suggestion needed

From:	Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To:	Xun Cheng <xun(at)cs(dot)ucsb(dot)edu>
Cc:	pgsql-hackers(at)postgreSQL(dot)org
Subject:	Re: [HACKERS] [hackers]development suggestion needed
Date:	2000-01-14 01:09:40
Message-ID:	200001140109.UAA23412@candle.pha.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

> I have background in relational database management system
> research and I want to try to be a developer for PostgreSQL.
> Right now I only try to be familiar with your code base. I
> plan to start with a specific function module in the backend.
> I'm thinking of /docs/pgsql/src/backend/executor because
> I want to experiment with some new fast join algorithms.
> My long term objective is to introduce materialized view
> subsystem into PostgreSQL. Could anyone tell me if
> the directory /docs/pgsql/src/backend/executor is the
> right place to start or just give me some general suggestions
> which are not in the FAQs? Oh one more thing I want to
> mention is that those join algorithms I want to experiment
> with may have some special data access paths similar to an index.

Good.

>
> Further if it doesn't bother you much, could someone
> answer the following question(s) for me? (Sorry if
> some are already in the docs)
> 1. Does postgresql do raw storage device management or it relies
> on file system? My impression is no raw device. If no,
> is it difficult to add it and possibly how?

No, only file system. We don't see much advantage to raw i/o.

> 2. Do you have standard benchmark results for postgresql?
> I guess not since it only implements a subset of SQL'92.
> What about subset of a benchmark or something repeatable?

We do the Wisconsin. I think it is in the source tree.

> 3. Suppose I have added a new two rel. join algorithm, how
> would I proceed to compare the performance of it with
> the exisiting two relation join algorithms under
> different senarios? Are there any existing facilities
> in the current code base for this purpose? Am I right
> that the available join algos implemented are nested loop
> join (including index-based), hash join (which one? hybrid),
> sort-merge join?

You can control the join types used with flags to postgres. Very easy.

> 4. Usually a single sequential pass of a large joining relation
> is preferred to random access in large join operation.
> It's mostly because of the current disk access characteristics.
> Is it possible for me to do some benchmarking about this
> using postgresql? What I'm actually asking are the issues about
> how to control the flow of data form disk to buffers,
> how to stop file system interference and how to arrange
> actual data placement on the disk.

Good idea. We deal with this regularly in deciding to use an index in
the optimizer or a sequential scan. Our optimizer is quite good.

--
Bruce Momjian | http://www.op.net/~candle
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 853-3000
+ If your life is a hard drive, | 830 Blythe Avenue
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026

In response to

[hackers]development suggestion needed at 2000-01-14 00:46:52 from Xun Cheng

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Tom Lane	2000-01-14 01:23:24	Re: [HACKERS] [hackers]development suggestion needed
Previous Message	Xun Cheng	2000-01-14 00:46:52	[hackers]development suggestion needed