Re: [HACKERS] [hackers]development suggestion needed

From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Xun Cheng <xun(at)cs(dot)ucsb(dot)edu>
Cc: pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: [HACKERS] [hackers]development suggestion needed
Date: 2000-01-14 01:09:40
Message-ID: 200001140109.UAA23412@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> I have background in relational database management system
> research and I want to try to be a developer for PostgreSQL.
> Right now I only try to be familiar with your code base. I
> plan to start with a specific function module in the backend.
> I'm thinking of /docs/pgsql/src/backend/executor because
> I want to experiment with some new fast join algorithms.
> My long term objective is to introduce materialized view
> subsystem into PostgreSQL. Could anyone tell me if
> the directory /docs/pgsql/src/backend/executor is the
> right place to start or just give me some general suggestions
> which are not in the FAQs? Oh one more thing I want to
> mention is that those join algorithms I want to experiment
> with may have some special data access paths similar to an index.

Good.

>
> Further if it doesn't bother you much, could someone
> answer the following question(s) for me? (Sorry if
> some are already in the docs)
> 1. Does postgresql do raw storage device management or it relies
> on file system? My impression is no raw device. If no,
> is it difficult to add it and possibly how?

No, only file system. We don't see much advantage to raw i/o.

> 2. Do you have standard benchmark results for postgresql?
> I guess not since it only implements a subset of SQL'92.
> What about subset of a benchmark or something repeatable?

We do the Wisconsin. I think it is in the source tree.

> 3. Suppose I have added a new two rel. join algorithm, how
> would I proceed to compare the performance of it with
> the exisiting two relation join algorithms under
> different senarios? Are there any existing facilities
> in the current code base for this purpose? Am I right
> that the available join algos implemented are nested loop
> join (including index-based), hash join (which one? hybrid),
> sort-merge join?

You can control the join types used with flags to postgres. Very easy.

> 4. Usually a single sequential pass of a large joining relation
> is preferred to random access in large join operation.
> It's mostly because of the current disk access characteristics.
> Is it possible for me to do some benchmarking about this
> using postgresql? What I'm actually asking are the issues about
> how to control the flow of data form disk to buffers,
> how to stop file system interference and how to arrange
> actual data placement on the disk.

Good idea. We deal with this regularly in deciding to use an index in
the optimizer or a sequential scan. Our optimizer is quite good.

--
Bruce Momjian | http://www.op.net/~candle
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 853-3000
+ If your life is a hard drive, | 830 Blythe Avenue
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2000-01-14 01:23:24 Re: [HACKERS] [hackers]development suggestion needed
Previous Message Xun Cheng 2000-01-14 00:46:52 [hackers]development suggestion needed