[hackers]development suggestion needed

From: xun(at)cs(dot)ucsb(dot)edu (Xun Cheng)
To: pgsql-hackers(at)postgresql(dot)org
Cc: xun(at)cs(dot)ucsb(dot)edu
Subject: [hackers]development suggestion needed
Date: 2000-01-14 00:46:52
Message-ID: 200001140046.QAA17082@brubeck.cs.ucsb.edu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi, everyone, I like your work very much and hope PostgreSQL can
grow into something competitive with Oracle just like Linux vs.
Windows.

I have background in relational database management system
research and I want to try to be a developer for PostgreSQL.
Right now I only try to be familiar with your code base. I
plan to start with a specific function module in the backend.
I'm thinking of /docs/pgsql/src/backend/executor because
I want to experiment with some new fast join algorithms.
My long term objective is to introduce materialized view
subsystem into PostgreSQL. Could anyone tell me if
the directory /docs/pgsql/src/backend/executor is the
right place to start or just give me some general suggestions
which are not in the FAQs? Oh one more thing I want to
mention is that those join algorithms I want to experiment
with may have some special data access paths similar to an index.

Further if it doesn't bother you much, could someone
answer the following question(s) for me? (Sorry if
some are already in the docs)
1. Does postgresql do raw storage device management or it relies
on file system? My impression is no raw device. If no,
is it difficult to add it and possibly how?
2. Do you have standard benchmark results for postgresql?
I guess not since it only implements a subset of SQL'92.
What about subset of a benchmark or something repeatable?
3. Suppose I have added a new two rel. join algorithm, how
would I proceed to compare the performance of it with
the exisiting two relation join algorithms under
different senarios? Are there any existing facilities
in the current code base for this purpose? Am I right
that the available join algos implemented are nested loop
join (including index-based), hash join (which one? hybrid),
sort-merge join?
4. Usually a single sequential pass of a large joining relation
is preferred to random access in large join operation.
It's mostly because of the current disk access characteristics.
Is it possible for me to do some benchmarking about this
using postgresql? What I'm actually asking are the issues about
how to control the flow of data form disk to buffers,
how to stop file system interference and how to arrange
actual data placement on the disk.

Sorry again if I'm not clear with my questions. I'd like
to further explain them if necessary.

thanks for any help
xun

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2000-01-14 01:09:40 Re: [HACKERS] [hackers]development suggestion needed
Previous Message admin 2000-01-13 23:16:04 SPI_fnumber can't see oid