Re: [HACKERS] gperf anyone?

From: Alfred Perlstein <bright(at)wintelcom(dot)net>
To: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, hackers(at)postgreSQL(dot)org
Subject: Re: [HACKERS] gperf anyone?
Date: 2000-01-19 01:29:34
Message-ID: 20000118172933.N8736@fw.wintelcom.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

* Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us> [000118 17:14] wrote:
> [Charset ISO-8859-1 unsupported, filtering to ASCII...]
> > A while ago I played around with gperf (GNU perfect hash function
> > generator), abusing the keyword lookup in parser/keyword.c as playground.
> > Now before I delete this I was wondering if this would perhaps be of use
> > to the general public. I don't know how huge the speed advantage of this
> > is, I'm sure the parser/scanner speed is the least of our problems. But I
> > thunk especially ecpg could benefit from this. Btw., gperf is used by GCC,
> > so it's not a toy.
>
> keywords are a fixed array, with a binary search to find a match. Could
> gperf be faster?

yes:

~ % gperf
/* starting time is 21:10:49 */
postgresql
really
kicks
butt
/* C code produced by gperf version 2.1 (K&R C version) */
/* Command-line: gperf */

#define MIN_WORD_LENGTH 4
#define MAX_WORD_LENGTH 10
#define MIN_HASH_VALUE 4
#define MAX_HASH_VALUE 10
/*
4 keywords
7 is the maximum key range
*/

static int
hash (str, len)
register char *str;
register unsigned int len;
{
static unsigned char hash_table[] =
{
10, 10, 10, 10, 10, 10, 10, 10, 10, 10,
10, 10, 10, 10, 10, 10, 10, 10, 10, 10,
10, 10, 10, 10, 10, 10, 10, 10, 10, 10,
10, 10, 10, 10, 10, 10, 10, 10, 10, 10,
10, 10, 10, 10, 10, 10, 10, 10, 10, 10,
10, 10, 10, 10, 10, 10, 10, 10, 10, 10,
10, 10, 10, 10, 10, 10, 10, 10, 10, 10,
10, 10, 10, 10, 10, 10, 10, 10, 10, 10,
10, 10, 10, 10, 10, 10, 10, 10, 10, 10,
10, 10, 10, 10, 10, 10, 10, 10, 0, 10,
10, 10, 10, 10, 10, 10, 10, 0, 0, 10,
10, 10, 0, 10, 0, 0, 0, 10, 10, 10,
10, 0, 10, 10, 10, 10, 10, 10,
};
return len + hash_table[str[len - 1]] + hash_table[str[0]];
}

char *
in_word_set (str, len)
register char *str;
register unsigned int len;
{

static char * wordlist[] =
{
"", "", "", "",
"butt",
"kicks",
"really",
"", "", "",
"postgresql",
};

if (len <= MAX_WORD_LENGTH && len >= MIN_WORD_LENGTH)
{
register int key = hash (str, len);

if (key <= MAX_HASH_VALUE && key >= MIN_HASH_VALUE)
{
register char *s = wordlist[key];

if (*s == *str && !strcmp (str + 1, s + 1))
return s;
}
}
return 0;
}
/* ending time is 21:10:58 */

A perfect hash should be much faster at the trivial expense of some space.

>From the distribution:
While teaching a data structures course at University of California,
Irvine, I developed a program called GPERF that generates perfect hash
functions for sets of key words. A perfect hash function is simply:

A hash function and a data structure that allows
recognition of a key word in a set of words using
exactly 1 probe into the data structure.

> We also can not distribute GNU code.

I'm pretty sure that the code the gperf outputs is not covered under the
GPL, just gperf itself.

--
-Alfred Perlstein - [bright(at)wintelcom(dot)net|alfred(at)freebsd(dot)org]

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tatsuo Ishii 2000-01-19 01:35:18 Re: [HACKERS] date/time problem in v6.5.3 and 7.0.0 ...
Previous Message Hiroshi Inoue 2000-01-19 01:13:40 RE: [HACKERS] Index recreation in vacuum