From: | Joe Conway <mail(at)joeconway(dot)com> |
---|---|
To: | Mike Mascari <mascarm(at)mascari(dot)com> |
Cc: | postgresql <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: pl/R questions |
Date: | 2003-08-02 17:13:40 |
Message-ID: | 3F2BF144.6010307@joeconway.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Mike Mascari wrote:
> (A) The function r_resetlm() must be called to reset the global values
> before each invocation. Not a big problem, but I would like to avoid
> globals, if possible. The relations supplying the data are temporary
> tables and thus I cannot refer to their names in static pl/R. I can't
> figure out a way to use pg.spi.prepare()/pg.spi.execp() to initialize
> R variables with the result of the executed queries. I would like to
> do something like this, instead:
>
> CREATE OR REPLACE FUNCTION r_predict(text, text)
> RETURNS SETOF RECORD AS '
>
> sql <- paste("SELECT x, y FROM", arg1, "ORDER BY x")
> plan <- pg.spi.prepare(sql, NA)
> pg.spi.execp(plan, NA)
>
> ??? Read results into appropriate vectors
>
> samples <- data.frame(xs=nxs)
> result <- predict(lm(ys ~ xs), samples)
> return (result)
>
> ' LANGUAGE 'plr' WITH (isStrict);
I don't think you can do a prepared plan if the table itself is going to
change, only when parameters change. Maybe something like this works:
CREATE OR REPLACE FUNCTION r_predict(text, text)
RETURNS SETOF RECORD AS '
sql <- paste("SELECT x, y FROM", arg1, "ORDER BY x")
xyknowns <- pg.spi.exec(sql)
xs <- as.numeric(xyknowns[,1])
ys <- as.numeric(xyknowns[,2])
sql <- paste("SELECT x FROM", arg2, "ORDER BY x")
xypred <- pg.spi.exec(sql)
nxs <- as.numeric(xypred[,1])
samples <- data.frame(xs=nxs)
result <- predict(lm(ys ~ xs), samples)
return (result)
' LANGUAGE 'plr' WITH (isStrict);
regression=# select * from r_predict('entries', 'predictions') as
trend(ny float8);
ny
------------------
146171.515151515
147189.696969697
148207.878787879
149226.060606061
150244.242424242
(5 rows)
> (B) I suppose an unqualified SELECT will always invoke r_initknowns()
> and r_initpredicts() but is this guaranteed? And guaranteed to only be
> executed once for each tuple? If so, then I'm somewhat less bothered
> by the use of R globals. Is using the VOLATILE attribute in the CREATE
> FUNTION statement sufficient to guarantee that the call will always be
> made?
Use the above -- I think your original multistep process is not the way
to go anyway
> (C) For the life of me, and this is an R question, I cannot figure out
> how to get R to perform predictions on multivariate data:
I'm sure there is support for multivariate linear regression in R, but
I'm still too new at R to know the answer myself. You should try posting
that one to R-help.
BTW, I created a PL/R specific mailing list on gborg, but no one is
subscribed currently. If people on this list find PL/R specific
questions too off-topic, perhaps we should move there. R specific
questions should definitely be posted to R-help though.
Regards,
Joe
From | Date | Subject | |
---|---|---|---|
Next Message | Mike Mascari | 2003-08-02 18:06:45 | Re: pl/R questions |
Previous Message | Jochem van Dieten | 2003-08-02 16:29:19 | Re: Domains (Was [PERFORM] Views With Unions) |