Quick Links

== PostgreSQL Weekly News - August 10 2008 ==

From:	David Fetter <david(at)fetter(dot)org>
To:	PostgreSQL Announce <pgsql-announce(at)postgresql(dot)org>
Subject:	== PostgreSQL Weekly News - August 10 2008 ==
Date:	2008-08-11 05:00:30
Message-ID:	20080811050030.GA15314@fetter.org
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-announce

== PostgreSQL Product News ==

BitNami LAPPStack 1.0 released.
http://bitnami.org/stack/lappstack

pgbouncer 1.2.3 released.
http://pgfoundry.org/projects/pgbouncer/

phpPgAdmin 4.2.1 released.
http://sourceforge.net/project/showfiles.php?group_id=37132

PyReplica 1.0.3 released.
http://pgfoundry.org/projects/pyreplica/

Another PostgreSQL Diff Tool 1.2 released.
http://pgfoundry.org/projects/apgdiff/

pgSphere 1.0.1 released.
http://pgfoundry.org/projects/pgsphere/

PostgreSQL Toolbox 1 released.
http://pgfoundry.org/projects/pg-toolbox/

== PostgreSQL Jobs for August ==

http://archives.postgresql.org/pgsql-jobs/2008-08/threads.php

== PostgreSQL Local ==

The Prato Linux User Group will be having PostgreSQL talks in
September. The schedule in Italian is:
http://www.prato.linux.it/serate_a_tema_2008

PGCon Brazil 2008 will be on September 26-27 at Unicamp in Campinas.
http://pgcon.postgresql.org.br/index.en.html

PgDay.fr will be October 4 in Toulouse. The Call for Papers is open:
http://www.postgresqlfr.org/?q=node/1686
Registration:
http://www.pgday.fr/doku.php/inscription

Sponsor the European PGDay!
http://www.pgday.org/en/sponsors/campaign

The Call for Papers for European PGDay has begun.
http://www.pgday.org/en/call4papers

PGDay.(IT|EU) 2008 will be October 17 and 18 in Prato.
http://www.pgday.org/it/

== PostgreSQL in the News ==

Planet PostgreSQL: http://www.planetpostgresql.org/

General Bits, Archives and occasional new articles:
http://www.varlena.com/GeneralBits/

PostgreSQL Weekly News is brought to you this week by David Fetter and
Devrim GUNDUZ.

Submit news and announcements by Sunday at 3:00pm Pacific time.
Please send English language ones to david(at)fetter(dot)org, German language
to pwn(at)pgug(dot)de, Italian language to pwn(at)itpug(dot)org(dot)

== Applied Patches ==

Tom Lane committed:

- Improve CREATE/DROP/RENAME DATABASE so that when failing because the
source or target database is being accessed by other users, it tells
you whether the "other users" are live sessions or uncommitted
prepared transactions. (Indeed, it tells you exactly how many of
each, but that's mostly just because it was easy to do so.) This
should help forestall the gotcha of not realizing that a prepared
transaction is what's blocking the command. Per discussion.

- Improve SELECT DISTINCT to consider hash aggregation, as well as
sort/uniq, as methods for implementing the DISTINCT step. This
eliminates the former performance gap between DISTINCT and GROUP BY,
and also makes it possible to do SELECT DISTINCT on datatypes that
only support hashing not sorting. SELECT DISTINCT ON is still
always implemented by sorting; it would take executor changes to
support hashing that, and it's not clear it's worth the trouble.
This is a release-note-worthy incompatibility from previous PG
versions, since SELECT DISTINCT can no longer be counted on to
deliver sorted output without explicitly saying ORDER BY. (Anyone
who can't cope with that can consider turning off enable_hashagg.)
Several regression test queries needed to have ORDER BY added to
preserve stable output order. I fixed the ones that manifested
here, but there might be some other cases that show up on other
platforms.

- In pgsql/src/test/regress/pg_regress.c, fix some message style
guideline violations in pg_regress, as well as some failures to
expose messages for translation.

- In pgsql/src/backend/storage/buffer/bufmgr.c, in ReadOrZeroBuffer
(and related entry points), don't bother to call PageHeaderIsValid
when we zero the buffer instead of reading the page in. The actual
performance improvement is probably marginal since this function
isn't very heavily used, but a cycle saved is a cycle earned Zdenek
Kotala

- Add an ORDER BY to one more SELECT DISTINCT test case, per buildfarm
results.

- In pgsql/src/backend/optimizer/plan/planner.c, department of second
thoughts: fix newly-added code in planner.c to make real sure that
DISTINCT ON does what it's supposed to, ie, sort by the full ORDER
BY list before unique-ifying. The error seems masked in simple
cases by the fact that query_planner won't return query pathkeys
that only partially match the requested sort order, but I wouldn't
want to bet that it couldn't be exposed in some way or other.

- Do not allow Unique nodes to be scanned backwards. The code claimed
that it would work, but in fact it didn't return the same rows when
moving backwards as when moving forwards. This would have no
visible effect in a DISTINCT query (at least assuming the column
datatypes use a strong definition of equality), but it gave entirely
wrong answers for DISTINCT ON queries.

- Teach the system how to use hashing for UNION. (INTERSECT/EXCEPT
will follow, but seem like a separate patch since most of the
remaining work is on the executor side.) I took the opportunity to
push selection of the grouping operators for set operations into the
parser where it belongs. Otherwise this is just a small exercise in
making prepunion.c consider both alternatives. As with the recent
DISTINCT patch, this means we can UNION on datatypes that can hash
but not sort, and it means that UNION without ORDER BY is no longer
certain to produce sorted output.

- Support hashing for duplicate-elimination in INTERSECT and EXCEPT
queries. This completes my project of improving usage of hashing
for duplicate elimination (aggregate functions with DISTINCT remain
undone, but that's for some other day). As with the previous
patches, this means we can INTERSECT/EXCEPT on datatypes that can
hash but not sort, and it means that INTERSECT/EXCEPT without ORDER
BY are no longer certain to produce sorted output.

- Improve INTERSECT/EXCEPT hashing by realizing that we don't need to
make any hashtable entries for tuples that are found only in the
second input: they can never contribute to the output. Furthermore,
this implies that the planner should endeavor to put first the
smaller (in number of groups) input relation for an INTERSECT.
Implement that, and upgrade prepunion's estimation of the number of
rows returned by setops so that there's some amount of sanity in the
estimate of which one is smaller.

- In pgsql/src/backend/executor/execMain.c, install checks in executor
startup to ensure that the tuples produced by an INSERT or UPDATE
will match the target table's current rowtype. In pre-8.3 releases
inconsistency can arise with stale cached plans, as reported by
Merlin Moncure. (We patched the equivalent hazard on the SELECT
side in Feb 2007; I'm not sure why we thought there was no risk on
the insertion side.) In 8.3 and HEAD this problem should be
impossible due to plan cache invalidation management, but it seems
prudent to make the check anyway. Back-patch as far as 8.0. 7.x
versions lack ALTER COLUMN TYPE, so there seems no way to abuse a
stale plan comparably.

- Fix corner-case bug introduced with HOT: if REINDEX TABLE pg_class
(or a REINDEX DATABASE including same) is done before a session has
done any other update on pg_class, the pg_class relcache entry was
left with an incorrect setting of rd_indexattr, because the
indexed-attributes set would be first demanded at a time when we'd
forced a partial list of indexes into the pg_class entry, and it
would remain cached after that. This could result in incorrect
decisions about HOT-update safety later in the same session. In
practice, since only pg_class_relname_nsp_index would be missed out,
only ALTER TABLE RENAME and ALTER TABLE SET SCHEMA could trigger a
problem. Per report and test case from Ondrej Jirman.

Magnus Hagander committed:

- Move pgstat.tmp into a temporary directory under $PGDATA named
pg_stat_tmp. This allows the use of a ramdrive (either through
mount or symlink) for the temporary file that's written every half
second, which should reduce I/O. On server shutdown/startup, the
file is written to the old location in the global directory, to
preserve data across restarts. Bump catversion since the $PGDATA
directory layout changed.

== Rejected Patches (for now) ==

No one was disappointed this week :-)

== Pending Patches ==

ITAGAKI Takahiro sent in a patch to user NDirectFileRead/Write
counters to get I/O counts in BufFile the module. These counters are
visible when log_statement_stats is on.

Pavel Stehule sent in a patch to implement GROUPING SETS.

Tom Lane sent in a patch to use hashes for set operations.

Martin Pihlak sent in a patch to make dropping and re-creating
functions work more nicely with plan invalidation.

Simon Riggs sent in a patch which adds a hook for stats plugins.

Abhijit Menon-Sen sent in a patch which extend has_table_privilege()
to include sequence information.

Robert Haas sent in a patch to implement CREATE OR REPLACE VIEW.

Simon Riggs sent two revisions of a patch to fix pg_stop_backup per
suggestion by Fujii Masao. pg_stop_backup now tests XLogArchiveCheckDone()
for both stopxlogfilename and history file and then stats the stop
WAL.

Marko Kreen sent in a patch to fix a security issue in dblink.

Alvaro Herrera sent in two revisions of a patch to make autovacuum
process TOAST tables separately from main tables.

Volkan YAZICI sent in three revisions of a patch to allow people to
increase the verbosity of set-returning functions.

David Wheeler sent in some touch-ups for his citext patch.

Euler Taveira de Oliveira sent in a patch which allows symlinking
statistics files at initdb time.

Browse pgsql-announce by date

	From	Date	Subject
Next Message	David Fetter	2008-08-18 02:02:27	== PostgreSQL Weekly News - August 17 2008 ==
Previous Message	Robert Treat	2008-08-09 03:02:50	phpPgAdmin 4.2.1 released