Bundle of patches

From: Teodor Sigaev <teodor(at)sigaev(dot)ru>
To: Patches <pgsql-patches(at)postgresql(dot)org>
Subject: Bundle of patches
Date: 2006-12-04 14:10:25
Message-ID: 45742C51.9020602@sigaev.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-patches

The 1C (http://www.1c.ru/) company kindly permits to publish a set of patches
we (Oleg and me) developed during our work on porting of the '1C:Enterprise'
system to support the PostgreSQL database.

We would like to suggest to commit they to HEAD.

1) Typmod for user-defined types
http://www.sigaev.ru/misc/user_defined_typmod-0.7.gz
Patch is based on ideas from
http://archives.postgresql.org/pgsql-hackers/2004-06/msg00932.php
http://archives.postgresql.org/pgsql-hackers/2005-08/msg01007.php

Patch adds to type definition two optional function: typmodinput and
typmodoutput. That allows to develop user-defined types with type's
modificators. Built-in types use typmod input/output functions too.
Typmod internally is represented as non-negative int4 value,
but patch allows to point list of integer in type definition. So,
NUMERIC type works with a help of typmodin/typmodout function.

2) ORDER BY .. [ NULLS ( FIRST | LAST ) ]
http://www.sigaev.ru/misc/NULLS_82-0.5.gz
Allow to sort NULLs as greater or lesser than any value. The goal was to
simplificate migrations from MySQL/MS SQL which think that NULL is less.
Also, syntax conforms to SQL2003. It operate on gram.y level, and
adds 'field is [not] null' qualification to sortClause.
Note, to allow queries like 'select .. union .. order by f nulls first'
pgsql now can rewrite that query to
'select * from (select .. union ..) order by f nulls first'. This solves the
problem with 'resjunk' column in SelectStmt->sortClause.

3) Allow to use index for IS [NOT] NULL
http://www.sigaev.ru/misc/indexnulls_82-0.6.gz
Initially patch was developed by Martijn van Oosterhout <kleptog(at)svana(dot)org>.
But it's reworked and support of searching NULLS to GiST too. Patch
adds new column named amsearchnull to pg_am. To recognize IS NULL clause
ScanKey->sk_flags contains (SK_ISNULL & SK_INDEXFINDNULL) and
ScanKey->sk_strategy == BTEqualStrategyNumber. For IS NOT NULL,
ScanKey->sk_strategy == BTLessStrategyNumber. Thats because NULLs are
treated greater than any value. It might be look some odd that
for IS [NOT] NULL clauses we use Btree strategy numbers even for GiST,
but if sk_flags contains SK_ISNULL then we never call user-defined functions.

4) OR clauses optimizations
http://www.sigaev.ru/misc/OR_82-0.6.gz
Patch can suggest new indexpaths to optimizer for ORed clauses. Patch uses
generate_bitmapscan and predicate_implied_by/predicate_refuted_by machineries

4.1) Allow any useful common restriction clauses to be extracted from
OR-of-AND quals. Also, it allows to combine several different
operations to one which can be used in index scan.
SELECT
a, b
FROM
tst
WHERE ( a = 50000 ) OR ( a > 50000 AND b > 50000 )
ORDER BY a, b
LIMIT 20;
Limit (cost=0.00..2.95 rows=20 width=8) (actual time=0.271..0.677 rows=20
loops=1)
-> Index Scan using abidx on tst (cost=0.00..3671.26 rows=24878 width=8)
(actual time=0.265..0.611 rows=20 loops=1)
Index Cond: (a >= 50000)
Filter: ((a = 50000) OR ((a > 50000) AND (b > 50000)))
4.2) When OR clauses aren't intersect and use the same index, it's possible
to just concatenate results of indexscans. For that, now postgres may use
Append node. Append node is modified to have a pathkeys.

SELECT
a
FROM
tst
WHERE ( a > 60000 AND a < 61000 ) OR ( a > 20000 AND a < 21000 )
ORDER BY ASC
LIMIT 20;
Limit (cost=0.00..39.86 rows=20 width=4) (actual time=0.364..0.883 rows=20
loops=1)
-> Result (cost=0.00..4001.55 rows=2008 width=4) (actual time=0.359..0.824
rows=20 loops=1)
-> Append (cost=0.00..4001.55 rows=2008 width=4) (actual
time=0.349..0.742 rows=20 loops=1)
-> Index Scan using aidx on tst (cost=0.00..2000.42 rows=990
width=4) (actual time=0.346..0.684 rows=20 loops=1)
Index Cond: ((a > 20000) AND (a < 21000))
-> Index Scan using aidx on tst (cost=0.00..2001.12 rows=1018
width=4) (never executed)
Index Cond: ((a > 60000) AND (a < 61000))

Also, if there is a 'ORDER BY' clause, childs nodes may be ordered by theys
ranges (compare plan with previous one).
SELECT
a
FROM
tst
WHERE ( a > 60000 AND a < 61000 ) OR ( a > 20000 AND a < 21000 )
ORDER BY a DESC
LIMIT 20;
Limit (cost=0.00..39.86 rows=20 width=4) (actual time=0.162..0.651 rows=20
loops=1)
-> Result (cost=0.00..4001.55 rows=2008 width=4) (actual time=0.157..0.589
rows=20 loops=1)
-> Append (cost=0.00..4001.55 rows=2008 width=4) (actual
time=0.149..0.511 rows=20 loops=1)
-> Index Scan Backward using aidx on tst (cost=0.00..2001.12
rows=1018 width=4) (actual time=0.145..0.450 rows=20 loops=1)
Index Cond: ((a > 60000) AND (a < 61000))
-> Index Scan Backward using aidx on tst (cost=0.00..2000.42
rows=990 width=4) (never executed)
Index Cond: ((a > 20000) AND (a < 21000))

4.3) As side effect of previous point, overlapped clauses can be eliminated:

SELECT
a
FROM
tst
WHERE ( a > 50000 AND a < 61000 ) OR ( a > 60000 AND a < 60100 )
ORDER BY a
LIMIT 20;
Limit (cost=0.00..4.14 rows=20 width=4) (actual time=0.168..1.001 rows=20
loops=1)
-> Index Scan using aidx on tst (cost=0.00..2344.85 rows=11338 width=4)
(actual time=0.162..0.935 rows=20 loops=1)
Index Cond: ((a > 50000) AND (a < 61000))

--
Teodor Sigaev E-mail: teodor(at)sigaev(dot)ru
WWW: http://www.sigaev.ru/

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2006-12-04 14:33:11 Re: Notify enhancement
Previous Message Andrew Dunstan 2006-12-04 14:05:22 Re: Notify enhancement

Browse pgsql-patches by date

  From Date Subject
Next Message Simon Riggs 2006-12-04 14:15:23 Re: On-disk bitmap index implementation
Previous Message cmo1@libero.it 2006-12-04 14:02:42 zope connection string