From: | Thorsten Schöning <tschoening(at)am-soft(dot)de> |
---|---|
To: | pgsql-general(at)postgresql(dot)org |
Subject: | Re: How to properly query lots of rows based on timestamps? |
Date: | 2020-08-30 13:54:11 |
Message-ID: | 1683596875.20200830155411@am-soft.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Guten Tag Alban Hertroys,
am Samstag, 29. August 2020 um 11:29 schrieben Sie:
> [...]Next, it loops over those to match the other condition of
> your query (id = clt_rec.oms_rec). You didn’t tell whether there’s
> an index on that column.
Which column, oms_rec.id or clt_rec.oms_rec? The former has one
because it's the unique key and the plan says so as well:
> Index Scan using pk_oms_rec on oms_rec
clt_rec.oms_rec OTOH is only marked as unique, without explicitly
creating an index.
> CONSTRAINT uk_clt_rec_oms_rec UNIQUE (oms_rec)
But doesn't that mean an index is available behind the scenes,
maintained by Postgres? So it should have been used if it would
provide any benefit?
> Adding a unique constraint will automatically create a unique B-tree
> index on the column or group of columns listed in the constraint.
https://www.postgresql.org/docs/9.4/ddl-constraints.html
> You’d probably see a performance improvement were you to create an
> index on (captured_at, id). If your Postgres version is somewhat
> recent, that could even lead to an Index Only Scan.
Tried that with your statement at the bottom and it didn't seem to
change anything even when using Postgres 11:
> -> Nested Loop (cost=1.14..343169.49 rows=43543 width=20) (actual time=0.228..95.554 rows=34266 loops=3)
> -> Parallel Index Scan using idx_clt_rec_captured_at on clt_rec (cost=0.57..3437.90 rows=43543 width=24) (actual time=0.119..16.895 rows=34266 loops=3)
> Index Cond: ((captured_at >= ('2020-08-01 00:00:00+02'::timestamp with time zone - '00:00:00'::interval)) AND (captured_at <= ('2020-08-01 00:00:00+02'::timestamp with time zone + '1 day'::interval)))
> -> Index Scan using pk_oms_rec on oms_rec (cost=0.57..7.80 rows=1 width=12) (actual time=0.002..0.002 rows=1 loops=102799)
> Index Cond: (id = clt_rec.oms_rec)
vs. with your suggested change:
> -> Nested Loop (cost=1.14..513397.11 rows=43543 width=20) (actual time=0.236..97.044 rows=34266 loops=3)
> -> Parallel Index Scan using idx_clt_rec_captured_at on clt_rec (cost=0.57..173665.52 rows=43543 width=24) (actual time=0.183..17.464 rows=34266 loops=3)
> Index Cond: ((captured_at >= ('2020-08-01 00:00:00+02'::timestamp with time zone - '00:00:00'::interval)) AND (captured_at <= ('2020-08-01 00:00:00+02'::timestamp with time zone + '1 day'::interval)))
> -> Index Scan using pk_oms_rec on oms_rec (cost=0.57..7.80 rows=1 width=12) (actual time=0.002..0.002 rows=1 loops=102799)
> Index Cond: (id = clt_rec.oms_rec)
Good news is that having one of those indexes in place at all makes a
huge difference compared to having neither of both. :-D
Mit freundlichen Grüßen,
Thorsten Schöning
--
Thorsten Schöning E-Mail: Thorsten(dot)Schoening(at)AM-SoFT(dot)de
AM-SoFT IT-Systeme http://www.AM-SoFT.de/
Telefon...........05151- 9468- 55
Fax...............05151- 9468- 88
Mobil..............0178-8 9468- 04
AM-SoFT GmbH IT-Systeme, Brandenburger Str. 7c, 31789 Hameln
AG Hannover HRB 207 694 - Geschäftsführer: Andreas Muchow
From | Date | Subject | |
---|---|---|---|
Next Message | Thorsten Schöning | 2020-08-30 14:00:21 | Re: How to properly query lots of rows based on timestamps? |
Previous Message | Thorsten Schöning | 2020-08-30 13:36:22 | Re: How to properly query lots of rows based on timestamps? |