Re: Faster distinct query?

From: Israel Brewster <ijbrewster(at)alaska(dot)edu>
To: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>
Cc: PostgreSQL Mailing Lists <pgsql-general(at)postgresql(dot)org>
Subject: Re: Faster distinct query?
Date: 2021-09-22 20:41:15
Message-ID: 6D98FC61-3916-4841-9A5F-E23E981E73DA@alaska.edu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

> On Sep 22, 2021, at 12:20 PM, David G. Johnston <david(dot)g(dot)johnston(at)gmail(dot)com> wrote:
>
> On Wed, Sep 22, 2021 at 1:05 PM Israel Brewster <ijbrewster(at)alaska(dot)edu <mailto:ijbrewster(at)alaska(dot)edu>> wrote:
> To work around the issue, I created a materialized view that I can update periodically, and of course I can query said view in no time flat. However, I’m concerned that as the dataset grows, the time it takes to refresh the view will also grow (correct me if I am wrong there).
>
> I'd probably turn that index into a foreign key that just ensures that every (station,channel) that appears in the data table also appears on the lookup table. Grouping and array-ifying the lookup table would be trivial. Either modify the application code or add a trigger to populate the lookup table as needed.

Makes sense. I was actually considering this approach (albeit without the foreign key - that’s a nice additional safety measure), but was concerned about the overhead that adding said trigger would have on inserts - thus my thought to try the materialized view. As a reference, this database is receiving 1Hz data from around 170 stations, with up to three channels of data per station. So something like 350-500 inserts per second, although the data is “grouped” into 10 minute batches. I’ll give it another look.

> The parentheses around channel in "array_agg(distinct(channel))" are unnecessary - you are invoking composite-type syntax, which is ignored in the single column case unless you write the optional ROW keyword, i.e., distinct ROW(channel)

Good information, thanks!

> David J.

---
Israel Brewster
Software Engineer
Alaska Volcano Observatory
Geophysical Institute - UAF
2156 Koyukuk Drive
Fairbanks AK 99775-7320
Work: 907-474-5172
cell: 907-328-9145

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Tom Lane 2021-09-22 20:48:42 Re: Faster distinct query?
Previous Message David G. Johnston 2021-09-22 20:26:55 Re: Faster distinct query?