| From: | Scott Ribe <scott_ribe(at)elevated-dev(dot)com> | 
|---|---|
| To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> | 
| Cc: | Paul Smith* <paul(at)pscs(dot)co(dot)uk>, "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>, pgsql-admin(at)lists(dot)postgresql(dot)org | 
| Subject: | Re: small temp files | 
| Date: | 2024-07-22 15:46:29 | 
| Message-ID: | 16C7116A-2CB2-4909-8830-2FA16470EE27@elevated-dev.com | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-admin | 
> On Jul 22, 2024, at 8:54 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> 
> You would get more specific answers if you provided an example of the
> queries that cause this, with EXPLAIN ANALYZE output.  But I think a
> likely bet is that it's doing a hash join that overruns work_mem.
> What will happen is that the join gets divided into batches based on
> hash codes, and each batch gets dumped into its own temp files (one
> per batch for each side of the join).  It would not be too surprising
> if some of the batches are small, thanks to the vagaries of hash
> values.  Certainly they could be less than work_mem, since the only
> thing we can say for sure is that the sum of the temp file sizes for
> the inner side of the join should exceed work_mem.
OK, that makes total sense, and fits our usage patterns. (Lots of complex queries, lots of hash joins.)
thanks
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Andrew Dunstan | 2024-07-22 15:50:10 | Re: Enhance pg_dump multi-threaded streaming (WAS: Re: filesystem full during vacuum - space recovery issues) | 
| Previous Message | Tom Lane | 2024-07-22 14:54:06 | Re: small temp files |