| From: | "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com> |
|---|---|
| To: | Jonathan Vanasco <postgres(at)2xlp(dot)com> |
| Cc: | pgsql-general general <pgsql-general(at)postgresql(dot)org> |
| Subject: | Re: recursive query too big to complete. are there any strategies to limit/partition? |
| Date: | 2017-01-27 00:07:00 |
| Message-ID: | CAKFQuwZ1_NAkAU3hRqXJFNAxpXUD-e8fkJYBLiZO=1FbN0yCEA@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-general |
On Thu, Jan 26, 2017 at 4:37 PM, Jonathan Vanasco <postgres(at)2xlp(dot)com> wrote:
> There are over 20 million records in a self-referential database table,
> where one record may point to another record as a descendant.
>
> Because of a bug in application code, there was no limit on recursion.
> The max was supposed to be 4. A few outlier records have between 5 and
> 5000 descendants (there could be more. I manually found one chain of 5000.
>
> I need to find all the chains of 5+ and mark them for update/deletion.
> While the database is about 10GB, the recursive search is maxing out on
> diskspace and causing a failure (there was over over 100GB of workspace
> free)
>
> Is there any way to make a recursive query work, or will I have to use
> another means and just iterate over the entire dataset (either in postgres
> or an external service)
>
Thinking aloud - why doesn't just finding every record with 5 descendants
not work? Any chain longer than 5 would have at least 5 items.
Even without recursion you could build out a five-way self-join and any
records that make it that far are guilty. I suppose this assumes your
setup is non-cyclic.
David J.
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Jonathan Vanasco | 2017-01-27 01:42:42 | Re: recursive query too big to complete. are there any strategies to limit/partition? |
| Previous Message | Jonathan Vanasco | 2017-01-26 23:37:21 | recursive query too big to complete. are there any strategies to limit/partition? |