From: | "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com> |
---|---|
To: | Jonathan Vanasco <postgres(at)2xlp(dot)com> |
Cc: | pgsql-general general <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: recursive query too big to complete. are there any strategies to limit/partition? |
Date: | 2017-01-27 00:07:00 |
Message-ID: | CAKFQuwZ1_NAkAU3hRqXJFNAxpXUD-e8fkJYBLiZO=1FbN0yCEA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On Thu, Jan 26, 2017 at 4:37 PM, Jonathan Vanasco <postgres(at)2xlp(dot)com> wrote:
> There are over 20 million records in a self-referential database table,
> where one record may point to another record as a descendant.
>
> Because of a bug in application code, there was no limit on recursion.
> The max was supposed to be 4. A few outlier records have between 5 and
> 5000 descendants (there could be more. I manually found one chain of 5000.
>
> I need to find all the chains of 5+ and mark them for update/deletion.
> While the database is about 10GB, the recursive search is maxing out on
> diskspace and causing a failure (there was over over 100GB of workspace
> free)
>
> Is there any way to make a recursive query work, or will I have to use
> another means and just iterate over the entire dataset (either in postgres
> or an external service)
>
Thinking aloud - why doesn't just finding every record with 5 descendants
not work? Any chain longer than 5 would have at least 5 items.
Even without recursion you could build out a five-way self-join and any
records that make it that far are guilty. I suppose this assumes your
setup is non-cyclic.
David J.
From | Date | Subject | |
---|---|---|---|
Next Message | Jonathan Vanasco | 2017-01-27 01:42:42 | Re: recursive query too big to complete. are there any strategies to limit/partition? |
Previous Message | Jonathan Vanasco | 2017-01-26 23:37:21 | recursive query too big to complete. are there any strategies to limit/partition? |