Re: Quesion about querying distributed databases

From: me nefcanto <sn(dot)1361(at)gmail(dot)com>
To: Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>
Cc: pgsql-general(at)lists(dot)postgresql(dot)org
Subject: Re: Quesion about querying distributed databases
Date: 2025-03-05 06:42:13
Message-ID: CAEHBEOD969YrbPH_z9OEmThWx3-w4sMMaHLhZLOQwqCwE8Y58Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Adrian Klaver, thank you for the link. I asked the AI to create a query for
me using FDW.

This is the sample query:

with filtered_products as (
select p.product_id
from products.product p
where p.title ilike '%search_term%'
), category_filtered as (
select ic.product_id
from taxonomy.item_categories ic
where ic.category_id = any(array['category_id_1', 'category_id_2'])
), attribute_filtered as (
select ia.product_id
from attributes.item_attributes ia
where ia.attribute_id = any(array['attribute_id_1', 'attribute_id_2'])
), final_products as (
select f.product_id
from filtered_products f
join category_filtered c on f.product_id = c.product_id
join attribute_filtered a on f.product_id = a.product_id
order by f.product_id -- replace with relevant sorting column
limit 50 offset 0
)
select p.*
from products.product p
join final_products fp on p.product_id = fp.product_id;

The problem here is that it collects all of the product_id values from the
ItemCategories table. Let's say each product is put in one category only.
This means that we have 100 thousand records in the ItemCategories table.
Thus, to show a list of 20 products on the website, this query first
fetches 100 thousand product_id values from the remote server.

That's not scalable. Is there a workaround for this?

Thank you
Saeed

On Wed, Mar 5, 2025 at 8:12 AM Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>
wrote:

> On 3/4/25 20:40, me nefcanto wrote:
> > Hello
> >
> > Consider this scenario:
> >
> > * 3 servers, 3 databases, each on a separate server:
> > o *Products database*: Contains the *Products* table (with over
> > 100,000 records).
> > o *Taxonomy database*: Contains the *Categories* and
> > *ItemCategories (EAV)* tables.
> > o *Attributes database*: Contains the *Attributes* and
> > *ItemAttributes (EAV)* tables.
> >
> > How do you find products based on the following criteria?
>
> https://www.postgresql.org/docs/current/postgres-fdw.html
>
> >
> > 1. A search in the title (e.g., "awesome shirts").
> > 2. Selected categories (e.g., "casual" and "sports").
> > 3. Selected attributes (e.g., "color: blue" and "size: large")
> >
> >
> > Regards
> > Saeed
>
> --
> Adrian Klaver
> adrian(dot)klaver(at)aklaver(dot)com
>
>

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Laurenz Albe 2025-03-05 07:39:21 Re: Quesion about querying distributed databases
Previous Message Adrian Klaver 2025-03-05 04:42:08 Re: Quesion about querying distributed databases