Quick Links

Re: Postgres Spark connector

From:	Giuseppe Broccolo <g(dot)broccolo(dot)7(at)gmail(dot)com>
To:	Zhihong Yu <zyu(at)yugabyte(dot)com>
Cc:	PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject:	Re: Postgres Spark connector
Date:	2020-12-24 10:33:28
Message-ID:	CAFtuf8Cnp5e=M_F5LQFq2+Y_9qegBv7uJRzmGGvE=LJ0xtBptw@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Hi Zhihong,

On Wed, 23 Dec 2020, 17:55 Zhihong Yu, <zyu(at)yugabyte(dot)com> wrote:

> Hi,
> I searched for Postgres support in Apache Spark.
> I found Spark doc related to JDBC.
>
> I wonder if the community is aware of Spark connector for Postgres
> (hopefully open source) where predicate involving jsonb columns can be
> pushed down.
>

JDBC driver is indeed the best driver which can be used if you have to
persist your Spark dataframes in PostgreSQL, IMO.

It's a connector which supports just pure SQL (no mapping between your
Scala/Java classes and the DB schema for instance, despite ORM frameworks
like Hibernate), so it works at a lower level allowing you to use directly
the queries you would use to handle jsonb data.

Maybe you need to use pure json's between Spark objects and the DB, but the
communication using JDBC driver can be completely based on jsonb.

Giuseppe.

In response to

Postgres Spark connector at 2020-12-23 17:55:54 from Zhihong Yu

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Marco Slot	2020-12-24 11:50:23	Re: How is this possible "publication does not exist"
Previous Message	Masahiko Sawada	2020-12-24 10:29:37	Re: Commit fest manager for 2021-01