Slow ext'd query via client native implementation vs. libpq & simple psql

From: Richard Michael <rmichael(at)edgeofthenet(dot)org>
To: pgsql-performance(at)postgresql(dot)org
Subject: Slow ext'd query via client native implementation vs. libpq & simple psql
Date: 2020-03-11 19:31:48
Message-ID: CABR0jERv5L7U0eYyNxv3ZES9HbJj9r2KPK3_=oPMBXTbTC+oXw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hello,

On PG 12.2, I am analyzing a performance problem when using a client
(Elixir/postgrex) querying via the extended query protocol. I am comparing
with
psql and a C program. Logs for all three follow this short explanation.

The query is trivial: `SELECT [cols] FROM t WHERE id = X` on a 65K row
table.

The Elixir client executes this as an extended query in >500-700ms, very
slow.
If relevant, the client does not use libpq, it is a native implementation.

A simple query via psql `PREPARE q AS ... ; EXECUTE q;` executes in ~130ms.
(Aside, please let me know if psql can execute extended queries.)

To compare another extended query protocol client, I wrote a tiny C program
using libpq PQprepare()/PQexecPrepared() executes in ~100ms.

I have tried to make the three tests as similar as possible; all are via
localhost and use named statements.

All use an identical query plan. There is a btree index on the WHERE col,
but
the table is small enough it is not used.

The above durations are consistent across server restart, reboot, and
repetition
(i.e. still >500ms if run multiple times), so it seems independent of
filesystem caching, buffers, etc.

Obviously the client's query execution is somehow different, but I do not
know
what/why.

I have enabled auto_explain min_duration (0), analyze, buffers, verbose and
settings. What more can I log or do to postgres to understand why the
client is
behaving poorly? Would wireshark on client messages reveal anything
postgres
can't log?

Other suggestions much appreciated as well.

(I happen to be on OSX, but I can test elsewhere if necessary.)

Regards,
richard

Elixir/postgrex extended query: (always >500ms)
-------------------------------
2020-03-11 13:46:20.090 EDT [3401] LOG: connection received:
host=127.0.0.1 port=50128
2020-03-11 13:46:20.096 EDT [3401] LOG: connection authorized:
user=testuser database=biodata
2020-03-11 13:46:20.141 EDT [3401] LOG: duration: 1.138 ms parse ecto_98:
SELECT g0."id", g0."genome_id", g0."seqid", g0."source", g0."feature",
g0."start", g0."end", g0."score", g0."strand", g0."phase", g0."attribute"
FROM "genes" AS g0 WHERE (g0."genome_id" = 1)
2020-03-11 13:46:20.144 EDT [3401] LOG: duration: 2.292 ms bind ecto_98:
SELECT g0."id", g0."genome_id", g0."seqid", g0."source", g0."feature",
g0."start", g0."end", g0."score", g0."strand", g0."phase", g0."attribute"
FROM "genes" AS g0 WHERE (g0."genome_id" = 1)
2020-03-11 13:46:20.658 EDT [3401] LOG: duration: 513.791 ms execute
ecto_98: SELECT g0."id", g0."genome_id", g0."seqid", g0."source",
g0."feature", g0."start", g0."end", g0."score", g0."strand", g0."phase",
g0."attribute" FROM "genes" AS g0 WHERE (g0."genome_id" = 1)
2020-03-11 13:46:20.658 EDT [3401] LOG: duration: 513.792 ms plan:
Query Text: SELECT g0."id", g0."genome_id", g0."seqid", g0."source",
g0."feature", g0."start", g0."end", g0."score", g0."strand", g0."phase",
g0."attribute" FROM "genes" AS g0 WHERE (g0."genome_id" = 1)
Seq Scan on public.genes g0 (cost=0.00..6591.62 rows=60632 width=224)
(actual time=0.523..33.229 rows=60623 loops=1)
Output: id, genome_id, seqid, source, feature, start, "end", score,
strand, phase, attribute
Filter: (g0.genome_id = 1)
Rows Removed by Filter: 3907
Buffers: shared read=5785
2020-03-11 13:46:20.887 EDT [3401] LOG: disconnection: session time:
0:00:00.796 user=testuser database=biodata host=127.0.0.1 port=50128

psql simple PREPARE/EXECUTE:
----------------------------
2020-03-11 13:46:40.021 EDT [3438] LOG: connection received: host=::1
port=50129
2020-03-11 13:46:40.044 EDT [3438] LOG: connection authorized:
user=testuser database=biodata application_name=psql
2020-03-11 13:46:40.106 EDT [3438] LOG: duration: 58.071 ms plan:
Query Text: PREPARE q AS SELECT g0."id", g0."genome_id", g0."seqid",
g0."source", g0."feature", g0."start", g0."end", g0."score", g0."strand",
g0."phase", g0."attribute" FROM "genes" AS g0 WHERE (g0."genome_id" = 1);
EXECUTE q;
Seq Scan on public.genes g0 (cost=0.00..6591.62 rows=60632 width=224)
(actual time=0.060..25.542 rows=60623 loops=1)
Output: id, genome_id, seqid, source, feature, start, "end", score,
strand, phase, attribute
Filter: (g0.genome_id = 1)
Rows Removed by Filter: 3907
Buffers: shared hit=42 read=5743
2020-03-11 13:46:40.182 EDT [3438] LOG: duration: 136.173 ms statement:
PREPARE q AS SELECT g0."id", g0."genome_id", g0."seqid", g0."source",
g0."feature", g0."start", g0."end", g0."score", g0."strand", g0."phase",
g0."attribute" FROM "genes" AS g0 WHERE (g0."genome_id" = 1); EXECUTE q;
2020-03-11 13:46:40.182 EDT [3438] DETAIL: prepare: PREPARE q AS SELECT
g0."id", g0."genome_id", g0."seqid", g0."source", g0."feature", g0."start",
g0."end", g0."score", g0."strand", g0."phase", g0."attribute" FROM "genes"
AS g0 WHERE (g0."genome_id" = 1); EXECUTE q;
2020-03-11 13:46:41.140 EDT [3438] LOG: disconnection: session time:
0:00:01.119 user=testuser database=biodata host=::1 port=50129

C libpq extended query:
-----------------------
2020-03-11 13:50:00.220 EDT [4299] LOG: connection received: host=::1
port=50137
2020-03-11 13:50:00.232 EDT [4299] LOG: connection authorized:
user=testuser database=biodata
2020-03-11 13:50:00.234 EDT [4299] LOG: duration: 0.437 ms parse foo:
SELECT g0."id", g0."genome_id", g0."seqid", g0."source", g0."feature",
g0."start", g0."end", g0."score", g0."strand", g0."phase", g0."attribute"
FROM genes AS g0 WHERE (g0."genome_id" = 1)
2020-03-11 13:50:00.235 EDT [4299] LOG: duration: 0.489 ms bind foo:
SELECT g0."id", g0."genome_id", g0."seqid", g0."source", g0."feature",
g0."start", g0."end", g0."score", g0."strand", g0."phase", g0."attribute"
FROM genes AS g0 WHERE (g0."genome_id" = 1)
2020-03-11 13:50:00.342 EDT [4299] LOG: duration: 106.874 ms execute foo:
SELECT g0."id", g0."genome_id", g0."seqid", g0."source", g0."feature",
g0."start", g0."end", g0."score", g0."strand", g0."phase", g0."attribute"
FROM genes AS g0 WHERE (g0."genome_id" = 1)
2020-03-11 13:50:00.342 EDT [4299] LOG: duration: 106.861 ms plan:
Query Text: SELECT g0."id", g0."genome_id", g0."seqid", g0."source",
g0."feature", g0."start", g0."end", g0."score", g0."strand", g0."phase",
g0."attribute" FROM genes AS g0 WHERE (g0."genome_id" = 1)
Seq Scan on public.genes g0 (cost=0.00..6591.62 rows=60632 width=224)
(actual time=0.056..25.049 rows=60623 loops=1)
Output: id, genome_id, seqid, source, feature, start, "end", score,
strand, phase, attribute
Filter: (g0.genome_id = 1)
Rows Removed by Filter: 3907
Buffers: shared hit=74 read=5711
2020-03-11 13:50:00.345 EDT [4299] LOG: disconnection: session time:
0:00:00.125 user=testuser database=biodata host=::1 port=50137

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Justin Pryzby 2020-03-11 21:59:21 Re: Slow ext'd query via client native implementation vs. libpq & simple psql
Previous Message David Rowley 2020-03-09 21:43:46 Re: pg12 partitions show bad performance vs pg96