OFFSET/LIMIT - Disparate Performance w/ Go application

From: Matt Silverlock <matt(at)eatsleeprepeat(dot)net>
To: pgsql-performance(at)postgresql(dot)org
Subject: OFFSET/LIMIT - Disparate Performance w/ Go application
Date: 2014-06-12 07:08:27
Message-ID: 1D92FBC7-76DD-4B96-86E5-4989878C2847@eatsleprepeat.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hi all. This might be tricky in so much as there’s a few moving parts (when isn’t there?), but I’ve tried to test the postgres side as much as possible.

Trying to work out a potential database bottleneck with a HTTP application (written in Go):
Pages that render HTML templates but don’t perform DB queries can hit ~36k+ req/s
Pages that perform a SELECT on a single row net about ~6.6k req/s: db.Get(l, "SELECT * FROM listings WHERE id = $1 AND expiry_date > current_date", l.Id)
Pages that SELECT multiple rows with OFFSET and LIMIT conditions struggle to top 1.3k req/s
There’s very little “extra” logic around these queries: you can find the code here (about 39 lines for both functions) https://gist.github.com/elithrar/b2497b0b473da64932b5
Other pertinent details:
It’s always been about this slow to my knowledge
The table is a test database with about 40 rows, although performance doesn’t change noticeably even with a few hundred (it’s unlikely to ever be more than a 10,000 rows over its lifetime)
Running PostgreSQL 9.3.4 on OS X w/ a 3.5GHz i7, 12GB RAM, 128GB PCI-E SSD.
The Go application peaks at about 40MB memory when hitting 37k req/s — so there doesn’t appear to be an issue of it eating into the available RAM on the machine
I’m also aware that HTTP benchmarks aren’t the most reliable thing, but I’m using wrk -c 400 -t 32 -15s to stress it out
The application has a connection pool via the lib/pq driver (https://github.com/lib/pq) with MaxOpen set to 256 connections. Stack size is 8GB and max socket connections are set to 1024 (running out of FDs isn’t the problem here from what I can see).

Relevant postgresql.conf settings — everything else should be default, including fsync/synchronous commits (on) for obvious reasons:
max_connections = 512
shared_buffers = 2048MB
temp_buffers = 16MB
work_mem = 4MB
wal_buffers = 16
checkpoint_segments = 16
random_page_cost = 2.0
effective_cache_size = 8192MB
The query in question is: http://explain.depesz.com/s/7g8 and the table schema is as below:

Table "public.listings"
┌───────────────┬──────────────────────────┬───────────┐
│ Column │ Type │ Modifiers │
├───────────────┼──────────────────────────┼───────────┤
│ id │ character varying(17) │ not null │
│ title │ text │ │
│ company │ text │ │
│ location │ text │ │
│ description │ text │ │
│ rendered_desc │ text │ │
│ term │ text │ │
│ commute │ text │ │
│ company_url │ text │ │
│ rep │ text │ │
│ rep_email │ text │ │
│ app_method │ text │ │
│ app_email │ text │ │
│ app_url │ text │ │
│ posted_date │ timestamp with time zone │ │
│ edited_date │ timestamp with time zone │ │
│ renewed_date │ timestamp with time zone │ │
│ expiry_date │ timestamp with time zone │ │
│ slug │ text │ │
│ charge_id │ text │ │
│ sponsor_id │ text │ │
│ tsv │ tsvector │ │
└───────────────┴──────────────────────────┴───────────┘
Indexes:
"listings_pkey" PRIMARY KEY, btree (id)
"fts" gin (tsv)
"listings_expiry_date_idx" btree (expiry_date)
"listings_fts_idx" gin (to_tsvector('english'::regconfig, (((((((title || ' '::text) || company) || ' '::text) || location) || ' '::text) || term) || ' '::text) || commute))
Triggers:
tsvectorupdate BEFORE INSERT OR UPDATE ON listings FOR EACH ROW EXECUTE PROCEDURE tsvector_update_trigger('tsv', 'pg_catalog.english', 'title', 'company', 'location', 'term', 'commute’)

The single row query has a query plan here: http://explain.depesz.com/s/1Np (this is where I see 6.6k req/s at the application level),

Some pgbench results from this machine as well:
$ pgbench -c 128 -C -j 4 -T 15 -M extended -S
starting vacuum...end.
transaction type: SELECT only
scaling factor: 1
query mode: extended
number of clients: 128
number of threads: 4
duration: 15 s
number of transactions actually processed: 17040
tps = 1134.481459 (including connections establishing)
tps = 56884.093652 (excluding connections establishing)
Ultimately I'm not expecting a miracle—database ops are nearly always the slowest part of a web server outside the latency to the client itself—but I'd expect something a little closer (even 10% of 33k would be a lot better). And of course, this is somewhat "academic" because I don't expect to see four million hits an hour—but I'd also like to catch problems for future reference.

Thanks in advance.

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2014-06-12 14:58:06 Re: OFFSET/LIMIT - Disparate Performance w/ Go application
Previous Message Timothy Garnett 2014-06-11 22:02:55 Query memory usage greatly in excess of work_mem * query plan steps