Re: Performance of query

From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Cindy Makarowsky <cindymakarowsky(at)gmail(dot)com>
Cc: "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Performance of query
Date: 2013-03-23 21:53:20
Message-ID: CAMkU=1yjnokhcJx9eEeec50AS6iYYJ_vtXywxwg1-F=5AqGeBA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Friday, March 22, 2013, Cindy Makarowsky wrote:

> I have two tables in Postgres 9.2 on a Linux server with 8GB of RAM. The
> first table has 60 million records:

You have over 40GB of data in that table, so there is no way you are going
to get it into 8GB RAM without some major reorganization.

> company character(35),
> address character(35),
> city character(20),
> contact character(35),
> title character(20),
>

All of those fixed width fields are probably taking up needless space, and
in your case, space is time. Varchar would probably be better. (And
probably longer maximum lengths as well. Most people don't need more than
35 characters for their addresses, but the people who do are going to be
cheesed off when you inform them that you deem their address to be
unreasonable. Unless your mailing labels only hold 35 characters)

> When I run this query:
>
> select state.state, count(table1.id) from state,table1 where
> table1.state = state.state group by state.state
>

The join to the "state" table is not necessary. Between the foreign key
and the primary key, you know that every state exists, and that every state
exists only once. But, that will not solve your problem, as the join to
the state table is not where the time goes.

> " -> Seq Scan on busbase (cost=0.00..6378172.28 rows=55402728
> width=7) (actual time=0.004..250046.673 rows=60057057 loops=1)"
>

Assuming that your cost parameters are all default, this means you have
(6378172.28 - 0.01* 55402728)/1 = 5.8e6 pages, or 44.4 GB of table. That
is, less than 10 tuples per page.

Tightly packed, you should be able to hold over 30 tuples per page. You
are probably not vacuuming aggressively enough, or you were not doing so in
the past and never did a "vacuum full" to reclaim the bloated space.

In any event, your sequential scan is running at 181 MB/s. Is this what
you would expect given your IO hardware?

>
> I've tried playing around with the settings in the config file for
> shared_buffers, work_mem, etc restarting Postgres each time and nothing
> seems to help.
>

How fast do you think it should run? How fast do you need it to run? This
seems like the type of query that would get run once per financial quarter,
or maybe once per day on off-peak times.

Cheers,

Jeff

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Misa Simic 2013-03-23 22:27:36 Re: Performance of query
Previous Message Tom Lane 2013-03-23 04:31:48 Re: Index usage for tstzrange?