Re: postgres 8 settings

From: "vinita bansal" <sagivini(at)hotmail(dot)com>
To: mrylander(at)gmail(dot)com, pgsql-general(at)postgresql(dot)org
Subject: Re: postgres 8 settings
Date: 2005-03-12 13:18:30
Message-ID: BAY20-F313F2A4CBA4E1384D91718CB540@phx.gbl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hi,

- Oops, forgot to mention that!!
It is Redhat Linux and storage system is SCSI. Also it has 4 processors.

- I would say that there is no particular reason for using a particular
setting except that there were no memory constraints in my case and
they seem to work fine with Postgres 7.4.
I had set them to values that worked while I was benchmarking some data
with Postgres 7.4.

- I have 4 processes which query the database at one point in time. There
are some 5-6 tables which have approx 6-8 million rows and they are the
ones which are the major bottlenecks.

The only reason for setting fsync to false was that the updates to these
tables were taking major time and just setting fysnc to false saved a lot of
time for me. Also, I am not using pg_autovacuum but instead run vaccum for
tables as and when required. Btw, Can running pg_autovaccum as a daemon
process affect performance and is it effective??

Also, a particular query which was taking a min or less with postgres 7 is
taking 15 mins now.
Here is the query plan for the query both for postgres 7.4.6 and postgres 8:

Query plan with Postgres 7.4.6:

explain SELECT cm_AllocData1_temp.allocation, cm_QuotaState.tril_gid,
cm_QuotaState.Period, cm_Quota.tril_gid FROM cm_AllocData1_temp,
cm_QuotaState, cm_Quota WHERE ( ( ( ( ( cm_AllocData1_temp.recipient =
cm_QuotaState.SalesTeam ) AND ( ( cm_AllocData1_temp.transDate ) >= (
cm_QuotaState.StartDate ) ) ) AND ( ( cm_AllocData1_temp.transDate ) < (
cm_QuotaState.EndDate ) ) ) AND ( cm_Quota.tril_gid = cm_QuotaState.Quota
) ) AND ( cm_Quota.Model = 'N151VZWTJN0DUVPBUVZSU31JQ2OUZZZZ' ) ) AND (
cm_Quota.Status = 'Enabled' )
QUERY PLAN
---------------

Hash Join (cost=30.69..18800.11 rows=459 width=109)
Hash Cond: (("outer".quota)::text = ("inner".tril_gid)::text)
-> Nested Loop (cost=0.00..18720.38 rows=8889 width=109)
-> Seq Scan on cm_allocdata1_temp (cost=0.00..20.00 rows=1000
width=7
8)
-> Index Scan using statesbysalesteam_i on cm_quotastate
(cost=0.00..
18.54 rows=9 width=126)
Index Cond: ((("outer".recipient)::text =
(cm_quotastate.salestea
m)::text) AND ("outer".transdate >= cm_quotastate.startdate) AND
("outer".transd
ate < cm_quotastate.enddate))
-> Hash (cost=30.65..30.65 rows=16 width=36)
-> Seq Scan on cm_quota (cost=0.00..30.65 rows=16 width=36)
Filter: (((model)::text =
'N151VZWTJN0DUVPBUVZSU31JQ2OUZZZZ'::tex
t) AND ((status)::text = 'Enabled'::text))
(9 rows)

Query plan with Postgres 8:

explain SELECT cm_AllocData1_temp.allocation, cm_QuotaState.tril_gid,
cm_QuotaState.Period, cm_Quota.tril_gid FROM cm_AllocData1_temp,
cm_QuotaState, cm_Quota WHERE ( ( ( ( ( cm_AllocData1_temp.recipient =
cm_QuotaState.SalesTeam ) AND ( ( cm_AllocData1_temp.transDate ) >= (
cm_QuotaState.StartDate ) ) ) AND ( ( cm_AllocData1_temp.transDate ) < (
cm_QuotaState.EndDate ) ) ) AND ( cm_Quota.tril_gid = cm_QuotaState.Quota
) ) AND ( cm_Quota.Model = 'N151VZWTJN0DUVPBUVZSU31JQ2OUZZZZ' ) ) AND (
cm_Quota.Status = 'Enabled' )

QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Merge Join (cost=333271.40..379482.09 rows=255889 width=143)
Merge Cond: ("outer"."?column4?" = "inner"."?column7?")
Join Filter: (("outer".transdate >= "inner".startdate) AND
("outer".transdate < "inner".enddate))
-> Sort (cost=3214.64..3290.02 rows=30150 width=146)
Sort Key: (cm_allocdata1_temp.recipient)::text
-> Seq Scan on cm_allocdata1_temp (cost=0.00..971.50 rows=30150
width=146)
-> Sort (cost=330056.76..330764.27 rows=283005 width=126)
Sort Key: (cm_quotastate.salesteam)::text
-> Hash Join (cost=22.69..304429.99 rows=283005 width=126)
Hash Cond: (("outer".quota)::text = ("inner".tril_gid)::text)
-> Seq Scan on cm_quotastate (cost=0.00..274161.17
rows=5483217 width=126)
-> Hash (cost=22.65..22.65 rows=16 width=36)
-> Seq Scan on cm_quota (cost=0.00..22.65 rows=16
width=36)
Filter: (((model)::text =
'N151VZWTJN0DUVPBUVZSU31JQ2OUZZZZ'::text) AND ((status)::text =
'Enabled'::text))

Just setting enable_sort and enable_hashjoin to off gives a good plan
(Settings for both are the same).

Plan with sort and hash join turned off:

Nested Loop (cost=22.67..7070120.99 rows=255889 width=143)
Join Filter: (("inner".tril_gid)::text = ("outer".quota)::text)
-> Nested Loop (cost=0.00..5285277.72 rows=4957835 width=143)
-> Seq Scan on cm_allocdata1_temp (cost=0.00..971.50 rows=30150
width=146)
-> Index Scan using statesbysalesteam_i on cm_quotastate
(cost=0.00..172.38 rows=165 width=126)
Index Cond: ((("outer".recipient)::text =
(cm_quotastate.salesteam)::text) AND ("outer".transdate >=
cm_quotastate.startdate) AND ("outer".transdate < cm_quotastate.enddate))
-> Materialize (cost=22.67..22.83 rows=16 width=36)
-> Seq Scan on cm_quota (cost=0.00..22.65 rows=16 width=36)
Filter: (((model)::text =
'N151VZWTJN0DUVPBUVZSU31JQ2OUZZZZ'::text) AND ((status)::text =
'Enabled'::text))
(9 rows)

Why is this happening. Do any of the settings I have mentioned before
causing this??

As regards to setting a high value for wall settings, I am not too sure on
why it should be a problem and how in any way it can affect me?

Also can you clarify what max_fsm_pages,bg_writer actually is since it is
not clear from the documentation. Won't turning bg_writer on affect my
overall run. I am too bothered about that since I have migrated a db2
database to postgres and am working on performance improvement, essentially
trying to get the same no's for the run as db2.

Regards,
Vinita Bansal

>From: Mike Rylander <mrylander(at)gmail(dot)com>
>Reply-To: Mike Rylander <mrylander(at)gmail(dot)com>
>To: vinita bansal <sagivini(at)hotmail(dot)com>, PostgreSQL general
><pgsql-general(at)postgresql(dot)org>
>Subject: Re: [GENERAL] postgres 8 settings
>Date: Thu, 10 Mar 2005 13:14:19 +0000
>
>On Thu, 10 Mar 2005 09:58:02 +0000, vinita bansal <sagivini(at)hotmail(dot)com>
>wrote:
> > Hi,
> >
> > I have a 64 bit Opteron m/c with 32GB RAM and ~500GB HardDrive. The
>database
> > size is ~45GB.
> >
>
>I've got a similar box, but with only 16G RAM. What is the storage
>subsystem, fibre channel or SCSI? Also, what OS?
>
> > I am using the following values in postgresql.conf:
> >
> > shared_buffers = 100000
>
>Seems high. I did some testing with my real data and found that
>anything of 15000 wasn't really gaining my anything. This is an 8.x
>config file (maintenance_work_mem vs. sort_mem), and pg 8+ can
>actually make do with smaller shared_buffers because of the ARC (soon
>to be 2Q) buffer management algorithm. Unless your working set PER
>QUERY is enormous I would suggest lowering this.
>
> > work_mem = 128000
>
>That's fine, but you may need to bump it up if (as above) you have
>individual queries that sort/group huge rowsets.
>
> > maintenance_work_mem = 100000
>
>Remember to pump this way up when building very large indexes.
>
> > max_fsm_pages = 200000
>
>Should probably be bigger. Mine is 2000000 (2 million).
>
> > bgwriter_percent = 0
> > bgwriter_maxpages = 0
> > fsync = false
>
>*KLAXON SOUNDS* Unless you dislike having your data around after
>power/hardware anomalies you'd better turn that on!
>
> > wal_buffers = 1000
> > checkpoint_segments = 2048
>
>This will require 16 * ((2 * 1000) + 1) MB of drive space... 100
>should be fine, really. (next setting related)
>
> > checkpoint_timeout = 3600
>
>Conventional wisdom is not to set this higher than 1800. It also
>means that you can cut your checkpoint_segments in half (more or
>less).
>
> > effective_cache_size = 1840000
> > random_page_cost = 2
>
>This might be low, but it depends on your storage subsystem. Is it
>fibre channel?
>
> > geqo_threshold = 25
>
>Wide queries, eh?
>
> > geqo_effort = 1
> > stats_start_collector = false
> > stats_command_string = false
> >
>
>If you want to use pg_autovacuum then you will need to turn the stats
>stuff back on, including row statistics. Plus it's a big help in
>debugging.
>
> > Do these settings seem fine or I am making some mistake. These settings
>when
> > used with Postgres 7.4 gave me good results but they don't seem to work
>with
> > Postgres 8.0. Am I missing out on something??
> >
> > Regards,
> > Vinita Bansal
> >
> > _________________________________________________________________
> > Click, Upload, Print. http://www.kodakexpress.co.in?soe=4956 Deliver in
> > India.
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 6: Have you searched our list archives?
> >
> > http://archives.postgresql.org
> >
>
>
>--
>Mike Rylander
>mrylander(at)gmail(dot)com
>GPLS -- PINES Development
>Database Developer
>http://open-ils.org

_________________________________________________________________
Is your PC infected? Get a FREE online computer virus scan from McAfee
Security. http://clinic.mcafee.com/clinic/ibuy/campaign.asp?cid=3963

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Mike Rylander 2005-03-12 14:22:41 Re: partitionning
Previous Message Kaloyan Iliev Iliev 2005-03-12 11:29:08 Re: Hash problem