From: | Lefteris <lsidir(at)gmail(dot)com> |
---|---|
To: | pgsql-performance(at)postgresql(dot)org |
Subject: | Air-traffic benchmark |
Date: | 2010-01-07 12:38:41 |
Message-ID: | 852badbc1001070438j36eb10d3qa22398ffcbef405f@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
Hi all,
following the simple but interesting air-traffic benchmark published at:
http://www.mysqlperformanceblog.com/2009/10/02/analyzing-air-traffic-performance-with-infobright-and-monetdb/
I decided to run the benchmark over postgres to get some more
experience and insights. Unfortunately, the query times I got from
postgres were not the expected ones: the fast ones were in the order
of 5 minutes while the slow ones in the order of 40 minutes. I came to
the conclusion that some parameters are not set correctly. I give all
the information I can think of in the sequel of this email, to
hopefully get some useful suggestions to improve query times. Thank
you very much for your time!
All of the schema, data and query information can be found on the
original blog post given above.
The machine runs Fedora 10:
uname -a:
Linux 2.6.27.30-170.2.82.fc10.x86_64 #1 SMP Mon Aug 17 08:18:34 EDT
2009 x86_64 x86_64 x86_64 GNU/Linux
The hardware characteristics are:
Platform Intel(R) Core(TM)2 Quad CPU Q6600 @ 2.40GHz with 8GB RAM and
ample disk space (2x 500 GB SATA disk @ 7200 RPM as SW-RAID-0)
I used postgresql-8.4.2. Source download and compiled by hand, no
special parameters where passed to the configure phase.
-- DETAILS on loading, analyze and index creation on postgresql-8.4.2
-- loading time was 77m15.976s
-- total rows: 119790558
-- total size of pgdata/base: 46G
ANALYZE;
-- 219698.365ms aka 3m39.698s
CREATE INDEX year_month_idx ON ontime ("Year", "Month");
-- 517481.632ms aka 8m37.481s
CREATE INDEX DepDelay_idx ON ontime ("DepDelay");
-- 1257051.668ms aka 20m57.051s
Query 1:
SELECT "DayOfWeek", count(*) AS c FROM ontime WHERE "Year" BETWEEN
2000 AND 2008 GROUP BY "DayOfWeek" ORDER BY c DESC;
Reported query times are (in sec):
MonetDB 7.9s
InfoBright 12.13s
LucidDB 54.8s
For pg-8.4.2 I got with 3 consecutive runs on the server:
5m52.384s
5m55.885s
5m54.309s
For more complex queries, postgres will become even more slower, that
is for queries from Q2 to Q8 will need over 40 minutes, while on the
rest ones around 5 minutes. By examining the explain analyze in these
queries, the main slow down factor is the HashAggregate and mainly
Sort operations. For query 1 the explain analyze returns:
airtraffic=# EXPLAIN ANALYZE SELECT "DayOfWeek", count(*) AS c FROM
ontime WHERE "Year" BETWEEN 2000 AND 2008 GROUP BY "DayOfWeek" ORDER
BY c DESC;
QUERY
PLAN
------------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=7407754.12..7407754.13 rows=4 width=2) (actual
time=371188.821..371188.823 rows=7 loops=1)
Sort Key: (count(*))
Sort Method: quicksort Memory: 25kB
-> HashAggregate (cost=7407754.04..7407754.08 rows=4 width=2)
(actual time=371163.316..371163.320 rows=7 loops=1)
-> Seq Scan on ontime (cost=0.00..7143875.40 rows=52775727
width=2) (actual time=190938.959..346180.079 rows=52484047 loops=1)
Filter: (("Year" >= 2000) AND ("Year" <= 2008))
Total runtime: 371201.156 ms
(7 rows)
I understand that the problem here is the memory used by the sort
method. *But*, I already changed the work_mem parameter to 6gigs:)
airtraffic=# show work_mem;
work_mem
----------
6GB
(1 row)
And when I saw that this did not work, I also changed the 8.3 style
(?) parameter sort_mem:
airtraffic=# show sort_mem;
work_mem
----------
6GB
(1 row)
which means that of course work_mem == sort_mem, as
such, shouldn't be the case that the sort algorithm should have used
much more memory?
I also attach the output of 'show all;' so you can advice me in any
other configuration settings that I might need to change to perform
better.
Thank you very much for your feedback and all of your help!
lefteris sidirourgos
P.S. sorry if you get this message twice but the first one does not
appear to have reached the list
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
airtraffic=# show all;
name | setting
|
description
---------------------------------+---------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------
add_missing_from | off
| Automatically adds missing table references to
FROM clauses.
allow_system_table_mods | off
| Allows modifications of the structure of system
tables.
archive_command | (disabled)
| Sets the shell command that will be called to
archive a WAL file.
archive_mode | off
| Allows archiving of WAL files using
archive_command.
archive_timeout | 0
| Forces a switch to the next xlog file if a new
file has not been started within N seconds.
array_nulls | on
| Enable input of NULL elements in arrays.
authentication_timeout | 1min
| Sets the maximum allowed time to complete client
authentication.
autovacuum | on
| Starts the autovacuum subprocess.
autovacuum_analyze_scale_factor | 0.1
| Number of tuple inserts, updates or deletes
prior to analyze as a fraction of reltuples.
autovacuum_analyze_threshold | 50
| Minimum number of tuple inserts, updates or
deletes prior to analyze.
autovacuum_freeze_max_age | 200000000
| Age at which to autovacuum a table to prevent
transaction ID wraparound.
autovacuum_max_workers | 3
| Sets the maximum number of simultaneously
running autovacuum worker processes.
autovacuum_naptime | 1min
| Time to sleep between autovacuum runs.
autovacuum_vacuum_cost_delay | 20ms
| Vacuum cost delay in milliseconds, for
autovacuum.
autovacuum_vacuum_cost_limit | -1
| Vacuum cost amount available before napping, for
autovacuum.
autovacuum_vacuum_scale_factor | 0.2
| Number of tuple updates or deletes prior to
vacuum as a fraction of reltuples.
autovacuum_vacuum_threshold | 50
| Minimum number of tuple updates or deletes prior
to vacuum.
backslash_quote | safe_encoding
| Sets whether "\'" is allowed in string literals.
bgwriter_delay | 200ms
| Background writer sleep time between rounds.
bgwriter_lru_maxpages | 100
| Background writer maximum number of LRU pages to
flush per round.
bgwriter_lru_multiplier | 2
| Multiple of the average buffer usage to free per
round.
block_size | 8192
| Shows the size of a disk block.
bonjour_name |
| Sets the Bonjour broadcast service name.
check_function_bodies | on
| Check function bodies during CREATE FUNCTION.
checkpoint_completion_target | 0.5
| Time spent flushing dirty buffers during
checkpoint, as fraction of checkpoint interval.
checkpoint_segments | 30
| Sets the maximum distance in log segments
between automatic WAL checkpoints.
checkpoint_timeout | 5min
| Sets the maximum time between automatic WAL
checkpoints.
checkpoint_warning | 30s
| Enables warnings if checkpoint segments are
filled more frequently than this.
client_encoding | UTF8
| Sets the client's character set encoding.
client_min_messages | notice
| Sets the message levels that are sent to the
client.
commit_delay | 0
| Sets the delay in microseconds between
transaction commit and flushing WAL to disk.
commit_siblings | 5
| Sets the minimum concurrent open transactions
before performing commit_delay.
config_file |
/export/scratch0/lsidir/postgres/pgdata/postgresql.conf | Sets the
server's main configuration file.
constraint_exclusion | partition
| Enables the planner to use constraints to
optimize queries.
cpu_index_tuple_cost | 0.005
| Sets the planner's estimate of the cost of
processing each index entry during an index scan.
cpu_operator_cost | 0.0025
| Sets the planner's estimate of the cost of
processing each operator or function call.
cpu_tuple_cost | 0.01
| Sets the planner's estimate of the cost of
processing each tuple (row).
cursor_tuple_fraction | 0.1
| Sets the planner's estimate of the fraction of a
cursor's rows that will be retrieved.
custom_variable_classes |
| Sets the list of known custom variable classes.
data_directory |
/export/scratch0/lsidir/postgres/pgdata | Sets the
server's data directory.
DateStyle | ISO, DMY
| Sets the display format for date and time
values.
db_user_namespace | off
| Enables per-database user names.
deadlock_timeout | 1s
| Sets the time to wait on a lock before checking
for deadlock.
debug_assertions | off
| Turns on various assertion checks.
debug_pretty_print | on
| Indents parse and plan tree displays.
debug_print_parse | off
| Logs each query's parse tree.
debug_print_plan | off
| Logs each query's execution plan.
debug_print_rewritten | off
| Logs each query's rewritten parse tree.
default_statistics_target | 100
| Sets the default statistics target.
default_tablespace |
| Sets the default tablespace to create tables and
indexes in.
default_text_search_config | pg_catalog.english
| Sets default text search configuration.
default_transaction_isolation | read committed
| Sets the transaction isolation level of each new
transaction.
default_transaction_read_only | off
| Sets the default read-only status of new
transactions.
default_with_oids | off
| Create new tables with OIDs by default.
dynamic_library_path | $libdir
| Sets the path for dynamically loadable modules.
effective_cache_size | 8MB
| Sets the planner's assumption about the size of
the disk cache.
effective_io_concurrency | 1
| Number of simultaneous requests that can be
handled efficiently by the disk subsystem.
enable_bitmapscan | on
| Enables the planner's use of bitmap-scan plans.
enable_hashagg | on
| Enables the planner's use of hashed aggregation
plans.
enable_hashjoin | on
| Enables the planner's use of hash join plans.
enable_indexscan | on
| Enables the planner's use of index-scan plans.
enable_mergejoin | on
| Enables the planner's use of merge join plans.
enable_nestloop | on
| Enables the planner's use of nested-loop join
plans.
enable_seqscan | on
| Enables the planner's use of sequential-scan
plans.
enable_sort | on
| Enables the planner's use of explicit sort
steps.
enable_tidscan | on
| Enables the planner's use of TID scan plans.
escape_string_warning | on
| Warn about backslash escapes in ordinary string
literals.
external_pid_file |
| Writes the postmaster PID to the specified file.
extra_float_digits | 0
| Sets the number of digits displayed for
floating-point values.
from_collapse_limit | 8
| Sets the FROM-list size beyond which subqueries
are not collapsed.
fsync | on
| Forces synchronization of updates to disk.
full_page_writes | on
| Writes full pages to WAL when first modified
after a checkpoint.
geqo | on
| Enables genetic query optimization.
geqo_effort | 5
| GEQO: effort is used to set the default for
other GEQO parameters.
geqo_generations | 0
| GEQO: number of iterations of the algorithm.
geqo_pool_size | 0
| GEQO: number of individuals in the population.
geqo_selection_bias | 2
| GEQO: selective pressure within the population.
geqo_threshold | 12
| Sets the threshold of FROM items beyond which
GEQO is used.
gin_fuzzy_search_limit | 0
| Sets the maximum allowed result for exact search
by GIN.
hba_file |
/export/scratch0/lsidir/postgres/pgdata/pg_hba.conf | Sets the
server's "hba" configuration file.
ident_file |
/export/scratch0/lsidir/postgres/pgdata/pg_ident.conf | Sets the
server's "ident" configuration file.
ignore_system_indexes | off
| Disables reading from system indexes.
integer_datetimes | on
| Datetimes are integer based.
IntervalStyle | postgres
| Sets the display format for interval values.
join_collapse_limit | 8
| Sets the FROM-list size beyond which JOIN
constructs are not flattened.
krb_caseins_users | off
| Sets whether Kerberos and GSSAPI user names
should be treated as case-insensitive.
krb_server_keyfile |
| Sets the location of the Kerberos server key
file.
krb_srvname | postgres
| Sets the name of the Kerberos service.
lc_collate | en_GB.UTF-8
| Shows the collation order locale.
lc_ctype | en_GB.UTF-8
| Shows the character classification and case
conversion locale.
lc_messages | en_GB.UTF-8
| Sets the language in which messages are
displayed.
lc_monetary | en_GB.UTF-8
| Sets the locale for formatting monetary amounts.
lc_numeric | en_GB.UTF-8
| Sets the locale for formatting numbers.
lc_time | en_GB.UTF-8
| Sets the locale for formatting date and time
values.
listen_addresses | localhost
| Sets the host name or IP address(es) to listen
to.
local_preload_libraries |
| Lists shared libraries to preload into each
backend.
log_autovacuum_min_duration | -1
| Sets the minimum execution time above which
autovacuum actions will be logged.
log_checkpoints | off
| Logs each checkpoint.
log_connections | off
| Logs each successful connection.
log_destination | stderr
| Sets the destination for server log output.
log_directory | pg_log
| Sets the destination directory for log files.
log_disconnections | off
| Logs end of a session, including duration.
log_duration | off
| Logs the duration of each completed SQL
statement.
log_error_verbosity | default
| Sets the verbosity of logged messages.
log_executor_stats | off
| Writes executor performance statistics to the
server log.
log_filename | postgresql-%Y-%m-%d_%H%M%S.log
| Sets the file name pattern for log files.
log_hostname | off
| Logs the host name in the connection logs.
log_line_prefix |
| Controls information prefixed to each log line.
log_lock_waits | off
| Logs long lock waits.
log_min_duration_statement | -1
| Sets the minimum execution time above which
statements will be logged.
log_min_error_statement | error
| Causes all statements generating error at or
above this level to be logged.
log_min_messages | warning
| Sets the message levels that are logged.
log_parser_stats | off
| Writes parser performance statistics to the
server log.
log_planner_stats | off
| Writes planner performance statistics to the
server log.
log_rotation_age | 1d
| Automatic log file rotation will occur after N
minutes.
log_rotation_size | 10MB
| Automatic log file rotation will occur after N
kilobytes.
log_statement | none
| Sets the type of statements logged.
log_statement_stats | off
| Writes cumulative performance statistics to the
server log.
log_temp_files | -1
| Log the use of temporary files larger than this
number of kilobytes.
log_timezone | Europe/Amsterdam
| Sets the time zone to use in log messages.
log_truncate_on_rotation | off
| Truncate existing log files of same name during
log rotation.
logging_collector | off
| Start a subprocess to capture stderr output
and/or csvlogs into log files.
maintenance_work_mem | 2GB
| Sets the maximum memory to be used for
maintenance operations.
max_connections | 100
| Sets the maximum number of concurrent
connections.
max_files_per_process | 1000
| Sets the maximum number of simultaneously open
files for each server process.
max_function_args | 100
| Shows the maximum number of function arguments.
max_identifier_length | 63
| Shows the maximum identifier length.
max_index_keys | 32
| Shows the maximum number of index keys.
max_locks_per_transaction | 64
| Sets the maximum number of locks per
transaction.
max_prepared_transactions | 0
| Sets the maximum number of simultaneously
prepared transactions.
max_stack_depth | 2MB
| Sets the maximum stack depth, in kilobytes.
password_encryption | on
| Encrypt passwords.
port | 5432
| Sets the TCP port the server listens on.
post_auth_delay | 0
| Waits N seconds on connection startup after
authentication.
pre_auth_delay | 0
| Waits N seconds on connection startup before
authentication.
random_page_cost | 4
| Sets the planner's estimate of the cost of a
nonsequentially fetched disk page.
regex_flavor | advanced
| Sets the regular expression "flavor".
search_path | "$user",public
| Sets the schema search order for names that are
not schema-qualified.
segment_size | 1GB
| Shows the number of pages per disk file.
seq_page_cost | 1
| Sets the planner's estimate of the cost of a
sequentially fetched disk page.
server_encoding | UTF8
| Sets the server (database) character set
encoding.
server_version | 8.4.2
| Shows the server version.
server_version_num | 80402
| Shows the server version as an integer.
session_replication_role | origin
| Sets the session's behavior for triggers and
rewrite rules.
shared_buffers | 32MB
| Sets the number of shared memory buffers used by
the server.
shared_preload_libraries |
| Lists shared libraries to preload into server.
silent_mode | off
| Runs the server silently.
sql_inheritance | on
| Causes subtables to be included by default in
various commands.
ssl | off
| Enables SSL connections.
standard_conforming_strings | off
| Causes '...' strings to treat backslashes
literally.
statement_timeout | 0
| Sets the maximum allowed duration of any
statement.
stats_temp_directory | pg_stat_tmp
| Writes temporary statistics files to the
specified directory.
superuser_reserved_connections | 3
| Sets the number of connection slots reserved for
superusers.
synchronize_seqscans | on
| Enable synchronized sequential scans.
synchronous_commit | on
| Sets immediate fsync at commit.
syslog_facility | local0
| Sets the syslog "facility" to be used when
syslog enabled.
syslog_ident | postgres
| Sets the program name used to identify
PostgreSQL messages in syslog.
tcp_keepalives_count | 0
| Maximum number of TCP keepalive retransmits.
tcp_keepalives_idle | 0
| Time between issuing TCP keepalives.
tcp_keepalives_interval | 0
| Time between TCP keepalive retransmits.
temp_buffers | 1024
| Sets the maximum number of temporary buffers
used by each session.
temp_tablespaces |
| Sets the tablespace(s) to use for temporary
tables and sort files.
TimeZone | Europe/Amsterdam
| Sets the time zone for displaying and
interpreting time stamps.
timezone_abbreviations | Default
| Selects a file of time zone abbreviations.
trace_notify | off
| Generates debugging output for LISTEN and
NOTIFY.
trace_sort | off
| Emit information about resource usage in
sorting.
track_activities | on
| Collects information about executing commands.
track_activity_query_size | 1024
| Sets the size reserved for
pg_stat_activity.current_query, in bytes.
track_counts | on
| Collects statistics on database activity.
track_functions | none
| Collects function-level statistics on database
activity.
transaction_isolation | read committed
| Sets the current transaction's isolation level.
transaction_read_only | off
| Sets the current transaction's read-only status.
transform_null_equals | off
| Treats "expr=NULL" as "expr IS NULL".
unix_socket_directory |
| Sets the directory where the Unix-domain socket
will be created.
unix_socket_group |
| Sets the owning group of the Unix-domain socket.
unix_socket_permissions | 511
| Sets the access permissions of the Unix-domain
socket.
update_process_title | on
| Updates the process title to show the active SQL
command.
vacuum_cost_delay | 0
| Vacuum cost delay in milliseconds.
vacuum_cost_limit | 200
| Vacuum cost amount available before napping.
vacuum_cost_page_dirty | 20
| Vacuum cost for a page dirtied by vacuum.
vacuum_cost_page_hit | 1
| Vacuum cost for a page found in the buffer
cache.
vacuum_cost_page_miss | 10
| Vacuum cost for a page not found in the buffer
cache.
vacuum_freeze_min_age | 50000000
| Minimum age at which VACUUM should freeze a
table row.
vacuum_freeze_table_age | 150000000
| Age at which VACUUM should scan whole table to
freeze tuples.
wal_block_size | 8192
| Shows the block size in the write ahead log.
wal_buffers | 64kB
| Sets the number of disk-page buffers in shared
memory for WAL.
wal_segment_size | 16MB
| Shows the number of pages per write ahead log
segment.
wal_sync_method | fdatasync
| Selects the method used for forcing WAL updates
to disk.
wal_writer_delay | 200ms
| WAL writer sleep time between WAL flushes.
work_mem | 6GB
| Sets the maximum memory to be used for query
workspaces.
xmlbinary | base64
| Sets how binary values are to be encoded in XML.
xmloption | content
| Sets whether XML data in implicit parsing and
serialization operations is to be considered as documents or content
fragments.
zero_damaged_pages | off
| Continues processing past damaged page headers.
(193 rows)
From | Date | Subject | |
---|---|---|---|
Next Message | Leo Mannhart | 2010-01-07 13:14:52 | Re: Massive table (500M rows) update nightmare |
Previous Message | Matthew Wakeling | 2010-01-07 11:23:54 | Re: Digesting explain analyze |