Re: postgresql + apache under heavy load

From: Alex Madon <alex(dot)madon(at)bestlinuxjobs(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "scott(dot)marlowe" <scott(dot)marlowe(at)ihs(dot)com>, Richard Huxton <dev(at)archonet(dot)com>, pgsql-general(at)postgresql(dot)org
Subject: Re: postgresql + apache under heavy load
Date: 2004-01-21 22:20:24
Message-ID: 400EFB28.70500@bestlinuxjobs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Well the top issued was corresponding to a concurrency of 20 (so a level
where swap is not sollicited).
With a concurrency of 60, swap is very high

Here is a "movie" of how swap is maanged (a snapshot every 5 sec)
Swap: 265064k av, 110592k used, 154472k free 48384k
cached
Swap: 265064k av, 110592k used, 154472k free 48404k
cached
Swap: 265064k av, 107064k used, 158000k free 48452k
cached
Swap: 265064k av, 104816k used, 160248k free 48484k
cached
Swap: 265064k av, 246496k used, 18568k free 31044k
cached
Swap: 265064k av, 190364k used, 74700k free 31356k
cached
Swap: 265064k av, 186512k used, 78552k free 31480k
cached
Swap: 265064k av, 181592k used, 83472k free 31588k
cached
Swap: 265064k av, 173924k used, 91140k free 31692k
cached
Swap: 265064k av, 168384k used, 96680k free 31864k
cached
Swap: 265064k av, 165444k used, 99620k free 31872k
cached
Swap: 265064k av, 163084k used, 101980k free 32004k
cached
Swap: 265064k av, 160592k used, 104472k free 32124k
cached
Swap: 265064k av, 158116k used, 106948k free 32136k
cached
Swap: 265064k av, 155692k used, 109372k free 32212k
cached
Swap: 265064k av, 153008k used, 112056k free 32252k
cached

and the parallel with Mem
Mem: 384580k av, 202952k used, 181628k free, 0k shrd, 20876k
buff
Swap: 265064k av, 110592k used, 154472k free 48384k
cached
Mem: 384580k av, 219072k used, 165508k free, 0k shrd, 20880k
buff
Swap: 265064k av, 110592k used, 154472k free 48404k
cached
Mem: 384580k av, 284076k used, 100504k free, 0k shrd, 20888k
buff
Swap: 265064k av, 107064k used, 158000k free 48452k
cached
Mem: 384580k av, 353520k used, 31060k free, 0k shrd, 20892k
buff
Swap: 265064k av, 104816k used, 160248k free 48484k
cached
Mem: 384580k av, 380108k used, 4472k free, 0k shrd, 6352k
buff
Swap: 265064k av, 246496k used, 18568k free 31044k
cached
Mem: 384580k av, 210008k used, 174572k free, 0k shrd, 6372k
buff
Swap: 265064k av, 190364k used, 74700k free 31356k
cached
Mem: 384580k av, 206024k used, 178556k free, 0k shrd, 6380k
buff
Swap: 265064k av, 186512k used, 78552k free 31480k
cached
Mem: 384580k av, 210008k used, 174572k free, 0k shrd, 6396k
buff
Swap: 265064k av, 181592k used, 83472k free 31588k
cached
Mem: 384580k av, 201612k used, 182968k free, 0k shrd, 6408k
buff
Swap: 265064k av, 173924k used, 91140k free 31692k
cached
Mem: 384580k av, 192464k used, 192116k free, 0k shrd, 6420k
buff
Swap: 265064k av, 168384k used, 96680k free 31864k
cached
Mem: 384580k av, 190504k used, 194076k free, 0k shrd, 6428k
buff
Swap: 265064k av, 165444k used, 99620k free 31872k
cached
Mem: 384580k av, 190956k used, 193624k free, 0k shrd, 6444k
buff
Swap: 265064k av, 163084k used, 101980k free 32004k
cached
Mem: 384580k av, 193244k used, 191336k free, 0k shrd, 6452k
buff
Swap: 265064k av, 160592k used, 104472k free 32124k
cached
Mem: 384580k av, 195924k used, 188656k free, 0k shrd, 6460k
buff
Swap: 265064k av, 158116k used, 106948k free 32136k
cached
Mem: 384580k av, 198428k used, 186152k free, 0k shrd, 6468k
buff
Swap: 265064k av, 155692k used, 109372k free 32212k
cached

One can see that at the maximum feeling of swap (74700k free swap), the
full picture is:

22:51:54 up 3:58, 6 users, load average: 47.38, 18.53, 7.79
131 processes: 130 sleeping, 1 running, 0 zombie, 0 stopped
CPU states: 5.3% user 3.0% system 0.0% nice 0.0% iowait 91.6% idle
Mem: 384580k av, 210008k used, 174572k free, 0k shrd, 6372k
buff
158748k actv, 14556k in_d, 1412k in_c
Swap: 265064k av, 190364k used, 74700k free 31356k
cached

PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME CPU COMMAND
7 root 16 0 0 0 0 SW 1.2 0.0 0:07 0
kscand/Normal
5 root 15 0 0 0 0 SW 1.0 0.0 0:01 0 kswapd
7050 apache 15 0 8016 5896 1580 D 1.0 1.5 0:00 0 httpd
3870 madona 15 0 6540 1440 472 D 0.6 0.3 0:07 0 xterm
7032 apache 15 0 8336 7568 1980 S 0.6 1.9 0:00 0 httpd
7051 apache 15 0 4784 1640 280 D 0.6 0.4 0:00 0 httpd
2581 root 15 0 15928 1452 704 S 0.5 0.3 5:40 0 X
6985 madona 16 0 788 732 476 R 0.5 0.1 0:00 0 top
7023 apache 15 0 7956 7160 1736 S 0.4 1.8 0:00 0 httpd
7025 apache 15 0 7944 6816 1584 D 0.4 1.7 0:00 0 httpd
7027 apache 15 0 7808 6976 1588 D 0.4 1.8 0:00 0 httpd
7052 apache 15 0 6616 3584 404 D 0.3 0.9 0:00 0 httpd
7142 postgres 15 0 3420 3268 2460 S 0.3 0.8 0:00 0 postmaster
7020 apache 15 0 8092 5880 1980 D 0.2 1.5 0:00 0 httpd
7034 apache 15 0 8380 7660 2036 S 0.2 1.9 0:00 0 httpd
2626 madona 15 0 1316 396 292 S 0.1 0.1 0:00 0 magicdev
7001 apache 15 0 8016 5892 1580 D 0.1 1.5 0:00 0 httpd
7008 apache 15 0 8016 5784 1580 D 0.1 1.5 0:00 0 httpd
7024 apache 15 0 7944 6664 1584 D 0.1 1.7 0:00 0 httpd
7054 apache 15 0 8016 5636 1580 D 0.1 1.4 0:00 0 httpd
1 root 15 0 88 60 40 S 0.0 0.0 0:04 0 init
2 root 15 0 0 0 0 SW 0.0 0.0 0:00 0 keventd
3 root 15 0 0 0 0 SW 0.0 0.0 0:00 0 kapmd
4 root 34 19 0 0 0 SWN 0.0 0.0 0:00 0
ksoftirqd_CPU0
9 root 25 0 0 0 0 SW 0.0 0.0 0:00 0 bdflush
6 root 15 0 0 0 0 SW 0.0 0.0 0:00 0 kscand/DMA
8 root 15 0 0 0 0 SW 0.0 0.0 0:00 0
kscand/HighMem
10 root 15 0 0 0 0 SW 0.0 0.0 0:00 0 kupdated
11 root 25 0 0 0 0 SW 0.0 0.0 0:00 0
mdrecoveryd
15 root 15 0 0 0 0 SW 0.0 0.0 0:04 0 kjournald
72 root 25 0 0 0 0 SW 0.0 0.0 0:00 0 khubd
2066 root 15 0 0 0 0 SW 0.0 0.0 0:00 0 eth1
2117 root 15 0 112 60 36 S 0.0 0.0 0:00 0 syslogd
2121 root 15 0 52 0 0 SW 0.0 0.0 0:00 0 klogd
2139 rpc 15 0 72 0 0 SW 0.0 0.0 0:00 0 portmap
2158 rpcuser 25 0 84 0 0 SW 0.0 0.0 0:00 0 rpc.statd
2208 root 25 0 52 4 0 S 0.0 0.0 0:00 0 apmd
2246 root 25 0 240 4 0 S 0.0 0.0 0:00 0 sshd
2260 root 15 0 120 4 0 S 0.0 0.0 0:00 0 xinetd
2269 root 15 0 88 56 36 S 0.0 0.0 0:00 0 gpm
2279 root 15 0 1436 216 152 S 0.0 0.0 0:00 0 httpd
2348 root 15 0 124 108 64 S 0.0 0.0 0:00 0 crond
2359 root 15 0 852 496 228 S 0.0 0.1 0:00 0 cupsd
2419 xfs 15 0 2696 188 188 S 0.0 0.0 0:32 0 xfs
2428 root 24 0 616 4 0 S 0.0 0.0 0:00 0 smbd
2432 root 15 0 824 356 260 S 0.0 0.0 0:00 0 nmbd
2450 daemon 15 0 64 12 12 S 0.0 0.0 0:00 0 atd
2458 root 16 0 224 0 0 SW 0.0 0.0 0:00 0 login
2459 root 22 0 48 4 0 S 0.0 0.0 0:00 0 mingetty
2460 root 22 0 52 4 0 S 0.0 0.0 0:00 0 mingetty
2461 root 22 0 52 4 0 S 0.0 0.0 0:00 0 mingetty
2462 root 22 0 52 4 0 S 0.0 0.0 0:00 0 mingetty
2463 root 22 0 52 4 0 S 0.0 0.0 0:00 0 mingetty
2464 madona 16 0 356 0 0 SW 0.0 0.0 0:00 0 bash
2569 madona 21 0 132 4 0 S 0.0 0.0 0:00 0 startx
2580 madona 20 0 72 0 0 SW 0.0 0.0 0:00 0 xinit
2590 madona 15 0 3412 240 236 S 0.0 0.0 0:01 0
gnome-session
2592 madona 15 0 5452 424 160 S 0.0 0.1 0:00 0 gconfd-2
2604 madona 15 0 236 72 40 S 0.0 0.0 0:00 0 ssh-agent
2608 madona 17 0 488 4 0 S 0.0 0.0 0:00 0
bonobo-activati
2610 madona 15 0 384 204 124 S 0.0 0.0 0:00 0 fam
2612 madona 16 0 4032 1916 1912 S 0.0 0.4 0:38 0 metacity
2614 madona 15 0 1620 268 264 S 0.0 0.0 0:00 0
gnome-settings-
2623 madona 15 0 720 396 288 S 0.0 0.1 0:01 0
xscreensaver
2633 madona 15 0 1276 688 528 S 0.0 0.1 0:00 0
pam-panel-icon
2634 root 15 0 152 136 88 S 0.0 0.0 0:00 0
pam_timestamp_c
2655 madona 15 0 5284 2428 1916 D 0.0 0.6 0:15 0
gnome-panel
2657 madona 15 0 2556 1176 804 S 0.0 0.3 0:01 0
multiload-apple
2659 madona 15 0 1808 452 448 S 0.0 0.1 0:00 0
notification-ar
2662 madona 15 0 5444 164 160 S 0.0 0.0 0:01 0 xterm
2664 madona 15 0 360 4 0 S 0.0 0.0 0:00 0 bash
3015 madona 15 0 9100 248 244 S 0.0 0.0 0:03 0 xterm
3017 madona 15 0 384 28 24 S 0.0 0.0 0:00 0 bash
3812 madona 15 0 35292 4908 4568 S 0.0 1.2 8:02 0
mozilla-bin
3872 madona 22 0 352 4 0 S 0.0 0.0 0:00 0 bash
3902 root 22 0 168 0 0 SW 0.0 0.0 0:00 0 su
3905 root 15 0 392 32 28 S 0.0 0.0 0:00 0 bash
4306 madona 15 0 5604 252 248 S 0.0 0.0 0:01 0 xterm
4308 madona 22 0 352 4 0 S 0.0 0.0 0:00 0 bash
4338 root 21 0 168 0 0 SW 0.0 0.0 0:00 0 su
4341 root 22 0 360 4 0 S 0.0 0.0 0:00 0 bash
4385 root 24 0 172 0 0 SW 0.0 0.0 0:00 0 su
4386 postgres 15 0 312 28 24 S 0.0 0.0 0:00 0 bash
4610 madona 15 0 5728 384 380 S 0.0 0.0 0:00 0 xterm
4612 madona 15 0 384 28 24 S 0.0 0.0 0:00 0 bash
6636 apache 15 0 6868 840 404 D 0.0 0.2 0:03 0 httpd
6820 postgres 15 0 808 560 468 S 0.0 0.1 0:00 0 postmaster
6821 postgres 15 0 1296 496 276 S 0.0 0.1 0:00 0 postmaster
6822 postgres 15 0 744 460 372 S 0.0 0.1 0:00 0 postmaster
6828 root 15 0 140 100 64 S 0.0 0.0 0:00 0 tail
6854 apache 15 0 6888 652 424 S 0.0 0.1 0:01 0 httpd
6867 apache 15 0 6816 740 352 D 0.0 0.1 0:01 0 httpd
6869 apache 15 0 6960 1084 640 D 0.0 0.2 0:00 0 httpd
6875 apache 15 0 6856 780 392 D 0.0 0.2 0:01 0 httpd
6882 apache 15 0 6700 436 236 S 0.0 0.1 0:01 0 httpd
6890 apache 15 0 7020 1184 700 D 0.0 0.3 0:01 0 httpd
6896 apache 15 0 6744 1084 424 D 0.0 0.2 0:01 0 httpd
6986 madona 15 0 816 592 180 S 0.0 0.1 0:00 0 ab
6993 apache 15 0 7084 1404 860 D 0.0 0.3 0:00 0 httpd
7002 apache 15 0 8016 2576 1580 D 0.0 0.6 0:00 0 httpd
7004 apache 15 0 8016 3104 1580 D 0.0 0.8 0:00 0 httpd
7007 postgres 15 0 3000 2576 2048 D 0.0 0.6 0:00 0 postmaster
7009 apache 15 0 8016 2956 1580 D 0.0 0.7 0:00 0 httpd
7011 apache 16 0 7956 7164 1736 S 0.0 1.8 0:00 0 httpd
7012 apache 15 0 8024 5364 1588 D 0.0 1.3 0:00 0 httpd
7021 apache 16 0 8376 7676 2032 S 0.0 1.9 0:00 0 httpd
7022 apache 15 0 8016 5140 1580 D 0.0 1.3 0:00 0 httpd
7029 apache 15 0 8336 7584 1980 S 0.0 1.9 0:00 0 httpd
7036 apache 15 0 8380 7632 2036 S 0.0 1.9 0:00 0 httpd
7053 apache 15 0 8016 4420 1580 D 0.0 1.1 0:00 0 httpd
7056 apache 15 0 8148 6292 1712 D 0.0 1.6 0:00 0 httpd
7059 apache 15 0 8240 7192 2020 S 0.0 1.8 0:00 0 httpd
7061 apache 15 0 8236 6436 1800 D 0.0 1.6 0:00 0 httpd
7062 apache 15 0 8240 7604 2020 S 0.0 1.9 0:00 0 httpd
7066 apache 16 0 8240 7544 2020 S 0.0 1.9 0:00 0 httpd
7068 postgres 15 0 3216 2580 2272 D 0.0 0.6 0:00 0 postmaster
7069 postgres 15 0 2908 2108 1960 S 0.0 0.5 0:00 0 postmaster
7070 postgres 15 0 2436 2104 1740 S 0.0 0.5 0:00 0 postmaster
7071 postgres 15 0 1884 1212 1188 S 0.0 0.3 0:00 0 postmaster
7079 postgres 15 0 3396 3244 2448 S 0.0 0.8 0:00 0 postmaster
7082 postgres 17 0 3412 3260 2460 S 0.0 0.8 0:00 0 postmaster
7086 postgres 16 0 3412 3232 2460 S 0.0 0.8 0:00 0 postmaster
7087 postgres 15 0 3424 3160 2468 S 0.0 0.8 0:00 0 postmaster
7091 postgres 15 0 3420 3264 2464 S 0.0 0.8 0:00 0 postmaster
7096 postgres 17 0 3412 3140 2456 S 0.0 0.8 0:00 0 postmaster
7097 postgres 15 0 3436 3196 2476 S 0.0 0.8 0:00 0 postmaster
7102 postgres 15 0 3436 3220 2480 S 0.0 0.8 0:00 0 postmaster
7112 postgres 15 0 3432 3280 2472 S 0.0 0.8 0:00 0 postmaster
7123 postgres 15 0 3440 3288 2476 S 0.0 0.8 0:00 0 postmaster
7124 postgres 15 0 3428 3276 2468 S 0.0 0.8 0:00 0 postmaster
7125 postgres 15 0 3424 3272 2464 S 0.0 0.8 0:00 0 postmaster

After that swap and ram usage decrease.... and the ab ends up with a
time out.
For the gnome and X, this is just my development machine. On the web
server itself, no X of course.

I had also a closer look at postgresql logs and make logs more verbose.
I saw a query taking 16 sec (it take normally few millisec):
[7104] LOG: connection received: host=[local]
[7104] LOG: connection authorized: user=myuser database=mydb
[7104] LOG: query: begin; select getdatabaseencoding(); commit
[7104] LOG: duration: 0.002152 sec
[7104] LOG: query: select ---- myselect instruction -----
[7104] LOG: duration: 16.561325 sec
[7104] LOG: query: select oid,typname from pg_type
[7104] LOG: duration: 15.498828 sec
[6820] DEBUG: child process (pid 7104) exited with exit code 0

Thanks everybody,
Alex

Tom Lane wrote:

>"scott.marlowe" <scott(dot)marlowe(at)ihs(dot)com> writes:
>
>
>>OK, the memory usage is NOT your problem. You have crashing postgresql
>>backends, see the <defunct> and the listing of 3 zombie processes in top?
>>That's bad. You very likely have bad memory in your box, or possibly
>>other hardware problems.
>>
>>
>
>While that could be true, the presence of a few zombie processes hardly
>proves it. What that says to me is that the box is so loaded that the
>postmaster isn't able to reap dead children instantaneously --- that is,
>what we see is top running between the time that a backend quits and the
>time the postmaster next gets to run.
>
>While we don't have a lot of data for determining why the overload,
>the top output:
>
>
>
>>>CPU states: 91.8% user 8.1% system 0.0% nice 0.0% iowait 0.0% idle
>>>
>>>
>
>sure looks like the bottleneck is CPU cycles, not disk (and not RAM
>either, since RAM shortage would lead to swapping and hence disk waits).
>Furthermore, the listing looks like it is the httpd processes that are
>sucking CPU, not Postgres. I think this is actually not Postgres'
>problem at all, but some inefficiency in the site's Web code.
>
> regards, tom lane
>
>---------------------------(end of broadcast)---------------------------
>TIP 1: subscribe and unsubscribe commands go to majordomo(at)postgresql(dot)org
>
>
>

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message lnd 2004-01-21 22:29:19 Lost plpgsql function
Previous Message Alex Satrapa 2004-01-21 22:07:52 Re: SCO Extortion