Re: need suggestion on querying big tables

From: Mladen Gogala <gogala(dot)mladen(at)gmail(dot)com>
To: pgsql-admin(at)lists(dot)postgresql(dot)org
Subject: Re: need suggestion on querying big tables
Date: 2022-12-06 02:08:39
Message-ID: 21a62556-f7dd-ee9f-3a9b-d4221f87d3e2@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

On 12/5/22 01:29, Ebin Jozer wrote:
> Hi Team,
> in postgresql 11 version we have two tables of size 435 GB and 347 GB.
> if we query on single table or if we are doing inner join on both the
> big tables, it is not displacing any output, it keeps running
>
> We can see the wait event is IO and directDatafile .
>
> Server Spec : 8 cores and  64GB RAM
> PG config : 53 GB(effective_cache), 12 GB(shared buffer)
>
> can you please suggest some ideas , how we can query on big tables and
> fasten them to get the output??
>
> Thanks & Regards,
> /Ebin/

The first thing you should realize is that your server is really small.
Things like NVME, more memory and more cores should help. It's a
business decision. If your application is important to your business,
you should invest some money in the appropriate hardware. You should
also consider partitioning both tables and upgrading to at least
Postgres 14. The fear of the latest version is understandable and I feel
it myself. Newer versions behave much better with respect to
partitioning. Be aware, though, that you will have to add partitioning
key as a leading column to any global index. Also, parallelism in a
query helps.

64 GB and 8 cores is a desktop class machine. Here is my desktop machine:

[mgogala(at)umajor ~]$ tail -20 /proc/cpuinfo
cache size    : 512 KB
physical id    : 0
siblings    : 16
_*core id        : 7*__*
*__*cpu cores    : 8*_
apicid        : 15
initial apicid    : 15
fpu        : yes
fpu_exception    : yes
cpuid level    : 13
wp        : yes
flags        : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca
cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt
pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid
extd_apicid aperfmperf rapl pni pclmulqdq monitor ssse3 fma cx16 sse4_1
sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm
extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw skinit wdt
tce topoext perfctr_core perfctr_nb bpext perfctr_llc mwaitx cpb
hw_pstate ssbd ibpb vmmcall fsgsbase bmi1 avx2 smep bmi2 rdseed adx smap
clflushopt sha_ni xsaveopt xsavec xgetbv1 xsaves clzero irperf
xsaveerptr arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean
flushbyasid decodeassists pausefilter pfthreshold avic v_vmsave_vmload
vgif overflow_recov succor smca sev sev_es
bugs        : sysret_ss_attrs null_seg spectre_v1 spectre_v2
spec_store_bypass retbleed
bogomips    : 7385.36
TLB size    : 2560 4K pages
clflush size    : 64
cache_alignment    : 64
address sizes    : 43 bits physical, 48 bits virtual
power management: ts ttp tm hwpstate cpb eff_freq_ro [13] [14]

[mgogala(at)umajor ~]$ head /proc/meminfo
MemTotal:       65781164 kB
MemFree:        48364252 kB
MemAvailable:   47110532 kB
Buffers:           14812 kB
Cached:         10199584 kB
SwapCached:            0 kB
Active:          3949732 kB
Inactive:        9434600 kB
Active(anon):       4304 kB
Inactive(anon):  3211636 kB

It's faster than your DB server and it has a NVME drive. And this is a
$2K machine from the late 2018. Today, I would go for a DDR5 machine,
with at least 64 cores, 1TB RAM and at least 20TB of NVME drives. Such a
machine would cost around $50K.

--
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com

In response to

Browse pgsql-admin by date

  From Date Subject
Next Message Mladen Gogala 2022-12-06 04:59:35 Re: need suggestion on querying big tables
Previous Message John Wiencek 2022-12-05 14:50:17 Re: need suggestion on querying big tables