Re: Best system for a data warehouse application

From: Steven Chang <stevenchang1213(at)gmail(dot)com>
To: Jaime Soler <jaime(dot)soler(at)gmail(dot)com>
Cc: Fernando Hevia <fhevia(at)gmail(dot)com>, John Scalia <jayknowsunix(at)gmail(dot)com>, Postgres <pgsql-admin(at)postgresql(dot)org>
Subject: Re: Best system for a data warehouse application
Date: 2017-03-25 04:53:15
Message-ID: CAEJt7k3+VKQti-C7_m6BcnC1zK528jBsRt6yGen6Nds0cRPKvg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

hello Jaim,

Regarding cache-line-contention issue , I think you can refer this
page

https://www.postgresql.org/message-id/5C92D015D36C6248AFA3806E72C04A06067B7320@usoldt-e3mb-008.PEROOT.COM
.
Once I resized my shared_buffer from 15GB to 45GB, AP team
reported performance issue after 2 days ,
and we found swap in/out activities increased vastly
At that time, the only thing I can do was returning to the original
shared_buffer setting -- 15GB,
and performance recovered to normal.
I consulted EDB for memory issue, because their slides introduced
at most 8GB for shared_buffer.
And I got positive response from them at that time.
Now I think it maybe an numa issue.
I will try the same evualtion size of data buffer size as Oracle's
SGA in my future projects.
Or anybody can share his/her experiences with large shared_buffer
size with postgres db.

Regards,
Steven

2017-02-22 22:23 GMT+08:00 Jaime Soler <jaime(dot)soler(at)gmail(dot)com>:

> Well I prefer 1 choice, but first of choose a cpu model, I would check how
> many concurrent connections use your DW system. Also if you prefer a
> many-cores cpu, please check some previous errors related with cache line
> conetention in large numa system:
> https://www.enterprisedb.com/blog/solving-cache-line-
> contention-large-numa-systems
>
> 2017-02-22 3:15 GMT+01:00 Fernando Hevia <fhevia(at)gmail(dot)com>:
>
>> Option 1 is best if you will handle but a few connections to postgres
>> (~<=4), since the higher cpu clock will prevail in this scenario.
>> But if you expect having more than a few simultaneous connections go with
>> option 2 without a doubt.
>> Check how many simultaneous connections your DW system typically uses and
>> go from there. Although more cores would in general be the correct choice,
>> keep in mind that common DW in-database computations will greatly profit
>> from the extra cpu cycles of the 3.5 GHz cpu.
>>
>> Cheers.
>>
>>
>> El 21 feb. 2017 2:39 PM, "John Scalia" <jayknowsunix(at)gmail(dot)com> escribió:
>>
>>> Hi all,
>>>
>>> One of my system admins has approached me about replacing our production
>>> data warehouse system. I believe I know the answer, but would just like
>>> more opinions. He's giving me two options so far:
>>>
>>> 1) CPU = Intel e5-2637 3.5GHz 4 core with hyperthreading
>>>
>>> vs.
>>>
>>> 2) CPU = Intel e5-2650v4 2.2GHz 14 core with hyperthreading
>>>
>>> The rest of the specs are identical with 512Gb RAM and more than 2 Tb
>>> SSD drives. Our data warehouse is just shy of 2Tb now. I suspect the the
>>> first option would be more optimal as I don't think PostgreSQL can make
>>> effective use of a lot of processor cores. So fewer cores would be better
>>> optimized with the O/S. Can anyone weigh in on whether my assumptions are
>>> correct?
>>> --
>>> Jay
>>>
>>
>

In response to

Browse pgsql-admin by date

  From Date Subject
Next Message Steven Chang 2017-03-25 05:01:35 Re: The same role for multiple databases
Previous Message John Scalia 2017-03-24 12:02:42 Re: [ADMIN] Replication mode Master-Slave - maintenance question.