Quick Links

Re: Relation extension scalability

From:	Dilip Kumar <dilipbalaut(at)gmail(dot)com>
To:	Andres Freund <andres(at)anarazel(dot)de>
Cc:	Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andres Freund <andres(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Relation extension scalability
Date:	2016-01-12 09:11:47
Message-ID:	CAFiTN-uY7kF0RC8MR07sbmUbZQ91bHLmjiUc64AOM4G=VJCeLg@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Thu, Jan 7, 2016 at 4:53 PM, Andres Freund <andres(at)anarazel(dot)de> wrote:

> On 2016-01-07 16:48:53 +0530, Amit Kapila wrote:
>
> I think it's a worthwhile approach to pursue. But until it actually
> fixes the problem of leaving around uninitialized pages I don't think
> it's very meaningful to do performance comparisons.
>

Attached patch solves this issue, I am allocating the buffer for each page
and initializing the page, only after that adding to FSM.

> > a. Extend the relation page by page and add it to FSM without
> initializing
> > it. I think this is what the current patch of Dilip seems to be doing.
> If
> > we
> I think that's pretty much unacceptable, for the non-error path at
> least.
>

Performance results:
----------------------------
Test Case:
------------
./psql -d postgres -c "COPY (select g.i::text FROM generate_series(1,
10000) g(i)) TO '/tmp/copybinary' WITH BINARY";

echo COPY data from '/tmp/copybinary' WITH BINARY; > copy_script

./psql -d postgres -c "truncate table data"
./psql -d postgres -c "checkpoint"
./pgbench -f copy_script -T 120 -c$ -j$ postgres

Test Summary:
--------------------
1. I have measured the performance of base and patch.
2. With patch there are multiple results, that are with different values of
"extend_num_pages" (parameter which says how many extra block to extend)

Test with Data on magnetic Disk and WAL on SSD
--------------------------------------------------------------------
Shared Buffer : 48GB
max_wal_size : 10GB
Storage : Magnetic Disk
WAL : SSD

tps with different value of extend_num_page

------------------------------------------------------------

Client Base 10-Page 20-Page 50-Page

1 105 103 157 129
2 217 219 255 288
4 210 421 494 486
8 166 605 702 701
16 145 484 563 686
32 124 477 480 745

Test with Data and WAL on SSD
-----------------------------------------------

Shared Buffer : 48GB
Max Wal Size : 10GB
Storage : SSD

tps with different value of extend_num_page

------------------------------------------------------------

Client Base 10-Page 20-Page 50-Page 100-Page

1 152 153 155 147 157
2 281 281 292 275 287
4 236 505 502 508 514
8 171 662 687 767 764
16 145 527 639 826 907

Note: Test with both data and WAL on Magnetic Disk : No significant
improvement visible
-- I think wall write is becoming bottleneck in this case.

Currently i have kept extend_num_page as session level parameter but i
think later we can make this as table property.
Any suggestion on this ?

Apart from this approach, I also tried extending the file in multiple block
in one extend call, but this approach (extending one by one) is performing
better.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Attachment	Content-Type	Size
multi_extend_v2.patch	text/x-diff	3.8 KB

In response to

Re: Relation extension scalability at 2016-01-07 11:23:41 from Andres Freund

Responses

Re: Relation extension scalability at 2016-01-23 06:49:43 from Amit Kapila

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Teodor Sigaev	2016-01-12 09:36:20	Re: Fuzzy substring searching with the pg_trgm extension
Previous Message	Thomas Kellerer	2016-01-12 09:03:41	Re: 9.4-1207 behaves differently with server side prepared statements compared to 9.2-1102