Re: Substituting Checksum Algorithm (was: Enabling Checksums)

From: Ants Aasma <ants(at)cybertec(dot)at>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Jeff Davis <pgsql(at)j-davis(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Florian Pflug <fgp(at)phlo(dot)org>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Greg Smith <greg(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Bruce Momjian <bruce(at)momjian(dot)us>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Subject: Re: Substituting Checksum Algorithm (was: Enabling Checksums)
Date: 2013-04-23 09:40:47
Message-ID: CA+CSw_vU5nKAL+UDQAELK8G--sfCn17oX1R4jFCMboTovFmvYA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Apr 23, 2013 at 11:47 AM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
> On 2013-04-23 00:17:28 -0700, Jeff Davis wrote:
>> + # important optimization flags for checksum.c
>> + ifeq ($(GCC),yes)
>> + checksum.o: CFLAGS += -msse4.1 -funroll-loops -ftree-vectorize
>> + endif
>
> I am pretty sure we can't do those unconditionally:
> - -funroll-loops and -ftree-vectorize weren't always part of gcc afair,
> so we would need a configure check for those

-funroll-loops is available from at least GCC 2.95. -ftree-vectorize
is GCC 4.0+. From what I read from the documentation on ICC -axSSE4.1
should generate a plain and accelerated version and do a runtime
check., I don't know if ICC vectorizes the specific loop in the patch,
but I would expect it to given that Intels vectorization has generally
been better than GCCs and the loop is about as simple as it gets. I
don't know the relevant options for other compilers.

> - SSE4.1 looks like a total no-go, its not available everywhere. We
> *can* add runtime detection of that with gcc fairly easily and
> one-time if we wan't to go there (later?) using 'ifunc's, but that
> needs a fair amount of infrastructure work.
> - We can rely on SSE1/2 on amd64, but I think thats automatically
> enabled there.

This is why I initially went for the lower strength 16bit checksum
calculation - requiring only SSE2 would have made supporting the
vectorized version on amd64 trivial. By now my feeling is that it's
not prudent to compromise in quality to save some infrastructure
complexity. If we set a hypothetical VECTORIZATION_FLAGS variable at
configure time, the performance is still there for those who need it
and can afford CPU specific builds.

Regards,
Ants Aasma
--
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: http://www.postgresql-support.de

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavan Deolasee 2013-04-23 09:46:05 Re: Couple of issues with pg_xlogdump
Previous Message Jov 2013-04-23 09:36:03 Re: 9.3 Beta1 status report