From: | Andres Freund <andres(at)anarazel(dot)de> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | Alex Tokarev <dwalin(at)dwalin(dot)ru>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, pgsql-performance(at)postgresql(dot)org |
Subject: | Re: Faster str to int conversion (was Table with large number of int columns, very slow COPY FROM) |
Date: | 2018-07-19 20:32:12 |
Message-ID: | 20180719203212.qso3vgljwns75oho@alap3.anarazel.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers pgsql-performance |
Hi,
On 2018-07-18 14:34:34 -0400, Robert Haas wrote:
> On Sat, Jul 7, 2018 at 4:01 PM, Andres Freund <andres(at)anarazel(dot)de> wrote:
> > FWIW, here's a rebased version of this patch. Could probably be polished
> > further. One might argue that we should do a bit more wide ranging
> > changes, to convert scanint8 and pg_atoi to be also unified. But it
> > might also just be worthwhile to apply without those, given the
> > performance benefit.
>
> Wouldn't hurt to do that one too, but might be OK to just do this
> much. Questions:
>
> 1. Why the error message changes? If there's a good reason, it should
> be done as a separate commit, or at least well-documented in the
> commit message.
Because there's a lot of "invalid input syntax for type %s: \"%s\"",
error messages, and we shouldn't force translators to have separate
version that inlines the first %s. But you're right, it'd be worthwhile
to point that out in the commit message.
> 2. Does the likely/unlikely stuff make a noticeable difference?
Yes. It's also largely a copy from existing code (scanint8), so I don't
really want to differ here.
> 3. If this is a drop-in replacement for pg_atoi, why not just recode
> pg_atoi this way -- or have it call this -- and leave the callers
> unchanged?
Because pg_atoi supports a variable 'terminator'. Supporting that would
create a bit slower code, without being particularly useful. I think
there's only a single in-core caller left after the patch
(int2vectorin). There's a fair argument that that should just be
open-coded to handle the weird space parsing, but given there's probably
external pg_atoi() callers, I'm not sure it's worth doing so?
I don't think it's a good idea to continue to have pg_atoi as a wrapper
- it takes a size argument, which makes efficient code hard.
> 4. Are we sure this is faster on all platforms, or could it work out
> the other way on, say, BSD?
I'd be *VERY* surprised if any would be faster. It's not easy to write a
faster implmentation, than what I've proposed, and especially not so if
you use strtol() as the API (variable bases, a bit of locale support).
Greetings,
Andres Freund
From | Date | Subject | |
---|---|---|---|
Next Message | Andres Freund | 2018-07-19 20:35:02 | Re: [HACKERS] possible self-deadlock window after bad ProcessStartupPacket |
Previous Message | Alexander Korotkov | 2018-07-19 20:30:19 | Re: Bug in gin insert redo code path during re-compression of empty gin data leaf pages |
From | Date | Subject | |
---|---|---|---|
Next Message | Mark Kirkwood | 2018-07-19 23:30:29 | Re: Why HDD performance is better than SSD in this case |
Previous Message | Robert Haas | 2018-07-18 18:34:34 | Re: Faster str to int conversion (was Table with large number of int columns, very slow COPY FROM) |