Re: Portability issues in shm_mq

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Andrew Dunstan <andrew(at)dunslane(dot)net>
Subject: Re: Portability issues in shm_mq
Date: 2014-03-16 01:24:40
Message-ID: CA+TgmobPUiDh6+ciqzbo9_vnGxa7AdYYkSt=uZc+buwTp41_Qw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Mar 14, 2014 at 4:43 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Whilst setting up a buildfarm member on an old, now-spare Mac, I was
> somewhat astonished to discover that contrib/test_shm_mq crashes thus:
> TRAP: FailedAssertion("!(rb >= sizeof(uint64))", File: "shm_mq.c", Line: 429)
> but only in UTF8 locales, not in C locale. You'd have bet your last
> dollar that that code was locale-independent, right?
>
> The reason appears to be that in the payload string generated with
> (select string_agg(chr(32+(random()*96)::int), '') from generate_series(1,400))
> the chr() argument rounds up to 128 every so often. In UTF8 encoding,
> that causes chr() to return a multibyte character instead of a single
> byte. So, instead of always having a fixed payload string length of
> 400 bytes, the payload length moves around a bit --- in a few trials
> I see anywhere from 400 to 409 bytes.
>
> How is that leading to a crash? Well, this machine is 32-bit, so MAXALIGN
> is only 4. This means it is possible for an odd-length message cum
> message length word to not exactly divide the size of the shared memory
> ring buffer, resulting in cases where an 8-byte message length word is
> wrapped around the end of the buffer. shm_mq_receive_bytes makes no
> attempt to hide that situation from its caller, and happily returns just
> 4 bytes with SHM_MQ_SUCCESS. shm_mq_receive, on the other hand, is so
> confident that it will always get an indivisible length word that it just
> Asserts that that's the case.

Argh. I think I forced the size of the buffer to be MAXALIGN'd, but
what it really needs is to be a multiple of the size of uint64.

> Recommendations:
>
> 1. Reduce the random() multiplier from 96 to 95. In multibyte encodings
> other than UTF8, chr() would flat out reject values of 128, so this test
> case is unportable.

Agreed.

> 2. Why in the world is the test case testing exactly one message length
> that happens to be a multiple of 8? Put some randomness into that,
> instead.

Good idea. I think that started out as a performance test rather than
an integrity test, and I didn't think hard enough when revising it
about what would make a good integrity test.

> 3. Either you need to work a bit harder at forcing alignment, or you need
> to fix shm_mq_receive to cope with split message length words.

The first one is what is intended. I will look at it.

> 4. The header comment for shm_mq_receive_bytes may once have described its
> API accurately, but that appears to have been a long long time ago in a
> galaxy far far away. Please fix.

Ugh, looks like I forgot to update that when I introduced the
shm_mq_result return type.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Niels Hoogeveen 2014-03-16 02:05:27 Re: Proposed feature: Selective Foreign Key
Previous Message Tom Lane 2014-03-16 00:55:03 Minimum supported version of Python?