Portability issues in shm_mq

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgreSQL(dot)org, Andrew Dunstan <andrew(at)dunslane(dot)net>
Subject: Portability issues in shm_mq
Date: 2014-03-14 20:43:39
Message-ID: 25993.1394829819@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Whilst setting up a buildfarm member on an old, now-spare Mac, I was
somewhat astonished to discover that contrib/test_shm_mq crashes thus:
TRAP: FailedAssertion("!(rb >= sizeof(uint64))", File: "shm_mq.c", Line: 429)
but only in UTF8 locales, not in C locale. You'd have bet your last
dollar that that code was locale-independent, right?

The reason appears to be that in the payload string generated with
(select string_agg(chr(32+(random()*96)::int), '') from generate_series(1,400))
the chr() argument rounds up to 128 every so often. In UTF8 encoding,
that causes chr() to return a multibyte character instead of a single
byte. So, instead of always having a fixed payload string length of
400 bytes, the payload length moves around a bit --- in a few trials
I see anywhere from 400 to 409 bytes.

How is that leading to a crash? Well, this machine is 32-bit, so MAXALIGN
is only 4. This means it is possible for an odd-length message cum
message length word to not exactly divide the size of the shared memory
ring buffer, resulting in cases where an 8-byte message length word is
wrapped around the end of the buffer. shm_mq_receive_bytes makes no
attempt to hide that situation from its caller, and happily returns just
4 bytes with SHM_MQ_SUCCESS. shm_mq_receive, on the other hand, is so
confident that it will always get an indivisible length word that it just
Asserts that that's the case.

Recommendations:

1. Reduce the random() multiplier from 96 to 95. In multibyte encodings
other than UTF8, chr() would flat out reject values of 128, so this test
case is unportable.

2. Why in the world is the test case testing exactly one message length
that happens to be a multiple of 8? Put some randomness into that,
instead.

3. Either you need to work a bit harder at forcing alignment, or you need
to fix shm_mq_receive to cope with split message length words.

4. The header comment for shm_mq_receive_bytes may once have described its
API accurately, but that appears to have been a long long time ago in a
galaxy far far away. Please fix.

Also, while this is not directly your problem, it's becoming clear that we
don't have enough buildfarm coverage of not-64-bit platforms; this problem
would have been spotted awhile ago if we did. I'm going to spin up a
couple of critters on old machines lying around my office. We should
probably also encourage owners of existing critters to expand their test
coverage a bit, eg try locales other than C.

regards, tom lane

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2014-03-14 21:19:00 Re: 9a57858f1103b89a5674f0d50c5fe1f756411df6
Previous Message Josh Berkus 2014-03-14 20:13:42 Re: jsonb and nested hstore