Re: Weird problems with C extension and bytea as input type

From: Adrian Schreyer <adrian(at)schreyer(dot)me>
To: David W Noon <dwnoon(at)ntlworld(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Weird problems with C extension and bytea as input type
Date: 2011-03-23 10:06:50
Message-ID: AANLkTinPo7d1PGDsivpH6yiz=NUHT-zyAfEvre_h=2Q7@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Tue, Mar 22, 2011 at 22:21, David W Noon <dwnoon(at)ntlworld(dot)com> wrote:
> On Tue, 22 Mar 2011 16:14:47 -0500, Merlin Moncure wrote about Re:
> [GENERAL] Weird problems with C extension and bytea as input type:
>
> [snip]
>>>> On Tue, Mar 22, 2011 at 8:22 AM, Adrian Schreyer <ams214(at)cam(dot)ac(dot)uk>
>>>> wrote:
> [snip]
>>>>> bytea *b = PG_GETARG_BYTEA_P(0);
>>>>> char *ism;
>>>>>
>>>>> ism = function(b);
>>>>>
>>>>> PG_RETURN_CSTRING(ism);
>
> What is the prototype for function()?  If it returns a char * then you
> will likely have either scope problems, reentrancy problems or memory
> leaks. If you are going to buy the C++ religion then you usually need to
> buy it wholesale: do all if your string processing as std::string
> objects and only return to char * when you revert to C.

you are right, it returns a char *.

The prototype:

char *function(bytea *b);

The actual C++ function looks roughly like this

extern "C"
char *function(bytea *b)
{
string ism;
[...]
return ism.c_str();
}

The postgres wrapper in C like this:

PG_FUNCTION_INFO_V1(bin_to_string);
Datum bin_to_string(PG_FUNCTION_ARGS)
{
bytea *b= PG_GETARG_BYTEA_P(0);

char *ism = function(b);

PG_RETURN_CSTRING(ism);
}

I have another function in C++ that parses the binary string (file)
into an object that is then further processed. This works for all
functions returning boolean or numeric values, only the string methods
produce these odd results. So as you said, the way in which strings
are passed between C++ and C in my code must be horribly wrong. What
would be the correct way?

> As a rough example:
>
>  bytea *b = PG_GETARG_BYTEA_P(0);
>  std::string ism;
>
>  ism = function(std::string(VARDATA(b), VARSIZE(b)-VARHDRSZ));
>
>  PG_RETURN_CSTRING(ism.c_str());
>
> Note that this returns an ASCIIZ string, which is not necessarily the
> same as the C++ string.  You would be better off creating a
> PostgreSQL text object and then return that.
>
>>Well, C++ string constructor is proper in the sense it makes copy of
>>the source data.  however, it's a little weird that you are passing
>>bytea like this...bytea can contain null and c++ string initialization
>>stops at any 0 byte.
>
> Not so.  If the constructor also specifies a length then the data
> pointer's area is not assumed to be NUL-terminated.
>
>>Maybe you should be encoding the data to text (say, to hex) first?
>
> Better to use the supplied length in the varlena descriptor.
> --
> Regards,
>
> Dave  [RLU #314465]
> *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*
> dwnoon(at)ntlworld(dot)com (David W Noon)
> *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*
>

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Adrian Schreyer 2011-03-23 10:08:19 Re: Weird problems with C extension and bytea as input type
Previous Message tv 2011-03-23 09:32:11 Re: Utilities for managing streaming replication servers?