Beefy Boxes and Bandwidth Generously Provided by pair Networks
"be consistent"
 
PerlMonks  

Re^8: Inline::C and NULL pointers

by syphilis (Archbishop)
on Dec 21, 2021 at 01:27 UTC ( [id://11139782]=note: print w/replies, xml ) Need Help??


in reply to Re^7: Inline::C and NULL pointers
in thread Inline::C and NULL pointers

If we agree that NULL pointers are a common feature of the C libraries, than I would expect transparent support for those by default.

That "transparent support" can only be enacted by altering ExtUtils/typemap. The way to address this is to file an Issue with perl5. (Thanks etj for supplying the appropriate link.)
I certainly agree that NULL pointers are a common feature of C libraries ... I'm not convinced that the need to pass NULL pointers from perl to C functions is all that common.

it seems the original author was drunk when wrote the logic behind the whole "local typemap" option :P

It seemed that way partly because of the way that things unfolded as I muddled my way through it. (You'd be excused for thinking that *I* was drunk. I wasn't, but I felt like I was by the end of it.)
Based on what I know at the moment, I think the succinct way to describe the bug is:

"User-supplied typemaps other than ./typemap and ../typemap cannot override the type settings specified in ExtUtils/typemap"

I believe this is an ExtUtils::ParseXS bug that has been around for a long time - I see the same behaviour on perl-5.8.8.
I would think it's very rare that a user wants to override an ExtUtils/typemap setting, and even rarer that a user would try to do that in a typemap other than ./typemap or ../typemap. So I'm not at all surprised that it has taken a long time to surface.

I'll file a bug report, and update this thread with a link to it. Might take a day or two. I'll spend some time trying to work out how to fix it first.
It might simply be that these problematic user typemaps are being prepended (instead of appended) to the list of typemaps. That would be consistent with what we're seeing, though the output I see with verbose (BUILD_NOISY) builds of the Inline::C scripts suggests that they are being appended.

UPDATE:In ExtUtils::ParseXS::Utilities::process_typemaps() we find the following line of code:
push @tm, standard_typemap_locations( \@INC );
Prior to this push() any typemap specified by the user that is not in one of the standard typemap locations is already included in @tm.
And this push() ensures that ExtUtils/typemap can be found *after* (and will override any conflicts with) that user-specified typemap.
Changing the push() to an unshift() fixes that problem, enabling the user-specified typemap to take precedence - which is exactly what we want.
However, I just need to determine what would be broken by that change. Another option is to stay with the push() and simply remove ExtUtils/typemap from the list returned by standard_typemap_locations(\@INC) - which I think might be a better approach if I can ascertain that ExtUtils/typemap is guaranteed to already be in @tm.

Cheers,
Rob

Replies are listed 'Best First'.
Re^9: Inline::C and NULL pointers
by syphilis (Archbishop) on Dec 23, 2021 at 04:04 UTC
Re^9: Inline::C and NULL pointers
by Marshall (Canon) on Dec 21, 2021 at 06:18 UTC
    Rob, I am glad that other Monks jumped in on this.

    I agree with this: I'm not convinced that the need to pass NULL pointers from perl to C functions is all that common.

    As I mentioned in previous posts, the OP is confronted with some poorly written C code that he can't change. This NULL pointer idea arises from an I/F that says: "if you give a pointer to memory, I will use that memory for output, assuming without question that you have given me enough memory for my yet to be generated output. If you don't give me such a pointer, I will give you a pointer to my non-thread-safe non-recursion-safe static memory." This is a bullshit I/F. But the OP can't change that.

    NULL as the 2nd param is a weird situation. If there could be one arg and an optional second arg, then normally this would be implemented with a variable number of args - you don't put NULL, for that second arg - it is simply not there at all for the caller. This of course requires different C code than what the OP is dealing with. printf() for example uses a variable number of arguments.

    I did have fun with Inline::C and found it to be "easy to use" for all the "heavy lifting" that is does.

      In general C programming, one doesn't have optional arguments. And generally, being able to give a pointer value that is clearly not intended as valid (a NULL) is a valuable thing. That maps very nicely for a Perl interface, to turning an undef input (or, conceivably, no input at all) into a NULL in C terms.

      tl;dr: C and Perl have different idioms, and while XS should probably take an SV* and treat it idiomatically, actual C has different needs.

        Yes, you are correct in that in general, C programming doesn't use optional arguments! I know how to do it, but I can't remember any production code that I've written that uses that idea - a much more normal way is to explicitly pass a single pointer to an array of variable length.

        Again, what is bizarre about this I/F is that it returns a pointer. But sometimes that pointer is to internal static memory and sometimes that pointer is to the memory that you just gave the I/F! In the first case, you had better use that info right away or make your own copy of it for later use - this idea is also non-thread-safe and non-reentrant. To fix that problem, I would dynamically allocate memory for the result and return a pointer to that. If we go with option 2, then the user is responsible for allocating and passing a pointer to enough memory for the result. The problem here is that the sub doesn't know how big of a buffer you have given it. And in general the guy who produces the result is in a much better position to estimate the size of the result than the caller. It is unusual to say the least for option 1 and option 2 to both be available depending upon the calling parms.

        Back to Perl. When was the last time you saw an I/F like this? If I pass the sub a reference for the output, say \$output, I would expect that result goes into my $output and the sub perhaps returns some success or failure code. More likely is that the sub returns an array, a scalar or a reference to an array and perhaps undef on failure. I can't think of a Perl I/F where I am required to pass a reference to an output structure in all cases and undef when I don't want the sub to use that reference for output when instead I want to look for the output as the return value. There is usually one way to give the input and one way to get the output. subs that transform data via a reference to an input. structure are quite common.

        Back to C. Memory allocation in C is a major issue and the source of many, many bugs. Ambiguity about who is allocating memory and who is responsible for it after allocation is "a big deal".

Re^9: Inline::C and NULL pointers
by markong (Pilgrim) on Dec 23, 2021 at 15:00 UTC
    I certainly agree that NULL pointers are a common feature of C libraries ... I'm not convinced that the need to pass NULL pointers from perl to C functions is all that common.

    There are many discussions about what is the right approach and there seems to be not an established consensus from what I read around. But!
    Considering the nature of the tool is to glue to a different language interface, ( i.e. a collection of functions), then any source type you map into C/C++ will be used in the majority of cases as a function argument.

    Consider also that NULL as argument is not just a poor C programming practice because often you need a "special" value to signal conditions, sometimes is even required(!) e.g.: strtok(3)/snprintf(3)/gettimeofday(2).

    As SWIG already does, undef ==> NULL feels completely natural and I personally consider the lack in Inline as a bug. I've already solved by using SWIG as usual, maybe some Inline::C users could take from here and consider opening a related enhancement request.

      maybe some Inline::C users could take from here and consider opening a related enhancement request

      This is the bit I don't quite get.
      Precisely what, IYO, should that "enhancement request" be seeking ?
      At the moment, all I've got is that "undef, when passed from perl to unsigned char * C argument, should be NULL". (That problem, at least, is solved.)
      What if we're passing something other than "undef" ? Should that be T_PV or T_PTR ? ... or something else ?

      What happens when the C function unsigned char * foo() returns a NULL to perl ? Should that come back as undef ?
      With T_PV it returns undef; with T_PTR it returns the IV zero. But this behaviour can be manipulated to whatever we want in ExtUtils/typemap (or user-provided typemap).

      Just give me a clear spec, I'll write a patch to ExtUtils/typemap that enacts that spec,and, if it passes review here, I'll see if I can get perl porters to accept it.
      That's where the change would best be made.

      If the proposed change is unacceptable to them, then we can look at making the change in Inline::C by use of a customized typemap. (No guarantees that it will be accepted there, either ... we'll just have to wait and see.)
      But I first need to see a clear spec of the requirement, telling me exactly what needs to be changed.

      Cheers,
      Rob

      UPDATE: If the only thing we want to do is to ensure that "undef" is passed as NULL to a char * (either signed or unsigned) then I think we need to change the "INPUT" setting in ExtUtils/typemap for T_PV from:
      $var = ($type)SvPV_nolen($arg)
      to
      if (SvOK($arg)) $var = ($type)SvPV_nolen($arg); else $var = INT2PTR($type,SvIV($arg))
      We can also achieve the same effect by creating a file named "typemap" that contains:
      INPUT T_PV if (SvOK($arg)) $var = ($type)SvPV_nolen($arg); else $var = INT2PTR($type,SvIV($arg))
      That file needs to be placed in a location where it will automatically be recognized as a typemap.

      For example, place that typemap file in the same directory as this little Inline:C script:
      use strict; use warnings; use Devel::Peek; use Inline C => Config => BUILD_NOISY => 1, # verbose build FORCE_BUILD => 1, # re-build whenever the script is run CLEAN_AFTER_BUILD => 0, # don't clean up the build directory ; use Inline C =><<'EOC'; unsigned char * foo(unsigned char *name) { if(name) printf("name is: %s\n", name); else printf("NULL input (undef) was detected\n"); return(name); } EOC my $x = foo(undef); Dump($x); my $y = foo("hello world"); Dump($y); my $z = foo(''); Dump $z;
      Then cd to that directory, run the script and tell me if it's doing something undesirable.

      Of course unsigned char* is not the only thing that maps to T_PV - char*, const char*, caddr_t, wchar_t* and Time_t* all map to T_PV, and will therefore all be affected by that typemap.
      But, if need be, we could always create a new and distinct type for those that need to use this revised setting.

      To run that script with the settings specified by ExtUtils/typemap, just rename "typemap" to something else, and it will be ignored.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://11139782]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others admiring the Monastery: (6)
As of 2024-04-25 13:51 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found