My C is rusty, and my C++ is laughable. (Or is it the other way 'round?) And now, with that minor disclaimer out of the way, I'll tell you about my most recent adventure in Rubyland.
I'm developing a gem which I'm tentatively dubbing "Tomato". This gem has an unfortunate dependency on some C++ code, meaning I have to interface with that C++. Now, I've done a fair amount of coding (most of it experimental) in the world of C extensions for Ruby, and I was hoping to fall back on what I remember of that. It was actually pretty easy, so long as all of your code was in C. C++ has made things a bit more complicated, I'm afraid.
I took a look at Rice. And to its credit, Rice looks like an awesome framework if you're going to do some hardcore C++ interfacing. Unfortunately, it also replaces virtually everything I thought I knew about C extensions (with C++ extensions). I haven't had the time to sit down and learn a new API, particularly using a language I haven't touched in 5 years or so; so I shelved the Rice idea. Instead, I went the C route.
For Extension Developers (All Others, Move Along)
I'll veer off-topic for a moment so that I can dump the contents of my brain regarding C Extensions with C++. If you're not an extension developer, then scroll down! This should probably be a separate article, but I'm feeling lazy.
There are only really a few things you have to know to get your C++ extension to compile using the regular C API for Ruby -- but those few things are notoriously difficult to discover, so I'm posting them here for your convenience. The first of these is knowing which library to link against. If you've done a lot of C++, you've probably figured that part out by now. Your extconf.rb should look like this:
1 2 3 4 5 6 |
require 'mkmf' $LIBS << " -lstdc++" # . . . create_makefile("tomato/tomato") |
If that's done, the battle's half over. The extension itself needs to be written in C++ (stored in .cpp files and using the appropriate compiler) but, at the same time, it must avoid the C++ name mangling that's used for overriding methods. So instead of defining a method thusly:
1 2 |
void Init_tomato(void) { /* . . . */ } |
You should use the extern "C" directive, making it look more like:
1 2 3 |
extern "C" void Init_tomato(void) { /* . . . */ } |
Keep in mind that the above syntax is only required for methods which you need Ruby to interface with. Methods internal to your extension can be written however you like.
This turns out to be a frequently misunderstood fact in C++ in general: the above directive does not say "this method body will be written in C." What it does, in fact, say, is: "The name of this method must not be modified. The contents are still C++, but the rest of the world should treat it as if written in C."
The last thing you need to change is how you pass function references around, because the C style won't work in C++. Change things like this:
rb_define_method(cTomato, "method_name", fTomato_method_name, 0) |
To this:
1 2 3 |
typedef VALUE (ruby_method_vararg)(...); /* . . . */ rb_define_method(cTomato, "method_name", (ruby_method_vararg *)fTomato_method_name, 0) |
With those caveats out of the way, you're ready to interface your C code with C++. Not that doing so is safe, however. There are the obvious pitfalls, and it's generally considered bad form to jumble malloc, new, free and delete together all in one application. However, it will work.
The Rest of the Story
There are a few things that you need to be made aware of, though. I'll post them here as I learn them, but one issue that's had me banging my head against the keyboard for the better part of 5 hours now is what happens when you rb_raise an error.
What I didn't realize at the time was that rb_raise actually stops executing your C code. Internally, Ruby uses setjmp before your Ruby method is called, and longjmp in the case of an exception. So when my code was raising a Ruby exception, Ruby was jumping out of the C++ code without a care for what would happen to all of my memory allocation.
But it gets better than that. In the world of C, you'd have to remember to free your malloced objects before calling rb_raise. In the world of C++,if you use rb_raise anywhere you have created an object with a destructor, your program will break.
That's right. longjmp doesn't give a damn about your C++ objects. In my case, I had a bunch of RSpec tests that I was using to drive the gem's development; all of a sudden, the tests themselves would still run to completion -- but I'd get a segmentation fault just before RSpec would have finished. Eventually I tracked it down to, obviously, the call to rb_raise that was bypassing the destruction of many C++ objects.
For a more detailed analysis from people smarter and more experienced than I, look no further than Ruby Talk:
> Second, ruby's exception handling via setjmp/longjmp effectively means> Second, ruby's exception handling via setjmp/longjmp effectively means > you should never construct a C++ object with a nontrivial destructor > on the stack. If ruby longjmps out of your code, your destructors > will not be called. It's worse than that; if ruby longjmps over the destruction of an automatic object, the program has undefined behavior. And if a C++ exception ever leaves C++ code and goes into Ruby code, the result will also be undefined. C99 programmers also have to be careful; longjmping over the destruction of a variable-length array can result in a memory leak.
So there you have it. Don't longjmp around C++. In the end, it's probably safer to use Rice and let those guys worry about what to do when C++ and Ruby collide. But if you're like me, and your business constraints (or personal ambitions, whichever are driving you forward with your evil plans for galactic domination!) force you to use C for the extension and C++ everywhere else, then you can do what I did: Nest the C++ code safely within another function, and wrap your Ruby exceptions in throw and catch. My code now looks a little something like this:
1 2 3 4 5 6 7 8 9 10 11 12 13 |
/* . . . */ /* We have to do things this way because rb_raise does a longjmp, which causes C++ destructors not to fire. */ try { /* and it's fTomato_execute that contains the C++ objects whose destructors are so dangerous here. */ VALUE return_value = fTomato_execute(self); return return_value; } catch (std::string const &exception_message) { rb_raise(cTomatoError, "%s", exception_message.c_str()); } |
It works for me, at least. Hopefully it'll work for someone else out there.
So yeah, this article was a pretty obscure edge case, but I'd be remiss in my duties if I didn't post it anyways. Hell, maybe this way, I'll remember it even if no one else needs to.


