Example #5 README

This is an example of how to embed a multithreaded Perl interpreter into an already-multithreaded C++ program.
The key trick demonstrated is how to create valid Perl threads yourselfs, without resorting to Thread::new().

This example also shows how to include a statically linked extension into our proggie, with automatic bootstrap of the extension from within C space - so that Perl space user scripts don't have to explicitly `use' the extension, which they'd always have to otherwise, as long as they'd be intended to run in our custom embedded interpreter.

The C++ program with embedded Perl is statically linked with the custom extension Emb_lib, with the Thread module and also with the DynaLoader module, to provide for autoloading of any other modules.
The following measures need to be taken in order to properly link-in and bootstrap the three aforementioned modules:

add the respective .o files into the linking run of gcc (one file per module)
do the xsinit stuff
- auto-generate xsinit.c based on the three modules
- pass extern xsinit() to perl_parse()
- add xsinit.o into the linking run of gcc
bootstrap the modules (the bootstrap function for each module is provided by xsinit.o)
- simple custom extensions can be bootstrapped directly using the bootstrap keyword in Perl space.
- most out-of-the-box modules are quite complex and should be bootstrapped using the `use' or `require' keywords in Perl space.
Both methods can obviously be performed in a perl_eval() called from within C++.

This example also uses a "private" build of Perl, rather than the system-wide build. Said "private" build resides in a dedicated subdirectory of the current directory of this README.
I have developed this scenario when I needed to build a custom Perl with support for threads (5.005-style posixish stuff).

More precisely, to save space, you have to provide the private Perl build yourselfs. The Makefile in this directory expects a subdirectory called


  perl5.005_03

with a built installation of perl


  tar xvzf perl5.005_03.tgz
  cd perl5.005_03
  ./Configure -Dusethreads -des
  make

(-d = use defaults, -e = proceed past the production of config.sh, -s = silent) This way, configure will use defaults and will not bother you with progress messages.

Note that patch_xs_makemaker now does some heavy modifications to the resulting $(Ext_Dir)/Makefile to force make to use the custom Perl build rather than the system-wide install.

The emb_main.cc in this example contains a lot of code cannibalized out of Thread.c. For easy comparison with Thread.c, there are a myriad of lines containing the original code, commented out with `//-' which means that this stuff can safely be cut out. Large sections of code in irrelevant #ifdefs have been cut out completely.

The relevant contents of Thread.c can be described in the following way:

static THREAD_RET_TYPE threadstart(void *arg)
{
   // Running in the child thread - the thread body itself.
   // This is what Perl passes to pthread_create() as the thread code function.

   does some initialization (perlembed-style macros) - namely, POPs a reference to the
         Perl-space sub to run as perl-space thread body
   calls the sub indicated to `new Thread' as the thing to run (using perl_call_sv())
         so that the Sub just run finds the array of arguments as the topmost
	 item on the perl stack and POPs it
   does some cleanup (perlembed-style macros) - pops results off the stack and
         transforms them into a new AV*, ready to be returned upon
	 raw pthread_join() to whoever waits for the thread to join.

   // Essentially this seems to be a wrapper between raw pthread_create() and
   // the perl-space sub, that should run as a separate thread.
   // This is analogous to the C style of Posix Threads - only some wrapping is
   // apparently necessary to make it work `seamlessly' the same way in Perl.
   //
   // When control is returned by the Perl-space sub to this function, the thread
   // is about to end. Hopefully someone is waiting for it to join(), so that it
   // doesn't become a zombie.
}

static SV *newthread (SV *startsv, AV *initargs, char *classname)
{
   // Running in the parent thread - the thread that acts to splint off a child.

   does some initialization - creates some crude C structs to hold per-thread
         bootstrap data, XPUSHes on stack a reference to the array of arguments
	 and XPUSHes on stack the reference to the Perl-space sub to run as
	 Perl-space thread body.
   calls pthread_create() and passes threadstart() as an argument.
   does a basic error check on the return value of pthread_create and finishes
         some stuff on behalf of the child thread.

   // Obviously when this function finishes, the thread just splinted keeps
   // running on - thus, this function is not the one to collect the return
   // values of the Perl-space sub comprising the thread body. To collect data
   // from a finished child thread, the parent thread has to wait() for the 
   // child to join().
}

// Please note how the parent thread locks the thread-specific bootstrap
// struct (using MUTEX_LOCK on a mutex within the struct), then sets the
// child airborne using pthread_create(), and if no problem is signaled,
// it performs a few final polishing strokes on the child's bootstrap
// struct - while the child pauses for a while, waiting to lock the mutex,
// until the master thread is done with its creator business and releases
// the lock (using MUTEX_UNLOCK). After that, the child goes on with its
// threadstart() business, rushing to finally launch the Perl-space sub -
// while newthread() returns within the parent thread, purring with joy
// looking back at the good job it has done.
//
// If you try to catch trace of the typical perlembed sequence of macros,
// you'll find out that indeed there is such a sequence (the "sub to run"
// parameter being somewhat of a deviation) - it's just that the input
// parameters are XPUSH'ed within the master thread, the sub is run
// within the child thread and the return data is collected in the child
// thread as well - or, more precisely, the array of return values
// can be returned by join() called within the master thread...

static void remove_thread(struct perl_thread *t)
{
   destroys the thread-specific bootstrap struct.

   // Detached threads call this themselves just before they die,
   // from within threadstart() - after the called Perl-space sub
   // returns control.
   // With joinable threads, this cleanup is done by the thread that
   // wait()s for the children to join().
}

If you want to run multi-threaded Perl in threads that you create yourself, all you need to do is dissect the standard Thread::new().

There are two key functions that take care of that - one runs in the parent thread, the other in the child thread.
The parent function calls pthread_create() in its middle, the child function calls the Perl sub passed as the Perl thread function - again, just about in the middle of its length.

All we need to do is split the parent and child lead-in and lead-out into separate functions (or perhaps macros) that can be called easily from your C++ code - while you handle pthread_create() and the Perl sub calls yourselfs.

Note that if you dissect the lead-in and lead-out into separate functions, you need to pass a couple of variables from the lead-in to the respective lead-out. Not a big problem though.

If you include/bootstrap Thread.pm, you can use the out-of-the-box Thread::methods and stuff within such in-vitro child threads :)

I haven't tested the whole thing extensively - it has only started to behave somewhat properly in this state. The example shows a basic concurrency and reentrancy test.
Perhaps I should add some more defensive error checking in the lead-outs, so that there are no zombies and the children die immediately when the parent dies.
Also, some stress tests for multithreaded operation over a shared interpreter instance could be useful. If some operations prove troublesome with concurrent access, additional explicit locking (serialization) could be employed.

References & credits:

`man perlembed`
`man perlxstut`
`man perlxs`
`man perlguts`
`man perlthrtut`
"Advanced Perl Programming", "Programming Perl", "Perl in a Nutshell", "Perl Cookbook" - all four from O'Reilly & Associates
the sources of Perl 5.005_03 and Perl 5.6.1, in particular ext/Thread/Thread.c (Damn. 5.8.0 is out just now - gotta take a look.)
"$PERL_SOURCE"/README.threads
`man gcc`
USENET newsgroups - several hints on how to compile Perl using a C++ compiler, what to do about -DBOOL=char, how to avoid the assert redefinition warning etc.