July 01, 2005
Wrapping Subroutinesbrian d foy
brian uses an unobtrusive, low-impact technique for tracing code execution.
brian has been a Perl user since 1994. He is founder of the first Perl Users Group, NY.pm, and Perl Mongers, the Perl advocacy organization. He has been teaching Perl through Stonehenge Consulting for the past five years, and has been a featured speaker at The Perl Conference, Perl University, YAPC, COMDEX, and Builder.com. Contact brian at comdog@panix.com.
I was having Sunday brunch with a colleague a couple of weeks ago, and we started talking about tracing the execution of a program. I wanted to figure out which subroutines the program was calling and which subroutines those subroutines were calling, and so on. Most of you have probably been given the Big Ball of Mud that the previous developer left behind right after he burned his hand-written documentation and right before he walked out the door with all of his personal possessions (and maybe a Springline stapler) never to be seen again. Where do you start?
My first requirement is that I don't want to modify the code to see what's going on. Truthfully, I don't want to ever modify the code and I wish I had never seen it, but back here on planet Earth a lot of people have had this same problem and you can find a lot of tools to help you tear apart a program while leaving its source intact. Look in the Devel::* namespace. Some people have even used things like GraphViz to make pictures out of the results.
After using some of those tools, I figure out which subroutines I need to pay attention to. Instead of working over the program flow with a sledge hammer, I want a scalpel. Consider this little program which makes a subroutine call (and pretend its in the middle of a Big Ball of Mud). I want to watch what happens when I pass those arguments to that subroutine, and I want to see what it returns. I don't want to modify the subroutine though.
#!/usr/bin/perl
cant_touch_this( qw( Fred Barney Betty Wilma ) );
sub cant_touch_this
{
...do stuff...
}
The way that we already know involves a lot of code, and it's a lot of
code that I have to repeat everywhere that I want to see what's going
on (and yes, debuggers do exist, and I could set all sorts of
breakpoints, but I'm going to pretend that doesn't exist since there
is some other cool stuff coming up). I have one line to store the
arguments ahead of the subroutine call so I can print them and then
pass them to the subroutine, and then I have to save the result of the
subroutine so I can print that afterwards. I could make this a bit
shorter with some Perl golf, but things are already getting to look
ugly.
my @args = qw( Fred Barney Betty Wilma ); print "Args are [@args]\n"; my $result = cant_touch_this( @args ); print "Result was $result\n";One possibility is Damian Conway's Hook::LexWrap. I can define
handlers that execute before and after the actual subroutine call, and
those will work for every call, not just one (more on that coming up).
I load the Hook::LexWrap module and use its wrap() function to tell it which subroutine I want to examine. Now my program is only a bit
longer
#!/usr/bin/perl
use Hook::LexWrap;
wrap 'cant_touch_this',
pre => sub { print "The arguments are [@_]\n" },
post => sub { print "Result was [$_[-1]]\n" };
cant_touch_this( qw( Fred Barney Betty Wilma ) );
sub cant_touch_this
{
return 42;
}
The Hook::LexWrap wrappers get the original argument list in their @_,
but with and additional value at the end. That extra value is the
return value. In the pre-wrapper the return value will be undefined
(we wouldn't have to call the subroutine if we already knew the
value!), and that element could have some value after the original
subroutine runs. In my post-wrapper, I just want to see the result, so
I only look at $_[-1].
However, when I run this, I don't see the a result. Why not? I called
cant_touch_this() in void context, so there is no return value.
I only get the return value in the post wrapper when I would actually save the result. I have to call the routine in either scalar or list context.The arguments are [Fred Barney Betty Wilma ] Result was []
#!/usr/bin/perl
use Hook::LexWrap;
wrap 'cant_touch_this',
pre => sub { print "The arguments are [@_]\n" },
post => sub { print "Result was [$_[-1]]\n" };
my $result = cant_touch_this( qw( Fred Barney Betty Wilma ) );
sub cant_touch_this
{
return 42;
}
Now I see the return value.
There are other modules that do this, such asThe arguments are [Fred Barney Betty Wilma ] Result was [42] Hook::PreAndPost and
Hook::WrapSub, but if Damian is writing a module, there must be some
cool trick to it. In Hook::LexWrap, caller() keeps working.
#!/usr/bin/perl
use Hook::LexWrap;
wrap 'who_called_me',
pre => sub { print "pre caller is @{[caller]}\n" },
post => sub { print "post caller is @{[caller]}\n" };
who_called_me();
sub who_called_me
{
print "caller is @{[caller]}\n"
}
When I run this, I get the same output for all three print statements
because all of the subroutines think they are being called from the
same place: not only the same file and package, but they also think
they come from the same line.
Think about what has to actually happen behind the scenes to make this happen: we have to construct a brand new subroutine that gets the original argument list, calls the pre-subroutine, calls the original subroutine, and then calls the post-subroutine. Not only does this brand new subroutine have to do all that, it has to insert itself into the named slot where the original subroutine lives. Damian uses a lot of typeglob magic to move the original subroutine out of the way (more on this coming up), and more typeglob magic to move the replacement subroutine into its place. Let me give you just a little taste of what is going on in there. Here's the bit frompre caller is main /Users/brian/Desktop/wrap-caller.pl 9 caller is main /Users/brian/Desktop/wrap-caller.pl 9 post caller is main /Users/brian/Desktop/wrap-caller.pl 9 Hook::LexWrap that stores the original subroutine
right before the module replaces it. Simple, right? Grab the
subroutine definition out of the symbol table and store it in
$original. The $typeglob is the first argument to wrap(), which is the
subroutine name. You noticed the lack of strict at the top of the
module, right? For extra credit, which of the three strict checks does
this violate? It's okay to do this stuff if you know why you need to
do it, but don't try this at home.
my $original = ref $typeglob eq 'CODE' && $typeglob
|| *$typeglob{CODE}
|| croak "Can't wrap non-existent subroutine ", $typeglob;
Now that the original subroutine is out of the way, we can make the
new one. I'll spare you the details on the inside. If you have a hour
(or a weekend) free, call up the source of Hook::LexWrap and take a
look between these two braces.
$imposter = sub {
...
};
Once we have the $imposter subroutine, we shove it into the place where
the original subroutine use to live.
*{$typeglob} = $imposter;
It's actually not that much code if you care to peek under the hood.
Still, with all that magic, the replacement subroutine is virtually
invisible to caller(). It's like its not there even though it is.
That's not enough though. As I've done it so far in this article,
every instance of the target subroutine gets the new behavior. Check
out the name of the module: it's got that "Lex" in it, and just like
other things in Perl that start with those letters, I can limit the
scope of this effect. In this example, I have to wrap the subroutine
differently. Previously, I called wrap() in void context (meaning I
did not use the result for anything, so wantarray returns undef) so
Hook::LexWrap made the effect global. If I store the result of wrap()
in a lexical variable, the effect disappears when that lexical
variable goes out of scope. That is, I've effectively unwrapped the
subroutine at the end of the naked block which I used to define the
scope.
#!/usr/bin/perl
use Hook::LexWrap;
{
my $lexical = wrap 'who_called_me',
pre => sub { print "pre caller is @{[caller]}\n" },
post => sub { print "post caller is @{[caller]}\n" };
who_called_me();
}
who_called_me();
sub who_called_me
{
print "caller is @{[caller]}\n"
}
If
TPJ
| |||||||||