1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147
|
=head1 NAME
Proc::FastSpawn - fork+exec, or spawn, a subprocess as quickly as possible
=head1 SYNOPSIS
use Proc::FastSpawn;
# simple use
my $pid = spawn "/bin/echo", ["echo", "hello, world"];
...
waitpid $pid, 0;
# with environment
my $pid = spawn "/bin/echo", ["echo", "hello, world"], ["PATH=/bin", "HOME=/tmp"];
# inheriting file descriptors
pipe R, W or die;
fd_inherit fileno W;
my $pid = spawn "/bin/sh", ["sh", "-c", "echo a pipe >&" . fileno W];
close W;
print <R>;
=head1 DESCRIPTION
The purpose of this small (in scope and footprint) module is simple:
spawn a subprocess asynchronously as efficiently and/or fast as
possible. Basically the same as calling fork+exec (on POSIX), but
hopefully faster than those two syscalls.
Apart from fork overhead, this module also allows you to fork+exec
programs when otherwise you couldn't - for example, when you use POSIX
threads in your perl process then it generally isn't safe to call
fork from perl, but it is safe to use this module to execute external
processes.
If neither of these are problems for you, you can safely ignore this
module.
So when is fork+exec not fast enough, how can you do it faster, and why
would it matter?
Forking a process requires making a complete copy of a process. Even
thought almost every implementation only copies page tables and not the
memory itself, this is still not free. For example, on my 3.6GHz amd64
box, I can fork a 5GB process only twenty times a second. For a real-time
process that must meet stricter deadlines, this is too slow. For a busy
and big web server, starting CGI scripts might mean unacceptable overhead.
A workaround is to use C<vfork> - this function isn't very portable, but
it avoids the memory copy that C<fork> has to do. Some systems have an
optimised implementation of C<spawn>, and some systems have nothing.
This module tries to abstract these differences away.
As for what improvements to expect - on the 3.6GHz amd64 box that this
module was originally developed on, a 3MB perl process (basically just
perl + Proc::FastSpawn) takes 3.6s to run /bin/true 10000 times using
fork+exec, and only 2.6s when using vfork+exec. In a 22MB process, the
difference is already 5.0s vs 2.6s, and so on.
=head1 FUNCTIONS
All the following functions are currently exported by default.
=over 4
=cut
package Proc::FastSpawn;
# only used on WIN32 - maddeningly complex and doesn't even work
sub _quote {
$_[0] = [@{ $_[0] }]; # make copy
for (@{ $_[0] }) {
if (/[\x01-\x20"]/) { # some sources say only space, "\t\n\v need to be escaped, microsoft says space and tab
s/(\\*)"/$1$1\\"/g; # double + extra escape before "
s/(\\+)$/$1$1/; # just double at end
$_ = '"' . $_ . '"';
}
}
}
BEGIN {
$VERSION = '1.2';
our @ISA = qw(Exporter);
our @EXPORT = qw(spawn spawnp fd_inherit);
require Exporter;
require XSLoader;
XSLoader::load (__PACKAGE__, $VERSION);
}
=item $pid = spawn $path, \@argv[, \@envp]
Creates a new process and tries to make it execute C<$path>, with the given
arguments and optionally the given environment variables, similar to
calling fork + execv, or execve.
Returns the PID of the new process if successful. On any error, C<undef>
is currently returned. Failure to execution might or might not be reported
as C<undef>, or via a subprocess exit status of C<127>.
=item $pid = spawnp $file, \@argv[, \@envp]
Like C<spawn>, but searches C<$file> in C<$ENV{PATH}> like the shell would
do.
=item fd_inherit $fileno[, $on]
File descriptors can be inherited by the spawned processes or not. This is
decided on a per file descriptor basis. This module does nothing to any
preexisting handles, but with this call, you can change the state of a
single file descriptor to either be inherited (C<$on> is true or missing)
or not C<$on> is false).
Free portability pro-tip: it seems native win32 perls ignore $^F and set
all file handles to be inherited by default - but this function can switch
it off.
=back
=head1 PORTABILITY NOTES
On POSIX systems, this module currently calls vfork+exec, spawn, or
fork+exec, depending on the platform. If your platform has a good vfork or
spawn but is misdetected and falls back to slow fork+exec, drop me a note.
On win32, the C<_spawn> family of functions is used, and the module tries
hard to patch the new process into perl's internal pid table, so the pid
returned should work with other Perl functions such as waitpid. Also,
win32 doesn't have a meaningful way to quote arguments containing
"special" characters, so this module tries it's best to quote those
strings itself. Other typical platform limitations (such as being able to
only have 64 or so subprocesses) are not worked around.
=head1 AUTHOR
Marc Lehmann <schmorp@schmorp.de>
http://home.schmorp.de/
=cut
1
|