Open file descriptor handling over fork()
and exec*()
in a multithreaded application using file leases and/or fcntl()
locks (record locks) is dicey.
In general, the O_CLOEXEC
/fcntl(fd, F_SETFD, FD_CLOEXEC)
option is preferable over explicitly closing the descriptors, as explicitly closing a descriptor has some undesirable side effects. In particular, if you have a lease on the descriptor, closing the descriptor in the child process will release the lease.
Note that in Linux, fcntl()
locks are not inherited across a fork()
; see Description in man 2 fork.
posix_spawn()
is implemented in the C library, and the file actions can be managed by posix_spawn_file_actions_init()
, posix_spawn_file_actions_addclose()
et cetera; look at the See also list in the man pages. Personally, I would not use this interface, as closing the descriptors in the child process prior to exec*()
is at least as simple.
Because of all of the above, I personally prefer to open files with O_CLOEXEC
and/or use fcntl(fd,F_SETFD,FD_CLOEXEC)
so that all descriptors are close-on-exec by default. Something like
#define _GNU_SOURCE
#define _POSIX_C_SOURCE 200809L
#include <unistd.h>
#include <fcntl.h>
#include <sys/time.h>
#include <sys/resource.h>
void set_all_close_on_exec(void)
{
struct rlimit rlim;
long max;
int fd;
/* Resource limit? */
#if defined(RLIMIT_NOFILE)
if (getrlimit(RLIMIT_NOFILE, &rlim) != 0)
rlim.rlim_max = 0;
#elif defined(RLIMIT_OFILE)
if (getrlimit(RLIMIT_OFILE, &rlim) != 0)
rlim.rlim_max = 0;
#else
/* POSIX: 8 message queues, 20 files, 8 streams */
rlim.rlim_max = 36;
#endif
/* Configured limit? */
#if defined(_SC_OPEN_MAX)
max = sysconf(_SC_OPEN_MAX);
#else
max = 36L;
#endif
/* Use the bigger of the two. */
if ((int)max > (int)rlim.rlim_max)
fd = max;
else
fd = rlim.rlim_max;
while (fd-->0)
if (fd != STDIN_FILENO &&
fd != STDOUT_FILENO &&
fd != STDERR_FILENO)
fcntl(fd, F_SETFD, FD_CLOEXEC);
}
is a pretty portable way to quickly set all open descriptors (except standard ones) to close-on-exec; libraries sometimes use descriptors internally, and may not set O_CLOEXEC
. On my system, set_all_close_on_exec()
takes 0.25ms to run; the maximums are 4096 and 1024 respectively, so it ends up trying to set 4093 file descriptors.
(Note that fcntl(fd,F_SETFD,FD_CLOEXEC)
should succeed for all valid descriptors, and fail with errno==EBADF
for other (invalid/unused) descriptors.)
Note that it is much faster to simply try setting the flag on all possible descriptors, than to try and find out which descriptors are actually open. (The latter is possible in Linux via e.g. /proc/self/fd/
.)
Second, I prefer to use a helper function to create a control pipe to the child process, move the file descriptors to their proper places (which is not always trivial), and fork the child process. The signature is usually similar to
int do_exec(pid_t *const childptr,
const char *const cmd,
const char *const args[],
const int stdin_fd,
const int stdout_fd,
const int stderr_fd);
My do_exec()
function creates a close-on-exec control pipe, to differentiate between failure to execute the child binary, and child binary exit statuses. (If the child process fails to exec()
, it writes errno
as a signed char to the control pipe. The parent process tries to read a single signed char from the other end of the control pipe. If that succeeds, then the exec failed; the parent reaps the child using e.g. waitpid()
, and returns the errno
error. Otherwise, the pipe was closed due to the exec(), so the parent process knows the child execution has started, and can close the (last open end of the) control pipe.)
Finally, if you have a multithreaded server-type process that needs to spawn new child processes with minimum latency and resource use, start a single child process connected to the original process with an Unix domain socket (because you can use ancillary messages to transfer credentials and file descriptors using those), and have that child process start the actual children. This is exactly what e.g. Apache mod_cgid and most FastCGI implementations do.