aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorChristian Grothoff <christian@grothoff.org>2019-02-16 21:19:23 +0100
committerChristian Grothoff <christian@grothoff.org>2019-02-16 21:19:50 +0100
commit4611c473f1415ceee8f4da94d1ef0c878bca4e4e (patch)
treec0d6618e335b485be363de8132d278b3d6970c77
parente98a4e07e89e26cb24c68690fe9cf389e49de05c (diff)
downloadgnunet-4611c473f1415ceee8f4da94d1ef0c878bca4e4e.tar.gz
gnunet-4611c473f1415ceee8f4da94d1ef0c878bca4e4e.zip
Florian Weimer writes:
Christian Grothoff: > I'm seeing some _very_ odd behavior with processes hanging on exit (?) > with GNU libc 2.28-6 on Debian (amd64 threadripper). This seems to > happen at random (for random tests, with very low frequency!) in the > GNUnet (Git master) testsuite when a child process is about to exit. It looks like you call exit from a signal handler, see src/util/scheduler.c: /** * Signal handler called for signals that should cause us to shutdown. */ static void sighandler_shutdown () { static char c; int old_errno = errno; /* backup errno */ if (getpid () != my_pid) exit (1); /* we have fork'ed since the signal handler was created, * ignore the signal, see https://gnunet.org/vfork discussion */ GNUNET_DISK_file_write (GNUNET_DISK_pipe_handle (shutdown_pipe_handle, GNUNET_DISK_PIPE_END_WRITE), &c, sizeof (c)); errno = old_errno; } In general, this results in undefined behavior because exit (unlike _exit) is not an async-signal-safe function. I suspect you either call the exit function while a fork is in progress, or since you register this signal handler multiple times for different signals: sh->shc_int = GNUNET_SIGNAL_handler_install (SIGINT, &sighandler_shutdown); sh->shc_term = GNUNET_SIGNAL_handler_install (SIGTERM, &sighandler_shutdown); one call to exit might interrupt another call to exit if both signals are delivered to the process. The deadlock you see was introduced in commit 27761a1042daf01987e7d79636d0c41511c6df3c ("Refactor atfork handlers"), first released in glibc 2.28. The fork deadlock will be gone (in the single-threaded case) if Debian updates to the current release/2.28/master branch because we backported commit 60f80624257ef84eacfd9b400bda1b5a5e8e7816 ("nptl: Avoid fork handler lock for async-signal-safe fork [BZ #24161]") there. But this will not help you. Even without the deadlock, I expect you still experience some random corruption during exit, but it's going to be difficult to spot. Thanks, Florian
-rw-r--r--src/util/scheduler.c4
1 files changed, 2 insertions, 2 deletions
diff --git a/src/util/scheduler.c b/src/util/scheduler.c
index dd0d5d5cf..3bd7ccec7 100644
--- a/src/util/scheduler.c
+++ b/src/util/scheduler.c
@@ -658,8 +658,8 @@ sighandler_shutdown ()
658 int old_errno = errno; /* backup errno */ 658 int old_errno = errno; /* backup errno */
659 659
660 if (getpid () != my_pid) 660 if (getpid () != my_pid)
661 exit (1); /* we have fork'ed since the signal handler was created, 661 _exit (1); /* we have fork'ed since the signal handler was created,
662 * ignore the signal, see https://gnunet.org/vfork discussion */ 662 * ignore the signal, see https://gnunet.org/vfork discussion */
663 GNUNET_DISK_file_write (GNUNET_DISK_pipe_handle 663 GNUNET_DISK_file_write (GNUNET_DISK_pipe_handle
664 (shutdown_pipe_handle, GNUNET_DISK_PIPE_END_WRITE), 664 (shutdown_pipe_handle, GNUNET_DISK_PIPE_END_WRITE),
665 &c, sizeof (c)); 665 &c, sizeof (c));