All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ian Kent <raven@themaw.net>
To: Jim Carter <jimc@math.ucla.edu>
Cc: autofs@linux.kernel.org
Subject: Re: clients suddenly start hanging (was: (no subject))
Date: Fri, 23 May 2008 10:35:00 +0800	[thread overview]
Message-ID: <1211510100.15090.27.camel@raven.themaw.net> (raw)
In-Reply-To: <20080522214233.7F5D4211250@simba.math.ucla.edu>


On Thu, 2008-05-22 at 14:42 -0700, Jim Carter wrote:
> Sorry for the delay -- we have a crisis with KDE menus disappearing :-(
> Anyway: the new patch did not stop the hanging, unfortunately.  Again 
> upon the first hang gdb was not able to print out the individual thread
> tracebacks, but the second time around, it succeeded.  But strangely,
> the hung daemon thread from the first hang is easily identified, but
> strings pertaining to the second hang are not evident.
> 
> I'm calling the patch "take-submount-lock". 

OK. I've named the patch
autofs-5.0.3-lock-submount-before-state-change.patch but it doesn't
really matter because it will be folded into the patch tittled
"autofs-5.0.3 - fix submount shutdown handling" which currently has the
name autofs-5.0.3-submount-shutdown-recovery-5.patch and will likely end
up with a 6 when I do this. I'm going to retain this submount locking
even though it may not be the source of the problem as I think it's an
improvement.

However, I'll keep the patch below separate as it (possibly) addresses a
different issue. I've named it
autofs-5.0.3-expire-thread-create-cond-handling.patch.

I'm sorry for all this hassle and I appreciate your patience and efforts
to help out very much.

But onto the issue.

I've looked at this again, on a broader front.
I think there are two possible causes.

One is that umount(8) is faulting and the expire thread is disappearing
because of it. I've seen that happen before so it is certainly possible.
But I don't want to add code to work around that yet because it would
mask a possible bug in the daemon.

The other possibility is my poor pthreads condition handling at expire
thread (the thread that does the actual umount) creation. In the early
development I changed the code, for both mount and umount thread
creation, time and again and ended up with something that appeared to
work for around 18 months. But then, recently, I suddenly started seeing
a hang with heavy mount activity. So I changed the code again, and
although I can understand why it started working with this change, I'm
still a little puzzled because I think it really should have continued
to function. Anyway, I've made this same alteration to the expire thread
creation in case the same thing is happening. The code is very similar
in both cases so it's a good candidate.

autofs-5.0.3 - fix incorrect pthreads condition handling for expire requests.

From: Ian Kent <raven@themaw.net>


---

 daemon/direct.c   |   40 +++++++++++++++++++++-------------------
 daemon/indirect.c |   28 +++++++++++++++-------------
 2 files changed, 36 insertions(+), 32 deletions(-)


diff --git a/daemon/direct.c b/daemon/direct.c
index 86c817c..4b35aba 100644
--- a/daemon/direct.c
+++ b/daemon/direct.c
@@ -1052,55 +1052,53 @@ static void expire_mutex_unlock(void *arg)
 
 static void *do_expire_direct(void *arg)
 {
-	struct pending_args *mt;
+	struct pending_args *args, mt;
 	struct autofs_point *ap;
 	size_t len;
 	int status, state;
 
-	mt = (struct pending_args *) arg;
+	args = (struct pending_args *) arg;
 
 	status = pthread_mutex_lock(&ea_mutex);
 	if (status)
 		fatal(status);
 
-	ap = mt->ap;
+	memcpy(&mt, args, sizeof(struct pending_args));
 
-	mt->signaled = 1;
-	status = pthread_cond_signal(&mt->cond);
+	ap = mt.ap;
+
+	args->signaled = 1;
+	status = pthread_cond_signal(&args->cond);
 	if (status)
 		fatal(status);
 
 	expire_mutex_unlock(NULL);
 
-	pthread_cleanup_push(free_pending_args, mt);
-	pthread_cleanup_push(pending_cond_destroy, mt);
-	pthread_cleanup_push(expire_send_fail, mt);
+	pthread_cleanup_push(expire_send_fail, &mt);
 
-	len = _strlen(mt->name, KEY_MAX_LEN);
+	len = _strlen(mt.name, KEY_MAX_LEN);
 	if (!len) {
-		warn(ap->logopt, "direct key path too long %s", mt->name);
+		warn(ap->logopt, "direct key path too long %s", mt.name);
 		/* TODO: force umount ?? */
 		pthread_exit(NULL);
 	}
 
-	status = do_expire(ap, mt->name, len);
+	status = do_expire(ap, mt.name, len);
 	pthread_setcancelstate(PTHREAD_CANCEL_DISABLE, &state);
 	if (status)
-		send_fail(ap->logopt, mt->ioctlfd, mt->wait_queue_token);
+		send_fail(ap->logopt, mt.ioctlfd, mt.wait_queue_token);
 	else {
 		struct mapent *me;
-		cache_readlock(mt->mc);
-		me = cache_lookup_distinct(mt->mc, mt->name);
+		cache_readlock(mt.mc);
+		me = cache_lookup_distinct(mt.mc, mt.name);
 		me->ioctlfd = -1;
-		cache_unlock(mt->mc);
-		send_ready(ap->logopt, mt->ioctlfd, mt->wait_queue_token);
-		close(mt->ioctlfd);
+		cache_unlock(mt.mc);
+		send_ready(ap->logopt, mt.ioctlfd, mt.wait_queue_token);
+		close(mt.ioctlfd);
 	}
 	pthread_setcancelstate(state, NULL);
 
 	pthread_cleanup_pop(0);
-	pthread_cleanup_pop(1);
-	pthread_cleanup_pop(1);
 
 	return NULL;
 }
@@ -1196,6 +1194,8 @@ int handle_packet_expire_direct(struct autofs_point *ap, autofs_packet_expire_di
 
 	cache_unlock(mc);
 
+	pthread_cleanup_push(free_pending_args, mt);
+	pthread_cleanup_push(pending_cond_destroy, mt);
 	pthread_cleanup_push(expire_mutex_unlock, NULL);
 	pthread_setcancelstate(state, NULL);
 
@@ -1207,6 +1207,8 @@ int handle_packet_expire_direct(struct autofs_point *ap, autofs_packet_expire_di
 	}
 
 	pthread_cleanup_pop(1);
+	pthread_cleanup_pop(1);
+	pthread_cleanup_pop(1);
 
 	return 0;
 }
diff --git a/daemon/indirect.c b/daemon/indirect.c
index c1bd3f2..81c7c84 100644
--- a/daemon/indirect.c
+++ b/daemon/indirect.c
@@ -595,40 +595,38 @@ static void expire_mutex_unlock(void *arg)
 
 static void *do_expire_indirect(void *arg)
 {
-	struct pending_args *mt;
+	struct pending_args *args, mt;
 	struct autofs_point *ap;
 	int status, state;
 
-	mt = (struct pending_args *) arg;
+	args = (struct pending_args *) arg;
 
 	status = pthread_mutex_lock(&ea_mutex);
 	if (status)
 		fatal(status);
 
-	ap = mt->ap;
+	memcpy(&mt, args, sizeof(struct pending_args));
+
+	ap = mt.ap;
 
-	mt->signaled = 1;
-	status = pthread_cond_signal(&mt->cond);
+	args->signaled = 1;
+	status = pthread_cond_signal(&args->cond);
 	if (status)
 		fatal(status);
 
 	expire_mutex_unlock(NULL);
 
-	pthread_cleanup_push(free_pending_args, mt);
-	pthread_cleanup_push(pending_cond_destroy, mt);
-	pthread_cleanup_push(expire_send_fail, mt);
+	pthread_cleanup_push(expire_send_fail, &mt);
 
-	status = do_expire(mt->ap, mt->name, mt->len);
+	status = do_expire(mt.ap, mt.name, mt.len);
 	pthread_setcancelstate(PTHREAD_CANCEL_DISABLE, &state);
 	if (status)
-		send_fail(ap->logopt, ap->ioctlfd, mt->wait_queue_token);
+		send_fail(ap->logopt, ap->ioctlfd, mt.wait_queue_token);
 	else
-		send_ready(ap->logopt, ap->ioctlfd, mt->wait_queue_token);
+		send_ready(ap->logopt, ap->ioctlfd, mt.wait_queue_token);
 	pthread_setcancelstate(state, NULL);
 
 	pthread_cleanup_pop(0);
-	pthread_cleanup_pop(1);
-	pthread_cleanup_pop(1);
 
 	return NULL;
 }
@@ -679,6 +677,8 @@ int handle_packet_expire_indirect(struct autofs_point *ap, autofs_packet_expire_
 		return 1;
 	}
 
+	pthread_cleanup_push(free_pending_args, mt);
+	pthread_cleanup_push(pending_cond_destroy, mt);
 	pthread_cleanup_push(expire_mutex_unlock, NULL);
 	pthread_setcancelstate(state, NULL);
 
@@ -690,6 +690,8 @@ int handle_packet_expire_indirect(struct autofs_point *ap, autofs_packet_expire_
 	}
 
 	pthread_cleanup_pop(1);
+	pthread_cleanup_pop(1);
+	pthread_cleanup_pop(1);
 
 	return 0;
 }

  reply	other threads:[~2008-05-23  2:35 UTC|newest]

Thread overview: 51+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-04-23 18:50 (no subject) Jim Carter
2008-04-23 20:04 ` Jeff Moyer
2008-04-24  3:10   ` Ian Kent
2008-04-24 16:52   ` clients suddenly start hanging (was: (no subject)) Jim Carter
2008-04-26  1:17   ` Jim Carter
2008-04-26  5:34     ` Ian Kent
2008-04-26 18:48       ` Jim Carter
2008-04-27  5:52         ` Ian Kent
2008-04-26 22:16       ` Jim Carter
2008-04-28  6:26 ` [PATCH 1/2] autofs4 - fix execution order race in mount request code Ian Kent
2008-05-08  4:52   ` clients suddenly start hanging (was: (no subject)) Jim Carter
2008-05-08  6:13     ` Ian Kent
2008-05-11  4:14       ` Jim Carter
2008-05-11  7:57         ` Ian Kent
2008-05-15 21:59           ` Jim Carter
2008-05-16  3:00             ` Ian Kent
2008-05-18  4:07             ` Ian Kent
2008-05-21  6:58               ` Ian Kent
2008-05-22 21:42               ` Jim Carter
2008-05-23  2:35                 ` Ian Kent [this message]
2008-05-26  0:34                   ` Jim Carter
2008-06-12  3:20                     ` Ian Kent
2008-06-12  4:50 [PATCH 00/10] Kernel patch series Ian Kent
2008-06-12  4:50 ` [PATCH 01/10] autofs4 - check for invalid dentry in getpath Ian Kent
2008-06-12  4:50 ` [PATCH 02/10] autofs4 - fix sparse warning in waitq.c:autofs4_expire_indirect() Ian Kent
2008-06-12  4:50 ` [PATCH 03/10] autofs4 - fix incorrect return from root.c:try_to_fill_dentry() Ian Kent
2008-06-12  4:51 ` [PATCH 04/10] autofs4 - fix mntput, dput order bug Ian Kent
2008-06-12  4:51 ` [PATCH 05/10] autofs4 - don't make expiring dentry negative Ian Kent
2008-06-12  4:51 ` [PATCH 06/10] autofs4 - use look aside list for lookups Ian Kent
2008-06-12  4:51 ` [PATCH 07/10] autofs4 - don't release directory mutex if called in oz_mode Ian Kent
2008-06-12  4:51 ` [PATCH 08/10] autofs4 - use lookup intent flags to trigger mounts Ian Kent
2008-06-12  4:51 ` [PATCH 09/10] autofs4 - use struct qstr in waitq.c Ian Kent
2008-06-12  4:51 ` [PATCH 10/10] autofs4 - fix pending mount race Ian Kent
2008-06-14  1:13 ` [PATCH 00/10] Kernel patch series Jim Carter
2008-06-14  3:30   ` Ian Kent
2008-06-14  3:42     ` Ian Kent
2008-06-19  0:40       ` clients suddenly start hanging (was: (no subject)) Jim Carter
2008-06-19  3:14         ` Ian Kent
2008-06-19 17:08           ` Jim Carter
2008-06-19 18:34           ` Jim Carter
2008-06-20  4:09             ` Ian Kent
2008-06-21  1:02               ` Jim Carter
2008-06-21  3:12                 ` Ian Kent
2008-06-23  3:49                   ` Jim Carter
2008-06-23  4:46                     ` Ian Kent
2008-06-24  3:08                       ` Ian Kent
2008-06-24 17:02                         ` Stephen Biggs
2008-06-24 23:39                         ` Jim Carter
2008-06-25  3:33                           ` Ian Kent
2008-06-25  5:00                             ` Ian Kent
2008-06-23  4:15                   ` Ian Kent

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1211510100.15090.27.camel@raven.themaw.net \
    --to=raven@themaw.net \
    --cc=autofs@linux.kernel.org \
    --cc=jimc@math.ucla.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.