All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH] Using pthreads to handle upcalls.
@ 2015-11-09 13:05 Steve Dickson
  2015-11-09 13:05 ` [PATCH] gssd: use " Steve Dickson
  2015-11-09 18:39 ` [RFC PATCH] Using " J. Bruce Fields
  0 siblings, 2 replies; 5+ messages in thread
From: Steve Dickson @ 2015-11-09 13:05 UTC (permalink / raw)
  To: Linux NFS Mailing list

Recently a bug was found that was causing a
TGT fetch for every mount upcall. The bug was
caused when forking for every mount was introduce.
The global memory for the TGT cache was being
freed when the forked process existed. 

The fix we came up with was to only fork on non-root
upcalls, basically mount upcalls would no longer fork.
In debugging the patch it became apparent that if the
process hung, all NFS mounts on that client would be blocked.
So at this point rpc.gssd is a single point of failure.

This patch replaces the forking/non-forking with creating
pthreads for every upcall which I think is a better 
solution to the original problem since pthreads can share
global data. 

I was also hoping using pthread would bring more asynchronous
to rpc.gssd. I was thinking rpc.gssd could take an upcall,
fire off a thread to handle it, the go back and listen
for more upcalls. 

Unfortunately this is not the case. It seems, maybe due to
my lack of my pthreads understanding, that after each 
pthread_create() call, a pthread_join() call, which waits for 
the created to stop, is needed. Similar to fork/wait..

This means if an upcall pthread gets hung the daemon
is also hung... The same single point of failure... 

I do believe using threads is a better solution than
the non-fork solution, but rpc.gssd is still a single 
point of failure. Plus I'm hoping moving to pthread will 
allow us to solve that problem.

Steve Dickson (1):
  gssd: use pthreads to handle upcalls

 aclocal/libpthread.m4  | 13 +++++++++++++
 configure.ac           |  3 +++
 utils/gssd/Makefile.am |  3 ++-
 utils/gssd/gssd.c      | 18 ++++++++++++++++--
 utils/gssd/gssd.h      |  2 ++
 utils/gssd/gssd_proc.c | 30 ------------------------------
 utils/gssd/krb5_util.c |  3 +++
 7 files changed, 39 insertions(+), 33 deletions(-)
 create mode 100644 aclocal/libpthread.m4

-- 
2.4.3


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH] gssd: use pthreads to handle upcalls
  2015-11-09 13:05 [RFC PATCH] Using pthreads to handle upcalls Steve Dickson
@ 2015-11-09 13:05 ` Steve Dickson
  2015-11-09 18:39 ` [RFC PATCH] Using " J. Bruce Fields
  1 sibling, 0 replies; 5+ messages in thread
From: Steve Dickson @ 2015-11-09 13:05 UTC (permalink / raw)
  To: Linux NFS Mailing list

Currently, to persevere global data over multiple mounts,
the root process does not fork when handling an upcall.
Instead of not-forking, create a pthread to handle the
all upcalls since global data can be shared among threads.

Signed-off-by: Steve Dickson <steved@redhat.com>
---
 aclocal/libpthread.m4  | 13 +++++++++++++
 configure.ac           |  3 +++
 utils/gssd/Makefile.am |  3 ++-
 utils/gssd/gssd.c      | 18 ++++++++++++++++--
 utils/gssd/gssd.h      |  2 ++
 utils/gssd/gssd_proc.c | 30 ------------------------------
 utils/gssd/krb5_util.c |  3 +++
 7 files changed, 39 insertions(+), 33 deletions(-)
 create mode 100644 aclocal/libpthread.m4

diff --git a/aclocal/libpthread.m4 b/aclocal/libpthread.m4
new file mode 100644
index 0000000..e87d2a0
--- /dev/null
+++ b/aclocal/libpthread.m4
@@ -0,0 +1,13 @@
+dnl Checks for pthreads library and headers
+dnl
+AC_DEFUN([AC_LIBPTHREAD], [
+
+    dnl Check for library, but do not add -lpthreads to LIBS
+    AC_CHECK_LIB([pthread], [pthread_create], [LIBPTHREAD=-lpthread],
+                 [AC_MSG_ERROR([libpthread not found.])])
+  AC_SUBST(LIBPTHREAD)
+
+  AC_CHECK_HEADERS([pthread.h], ,
+                   [AC_MSG_ERROR([libpthread headers not found.])])
+
+])dnl
diff --git a/configure.ac b/configure.ac
index 25d2ba4..e0ecd61 100644
--- a/configure.ac
+++ b/configure.ac
@@ -385,6 +385,9 @@ if test "$enable_gss" = yes; then
   dnl Invoked after AC_KERBEROS_V5; AC_LIBRPCSECGSS needs to have KRBLIBS set
   AC_LIBRPCSECGSS
 
+  dnl Check for pthreads
+  AC_LIBPTHREAD
+
   dnl librpcsecgss already has a dependency on libgssapi,
   dnl but we need to make sure we get the right version
   if test "$enable_gss" = yes; then
diff --git a/utils/gssd/Makefile.am b/utils/gssd/Makefile.am
index cb040b3..3f5f59a 100644
--- a/utils/gssd/Makefile.am
+++ b/utils/gssd/Makefile.am
@@ -49,7 +49,8 @@ gssd_LDADD = \
 	$(RPCSECGSS_LIBS) \
 	$(KRBLIBS) \
 	$(GSSAPI_LIBS) \
-	$(LIBTIRPC)
+	$(LIBTIRPC) \
+	$(LIBPTHREAD)
 
 gssd_LDFLAGS = \
 	$(KRBLDFLAGS)
diff --git a/utils/gssd/gssd.c b/utils/gssd/gssd.c
index e7cb07f..3c425ce 100644
--- a/utils/gssd/gssd.c
+++ b/utils/gssd/gssd.c
@@ -365,16 +365,30 @@ static void
 gssd_clnt_gssd_cb(int UNUSED(fd), short UNUSED(which), void *data)
 {
 	struct clnt_info *clp = data;
+	pthread_t th;
+	int ret; 
 
-	handle_gssd_upcall(clp);
+	ret = pthread_create(&th, NULL, (void *)handle_gssd_upcall, (void *)clp);
+	if (ret != 0) {
+		printerr(0, "pthread_create failed: ret %d: %s\n", ret, strerror(errno)); 
+		return;
+	}
+	pthread_join(th, NULL);
 }
 
 static void
 gssd_clnt_krb5_cb(int UNUSED(fd), short UNUSED(which), void *data)
 {
 	struct clnt_info *clp = data;
+	pthread_t th;
+	int ret; 
+
+	ret = pthread_create(&th, NULL, (void *)handle_krb5_upcall, (void *)clp);
+	if (ret != 0) {
+		printerr(0, "pthread_create failed: ret %d: %s\n", ret, strerror(errno)); 
+	}
 
-	handle_krb5_upcall(clp);
+	pthread_join(th, NULL);
 }
 
 static struct clnt_info *
diff --git a/utils/gssd/gssd.h b/utils/gssd/gssd.h
index c6937c5..8677329 100644
--- a/utils/gssd/gssd.h
+++ b/utils/gssd/gssd.h
@@ -36,6 +36,7 @@
 #include <gssapi/gssapi.h>
 #include <event.h>
 #include <stdbool.h>
+#include <pthread.h>
 
 #ifndef GSSD_PIPEFS_DIR
 #define GSSD_PIPEFS_DIR		"/var/lib/nfs/rpc_pipefs"
@@ -61,6 +62,7 @@ extern int			root_uses_machine_creds;
 extern unsigned int 		context_timeout;
 extern unsigned int rpc_timeout;
 extern char			*preferred_realm;
+extern pthread_mutex_t ple_lock;
 
 struct clnt_info {
 	TAILQ_ENTRY(clnt_info)	list;
diff --git a/utils/gssd/gssd_proc.c b/utils/gssd/gssd_proc.c
index 1ef68d8..616082e 100644
--- a/utils/gssd/gssd_proc.c
+++ b/utils/gssd/gssd_proc.c
@@ -629,36 +629,6 @@ process_krb5_upcall(struct clnt_info *clp, uid_t uid, int fd, char *tgtname,
 	if (uid != 0 || (uid == 0 && root_uses_machine_creds == 0 &&
 				service == NULL)) {
 
-		/* already running as uid 0 */
-		if (uid == 0)
-			goto no_fork;
-
-		pid = fork();
-		switch(pid) {
-		case 0:
-			/* Child: fall through to rest of function */
-			childpid = getpid();
-			unsetenv("KRB5CCNAME");
-			printerr(2, "CHILD forked pid %d \n", childpid);
-			break;
-		case -1:
-			/* fork() failed! */
-			printerr(0, "WARNING: unable to fork() to handle"
-				"upcall: %s\n", strerror(errno));
-			return;
-		default:
-			/* Parent: just wait on child to exit and return */
-			do {
-				pid = wait(&err);
-			} while(pid == -1 && errno != -ECHILD);
-
-			if (WIFSIGNALED(err))
-				printerr(0, "WARNING: forked child was killed"
-					 "with signal %d\n", WTERMSIG(err));
-			return;
-		}
-no_fork:
-
 		auth = krb5_not_machine_creds(clp, uid, tgtname, &downcall_err,
 						&err, &rpc_clnt);
 		if (err)
diff --git a/utils/gssd/krb5_util.c b/utils/gssd/krb5_util.c
index 8ef8184..7354dc7 100644
--- a/utils/gssd/krb5_util.c
+++ b/utils/gssd/krb5_util.c
@@ -128,6 +128,7 @@
 
 /* Global list of principals/cache file names for machine credentials */
 struct gssd_k5_kt_princ *gssd_k5_kt_princ_list = NULL;
+pthread_mutex_t ple_lock;
 
 #ifdef HAVE_SET_ALLOWABLE_ENCTYPES
 int limit_to_legacy_enctypes = 0;
@@ -586,10 +587,12 @@ get_ple_by_princ(krb5_context context, krb5_principal princ)
 
 	/* Need to serialize list if we ever become multi-threaded! */
 
+	pthread_mutex_lock(&ple_lock);
 	ple = find_ple_by_princ(context, princ);
 	if (ple == NULL) {
 		ple = new_ple(context, princ);
 	}
+	pthread_mutex_unlock(&ple_lock);
 
 	return ple;
 }
-- 
2.4.3


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [RFC PATCH] Using pthreads to handle upcalls.
  2015-11-09 13:05 [RFC PATCH] Using pthreads to handle upcalls Steve Dickson
  2015-11-09 13:05 ` [PATCH] gssd: use " Steve Dickson
@ 2015-11-09 18:39 ` J. Bruce Fields
  2015-11-09 20:11   ` Steve Dickson
  1 sibling, 1 reply; 5+ messages in thread
From: J. Bruce Fields @ 2015-11-09 18:39 UTC (permalink / raw)
  To: Steve Dickson; +Cc: Linux NFS Mailing list

On Mon, Nov 09, 2015 at 08:05:07AM -0500, Steve Dickson wrote:
> Recently a bug was found that was causing a
> TGT fetch for every mount upcall. The bug was
> caused when forking for every mount was introduce.
> The global memory for the TGT cache was being
> freed when the forked process existed. 
> 
> The fix we came up with was to only fork on non-root
> upcalls, basically mount upcalls would no longer fork.
> In debugging the patch it became apparent that if the
> process hung, all NFS mounts on that client would be blocked.
> So at this point rpc.gssd is a single point of failure.
> 
> This patch replaces the forking/non-forking with creating
> pthreads for every upcall which I think is a better 
> solution to the original problem since pthreads can share
> global data. 

I seem to recall the reason for the fork is to allow dropping some
privileges while processing the upcall, is that right?  But looking at
pthreads(7), it looks like those are probably shared (e.g., it says user
and group IDs are process-wide).

> I was also hoping using pthread would bring more asynchronous
> to rpc.gssd. I was thinking rpc.gssd could take an upcall,
> fire off a thread to handle it, the go back and listen
> for more upcalls. 
> 
> Unfortunately this is not the case. It seems, maybe due to
> my lack of my pthreads understanding, that after each 
> pthread_create() call, a pthread_join() call, which waits for 
> the created to stop, is needed. Similar to fork/wait..

Actually making gssd thread-safe would be a significant effort.

> This means if an upcall pthread gets hung the daemon
> is also hung... The same single point of failure... 
> 
> I do believe using threads is a better solution than
> the non-fork solution, but rpc.gssd is still a single 
> point of failure. Plus I'm hoping moving to pthread will 
> allow us to solve that problem.

So this doesn't actually fix anything right now?

--b.

> 
> Steve Dickson (1):
>   gssd: use pthreads to handle upcalls
> 
>  aclocal/libpthread.m4  | 13 +++++++++++++
>  configure.ac           |  3 +++
>  utils/gssd/Makefile.am |  3 ++-
>  utils/gssd/gssd.c      | 18 ++++++++++++++++--
>  utils/gssd/gssd.h      |  2 ++
>  utils/gssd/gssd_proc.c | 30 ------------------------------
>  utils/gssd/krb5_util.c |  3 +++
>  7 files changed, 39 insertions(+), 33 deletions(-)
>  create mode 100644 aclocal/libpthread.m4
> 
> -- 
> 2.4.3
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [RFC PATCH] Using pthreads to handle upcalls.
  2015-11-09 18:39 ` [RFC PATCH] Using " J. Bruce Fields
@ 2015-11-09 20:11   ` Steve Dickson
  2015-11-09 21:31     ` J. Bruce Fields
  0 siblings, 1 reply; 5+ messages in thread
From: Steve Dickson @ 2015-11-09 20:11 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: Linux NFS Mailing list

Hey,

On 11/09/2015 01:39 PM, J. Bruce Fields wrote:
> On Mon, Nov 09, 2015 at 08:05:07AM -0500, Steve Dickson wrote:
>> Recently a bug was found that was causing a
>> TGT fetch for every mount upcall. The bug was
>> caused when forking for every mount was introduce.
>> The global memory for the TGT cache was being
>> freed when the forked process existed. 
>>
>> The fix we came up with was to only fork on non-root
>> upcalls, basically mount upcalls would no longer fork.
>> In debugging the patch it became apparent that if the
>> process hung, all NFS mounts on that client would be blocked.
>> So at this point rpc.gssd is a single point of failure.
>>
>> This patch replaces the forking/non-forking with creating
>> pthreads for every upcall which I think is a better 
>> solution to the original problem since pthreads can share
>> global data. 
> 
> I seem to recall the reason for the fork is to allow dropping some
> privileges while processing the upcall, is that right? 
I don't see this where privileges are being dropped. 

> But looking at pthreads(7), it looks like those are probably 
> shared (e.g., it says user and group IDs are process-wide).
I think they are and that's also why thread can access the
same global data. 

> 
>> I was also hoping using pthread would bring more asynchronous
>> to rpc.gssd. I was thinking rpc.gssd could take an upcall,
>> fire off a thread to handle it, the go back and listen
>> for more upcalls. 
>>
>> Unfortunately this is not the case. It seems, maybe due to
>> my lack of my pthreads understanding, that after each 
>> pthread_create() call, a pthread_join() call, which waits for 
>> the created to stop, is needed. Similar to fork/wait..
> 
> Actually making gssd thread-safe would be a significant effort.
Is it because the MIT libs are not thread-safe? Isn't the 
gssd_k5_kt_princ_list the only global list?

Something to do with the upcalls? 

> 
>> This means if an upcall pthread gets hung the daemon
>> is also hung... The same single point of failure... 
>>
>> I do believe using threads is a better solution than
>> the non-fork solution, but rpc.gssd is still a single 
>> point of failure. Plus I'm hoping moving to pthread will 
>> allow us to solve that problem.
> 
> So this doesn't actually fix anything right now?
No.. it does not... :-)  But I do think its a clearer way 
handling global lists via multiple threads/process. It also makes 
the top of process_krb5_upcall easier to read, IMHO... 

Plus it does introduce pthread to nfs-utils.. So maybe some
day we can pull some these daemons into the 21th century 
by multi-threading them.. or kill them! I'm good either way. 8-)

Thanks for the cycles!!! 

steved.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [RFC PATCH] Using pthreads to handle upcalls.
  2015-11-09 20:11   ` Steve Dickson
@ 2015-11-09 21:31     ` J. Bruce Fields
  0 siblings, 0 replies; 5+ messages in thread
From: J. Bruce Fields @ 2015-11-09 21:31 UTC (permalink / raw)
  To: Steve Dickson; +Cc: Linux NFS Mailing list

On Mon, Nov 09, 2015 at 03:11:30PM -0500, Steve Dickson wrote:
> Hey,
> 
> On 11/09/2015 01:39 PM, J. Bruce Fields wrote:
> > On Mon, Nov 09, 2015 at 08:05:07AM -0500, Steve Dickson wrote:
> >> Recently a bug was found that was causing a
> >> TGT fetch for every mount upcall. The bug was
> >> caused when forking for every mount was introduce.
> >> The global memory for the TGT cache was being
> >> freed when the forked process existed. 
> >>
> >> The fix we came up with was to only fork on non-root
> >> upcalls, basically mount upcalls would no longer fork.
> >> In debugging the patch it became apparent that if the
> >> process hung, all NFS mounts on that client would be blocked.
> >> So at this point rpc.gssd is a single point of failure.
> >>
> >> This patch replaces the forking/non-forking with creating
> >> pthreads for every upcall which I think is a better 
> >> solution to the original problem since pthreads can share
> >> global data. 
> > 
> > I seem to recall the reason for the fork is to allow dropping some
> > privileges while processing the upcall, is that right? 
> I don't see this where privileges are being dropped. 

See 6b53fc9ce38ba6fff2fd5c2f6ed143747067a39d "gssd: do a more thorough
change of identity after forking".  I think the id's its switching there
are global to a process, not specific to a single pthread.

> >> I was also hoping using pthread would bring more asynchronous
> >> to rpc.gssd. I was thinking rpc.gssd could take an upcall,
> >> fire off a thread to handle it, the go back and listen
> >> for more upcalls. 
> >>
> >> Unfortunately this is not the case. It seems, maybe due to
> >> my lack of my pthreads understanding, that after each 
> >> pthread_create() call, a pthread_join() call, which waits for 
> >> the created to stop, is needed. Similar to fork/wait..
> > 
> > Actually making gssd thread-safe would be a significant effort.
> Is it because the MIT libs are not thread-safe?

I don't know, it's one of many things we'd need to check.

> Isn't the gssd_k5_kt_princ_list the only global list?  Something to do
> with the upcalls? 

There's also some global data that keeps track of the state of the rpc
pipefs directory--topdir_list, etc.

It might be easier just to share that one list?  I don't know what would
be involved, though.

I'm not convinced this is a good way to help with hangs--a lot of the
cases that would cause a hang are going to cause further threads to hang
too.

I can believe the parallelism might be useful in some cases for
performance reasons, but I don't know if we've seen such a case yet.

--b.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2015-11-09 21:31 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-11-09 13:05 [RFC PATCH] Using pthreads to handle upcalls Steve Dickson
2015-11-09 13:05 ` [PATCH] gssd: use " Steve Dickson
2015-11-09 18:39 ` [RFC PATCH] Using " J. Bruce Fields
2015-11-09 20:11   ` Steve Dickson
2015-11-09 21:31     ` J. Bruce Fields

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.