linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: James Simmons <jsimmons@infradead.org>
To: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	devel@driverdev.osuosl.org,
	Andreas Dilger <andreas.dilger@intel.com>,
	Oleg Drokin <oleg.drokin@intel.com>
Cc: Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Lustre Development List <lustre-devel@lists.lustre.org>,
	Patrick Valentin <patrick.valentin@bull.net>,
	Gregoire Pichon <gregoire.pichon@bull.net>,
	James Simmons <jsimmons@infradead.org>
Subject: [PATCH 02/41] staging: lustre: obdclass: Add synchro in lu_context_key_degister()
Date: Sun,  2 Oct 2016 22:27:58 -0400	[thread overview]
Message-ID: <1475461717-21631-3-git-send-email-jsimmons@infradead.org> (raw)
In-Reply-To: <1475461717-21631-1-git-send-email-jsimmons@infradead.org>

From: Patrick Valentin <patrick.valentin@bull.net>

When unloading a module, it may happen that lu_context_key_degister()
removes a key while a thread is either registering it in a new
context (lu_context_init(), lu_context_refill()), or using it when
exiting from a context (lu_context__exit(), lu_context__fini()).

In these cases, we reference a key which no longer exists, and
the system crashes either because we use a *POISON'ed* pointer
in key_fini() -> key->lct_fini(), or because one of the following
assertions fails:
 - lu_context_key_degister():
        ASSERTION(cfs_atomic_read(&key->lct_used) == 1)
                  failed: key has instances: 2

 - lu_context_exit():
        ASSERTION(key != NULL)

 - key_fini():
        ASSERTION(atomic_read(&key->lct_used) > 1)

This can also leads to SLAB objects which are not freed:
        slab error in kmem_cache_destroy(): cache `echo_thread_kmem':
                   Can't free all objects

Note: ptlrpc service threads need to call lu_context_init/fini in
each loop (for each RPC), and this could be a big performance issue
on fat SMP machines if we add serialization by a spinlock and need
to lock/unlock it for multiple times for each RPC.

So the aim of this patch, which only impacts some low frequently used
functions, is:
  1) to add a synchronization in lu_context_key_quiesce(), also called
     by lu_context_key_degister(), to wait until all key::lct_init()
     methods have completed, by serializing with keys_fill()
  2) to add a synchronization in lu_context_key_degister(), to wait
     until all transient contexts referencing this key have run
     key::lct_fini() method

Signed-off-by: Patrick Valentin <patrick.valentin@bull.net>
Signed-off-by: Gregoire Pichon <gregoire.pichon@bull.net>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-6049
Reviewed-on: http://review.whamcloud.com/13164
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/obdclass/lu_object.c |   58 ++++++++++++++++++--
 1 files changed, 54 insertions(+), 4 deletions(-)

diff --git a/drivers/staging/lustre/lustre/obdclass/lu_object.c b/drivers/staging/lustre/lustre/obdclass/lu_object.c
index f0e74c6..e031fd2 100644
--- a/drivers/staging/lustre/lustre/obdclass/lu_object.c
+++ b/drivers/staging/lustre/lustre/obdclass/lu_object.c
@@ -1311,6 +1311,7 @@ enum {
 static struct lu_context_key *lu_keys[LU_CONTEXT_KEY_NR] = { NULL, };
 
 static DEFINE_SPINLOCK(lu_keys_guard);
+static atomic_t lu_key_initing_cnt = ATOMIC_INIT(0);
 
 /**
  * Global counter incremented whenever key is registered, unregistered,
@@ -1385,6 +1386,19 @@ void lu_context_key_degister(struct lu_context_key *key)
 	++key_set_version;
 	spin_lock(&lu_keys_guard);
 	key_fini(&lu_shrink_env.le_ctx, key->lct_index);
+
+	/**
+	 * Wait until all transient contexts referencing this key have
+	 * run lu_context_key::lct_fini() method.
+	 */
+	while (atomic_read(&key->lct_used) > 1) {
+		spin_unlock(&lu_keys_guard);
+		CDEBUG(D_INFO, "lu_context_key_degister: \"%s\" %p, %d\n",
+		       key->lct_owner ? key->lct_owner->name : "", key,
+		       atomic_read(&key->lct_used));
+		schedule();
+		spin_lock(&lu_keys_guard);
+	}
 	if (lu_keys[key->lct_index]) {
 		lu_keys[key->lct_index] = NULL;
 		lu_ref_fini(&key->lct_reference);
@@ -1510,11 +1524,26 @@ void lu_context_key_quiesce(struct lu_context_key *key)
 		 * XXX layering violation.
 		 */
 		cl_env_cache_purge(~0);
-		key->lct_tags |= LCT_QUIESCENT;
 		/*
 		 * XXX memory barrier has to go here.
 		 */
 		spin_lock(&lu_keys_guard);
+		key->lct_tags |= LCT_QUIESCENT;
+
+		/**
+		 * Wait until all lu_context_key::lct_init() methods
+		 * have completed.
+		 */
+		while (atomic_read(&lu_key_initing_cnt) > 0) {
+			spin_unlock(&lu_keys_guard);
+			CDEBUG(D_INFO, "lu_context_key_quiesce: \"%s\" %p, %d (%d)\n",
+			       key->lct_owner ? key->lct_owner->name : "",
+			       key, atomic_read(&key->lct_used),
+			atomic_read(&lu_key_initing_cnt));
+			schedule();
+			spin_lock(&lu_keys_guard);
+		}
+
 		list_for_each_entry(ctx, &lu_context_remembered, lc_remember)
 			key_fini(ctx, key->lct_index);
 		spin_unlock(&lu_keys_guard);
@@ -1546,6 +1575,19 @@ static int keys_fill(struct lu_context *ctx)
 {
 	unsigned int i;
 
+	/*
+	 * A serialisation with lu_context_key_quiesce() is needed, but some
+	 * "key->lct_init()" are calling kernel memory allocation routine and
+	 * can't be called while holding a spin_lock.
+	 * "lu_keys_guard" is held while incrementing "lu_key_initing_cnt"
+	 * to ensure the start of the serialisation.
+	 * An atomic_t variable is still used, in order not to reacquire the
+	 * lock when decrementing the counter.
+	 */
+	spin_lock(&lu_keys_guard);
+	atomic_inc(&lu_key_initing_cnt);
+	spin_unlock(&lu_keys_guard);
+
 	LINVRNT(ctx->lc_value);
 	for (i = 0; i < ARRAY_SIZE(lu_keys); ++i) {
 		struct lu_context_key *key;
@@ -1563,12 +1605,19 @@ static int keys_fill(struct lu_context *ctx)
 			LINVRNT(key->lct_init);
 			LINVRNT(key->lct_index == i);
 
+			LASSERT(key->lct_owner);
+			if (!(ctx->lc_tags & LCT_NOREF) &&
+			    !try_module_get(key->lct_owner)) {
+				/* module is unloading, skip this key */
+				continue;
+			}
+
 			value = key->lct_init(ctx, key);
-			if (IS_ERR(value))
+			if (unlikely(IS_ERR(value))) {
+				atomic_dec(&lu_key_initing_cnt);
 				return PTR_ERR(value);
+			}
 
-			if (!(ctx->lc_tags & LCT_NOREF))
-				try_module_get(key->lct_owner);
 			lu_ref_add_atomic(&key->lct_reference, "ctx", ctx);
 			atomic_inc(&key->lct_used);
 			/*
@@ -1582,6 +1631,7 @@ static int keys_fill(struct lu_context *ctx)
 		}
 		ctx->lc_version = key_set_version;
 	}
+	atomic_dec(&lu_key_initing_cnt);
 	return 0;
 }
 
-- 
1.7.1

  parent reply	other threads:[~2016-10-03  2:29 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-10-03  2:27 [PATCH 00/41] missing patches for lustre 2.7.50 to 2.7.55 James Simmons
2016-10-03  2:27 ` [PATCH 01/41] staging: lustre: obdclass: fix race during key quiescency James Simmons
2016-10-03  2:27 ` James Simmons [this message]
2016-10-03  2:27 ` [PATCH 03/41] staging: lustre: llite: remove client Size on MDS support James Simmons
2016-10-03  2:28 ` [PATCH 04/41] staging: lustre: obd: " James Simmons
2016-10-03  2:28 ` [PATCH 05/41] staging: lustre: clio: Revise read ahead implementation James Simmons
2016-10-03  2:28 ` [PATCH 06/41] staging: lustre: ldlm: remove unnecessary EXPORT_SYMBOL James Simmons
2016-10-03  2:28 ` [PATCH 07/41] staging: lustre: llite: remove duplicate fiemap defines James Simmons
2016-10-03  2:28 ` [PATCH 08/41] staging: lustre: ptlrpc: ret -ECONNREFUSED if not context found in req James Simmons
2016-10-03  2:28 ` [PATCH 09/41] staging: lustre: llite: default dir stripe index only for mkdir James Simmons
2016-10-03  2:28 ` [PATCH 10/41] staging: lustre: libcfs: shortcut to create CPT from NUMA topology James Simmons
2016-10-03  2:28 ` [PATCH 11/41] staging: lustre: ptlrpc: Add OBD_CONNECT_MULTIMODRPCS flag James Simmons
2016-10-03  2:28 ` [PATCH 12/41] staging: lustre: clio: get rid of lov_stripe_md reference James Simmons
2016-10-03  2:28 ` [PATCH 13/41] staging: lustre: clio: use CIT_SETATTR for FSFILT_IOC_SETFLAGS James Simmons
2016-10-03  2:28 ` [PATCH 14/41] staging: lustre: ptlrpc: Add a tag field to ptlrpc messages James Simmons
2016-10-03  2:28 ` [PATCH 15/41] staging: lustre: osc: fix bug when setting max_pages_per_rpc James Simmons
2016-10-03  2:28 ` [PATCH 16/41] staging: lustre: ldlm: Do not use cbpending for group locks James Simmons
2016-10-03  2:28 ` [PATCH 17/41] staging: lustre: ptlrpc: remove old protocol compatibility James Simmons
2016-10-03  2:28 ` [PATCH 18/41] staging: lustre: llite: Report first encountered error James Simmons
2016-10-03  2:28 ` [PATCH 19/41] staging: lustre: ptlrpc: dont take unwrap in req_waittime calculation James Simmons
2016-10-03  2:28 ` [PATCH 20/41] staging: lustre: remove Size on MDS support James Simmons
2016-10-03  2:28 ` [PATCH 21/41] staging: lustre: mdc: Removed unneeded NULL check James Simmons
2016-10-03  2:28 ` [PATCH 22/41] staging: lustre: obd: remove unused LSM parameters James Simmons
2016-10-03  2:28 ` [PATCH 23/41] staging: lustre: mgc: MGC should retry for invalid import James Simmons
2016-10-03  2:28 ` [PATCH 24/41] staging: lustre: clio: add CIT_DATA_VERSION and remove IOC_LOV_GETINFO James Simmons
2018-02-22  3:06   ` NeilBrown
2016-10-03  2:28 ` [PATCH 25/41] staging: lustre: lov: add cl_object_layout_get() James Simmons
2016-10-03  2:28 ` [PATCH 26/41] staging: lustre: llite: remove lli_has_smd James Simmons
2016-10-03  2:28 ` [PATCH 27/41] staging: lustre: llite: add cl_object_maxbytes() James Simmons
2016-10-03  2:28 ` [PATCH 28/41] staging: lustre: hsm: make HSM modification requests replayable James Simmons
2016-10-03  2:28 ` [PATCH 29/41] staging: lustre: ptlrpc: Move NRS structures out of lustre_net.h James Simmons
2016-10-03  2:28 ` [PATCH 30/41] staging: lustre: quota: remove obsolete quota code James Simmons
2016-10-03  2:28 ` [PATCH 31/41] staging: lustre: obd: remove destroy cookie handling James Simmons
2016-10-03  2:28 ` [PATCH 32/41] staging: lustre: llite: restart short read/write for normal IO James Simmons
2016-10-09 14:16   ` Greg Kroah-Hartman
2016-10-11 23:22     ` James Simmons
2016-10-12  6:08       ` Greg Kroah-Hartman
2016-10-13 22:45         ` James Simmons
2016-10-14  7:41           ` Greg Kroah-Hartman
2016-10-03  2:28 ` [PATCH 33/41] staging: lustre: lov: use obd_get_info() to get def/max LOV EA sizes James Simmons
2016-10-03  2:28 ` [PATCH 34/41] staging: lustre: ldlm: cancel aged locks for LRUR James Simmons
2016-10-03  2:28 ` [PATCH 35/41] staging: lustre: hsm: Use file lease to implement migration James Simmons
2016-10-09 14:18   ` Greg Kroah-Hartman
2016-10-03  2:28 ` [PATCH 36/41] staging: lustre: ldlm: interval tree search in ldlm_lock_match() James Simmons
2016-10-03  2:28 ` [PATCH 37/41] staging: lustre: lov: copy_to_user uses wrong casting James Simmons
2016-10-03  2:28 ` [PATCH 38/41] staging: lustre: mdc: add max modify RPCs in flight variable James Simmons
2016-10-03  2:28 ` [PATCH 39/41] staging: lustre: osc: remove remaining bits for capa support James Simmons
2016-10-03  2:28 ` [PATCH 40/41] staging: lustre: lov: move LSM to LOV layer James Simmons
2016-10-03  2:28 ` [PATCH 41/41] staging: lustre: echo: request pages in batches James Simmons

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1475461717-21631-3-git-send-email-jsimmons@infradead.org \
    --to=jsimmons@infradead.org \
    --cc=andreas.dilger@intel.com \
    --cc=devel@driverdev.osuosl.org \
    --cc=gregkh@linuxfoundation.org \
    --cc=gregoire.pichon@bull.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lustre-devel@lists.lustre.org \
    --cc=oleg.drokin@intel.com \
    --cc=patrick.valentin@bull.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).