All of lore.kernel.org
 help / color / mirror / Atom feed
From: NeilBrown <neilb@suse.com>
To: James Simmons <jsimmons@infradead.org>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	devel@driverdev.osuosl.org,
	Andreas Dilger <andreas.dilger@intel.com>,
	Oleg Drokin <oleg.drokin@intel.com>,
	Lai Siyao <lai.siyao@intel.com>,
	Jinshan Xiong <jinshan.xiong@intel.com>
Cc: Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Lustre Development List <lustre-devel@lists.lustre.org>,
	James Simmons <jsimmons@infradead.org>
Subject: Re: [PATCH 4/4] staging: lustre: obdclass: change object lookup to no wait mode
Date: Fri, 04 May 2018 11:15:27 +1000	[thread overview]
Message-ID: <876044fcgg.fsf@notabene.neil.brown.name> (raw)
In-Reply-To: <1525285308-15347-5-git-send-email-jsimmons@infradead.org>

[-- Attachment #1: Type: text/plain, Size: 5386 bytes --]

On Wed, May 02 2018, James Simmons wrote:

> From: Lai Siyao <lai.siyao@intel.com>
>
> Currently we set LU_OBJECT_HEARD_BANSHEE on object when we want
> to remove object from cache, but this may lead to deadlock, because
> when other process lookup such object, it needs to wait for this
> object until release (done at last refcount put), while that process
> maybe already hold an LDLM lock.
>
> Now that current code can handle dying object correctly, we can just
> return such object in lookup, thus the above deadlock can be avoided.

I think one of the reasons that I didn't apply this to mainline myself
is that "Now that" comment.  When is the "now" that it is referring to?
Are were sure that all code in mainline "can handle dying objects
correctly"??


>
> Signed-off-by: Lai Siyao <lai.siyao@intel.com>
> Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-9049
> Reviewed-on: https://review.whamcloud.com/26965
> Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
> Tested-by: Cliff White <cliff.white@intel.com>
> Reviewed-by: Fan Yong <fan.yong@intel.com>
> Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
> Signed-off-by: James Simmons <jsimmons@infradead.org>
> ---
>  drivers/staging/lustre/lustre/include/lu_object.h  |  2 +-
>  drivers/staging/lustre/lustre/obdclass/lu_object.c | 82 +++++++++-------------
>  2 files changed, 36 insertions(+), 48 deletions(-)
>
> diff --git a/drivers/staging/lustre/lustre/include/lu_object.h b/drivers/staging/lustre/lustre/include/lu_object.h
> index f29bbca..232063a 100644
> --- a/drivers/staging/lustre/lustre/include/lu_object.h
> +++ b/drivers/staging/lustre/lustre/include/lu_object.h
> @@ -673,7 +673,7 @@ static inline void lu_object_get(struct lu_object *o)
>  }
>  
>  /**
> - * Return true of object will not be cached after last reference to it is
> + * Return true if object will not be cached after last reference to it is
>   * released.
>   */
>  static inline int lu_object_is_dying(const struct lu_object_header *h)
> diff --git a/drivers/staging/lustre/lustre/obdclass/lu_object.c b/drivers/staging/lustre/lustre/obdclass/lu_object.c
> index 8b507f1..9311703 100644
> --- a/drivers/staging/lustre/lustre/obdclass/lu_object.c
> +++ b/drivers/staging/lustre/lustre/obdclass/lu_object.c
> @@ -589,19 +589,13 @@ static struct lu_object *htable_lookup(struct lu_site *s,
>  				       const struct lu_fid *f,
>  				       __u64 *version)
>  {
> -	struct cfs_hash		*hs = s->ls_obj_hash;
>  	struct lu_site_bkt_data *bkt;
>  	struct lu_object_header *h;
>  	struct hlist_node	*hnode;
> -	__u64 ver;
> -	wait_queue_entry_t waiter;
> +	u64 ver = cfs_hash_bd_version_get(bd);
>  
> -retry:
> -	ver = cfs_hash_bd_version_get(bd);
> -
> -	if (*version == ver) {
> +	if (*version == ver)
>  		return ERR_PTR(-ENOENT);
> -	}
>  
>  	*version = ver;
>  	bkt = cfs_hash_bd_extra_get(s->ls_obj_hash, bd);
> @@ -615,31 +609,13 @@ static struct lu_object *htable_lookup(struct lu_site *s,
>  	}
>  
>  	h = container_of(hnode, struct lu_object_header, loh_hash);
> -	if (likely(!lu_object_is_dying(h))) {
> -		cfs_hash_get(s->ls_obj_hash, hnode);
> -		lprocfs_counter_incr(s->ls_stats, LU_SS_CACHE_HIT);
> -		if (!list_empty(&h->loh_lru)) {
> -			list_del_init(&h->loh_lru);
> -			percpu_counter_dec(&s->ls_lru_len_counter);
> -		}
> -		return lu_object_top(h);
> +	cfs_hash_get(s->ls_obj_hash, hnode);
> +	lprocfs_counter_incr(s->ls_stats, LU_SS_CACHE_HIT);
> +	if (!list_empty(&h->loh_lru)) {
> +		list_del_init(&h->loh_lru);
> +		percpu_counter_dec(&s->ls_lru_len_counter);
>  	}
> -
> -	/*
> -	 * Lookup found an object being destroyed this object cannot be
> -	 * returned (to assure that references to dying objects are eventually
> -	 * drained), and moreover, lookup has to wait until object is freed.
> -	 */
> -
> -	init_waitqueue_entry(&waiter, current);
> -	add_wait_queue(&bkt->lsb_marche_funebre, &waiter);
> -	set_current_state(TASK_UNINTERRUPTIBLE);
> -	lprocfs_counter_incr(s->ls_stats, LU_SS_CACHE_DEATH_RACE);
> -	cfs_hash_bd_unlock(hs, bd, 1);
> -	schedule();
> -	remove_wait_queue(&bkt->lsb_marche_funebre, &waiter);
> -	cfs_hash_bd_lock(hs, bd, 1);
> -	goto retry;
> +	return lu_object_top(h);
>  }
>  
>  /**
> @@ -680,6 +656,8 @@ static void lu_object_limit(const struct lu_env *env, struct lu_device *dev)
>  }
>  
>  /**
> + * Core logic of lu_object_find*() functions.
> + *
>   * Much like lu_object_find(), but top level device of object is specifically
>   * \a dev rather than top level device of the site. This interface allows
>   * objects of different "stacking" to be created within the same site.
> @@ -713,36 +691,46 @@ struct lu_object *lu_object_find_at(const struct lu_env *env,
>  	 * It is unnecessary to perform lookup-alloc-lookup-insert, instead,
>  	 * just alloc and insert directly.
>  	 *
> +	 * If dying object is found during index search, add @waiter to the
> +	 * site wait-queue and return ERR_PTR(-EAGAIN).

It seems odd to add this comment here, when it seems to describe code
that is being removed.
I can see that this comment is added by the upstream patch
Commit: fa14bdf6b648 ("LU-9049 obdclass: change object lookup to no wait mode")
but I cannot see what it refers to.

Otherwise that patch looks good.

Thanks,
NeilBrown

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

WARNING: multiple messages have this Message-ID (diff)
From: NeilBrown <neilb@suse.com>
To: James Simmons <jsimmons@infradead.org>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	devel@driverdev.osuosl.org,
	Andreas Dilger <andreas.dilger@intel.com>,
	Oleg Drokin <oleg.drokin@intel.com>,
	Lai Siyao <lai.siyao@intel.com>,
	Jinshan Xiong <jinshan.xiong@intel.com>
Cc: Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Lustre Development List <lustre-devel@lists.lustre.org>
Subject: [lustre-devel] [PATCH 4/4] staging: lustre: obdclass: change object lookup to no wait mode
Date: Fri, 04 May 2018 11:15:27 +1000	[thread overview]
Message-ID: <876044fcgg.fsf@notabene.neil.brown.name> (raw)
In-Reply-To: <1525285308-15347-5-git-send-email-jsimmons@infradead.org>

On Wed, May 02 2018, James Simmons wrote:

> From: Lai Siyao <lai.siyao@intel.com>
>
> Currently we set LU_OBJECT_HEARD_BANSHEE on object when we want
> to remove object from cache, but this may lead to deadlock, because
> when other process lookup such object, it needs to wait for this
> object until release (done at last refcount put), while that process
> maybe already hold an LDLM lock.
>
> Now that current code can handle dying object correctly, we can just
> return such object in lookup, thus the above deadlock can be avoided.

I think one of the reasons that I didn't apply this to mainline myself
is that "Now that" comment.  When is the "now" that it is referring to?
Are were sure that all code in mainline "can handle dying objects
correctly"??


>
> Signed-off-by: Lai Siyao <lai.siyao@intel.com>
> Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-9049
> Reviewed-on: https://review.whamcloud.com/26965
> Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
> Tested-by: Cliff White <cliff.white@intel.com>
> Reviewed-by: Fan Yong <fan.yong@intel.com>
> Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
> Signed-off-by: James Simmons <jsimmons@infradead.org>
> ---
>  drivers/staging/lustre/lustre/include/lu_object.h  |  2 +-
>  drivers/staging/lustre/lustre/obdclass/lu_object.c | 82 +++++++++-------------
>  2 files changed, 36 insertions(+), 48 deletions(-)
>
> diff --git a/drivers/staging/lustre/lustre/include/lu_object.h b/drivers/staging/lustre/lustre/include/lu_object.h
> index f29bbca..232063a 100644
> --- a/drivers/staging/lustre/lustre/include/lu_object.h
> +++ b/drivers/staging/lustre/lustre/include/lu_object.h
> @@ -673,7 +673,7 @@ static inline void lu_object_get(struct lu_object *o)
>  }
>  
>  /**
> - * Return true of object will not be cached after last reference to it is
> + * Return true if object will not be cached after last reference to it is
>   * released.
>   */
>  static inline int lu_object_is_dying(const struct lu_object_header *h)
> diff --git a/drivers/staging/lustre/lustre/obdclass/lu_object.c b/drivers/staging/lustre/lustre/obdclass/lu_object.c
> index 8b507f1..9311703 100644
> --- a/drivers/staging/lustre/lustre/obdclass/lu_object.c
> +++ b/drivers/staging/lustre/lustre/obdclass/lu_object.c
> @@ -589,19 +589,13 @@ static struct lu_object *htable_lookup(struct lu_site *s,
>  				       const struct lu_fid *f,
>  				       __u64 *version)
>  {
> -	struct cfs_hash		*hs = s->ls_obj_hash;
>  	struct lu_site_bkt_data *bkt;
>  	struct lu_object_header *h;
>  	struct hlist_node	*hnode;
> -	__u64 ver;
> -	wait_queue_entry_t waiter;
> +	u64 ver = cfs_hash_bd_version_get(bd);
>  
> -retry:
> -	ver = cfs_hash_bd_version_get(bd);
> -
> -	if (*version == ver) {
> +	if (*version == ver)
>  		return ERR_PTR(-ENOENT);
> -	}
>  
>  	*version = ver;
>  	bkt = cfs_hash_bd_extra_get(s->ls_obj_hash, bd);
> @@ -615,31 +609,13 @@ static struct lu_object *htable_lookup(struct lu_site *s,
>  	}
>  
>  	h = container_of(hnode, struct lu_object_header, loh_hash);
> -	if (likely(!lu_object_is_dying(h))) {
> -		cfs_hash_get(s->ls_obj_hash, hnode);
> -		lprocfs_counter_incr(s->ls_stats, LU_SS_CACHE_HIT);
> -		if (!list_empty(&h->loh_lru)) {
> -			list_del_init(&h->loh_lru);
> -			percpu_counter_dec(&s->ls_lru_len_counter);
> -		}
> -		return lu_object_top(h);
> +	cfs_hash_get(s->ls_obj_hash, hnode);
> +	lprocfs_counter_incr(s->ls_stats, LU_SS_CACHE_HIT);
> +	if (!list_empty(&h->loh_lru)) {
> +		list_del_init(&h->loh_lru);
> +		percpu_counter_dec(&s->ls_lru_len_counter);
>  	}
> -
> -	/*
> -	 * Lookup found an object being destroyed this object cannot be
> -	 * returned (to assure that references to dying objects are eventually
> -	 * drained), and moreover, lookup has to wait until object is freed.
> -	 */
> -
> -	init_waitqueue_entry(&waiter, current);
> -	add_wait_queue(&bkt->lsb_marche_funebre, &waiter);
> -	set_current_state(TASK_UNINTERRUPTIBLE);
> -	lprocfs_counter_incr(s->ls_stats, LU_SS_CACHE_DEATH_RACE);
> -	cfs_hash_bd_unlock(hs, bd, 1);
> -	schedule();
> -	remove_wait_queue(&bkt->lsb_marche_funebre, &waiter);
> -	cfs_hash_bd_lock(hs, bd, 1);
> -	goto retry;
> +	return lu_object_top(h);
>  }
>  
>  /**
> @@ -680,6 +656,8 @@ static void lu_object_limit(const struct lu_env *env, struct lu_device *dev)
>  }
>  
>  /**
> + * Core logic of lu_object_find*() functions.
> + *
>   * Much like lu_object_find(), but top level device of object is specifically
>   * \a dev rather than top level device of the site. This interface allows
>   * objects of different "stacking" to be created within the same site.
> @@ -713,36 +691,46 @@ struct lu_object *lu_object_find_at(const struct lu_env *env,
>  	 * It is unnecessary to perform lookup-alloc-lookup-insert, instead,
>  	 * just alloc and insert directly.
>  	 *
> +	 * If dying object is found during index search, add @waiter to the
> +	 * site wait-queue and return ERR_PTR(-EAGAIN).

It seems odd to add this comment here, when it seems to describe code
that is being removed.
I can see that this comment is added by the upstream patch
Commit: fa14bdf6b648 ("LU-9049 obdclass: change object lookup to no wait mode")
but I cannot see what it refers to.

Otherwise that patch looks good.

Thanks,
NeilBrown
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 832 bytes
Desc: not available
URL: <http://lists.lustre.org/pipermail/lustre-devel-lustre.org/attachments/20180504/fc969d6a/attachment.sig>

  reply	other threads:[~2018-05-04  1:15 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-05-02 18:21 [PATCH 0/4] staging: lustre: obdclass: missing lu_object fixes James Simmons
2018-05-02 18:21 ` [lustre-devel] " James Simmons
2018-05-02 18:21 ` [PATCH 1/4] staging: lustre: obdclass: change spinlock of key to rwlock James Simmons
2018-05-02 18:21   ` [lustre-devel] " James Simmons
2018-05-03 13:50   ` David Laight
2018-05-03 13:50     ` [lustre-devel] " David Laight
2018-05-03 23:26     ` NeilBrown
2018-05-03 23:26       ` [lustre-devel] " NeilBrown
2018-05-04  0:11     ` Dilger, Andreas
2018-05-04  0:11       ` [lustre-devel] " Dilger, Andreas
2018-05-04  0:53       ` NeilBrown
2018-05-04  0:53         ` [lustre-devel] " NeilBrown
2018-05-02 18:21 ` [PATCH 2/4] staging: lustre: obdclass: hoist locking in lu_context_exit() James Simmons
2018-05-02 18:21   ` [lustre-devel] " James Simmons
2018-05-02 18:21 ` [PATCH 3/4] staging: lustre: obdclass: guarantee all keys filled James Simmons
2018-05-02 18:21   ` [lustre-devel] " James Simmons
2018-05-02 18:21 ` [PATCH 4/4] staging: lustre: obdclass: change object lookup to no wait mode James Simmons
2018-05-02 18:21   ` [lustre-devel] " James Simmons
2018-05-04  1:15   ` NeilBrown [this message]
2018-05-04  1:15     ` NeilBrown
2018-05-15  0:37     ` James Simmons
2018-05-15  0:37       ` [lustre-devel] " James Simmons
2018-05-15  1:37       ` NeilBrown
2018-05-15  1:37         ` [lustre-devel] " NeilBrown
2018-05-15  2:11         ` James Simmons
2018-05-15  2:11           ` [lustre-devel] " James Simmons
2018-05-07  1:47   ` Greg Kroah-Hartman
2018-05-07  1:47     ` [lustre-devel] " Greg Kroah-Hartman
2018-05-08 11:45   ` Dan Carpenter
2018-05-08 11:45     ` [lustre-devel] " Dan Carpenter
2018-05-15 15:02     ` James Simmons
2018-05-15 15:02       ` [lustre-devel] " James Simmons
2018-05-16  8:00       ` Dan Carpenter
2018-05-16  8:00         ` [lustre-devel] " Dan Carpenter
2018-05-16  9:12         ` Dilger, Andreas
2018-05-16  9:12           ` [lustre-devel] " Dilger, Andreas
2018-05-16 15:44           ` Joe Perches
2018-05-16 15:44             ` [lustre-devel] " Joe Perches
2018-05-16 16:57       ` Greg Kroah-Hartman
2018-05-16 16:57         ` [lustre-devel] " Greg Kroah-Hartman
2018-05-17  5:07         ` James Simmons
2018-05-17  5:07           ` [lustre-devel] " James Simmons

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=876044fcgg.fsf@notabene.neil.brown.name \
    --to=neilb@suse.com \
    --cc=andreas.dilger@intel.com \
    --cc=devel@driverdev.osuosl.org \
    --cc=gregkh@linuxfoundation.org \
    --cc=jinshan.xiong@intel.com \
    --cc=jsimmons@infradead.org \
    --cc=lai.siyao@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lustre-devel@lists.lustre.org \
    --cc=oleg.drokin@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.