All of lore.kernel.org
 help / color / mirror / Atom feed
From: Oleg Drokin <oleg.drokin@intel.com>
To: Al Viro <viro@ZenIV.linux.org.uk>
Cc: "<linux-fsdevel@vger.kernel.org>" <linux-fsdevel@vger.kernel.org>,
	Lustre Development List <lustre-devel@lists.lustre.org>,
	Jinshan Xiong <jinshan.xiong@intel.com>
Subject: Re: insanity in ll_dirty_page_discard_warn()
Date: Thu, 28 Jul 2016 15:25:45 -0400	[thread overview]
Message-ID: <FBC7930D-6970-45F6-975F-D9B349B5C7EF@intel.com> (raw)
In-Reply-To: <20160728182659.GV2356@ZenIV.linux.org.uk>


On Jul 28, 2016, at 2:26 PM, Al Viro wrote:

>        /* this can be called inside spin lock so use GFP_ATOMIC. */
>        buf = (char *)__get_free_page(GFP_ATOMIC);
>        if (buf) {
>                dentry = d_find_alias(page->mapping->host);
> 	...
> 	if (dentry)
> 		dput(dentry);
> 
> If it *can* be called under a spinlock, you have an obvious problem -
> dput() can sleep.  d_find_alias() might've picked a hashed dentry with
> zero refcount that got unhashed by the time of dput().  Or other references
> used to exist, but got dropped by that point...

Ah, the dput()->dentry_kill()->cpu_relax() I guess?

(the final iput cannot catch us here, I think, because we still have pages
in the mapping)

Hm� So the original reported path was:
            ll_dirty_page_discard_warn at ffffffffa0a3d252 [lustre]
            vvp_page_completion_common at ffffffffa0a7adfc [lustre]
            vvp_page_completion_write_common at ffffffffa0a7ae6b [lustre]
            vvp_page_completion_write at ffffffffa0a7b83e [lustre]
            cl_page_completion at ffffffffa05eed8f [obdclass]
            osc_completion at ffffffffa0880812 [osc]
            osc_ap_completion at ffffffffa086a544 [osc]
            brw_interpret at ffffffffa0876d69 [osc]

But we don't even have a call to osc_ap_completion from brw_interpret
anymore.

osc_ap_completion() itself has a comment that it is to be called under
cl_loi_list_lock, but then tries to take it itself, so the comment
is definitely stale.
And osc_completion() is called outside of that coverage.

I tend to think the comment is stale now, but need to do some more investigations
before I am 100% sure of that.

Thanks for bringing it to our attention.

WARNING: multiple messages have this Message-ID (diff)
From: Oleg Drokin <oleg.drokin@intel.com>
To: Al Viro <viro@ZenIV.linux.org.uk>
Cc: "<linux-fsdevel@vger.kernel.org>" <linux-fsdevel@vger.kernel.org>,
	Lustre Development List <lustre-devel@lists.lustre.org>,
	Jinshan Xiong <jinshan.xiong@intel.com>
Subject: [lustre-devel] insanity in ll_dirty_page_discard_warn()
Date: Thu, 28 Jul 2016 15:25:45 -0400	[thread overview]
Message-ID: <FBC7930D-6970-45F6-975F-D9B349B5C7EF@intel.com> (raw)
In-Reply-To: <20160728182659.GV2356@ZenIV.linux.org.uk>


On Jul 28, 2016, at 2:26 PM, Al Viro wrote:

>        /* this can be called inside spin lock so use GFP_ATOMIC. */
>        buf = (char *)__get_free_page(GFP_ATOMIC);
>        if (buf) {
>                dentry = d_find_alias(page->mapping->host);
> 	...
> 	if (dentry)
> 		dput(dentry);
> 
> If it *can* be called under a spinlock, you have an obvious problem -
> dput() can sleep.  d_find_alias() might've picked a hashed dentry with
> zero refcount that got unhashed by the time of dput().  Or other references
> used to exist, but got dropped by that point...

Ah, the dput()->dentry_kill()->cpu_relax() I guess?

(the final iput cannot catch us here, I think, because we still have pages
in the mapping)

Hm? So the original reported path was:
            ll_dirty_page_discard_warn at ffffffffa0a3d252 [lustre]
            vvp_page_completion_common at ffffffffa0a7adfc [lustre]
            vvp_page_completion_write_common at ffffffffa0a7ae6b [lustre]
            vvp_page_completion_write at ffffffffa0a7b83e [lustre]
            cl_page_completion at ffffffffa05eed8f [obdclass]
            osc_completion at ffffffffa0880812 [osc]
            osc_ap_completion at ffffffffa086a544 [osc]
            brw_interpret at ffffffffa0876d69 [osc]

But we don't even have a call to osc_ap_completion from brw_interpret
anymore.

osc_ap_completion() itself has a comment that it is to be called under
cl_loi_list_lock, but then tries to take it itself, so the comment
is definitely stale.
And osc_completion() is called outside of that coverage.

I tend to think the comment is stale now, but need to do some more investigations
before I am 100% sure of that.

Thanks for bringing it to our attention.

  reply	other threads:[~2016-07-28 19:25 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-07-28 18:26 insanity in ll_dirty_page_discard_warn() Al Viro
2016-07-28 19:25 ` Oleg Drokin [this message]
2016-07-28 19:25   ` [lustre-devel] " Oleg Drokin
2016-07-29 17:22   ` Ben Evans
2016-07-29 17:22     ` Ben Evans
2016-08-01  7:54     ` DEGREMONT Aurelien
2016-08-01 13:14       ` Ben Evans

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=FBC7930D-6970-45F6-975F-D9B349B5C7EF@intel.com \
    --to=oleg.drokin@intel.com \
    --cc=jinshan.xiong@intel.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=lustre-devel@lists.lustre.org \
    --cc=viro@ZenIV.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.