From: Jan Kara <jack@suse.cz>
To: Tejun Heo <tj@kernel.org>
Cc: Jan Kara <jack@suse.cz>, Jens Axboe <axboe@kernel.dk>,
linux-block@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH block/for-next] writeback: add tracepoints for cgroup foreign writebacks
Date: Mon, 2 Sep 2019 10:38:12 +0200 [thread overview]
Message-ID: <20190902083812.GA14207@quack2.suse.cz> (raw)
In-Reply-To: <20190830170903.GB2263813@devbig004.ftw2.facebook.com>
Hello Tejun,
On Fri 30-08-19 10:09:03, Tejun Heo wrote:
> On Fri, Aug 30, 2019 at 06:42:11PM +0200, Jan Kara wrote:
> > Well, but if you look at __set_page_dirty_nobuffers() it is careful. It
> > does:
> >
> > struct address_space *mapping = page_mapping(page);
> >
> > if (!mapping) {
> > bail
> > }
> > ... use mapping
> >
> > Exactly because page->mapping can become NULL under your hands if you don't
> > hold page lock. So I think you either need something similar in your
> > tracepoint or handle this in the caller.
>
> So, account_page_dirtied() is called from two places.
>
> __set_page_dirty() and __set_page_dirty_nobuffers(). The following is
> from the latter.
>
> lock_page_memcg(page);
> if (!TestSetPageDirty(page)) {
> struct address_space *mapping = page_mapping(page);
> ...
>
> if (!mapping) {
> unlock_page_memcg(page);
> return 1;
> }
>
> xa_lock_irqsave(&mapping->i_pages, flags);
> BUG_ON(page_mapping(page) != mapping);
> WARN_ON_ONCE(!PagePrivate(page) && !PageUptodate(page));
> account_page_dirtied(page, mapping);
> ...
>
> If I'm reading it right, it's saying that at this point if mapping
> exists after setting page dirty, it must not change while locking
> i_pages.
Correct __set_page_dirty_nobuffers() is supposed to be called serialized
with truncation either through page lock or other means. At least the
comment says so and the code looks like that.
>
> __set_page_dirty_nobuffers() is more brief but seems to be making the
> same assumption.
I suppose you mean __set_page_dirty() here.
> xa_lock_irqsave(&mapping->i_pages, flags);
> if (page->mapping) { /* Race with truncate? */
> WARN_ON_ONCE(warn && !PageUptodate(page));
> account_page_dirtied(page, mapping);
> __xa_set_mark(&mapping->i_pages, page_index(page),
> PAGECACHE_TAG_DIRTY);
> }
> xa_unlock_irqrestore(&mapping->i_pages, flags);
>
> Both are clearly assuming that once i_pages is locked, mapping can't
> change. So, inside account_page_dirtied(), mapping clearly can't
> change. The TP in question - track_foreign_dirty - is invoked from
> mem_cgroup_track_foreign_dirty() which is only called from
> account_page_dirty(), so I'm failing to see how mapping would change
> there.
I'm not sure where we depend here on page->mapping not getting cleared. The
point is even if page->mapping is getting cleared while we work on the
page, we have 'mapping' stored locally so we just account everything
against the original mapping.
I've researched this a bit more and commit 2d6d7f982846 "mm: protect
set_page_dirty() from ongoing truncation" introduced the idea that
__set_page_dirty_nobuffers() should be only called synchronized with
truncation. Now I know for a fact that this is not always the case (e.g.
various RDMA drivers calling set_page_dirty() without a lock or any other
protection against truncate) but let's consider this a bug in the caller of
set_page_dirty(). So in the end I agree that you're fine with relying on
page_mapping() not changing under you.
Honza
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
next prev parent reply other threads:[~2019-09-02 8:38 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-08-29 22:47 [PATCH block/for-next] writeback: add tracepoints for cgroup foreign writebacks Tejun Heo
2019-08-30 13:43 ` Jens Axboe
2019-08-30 15:40 ` Jan Kara
2019-08-30 15:49 ` Tejun Heo
2019-08-30 16:42 ` Jan Kara
2019-08-30 17:09 ` Tejun Heo
2019-09-02 8:38 ` Jan Kara [this message]
2019-08-30 15:51 ` Tejun Heo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190902083812.GA14207@quack2.suse.cz \
--to=jack@suse.cz \
--cc=axboe@kernel.dk \
--cc=linux-block@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=tj@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).