All of lore.kernel.org
 help / color / mirror / Atom feed
From: Sage Weil <sage@inktank.com>
To: "Yan, Zheng" <ukernel@gmail.com>
Cc: Sha Zhengju <handai.szj@gmail.com>,
	"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
	ceph-devel <ceph-devel@vger.kernel.org>,
	linux-mm <linux-mm@kvack.org>,
	cgroups@vger.kernel.org, mhocko@suse.cz,
	kamezawa.hiroyu@jp.fujitsu.com, glommer@gmail.com,
	Greg Thelen <gthelen@google.com>,
	Wu Fengguang <fengguang.wu@intel.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Sha Zhengju <handai.szj@taobao.com>
Subject: Re: [PATCH V5 2/8] fs/ceph: vfs __set_page_dirty_nobuffers interface instead of doing it inside filesystem
Date: Thu, 1 Aug 2013 11:27:49 -0700 (PDT)	[thread overview]
Message-ID: <alpine.DEB.2.00.1308011121080.22584@cobra.newdream.net> (raw)
In-Reply-To: <CAAM7YAmxmmA6g2WPVtGN1-42rtDBYzLhF-gvNXxcBN6dUveBYQ@mail.gmail.com>

[-- Attachment #1: Type: TEXT/PLAIN, Size: 3697 bytes --]

On Thu, 1 Aug 2013, Yan, Zheng wrote:
> On Thu, Aug 1, 2013 at 7:51 PM, Sha Zhengju <handai.szj@gmail.com> wrote:
> > From: Sha Zhengju <handai.szj@taobao.com>
> >
> > Following we will begin to add memcg dirty page accounting around
> __set_page_dirty_
> > {buffers,nobuffers} in vfs layer, so we'd better use vfs interface to
> avoid exporting
> > those details to filesystems.
> >
> > Signed-off-by: Sha Zhengju <handai.szj@taobao.com>
> > ---
> >  fs/ceph/addr.c |   13 +------------
> >  1 file changed, 1 insertion(+), 12 deletions(-)
> >
> > diff --git a/fs/ceph/addr.c b/fs/ceph/addr.c
> > index 3e68ac1..1445bf1 100644
> > --- a/fs/ceph/addr.c
> > +++ b/fs/ceph/addr.c
> > @@ -76,7 +76,7 @@ static int ceph_set_page_dirty(struct page *page)
> >         if (unlikely(!mapping))
> >                 return !TestSetPageDirty(page);
> >
> > -       if (TestSetPageDirty(page)) {
> > +       if (!__set_page_dirty_nobuffers(page)) {
> it's too early to set the radix tree tag here. We should set page's snapshot
> context and increase the i_wrbuffer_ref first. This is because once the tag
> is set, writeback thread can find and start flushing the page.

Unfortunately I only remember being frustrated by this code.  :)  Looking 
at it now, though, it seems like the minimum fix is to set the 
page->private before marking the page dirty.  I don't know the locking 
rules around that, though.  If that is potentially racy, maybe the safest 
thing would be if __set_page_dirty_nobuffers() took a void* to set 
page->private to atomically while holding the tree_lock.

sage

> 
> >                 dout("%p set_page_dirty %p idx %lu -- already dirty\n",
> >                      mapping->host, page, page->index);
> >                 return 0;
> > @@ -107,14 +107,7 @@ static int ceph_set_page_dirty(struct page *page)
> >              snapc, snapc->seq, snapc->num_snaps);
> >         spin_unlock(&ci->i_ceph_lock);
> >
> > -       /* now adjust page */
> > -       spin_lock_irq(&mapping->tree_lock);
> >         if (page->mapping) {    /* Race with truncate? */
> > -               WARN_ON_ONCE(!PageUptodate(page));
> > -               account_page_dirtied(page, page->mapping);
> > -               radix_tree_tag_set(&mapping->page_tree,
> > -                               page_index(page), PAGECACHE_TAG_DIRTY);
> > -
> 
> this code was coped from __set_page_dirty_nobuffers(). I think the reason
> Sage did this is to handle the race described in
> __set_page_dirty_nobuffers()'s comment. But I'm wonder if "page->mapping ==
> NULL" can still happen here. Because truncate_inode_page() unmap page from
> processes's address spaces first, then delete page from page cache.
> 
> Regards
> Yan, Zheng
> 
> >                 /*
> >                  * Reference snap context in page->private.  Also set
> >                  * PagePrivate so that we get invalidatepage callback.
> > @@ -126,14 +119,10 @@ static int ceph_set_page_dirty(struct page *page)
> >                 undo = 1;
> >         }
> >
> > -       spin_unlock_irq(&mapping->tree_lock);
> 
> 
> 
> 
> > -
> >         if (undo)
> >                 /* whoops, we failed to dirty the page */
> >                 ceph_put_wrbuffer_cap_refs(ci, 1, snapc);
> >
> > -       __mark_inode_dirty(mapping->host, I_DIRTY_PAGES);
> > -
> >         BUG_ON(!PageDirty(page));
> >         return 1;
> >  }
> > --
> > 1.7.9.5
> >
> > --
> > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 
> 

  reply	other threads:[~2013-08-01 18:27 UTC|newest]

Thread overview: 47+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-08-01 11:43 [PATCH V5 0/8] Add memcg dirty/writeback page accounting Sha Zhengju
2013-08-01 11:43 ` Sha Zhengju
2013-08-01 11:44 ` [PATCH 1/8] memcg: remove MEMCG_NR_FILE_MAPPED Sha Zhengju
2013-08-01 11:51 ` [PATCH V5 2/8] fs/ceph: vfs __set_page_dirty_nobuffers interface instead of doing it inside filesystem Sha Zhengju
2013-08-01 15:19   ` Yan, Zheng
2013-08-01 18:27     ` Sage Weil [this message]
2013-08-02 10:04       ` Sha Zhengju
     [not found]         ` <CAFj3OHVXvtr5BDMrGatHZi7M9y+dh1ZKRMQZGjZmNBcg3pNQtw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-08-02 20:30           ` Sage Weil
2013-08-02 20:30             ` Sage Weil
2013-08-03  8:58             ` Sha Zhengju
2013-08-02  9:04     ` Sha Zhengju
2013-08-02  9:04       ` Sha Zhengju
2013-08-02 13:11       ` Yan, Zheng
2013-08-01 11:52 ` [PATCH V5 3/8] memcg: check for proper lock held in mem_cgroup_update_page_stat Sha Zhengju
2013-08-01 14:34   ` Michal Hocko
2013-08-01 14:34     ` Michal Hocko
2013-08-04 18:48   ` Greg Thelen
2013-08-04 18:48     ` Greg Thelen
2013-08-01 11:53 ` [PATCH V5 4/8] memcg: add per cgroup dirty pages accounting Sha Zhengju
2013-08-01 11:54 ` [PATCH V5 5/8] memcg: add per cgroup writeback " Sha Zhengju
2013-08-01 14:53   ` Michal Hocko
     [not found]     ` <20130801145302.GJ5198-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>
2013-08-03  9:25       ` Sha Zhengju
2013-08-03  9:25         ` Sha Zhengju
     [not found]         ` <CAFj3OHV-VCKJfe6bv4UMvv+uj4LELDXsieRZFJD06Yrdyy=XxA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-08-04 10:08           ` Michal Hocko
2013-08-04 10:08             ` Michal Hocko
2013-08-22  9:46             ` Fwd: " Sha Zhengju
2013-08-22  9:46               ` Sha Zhengju
2013-08-22  9:50               ` [PATCH 1/4] memcg: remove MEMCG_NR_FILE_MAPPED Sha Zhengju
2013-08-22  9:52               ` [PATCH 2/4] memcg: check for proper lock held in mem_cgroup_update_page_stat Sha Zhengju
2013-08-22  9:53               ` [PATCH 3/4] memcg: add per cgroup writeback pages accounting Sha Zhengju
2013-08-22 22:40                 ` Andrew Morton
2013-08-23 16:11                   ` Sha Zhengju
2013-08-23 16:11                     ` Sha Zhengju
2013-08-22  9:53               ` [PATCH 4/4] memcg: Document cgroup dirty/writeback memory statistics Sha Zhengju
2013-08-04 18:51   ` [PATCH V5 5/8] memcg: add per cgroup writeback pages accounting Greg Thelen
2013-08-04 18:51     ` Greg Thelen
2013-08-01 11:55 ` [PATCH V5 6/8] memcg: make nocpu_base available for non-hotplug Sha Zhengju
2013-08-01 12:00 ` [PATCH V5 7/8] memcg: don't account root memcg page stats if only root exists Sha Zhengju
2013-08-01 16:20   ` Johannes Weiner
2013-08-02  4:32     ` Sha Zhengju
2013-08-02  4:32       ` Sha Zhengju
2013-08-05 21:58       ` Johannes Weiner
2013-08-05 21:58         ` Johannes Weiner
2013-08-01 12:00 ` [PATCH V5 8/8] memcg: Document cgroup dirty/writeback memory statistics Sha Zhengju
2013-08-01 12:00   ` Sha Zhengju
2013-08-01 14:43 ` [PATCH V5 0/8] Add memcg dirty/writeback page accounting Michal Hocko
2013-08-03  9:30   ` Sha Zhengju

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.DEB.2.00.1308011121080.22584@cobra.newdream.net \
    --to=sage@inktank.com \
    --cc=akpm@linux-foundation.org \
    --cc=ceph-devel@vger.kernel.org \
    --cc=cgroups@vger.kernel.org \
    --cc=fengguang.wu@intel.com \
    --cc=glommer@gmail.com \
    --cc=gthelen@google.com \
    --cc=handai.szj@gmail.com \
    --cc=handai.szj@taobao.com \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.cz \
    --cc=ukernel@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.