linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: zhangliguang <zhangliguang@linux.alibaba.com>
To: tj@kernel.org, akpm@linux-foundation.org
Cc: cgroups@vger.kernel.org, linux-mm@kvack.org
Subject: [PATCH] fs/writeback: Attach inode's wb to root if needed
Date: Thu,  9 May 2019 16:03:53 +0800	[thread overview]
Message-ID: <1557389033-39649-1-git-send-email-zhangliguang@linux.alibaba.com> (raw)

There might have tons of files queued in the writeback, awaiting for
writing back. Unfortunately, the writeback's cgroup has been dead. In
this case, we reassociate the inode with another writeback cgroup, but
we possibly can't because the writeback associated with the dead cgroup
is the only valid one. In this case, the new writeback is allocated,
initialized and associated with the inode. It causes unnecessary high
system load and latency.

This fixes the issue by enforce moving the inode to root cgroup when the
previous binding cgroup becomes dead. With it, no more unnecessary
writebacks are created, populated and the system load decreased by about
6x in the online service we encounted:
    Without the patch: about 30% system load
    With the patch:    about  5% system load

with the patch observes significant perf graph change:

========================================================================
We record the trace with 'perf record cycles:k -g -a'.

The trace without the patch:

+  44.68%	[kernel]  [k] native_queued_spin_lock_slowpath
+   3.38%	[kernel]  [k] memset_erms
+   3.04%	[kernel]  [k] pcpu_alloc_area
+   1.46%	[kernel]  [k] __memmove
+   1.37%	[kernel]  [k] pcpu_alloc

detail information abount native_queued_spin_lock_slowpath:
44.68%       [kernel]  [k] native_queued_spin_lock_slowpath
 - native_queued_spin_lock_slowpath
    - _raw_spin_lock_irqsave
       - 68.80% pcpu_alloc
          - __alloc_percpu_gfp
             - 85.01% __percpu_counter_init
                - 66.75% wb_init
                     wb_get_create
                     inode_switch_wbs
                     wbc_attach_and_unlock_inode
                     __filemap_fdatawrite_range
                   + filemap_write_and_wait_range
                + 33.25% fprop_local_init_percpu
             + 14.99% percpu_ref_init
       + 30.77% free_percpu

system load (by top)
%Cpu(s): 31.9% sy

With the patch:

+   4.45%       [kernel]  [k] native_queued_spin_lock_slowpath
+   3.32%       [kernel]  [k] put_compound_page
+   2.20%       [kernel]  [k] gup_pte_range
+   1.91%       [kernel]  [k] kstat_irqs

system load (by top)
%Cpu(s): 5.4% sy

Signed-off-by: zhangliguang <zhangliguang@linux.alibaba.com>
---
 fs/fs-writeback.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
index 36855c1..e7e19d8 100644
--- a/fs/fs-writeback.c
+++ b/fs/fs-writeback.c
@@ -696,6 +696,13 @@ void wbc_detach_inode(struct writeback_control *wbc)
 	inode->i_wb_frn_avg_time = min(avg_time, (unsigned long)U16_MAX);
 	inode->i_wb_frn_history = history;
 
+	/*
+	 * The wb is switched to the root memcg unconditionally. We expect
+	 * the correct wb (best candidate) is picked up in next round.
+	 */
+	if (wb == inode->i_wb && wb_dying(wb) && !(inode->i_state & I_DIRTY_ALL))
+		inode_switch_wbs(inode, root_mem_cgroup->css.id);
+
 	wb_put(wbc->wb);
 	wbc->wb = NULL;
 }
-- 
1.8.3.1


             reply	other threads:[~2019-05-09  8:04 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-05-09  8:03 zhangliguang [this message]
2019-05-09 16:48 ` [PATCH] fs/writeback: Attach inode's wb to root if needed Tejun Heo
2019-05-10  1:54   ` 乱石
2019-05-13 18:30     ` Dennis Zhou
2019-05-16  5:54       ` 乱石

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1557389033-39649-1-git-send-email-zhangliguang@linux.alibaba.com \
    --to=zhangliguang@linux.alibaba.com \
    --cc=akpm@linux-foundation.org \
    --cc=cgroups@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).