Linux-Fsdevel Archive on lore.kernel.org
 help / color / Atom feed
From: Tejun Heo <tj@kernel.org>
To: axboe@kernel.dk
Cc: linux-kernel@vger.kernel.org, jack@suse.cz, hch@infradead.org,
	hannes@cmpxchg.org, linux-fsdevel@vger.kernel.org,
	vgoyal@redhat.com, lizefan@huawei.com, cgroups@vger.kernel.org,
	linux-mm@kvack.org, mhocko@suse.cz, clm@fb.com,
	fengguang.wu@intel.com, david@fromorbit.com, gthelen@google.com,
	Tejun Heo <tj@kernel.org>
Subject: [PATCH 17/19] writeback: reset wb_domain->dirty_limit[_tstmp] when memcg domain size changes
Date: Mon,  6 Apr 2015 16:04:32 -0400
Message-ID: <1428350674-8303-18-git-send-email-tj@kernel.org> (raw)
In-Reply-To: <1428350674-8303-1-git-send-email-tj@kernel.org>

The amount of available memory to a memcg wb_domain can change as
memcg configuration changes.  A domain's ->dirty_limit exists to
smooth out sudden drops in dirty threshold; however, when a domain's
size actually drops significantly, it hinders the dirty throttling
from adjusting to the new configuration leading to unexpected
behaviors including unnecessary OOM kills.

This patch resolves the issue by adding wb_domain_size_changed() which
resets ->dirty_limit[_tstmp] and making memcg call it on configuration
changes.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Jan Kara <jack@suse.cz>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Cc: Greg Thelen <gthelen@google.com>
---
 include/linux/writeback.h | 20 ++++++++++++++++++++
 mm/memcontrol.c           | 12 ++++++++++++
 2 files changed, 32 insertions(+)

diff --git a/include/linux/writeback.h b/include/linux/writeback.h
index a9eb40f..5b86988 100644
--- a/include/linux/writeback.h
+++ b/include/linux/writeback.h
@@ -132,6 +132,26 @@ struct wb_domain {
 	unsigned long dirty_limit;
 };
 
+/**
+ * wb_domain_size_changed - memory available to a wb_domain has changed
+ * @dom: wb_domain of interest
+ *
+ * This function should be called when the amount of memory available to
+ * @dom has changed.  It resets @dom's dirty limit parameters to prevent
+ * the past values which don't match the current configuration from skewing
+ * dirty throttling.  Without this, when memory size of a wb_domain is
+ * greatly reduced, the dirty throttling logic may allow too many pages to
+ * be dirtied leading to consecutive unnecessary OOMs and may get stuck in
+ * that situation.
+ */
+static inline void wb_domain_size_changed(struct wb_domain *dom)
+{
+	spin_lock(&dom->lock);
+	dom->dirty_limit_tstamp = jiffies;
+	dom->dirty_limit = 0;
+	spin_unlock(&dom->lock);
+}
+
 /*
  * fs/fs-writeback.c
  */	
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 2a74cf3..108acfc 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -4096,6 +4096,11 @@ static void memcg_wb_domain_exit(struct mem_cgroup *memcg)
 	wb_domain_exit(&memcg->cgwb_domain);
 }
 
+static void memcg_wb_domain_size_changed(struct mem_cgroup *memcg)
+{
+	wb_domain_size_changed(&memcg->cgwb_domain);
+}
+
 struct wb_domain *mem_cgroup_wb_domain(struct bdi_writeback *wb)
 {
 	struct mem_cgroup *memcg = mem_cgroup_from_css(wb->memcg_css);
@@ -4117,6 +4122,10 @@ static void memcg_wb_domain_exit(struct mem_cgroup *memcg)
 {
 }
 
+static void memcg_wb_domain_size_changed(struct mem_cgroup *memcg)
+{
+}
+
 #endif	/* CONFIG_CGROUP_WRITEBACK */
 
 /*
@@ -4715,6 +4724,7 @@ static void mem_cgroup_css_reset(struct cgroup_subsys_state *css)
 	memcg->low = 0;
 	memcg->high = PAGE_COUNTER_MAX;
 	memcg->soft_limit = PAGE_COUNTER_MAX;
+	memcg_wb_domain_size_changed(memcg);
 }
 
 #ifdef CONFIG_MMU
@@ -5347,6 +5357,7 @@ static ssize_t memory_high_write(struct kernfs_open_file *of,
 
 	memcg->high = high;
 
+	memcg_wb_domain_size_changed(memcg);
 	return nbytes;
 }
 
@@ -5379,6 +5390,7 @@ static ssize_t memory_max_write(struct kernfs_open_file *of,
 	if (err)
 		return err;
 
+	memcg_wb_domain_size_changed(memcg);
 	return nbytes;
 }
 
-- 
2.1.0

  parent reply index

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-04-06 20:04 [PATCHSET 2/3 v2 block/for-4.1/core] writeback: cgroup writeback backpressure propagation Tejun Heo
2015-04-06 20:04 ` [PATCH 01/19] memcg: make mem_cgroup_read_{stat|event}() iterate possible cpus instead of online Tejun Heo
2015-04-06 20:04 ` [PATCH 02/19] writeback: clean up wb_dirty_limit() Tejun Heo
2015-04-06 20:04 ` [PATCH 03/19] writeback: reorganize [__]wb_update_bandwidth() Tejun Heo
2015-04-06 20:04 ` [PATCH 04/19] writeback: implement wb_domain Tejun Heo
2015-04-06 20:04 ` [PATCH 05/19] writeback: move global_dirty_limit into wb_domain Tejun Heo
2015-04-06 20:04 ` [PATCH 06/19] writeback: consolidate dirty throttle parameters into dirty_throttle_control Tejun Heo
2015-04-06 20:04 ` [PATCH 07/19] writeback: add dirty_throttle_control->wb_bg_thresh Tejun Heo
2015-04-06 20:04 ` [PATCH 08/19] writeback: make __wb_calc_thresh() take dirty_throttle_control Tejun Heo
2015-04-06 20:04 ` [PATCH 09/19] writeback: add dirty_throttle_control->pos_ratio Tejun Heo
2015-04-06 20:04 ` [PATCH 10/19] writeback: add dirty_throttle_control->wb_completions Tejun Heo
2015-04-06 20:04 ` [PATCH 11/19] writeback: add dirty_throttle_control->dom Tejun Heo
2015-04-06 20:04 ` [PATCH 12/19] writeback: make __wb_writeout_inc() and hard_dirty_limit() take wb_domaas a parameter Tejun Heo
2015-04-06 20:04 ` [PATCH 13/19] writeback: separate out domain_dirty_limits() Tejun Heo
2015-04-06 20:04 ` [PATCH 14/19] writeback: move over_bground_thresh() to mm/page-writeback.c Tejun Heo
2015-04-06 20:04 ` [PATCH 15/19] writeback: update wb_over_bg_thresh() to use wb_domain aware operations Tejun Heo
2015-04-06 20:04 ` [PATCH 16/19] writeback: implement memcg wb_domain Tejun Heo
2015-04-06 20:04 ` Tejun Heo [this message]
2015-04-06 20:04 ` [PATCH 18/19] writeback: implement memcg writeback domain based throttling Tejun Heo
2015-04-06 20:04 ` [PATCH 19/19] mm: vmscan: disable memcg direct reclaim stalling if cgroup writeback support is in use Tejun Heo
2015-04-06 20:07 ` [PATCHSET 2/3 v2 block/for-4.1/core] writeback: cgroup writeback backpressure propagation Tejun Heo
2015-05-22 22:23 [PATCHSET 2/3 v3 block/for-4.2/core] " Tejun Heo
2015-05-22 22:23 ` [PATCH 17/19] writeback: reset wb_domain->dirty_limit[_tstmp] when memcg domain size changes Tejun Heo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1428350674-8303-18-git-send-email-tj@kernel.org \
    --to=tj@kernel.org \
    --cc=axboe@kernel.dk \
    --cc=cgroups@vger.kernel.org \
    --cc=clm@fb.com \
    --cc=david@fromorbit.com \
    --cc=fengguang.wu@intel.com \
    --cc=gthelen@google.com \
    --cc=hannes@cmpxchg.org \
    --cc=hch@infradead.org \
    --cc=jack@suse.cz \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lizefan@huawei.com \
    --cc=mhocko@suse.cz \
    --cc=vgoyal@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-Fsdevel Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-fsdevel/0 linux-fsdevel/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-fsdevel linux-fsdevel/ https://lore.kernel.org/linux-fsdevel \
		linux-fsdevel@vger.kernel.org
	public-inbox-index linux-fsdevel

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-fsdevel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git