linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 1/2] mem_cgroup: optimize the atomic count of wb_completion
@ 2021-09-24  6:46 brookxu
  2021-09-24  6:46 ` [PATCH 2/2] mem_cgroup: introduce foreign_writeback_in_process() function brookxu
  2021-09-24  9:34 ` [PATCH 1/2] mem_cgroup: optimize the atomic count of wb_completion Michal Hocko
  0 siblings, 2 replies; 6+ messages in thread
From: brookxu @ 2021-09-24  6:46 UTC (permalink / raw)
  To: hannes, mhocko, vdavydov.dev, akpm; +Cc: linux-kernel, cgroups

From: Chunguang Xu <brookxu@tencent.com>

In order to track inflight foreign writeback, we init
wb_completion.cnt to 1. For normal writeback, this cause
wb_wait_for_completion() to perform meaningless atomic
operations. Since foreign writebacks rarely occur in most
scenarios, we can init wb_completion.cnt to 0 and set
frn.done.cnt to 1. In this way we can avoid unnecessary
atomic operations.

Signed-off-by: Chunguang Xu <brookxu@tencent.com>
---
 fs/fs-writeback.c                | 1 -
 include/linux/backing-dev-defs.h | 2 +-
 mm/memcontrol.c                  | 7 ++++---
 3 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
index 81ec192..1ef10f2 100644
--- a/fs/fs-writeback.c
+++ b/fs/fs-writeback.c
@@ -186,7 +186,6 @@ static void wb_queue_work(struct bdi_writeback *wb,
  */
 void wb_wait_for_completion(struct wb_completion *done)
 {
-	atomic_dec(&done->cnt);		/* put down the initial count */
 	wait_event(*done->waitq, !atomic_read(&done->cnt));
 }
 
diff --git a/include/linux/backing-dev-defs.h b/include/linux/backing-dev-defs.h
index 3320700..38bd571 100644
--- a/include/linux/backing-dev-defs.h
+++ b/include/linux/backing-dev-defs.h
@@ -71,7 +71,7 @@ struct wb_completion {
 };
 
 #define __WB_COMPLETION_INIT(_waitq)	\
-	(struct wb_completion){ .cnt = ATOMIC_INIT(1), .waitq = (_waitq) }
+	(struct wb_completion){ .cnt = ATOMIC_INIT(0), .waitq = (_waitq) }
 
 /*
  * If one wants to wait for one or more wb_writeback_works, each work's
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index b762215..3e1384a6 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -5168,9 +5168,10 @@ static struct mem_cgroup *mem_cgroup_alloc(void)
 #endif
 #ifdef CONFIG_CGROUP_WRITEBACK
 	INIT_LIST_HEAD(&memcg->cgwb_list);
-	for (i = 0; i < MEMCG_CGWB_FRN_CNT; i++)
-		memcg->cgwb_frn[i].done =
-			__WB_COMPLETION_INIT(&memcg_cgwb_frn_waitq);
+	for (i = 0; i < MEMCG_CGWB_FRN_CNT; i++) {
+		atomic_set(&memcg->cgwb_frn[i].done.cnt, 1);
+		memcg->cgwb_frn[i].done.waitq = &memcg_cgwb_frn_waitq;
+	}
 #endif
 #ifdef CONFIG_TRANSPARENT_HUGEPAGE
 	spin_lock_init(&memcg->deferred_split_queue.split_queue_lock);
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 2/2] mem_cgroup: introduce foreign_writeback_in_process() function
  2021-09-24  6:46 [PATCH 1/2] mem_cgroup: optimize the atomic count of wb_completion brookxu
@ 2021-09-24  6:46 ` brookxu
  2021-09-24  9:34 ` [PATCH 1/2] mem_cgroup: optimize the atomic count of wb_completion Michal Hocko
  1 sibling, 0 replies; 6+ messages in thread
From: brookxu @ 2021-09-24  6:46 UTC (permalink / raw)
  To: hannes, mhocko, vdavydov.dev, akpm; +Cc: linux-kernel, cgroups

From: Chunguang Xu <brookxu@tencent.com>

Directly use atomic_read(&frn->done.cnt) == 1 to check whether frn
is issued, which makes the code a bit obscure. Maybe we should
replace it with a more understandable function.

Signed-off-by: Chunguang Xu <brookxu@tencent.com>
---
 mm/memcontrol.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 3e1384a6..464745b 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -4542,6 +4542,12 @@ void mem_cgroup_wb_stats(struct bdi_writeback *wb, unsigned long *pfilepages,
  * As being wrong occasionally doesn't matter, updates and accesses to the
  * records are lockless and racy.
  */
+
+static inline bool foreign_writeback_in_process(struct memcg_cgwb_frn *frn)
+{
+	return atomic_read(&frn->done.cnt) != 1;
+}
+
 void mem_cgroup_track_foreign_dirty_slowpath(struct page *page,
 					     struct bdi_writeback *wb)
 {
@@ -4565,7 +4571,7 @@ void mem_cgroup_track_foreign_dirty_slowpath(struct page *page,
 		    frn->memcg_id == wb->memcg_css->id)
 			break;
 		if (time_before64(frn->at, oldest_at) &&
-		    atomic_read(&frn->done.cnt) == 1) {
+		    !foreign_writeback_in_process(frn)) {
 			oldest = i;
 			oldest_at = frn->at;
 		}
@@ -4612,7 +4618,7 @@ void mem_cgroup_flush_foreign(struct bdi_writeback *wb)
 		 * already one in flight.
 		 */
 		if (time_after64(frn->at, now - intv) &&
-		    atomic_read(&frn->done.cnt) == 1) {
+		    !foreign_writeback_in_process(frn)) {
 			frn->at = 0;
 			trace_flush_foreign(wb, frn->bdi_id, frn->memcg_id);
 			cgroup_writeback_by_id(frn->bdi_id, frn->memcg_id,
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH 1/2] mem_cgroup: optimize the atomic count of wb_completion
  2021-09-24  6:46 [PATCH 1/2] mem_cgroup: optimize the atomic count of wb_completion brookxu
  2021-09-24  6:46 ` [PATCH 2/2] mem_cgroup: introduce foreign_writeback_in_process() function brookxu
@ 2021-09-24  9:34 ` Michal Hocko
  2021-09-24 13:02   ` brookxu
  1 sibling, 1 reply; 6+ messages in thread
From: Michal Hocko @ 2021-09-24  9:34 UTC (permalink / raw)
  To: brookxu; +Cc: hannes, vdavydov.dev, akpm, linux-kernel, cgroups

On Fri 24-09-21 14:46:22, brookxu wrote:
> From: Chunguang Xu <brookxu@tencent.com>
> 
> In order to track inflight foreign writeback, we init
> wb_completion.cnt to 1. For normal writeback, this cause
> wb_wait_for_completion() to perform meaningless atomic
> operations. Since foreign writebacks rarely occur in most
> scenarios, we can init wb_completion.cnt to 0 and set
> frn.done.cnt to 1. In this way we can avoid unnecessary
> atomic operations.

Does this lead to any measurable differences?
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 1/2] mem_cgroup: optimize the atomic count of wb_completion
  2021-09-24  9:34 ` [PATCH 1/2] mem_cgroup: optimize the atomic count of wb_completion Michal Hocko
@ 2021-09-24 13:02   ` brookxu
  2021-09-24 14:01     ` Michal Hocko
  0 siblings, 1 reply; 6+ messages in thread
From: brookxu @ 2021-09-24 13:02 UTC (permalink / raw)
  To: Michal Hocko; +Cc: hannes, vdavydov.dev, akpm, linux-kernel, cgroups

Thanks for your time.

Michal Hocko wrote on 2021/9/24 17:34:
> On Fri 24-09-21 14:46:22, brookxu wrote:
>> From: Chunguang Xu <brookxu@tencent.com>
>>
>> In order to track inflight foreign writeback, we init
>> wb_completion.cnt to 1. For normal writeback, this cause
>> wb_wait_for_completion() to perform meaningless atomic
>> operations. Since foreign writebacks rarely occur in most
>> scenarios, we can init wb_completion.cnt to 0 and set
>> frn.done.cnt to 1. In this way we can avoid unnecessary
>> atomic operations.
> 
> Does this lead to any measurable differences?

I created multiple cgroups that performed IO on multiple disks, 
then flushed the cache with sync command, and no measurable
differences have been observed so far.

> 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 1/2] mem_cgroup: optimize the atomic count of wb_completion
  2021-09-24 13:02   ` brookxu
@ 2021-09-24 14:01     ` Michal Hocko
  2021-09-24 15:21       ` brookxu.cn
  0 siblings, 1 reply; 6+ messages in thread
From: Michal Hocko @ 2021-09-24 14:01 UTC (permalink / raw)
  To: brookxu; +Cc: hannes, vdavydov.dev, akpm, linux-kernel, cgroups

On Fri 24-09-21 21:02:52, brookxu wrote:
> Thanks for your time.
> 
> Michal Hocko wrote on 2021/9/24 17:34:
> > On Fri 24-09-21 14:46:22, brookxu wrote:
> >> From: Chunguang Xu <brookxu@tencent.com>
> >>
> >> In order to track inflight foreign writeback, we init
> >> wb_completion.cnt to 1. For normal writeback, this cause
> >> wb_wait_for_completion() to perform meaningless atomic
> >> operations. Since foreign writebacks rarely occur in most
> >> scenarios, we can init wb_completion.cnt to 0 and set
> >> frn.done.cnt to 1. In this way we can avoid unnecessary
> >> atomic operations.
> > 
> > Does this lead to any measurable differences?
> 
> I created multiple cgroups that performed IO on multiple disks, 
> then flushed the cache with sync command, and no measurable
> differences have been observed so far.

OK, so why do we want to optimize this code?
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 1/2] mem_cgroup: optimize the atomic count of wb_completion
  2021-09-24 14:01     ` Michal Hocko
@ 2021-09-24 15:21       ` brookxu.cn
  0 siblings, 0 replies; 6+ messages in thread
From: brookxu.cn @ 2021-09-24 15:21 UTC (permalink / raw)
  To: Michal Hocko; +Cc: hannes, vdavydov.dev, akpm, linux-kernel, cgroups

Thanks for your time.

On 2021/9/24 10:01 PM, Michal Hocko wrote:
> On Fri 24-09-21 21:02:52, brookxu wrote:
>> Thanks for your time.
>>
>> Michal Hocko wrote on 2021/9/24 17:34:
>>> On Fri 24-09-21 14:46:22, brookxu wrote:
>>>> From: Chunguang Xu <brookxu@tencent.com>
>>>>
>>>> In order to track inflight foreign writeback, we init
>>>> wb_completion.cnt to 1. For normal writeback, this cause
>>>> wb_wait_for_completion() to perform meaningless atomic
>>>> operations. Since foreign writebacks rarely occur in most
>>>> scenarios, we can init wb_completion.cnt to 0 and set
>>>> frn.done.cnt to 1. In this way we can avoid unnecessary
>>>> atomic operations.
>>>
>>> Does this lead to any measurable differences?
>>
>> I created multiple cgroups that performed IO on multiple disks,
>> then flushed the cache with sync command, and no measurable
>> differences have been observed so far.
> 
> OK, so why do we want to optimize this code?

Just a optimization point discovered during the diagnosis, no
behavior change.

> 

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2021-09-24 15:21 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-24  6:46 [PATCH 1/2] mem_cgroup: optimize the atomic count of wb_completion brookxu
2021-09-24  6:46 ` [PATCH 2/2] mem_cgroup: introduce foreign_writeback_in_process() function brookxu
2021-09-24  9:34 ` [PATCH 1/2] mem_cgroup: optimize the atomic count of wb_completion Michal Hocko
2021-09-24 13:02   ` brookxu
2021-09-24 14:01     ` Michal Hocko
2021-09-24 15:21       ` brookxu.cn

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).