linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Kemeng Shi <shikemeng@huaweicloud.com>
To: Tejun Heo <tj@kernel.org>
Cc: willy@infradead.org, akpm@linux-foundation.org,
	hcochran@kernelspring.com, mszeredi@redhat.com, axboe@kernel.dk,
	linux-fsdevel@vger.kernel.org, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH 2/5] mm: correct calculation of cgroup wb's bg_thresh in wb_over_bg_thresh
Date: Wed, 24 Jan 2024 10:01:47 +0800	[thread overview]
Message-ID: <a747dc7d-f24a-08bd-d969-d3fb35e151b7@huaweicloud.com> (raw)
In-Reply-To: <ZbAk8HfnzHoSSFWC@slm.duckdns.org>



on 1/24/2024 4:43 AM, Tejun Heo wrote:
> On Wed, Jan 24, 2024 at 02:33:29AM +0800, Kemeng Shi wrote:
>> The wb_calc_thresh will calculate wb's share in global wb domain. We need
>> to wb's share in mem_cgroup_wb_domain for mdtc. Call __wb_calc_thresh
>> instead of wb_calc_thresh to fix this.
> 
> That function is calculating the wb's portion of wb portion in the whole
> system so that threshold can be distributed accordingly. So, it has to be
> compared in the global domain. If you look at the comment on top of struct
> wb_domain, it says:
> 
> /*
>  * A wb_domain represents a domain that wb's (bdi_writeback's) belong to
>  * and are measured against each other in.  There always is one global
>  * domain, global_wb_domain, that every wb in the system is a member of.
>  * This allows measuring the relative bandwidth of each wb to distribute
>  * dirtyable memory accordingly.
>  */
> 
Hi Tejun, thanks for reply. For cgroup wb, it will belongs to a global wb
domain and a cgroup domain. I agree the way how we calculate wb's threshold
in global domain as you described above. This patch tries to fix calculation
of wb's threshold in cgroup domain which now is wb_calc_thresh(mdtc->wb,
mdtc->bg_thresh)), means:
(wb bandwidth) / (system bandwidth) * (*cgroup domain threshold*)
The cgroup domain threshold is
(memory of cgroup domain) / (memory of system) * (system threshold).
Then the wb's threshold in cgroup will be smaller than expected.

Consider following domain hierarchy:
                global domain (100G)
                /                 \
        cgroup domain1(50G)     cgroup domain2(50G)
                |                 |
bdi            wb1               wb2
Assume wb1 and wb2 has the same bandwidth.
We have global domain bg_thresh 10G, cgroup domain bg_thresh 5G.
Then we have:
wb's thresh in global domain = 10G * (wb bandwidth) / (system bandwidth)
= 10G * 1/2 = 5G
wb's thresh in cgroup domain = 5G * (wb bandwidth) / (system bandwidth)
= 5G * 1/2 = 2.5G
At last, wb1 and wb2 will be limited at 2.5G, the system will be limited
at 5G which is less than global domain bg_thresh 10G.

After the fix, threshold in cgroup domain will be:
(wb bandwidth) / (cgroup bandwidth) * (cgroup domain threshold)
The wb1 and wb2 will be limited at 5G, the system will be limited at
10G which equals to global domain bg_thresh 10G.

As I didn't take a deep look into memory cgroup, please correct me if
anything is wrong. Thanks!
> Also, how is this tested? Was there a case where the existing code
> misbehaved that's improved by this patch? Or is this just from reading code?
This is jut from reading code. Would the case showed above convince you
a bit. Look forward to your reply, thanks!.
> 
> Thanks.
> 


  reply	other threads:[~2024-01-24  2:02 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-01-23 18:33 [PATCH 0/5] Fix and cleanups to page-writeback Kemeng Shi
2024-01-23 18:33 ` [PATCH 1/5] mm: enable __wb_calc_thresh to calculate dirty background threshold Kemeng Shi
2024-01-23 18:33 ` [PATCH 2/5] mm: correct calculation of cgroup wb's bg_thresh in wb_over_bg_thresh Kemeng Shi
2024-01-23 20:43   ` Tejun Heo
2024-01-24  2:01     ` Kemeng Shi [this message]
2024-01-29 21:00       ` Tejun Heo
2024-02-08  9:26         ` Kemeng Shi
2024-02-08 19:32           ` Tejun Heo
2024-02-18  2:35             ` Kemeng Shi
2024-02-20 17:34               ` Tejun Heo
2024-01-23 18:33 ` [PATCH 3/5] mm: call __wb_calc_thresh instead of wb_calc_thresh " Kemeng Shi
2024-01-23 18:33 ` [PATCH 4/5] mm: remove redundant check in wb_min_max_ratio Kemeng Shi
2024-01-23 18:33 ` [PATCH 5/5] mm: remove stale comment __folio_mark_dirty Kemeng Shi
2024-01-23 14:07   ` Matthew Wilcox
2024-01-23 20:46 ` [PATCH 0/5] Fix and cleanups to page-writeback Tejun Heo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a747dc7d-f24a-08bd-d969-d3fb35e151b7@huaweicloud.com \
    --to=shikemeng@huaweicloud.com \
    --cc=akpm@linux-foundation.org \
    --cc=axboe@kernel.dk \
    --cc=hcochran@kernelspring.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mszeredi@redhat.com \
    --cc=tj@kernel.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).