linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Process memory accounting (cgroups) accuracy
@ 2021-07-02  7:50 Krzysztof Kozlowski
  2021-07-02  9:08 ` Michal Hocko
  0 siblings, 1 reply; 3+ messages in thread
From: Krzysztof Kozlowski @ 2021-07-02  7:50 UTC (permalink / raw)
  To: Andrew Morton, Johannes Weiner, Michal Hocko, Vladimir Davydov,
	linux-mm, linux-kernel, cgroups

Hi,

Since some time I am trying to fix Linux Test Project tests around
memory cgroups:
https://lists.linux.it/pipermail/ltp/2021-June/023259.html

The trouble I have, for example with memcg_max_usage_in_bytes_test.sh is
that on recent kernels (v4.15+) on x86_64, the memory group reports max
usage as higher than process' anonymous mapping.

The test works like this:
1. Fork a process, signal it to mmap 4 MB (PROT_WRITE | PROT_READ,
AP_PRIVATE | MAP_ANONYMOUS) and touch the memory.
2. Add the process to control group.
3. Signal it to munmap the region and immediately mmap again the same 4
MB (with touching the memory).
4. Check the counters and reset them.
5. munmap
6. Check the counters

Mentioned memcg_max_usage_in_bytes_test.sh checks the counters of
memory.memsw.max_usage_in_bytes which are:
a. early kernels: 4 MB (so only the mmap)
b. v4.15, v5.4 kernel: 4 MB + 32 pages
c. v5.11 kernel: 4 MB + 32 pages + 2 pages

I tweaked the mmap() size to smaller values and then the accounting is
even different. For example mmap of 1 up to 32 pages the
memory.memsw.max_usage_in_bytes is always 131072.

After final munmap (point 5 above), the test expects the
memcg_max_usage_in_bytes to be =0, however it is usually 8 or 132 kB.
Which kind of points that process is charged for something not related
to that memory map directly.

The questions: How accurate are now the cgroup counters?
I understood they should charge only pages allocated by the process, so
why mmap(4 kB) causes max_usage_in_bytes=132 kB?
Why mmap(4 MB) causes max_usage_in_bytes=4 MB + 34 pages?
What is being accounted there (stack guards?)?

Or maybe the entire LTP test checking so carefully memcg limits is useless?

The v5.4 kernel config is here:
https://kernel.ubuntu.com/~kernel-ppa/config/focal/linux-azure/5.4.0-1039.41/amd64-config.flavour.azure

Best regards,
Krzysztof

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Process memory accounting (cgroups) accuracy
  2021-07-02  7:50 Process memory accounting (cgroups) accuracy Krzysztof Kozlowski
@ 2021-07-02  9:08 ` Michal Hocko
  2021-07-02 10:40   ` Krzysztof Kozlowski
  0 siblings, 1 reply; 3+ messages in thread
From: Michal Hocko @ 2021-07-02  9:08 UTC (permalink / raw)
  To: Krzysztof Kozlowski
  Cc: Andrew Morton, Johannes Weiner, Vladimir Davydov, linux-mm,
	linux-kernel, cgroups

On Fri 02-07-21 09:50:11, Krzysztof Kozlowski wrote:
[...]
> The questions: How accurate are now the cgroup counters?

The precision depends on the number of CPUs the workload is running on
as we do a per-cpu charge caching to optimize the accounting. This is
MEMCG_CHARGE_BATCH (32) pages currently. You can learn more by checking
try_charge function (mm/memcontrol.c).

> I understood they should charge only pages allocated by the process, so
> why mmap(4 kB) causes max_usage_in_bytes=132 kB?

Please note that kernel allocations (marked by __GFP_ACCOUNT) are
accounted as well so this is not only about mmaped memory.

> Why mmap(4 MB) causes max_usage_in_bytes=4 MB + 34 pages?

The specific number will depend on the executing - e.g. use up all but 3
pages from CPU0 batch and have 31 pages on another cpu.

> What is being accounted there (stack guards?)?
> 
> Or maybe the entire LTP test checking so carefully memcg limits is useless?

Well, I haven't really checked details of those tests and their
objective but aiming for an absolute precision is not really something
that is very useful IMHO. We are very likely to do optimizations like
the one mentioned above as the runtime tends to be much more important
than to-the-page precision.

Hope this clarifies this a bit.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Process memory accounting (cgroups) accuracy
  2021-07-02  9:08 ` Michal Hocko
@ 2021-07-02 10:40   ` Krzysztof Kozlowski
  0 siblings, 0 replies; 3+ messages in thread
From: Krzysztof Kozlowski @ 2021-07-02 10:40 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Andrew Morton, Johannes Weiner, Vladimir Davydov, linux-mm,
	linux-kernel, cgroups

On 02/07/2021 11:08, Michal Hocko wrote:
> On Fri 02-07-21 09:50:11, Krzysztof Kozlowski wrote:
> [...]
>> The questions: How accurate are now the cgroup counters?
> 
> The precision depends on the number of CPUs the workload is running on
> as we do a per-cpu charge caching to optimize the accounting. This is
> MEMCG_CHARGE_BATCH (32) pages currently. You can learn more by checking
> try_charge function (mm/memcontrol.c).

This explains the 32 pages, thanks!

> 
>> I understood they should charge only pages allocated by the process, so
>> why mmap(4 kB) causes max_usage_in_bytes=132 kB?
> 
> Please note that kernel allocations (marked by __GFP_ACCOUNT) are
> accounted as well so this is not only about mmaped memory.
> 
>> Why mmap(4 MB) causes max_usage_in_bytes=4 MB + 34 pages?
> 
> The specific number will depend on the executing - e.g. use up all but 3
> pages from CPU0 batch and have 31 pages on another cpu.
> 
>> What is being accounted there (stack guards?)?
>>
>> Or maybe the entire LTP test checking so carefully memcg limits is useless?
> 
> Well, I haven't really checked details of those tests and their
> objective but aiming for an absolute precision is not really something
> that is very useful IMHO. We are very likely to do optimizations like
> the one mentioned above as the runtime tends to be much more important
> than to-the-page precision.
> 
> Hope this clarifies this a bit.

Yes, thanks!


Best regards,
Krzysztof

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2021-07-02 10:40 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-07-02  7:50 Process memory accounting (cgroups) accuracy Krzysztof Kozlowski
2021-07-02  9:08 ` Michal Hocko
2021-07-02 10:40   ` Krzysztof Kozlowski

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).