All of lore.kernel.org
 help / color / mirror / Atom feed
* Access rules for current->memcg
@ 2015-07-16 13:34 Nikolay Borisov
       [not found] ` <55A7B2D0.1030506-/eCPMmvKun9pLGFMi4vTTA@public.gmane.org>
  0 siblings, 1 reply; 15+ messages in thread
From: Nikolay Borisov @ 2015-07-16 13:34 UTC (permalink / raw)
  Cc: cgroups-u79uwXL29TY76Z2rM5mHXA

Hello,

I'd like to ask what are the locking rules when using
mem_cgroup_from_task(current)? Currently I'm doing this under
rcu_read_lock which I believe is sufficient. However, I've seen patches
where reference is obtained via mem_cgroup_from_task and then
css_tryget_online is used on the resulting cgroup?

Looking at the context of css_tryget_online it seems it will only
succeed if the cgroup isn't being terminated (cgroup_destroy_locked
isn't being invoked). Judging from this then if a css_tryget_online
succeeds this means the caller can be sure they are working with a live
cgroup, however, what happens if the process of acuiqring a reference on
css is skipped AND the caller is under RCU read lock? They are
guaranteed to succeed, but after the rcu read lock is released the
cgroup might go away ?

Essentially my use case is to obtain a reference to a memcg for the
current process and query some counter values IOW - just reading from
the memcg. Do I need to acquire a CSS reference in this case?

Regards,
Nikolay

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Access rules for current->memcg
       [not found] ` <55A7B2D0.1030506-/eCPMmvKun9pLGFMi4vTTA@public.gmane.org>
@ 2015-07-16 14:59   ` Michal Hocko
       [not found]     ` <20150716145902.GA10758-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>
  0 siblings, 1 reply; 15+ messages in thread
From: Michal Hocko @ 2015-07-16 14:59 UTC (permalink / raw)
  To: Nikolay Borisov; +Cc: cgroups-u79uwXL29TY76Z2rM5mHXA

On Thu 16-07-15 16:34:08, Nikolay Borisov wrote:
> Hello,
> 
> I'd like to ask what are the locking rules when using
> mem_cgroup_from_task(current)? Currently I'm doing this under
> rcu_read_lock which I believe is sufficient. However, I've seen patches
> where reference is obtained via mem_cgroup_from_task and then
> css_tryget_online is used on the resulting cgroup?

RCU will guarantee that the memcg will not go away. The rest depends on
what you want to do with it. If you want to use it outside of RCU you
have to take a reference. And then it depends what the memcg is used
for - some operations can be done also on the offline memcg.

Btw. mem_cgroup_from_task is not the proper interface for you. You
really want to do
memcg = get_mem_cgroup_from_mm(current->mm)
[...]
css_put(&memcg)

> Looking at the context of css_tryget_online it seems it will only
> succeed if the cgroup isn't being terminated (cgroup_destroy_locked
> isn't being invoked). Judging from this then if a css_tryget_online
> succeeds this means the caller can be sure they are working with a live
> cgroup, however, what happens if the process of acuiqring a reference on
> css is skipped AND the caller is under RCU read lock?

The memcg will not get freed. It still might be offline.

> They are
> guaranteed to succeed, but after the rcu read lock is released the
> cgroup might go away ?

Yes it might go away.
 
> Essentially my use case is to obtain a reference to a memcg for the
> current process and query some counter values IOW - just reading from
> the memcg. Do I need to acquire a CSS reference in this case?

I would strongly recommend using get_mem_cgroup_from_mm as shown above.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Access rules for current->memcg
       [not found]     ` <20150716145902.GA10758-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>
@ 2015-07-16 15:11       ` Nikolay Borisov
       [not found]         ` <55A7C9B4.3010907-/eCPMmvKun9pLGFMi4vTTA@public.gmane.org>
  0 siblings, 1 reply; 15+ messages in thread
From: Nikolay Borisov @ 2015-07-16 15:11 UTC (permalink / raw)
  To: Michal Hocko; +Cc: cgroups-u79uwXL29TY76Z2rM5mHXA



On 07/16/2015 05:59 PM, Michal Hocko wrote:
> On Thu 16-07-15 16:34:08, Nikolay Borisov wrote:
>> Hello,
>>
>> I'd like to ask what are the locking rules when using
>> mem_cgroup_from_task(current)? Currently I'm doing this under
>> rcu_read_lock which I believe is sufficient. However, I've seen patches
>> where reference is obtained via mem_cgroup_from_task and then
>> css_tryget_online is used on the resulting cgroup?
> 
> RCU will guarantee that the memcg will not go away. The rest depends on
> what you want to do with it. If you want to use it outside of RCU you
> have to take a reference. And then it depends what the memcg is used
> for - some operations can be done also on the offline memcg.
> 
> Btw. mem_cgroup_from_task is not the proper interface for you. You
> really want to do
> memcg = get_mem_cgroup_from_mm(current->mm)
> [...]
> css_put(&memcg)

Unfortunately this function is static, do you think there might be any
value of a patch that exposes it upstream?
> 
>> Looking at the context of css_tryget_online it seems it will only
>> succeed if the cgroup isn't being terminated (cgroup_destroy_locked
>> isn't being invoked). Judging from this then if a css_tryget_online
>> succeeds this means the caller can be sure they are working with a live
>> cgroup, however, what happens if the process of acuiqring a reference on
>> css is skipped AND the caller is under RCU read lock?
> 
> The memcg will not get freed. It still might be offline.
> 
>> They are
>> guaranteed to succeed, but after the rcu read lock is released the
>> cgroup might go away ?
> 
> Yes it might go away.
>  
>> Essentially my use case is to obtain a reference to a memcg for the
>> current process and query some counter values IOW - just reading from
>> the memcg. Do I need to acquire a CSS reference in this case?
> 
> I would strongly recommend using get_mem_cgroup_from_mm as shown above.
> 

Thanks for the info !

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Access rules for current->memcg
       [not found]         ` <55A7C9B4.3010907-/eCPMmvKun9pLGFMi4vTTA@public.gmane.org>
@ 2015-07-16 15:22           ` Michal Hocko
       [not found]             ` <20150716152239.GA22529-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>
  0 siblings, 1 reply; 15+ messages in thread
From: Michal Hocko @ 2015-07-16 15:22 UTC (permalink / raw)
  To: Nikolay Borisov; +Cc: cgroups-u79uwXL29TY76Z2rM5mHXA

On Thu 16-07-15 18:11:48, Nikolay Borisov wrote:
> 
> 
> On 07/16/2015 05:59 PM, Michal Hocko wrote:
> > On Thu 16-07-15 16:34:08, Nikolay Borisov wrote:
> >> Hello,
> >>
> >> I'd like to ask what are the locking rules when using
> >> mem_cgroup_from_task(current)? Currently I'm doing this under
> >> rcu_read_lock which I believe is sufficient. However, I've seen patches
> >> where reference is obtained via mem_cgroup_from_task and then
> >> css_tryget_online is used on the resulting cgroup?
> > 
> > RCU will guarantee that the memcg will not go away. The rest depends on
> > what you want to do with it. If you want to use it outside of RCU you
> > have to take a reference. And then it depends what the memcg is used
> > for - some operations can be done also on the offline memcg.
> > 
> > Btw. mem_cgroup_from_task is not the proper interface for you. You
> > really want to do
> > memcg = get_mem_cgroup_from_mm(current->mm)
> > [...]
> > css_put(&memcg)
> 
> Unfortunately this function is static, do you think there might be any
> value of a patch that exposes it upstream?

Ohh, you are right! I thought I made it visible with my recent changes
but nope. There are no external users currently.

Could you tell us more why it would be useful for you?
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Access rules for current->memcg
       [not found]             ` <20150716152239.GA22529-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>
@ 2015-07-16 21:21               ` Nikolay Borisov
       [not found]                 ` <CAJFSNy6sLX82+3ZW_COr__pDTd9aSGgL1bjryMKKcVPhEN0F9g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 15+ messages in thread
From: Nikolay Borisov @ 2015-07-16 21:21 UTC (permalink / raw)
  To: Michal Hocko; +Cc: cgroups-u79uwXL29TY76Z2rM5mHXA

On Thu, Jul 16, 2015 at 6:22 PM, Michal Hocko <mhocko-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> wrote:
> On Thu 16-07-15 18:11:48, Nikolay Borisov wrote:
>>
>>
>> On 07/16/2015 05:59 PM, Michal Hocko wrote:
>> > On Thu 16-07-15 16:34:08, Nikolay Borisov wrote:
>> >> Hello,
>> >>
>> >> I'd like to ask what are the locking rules when using
>> >> mem_cgroup_from_task(current)? Currently I'm doing this under
>> >> rcu_read_lock which I believe is sufficient. However, I've seen patches
>> >> where reference is obtained via mem_cgroup_from_task and then
>> >> css_tryget_online is used on the resulting cgroup?
>> >
>> > RCU will guarantee that the memcg will not go away. The rest depends on
>> > what you want to do with it. If you want to use it outside of RCU you
>> > have to take a reference. And then it depends what the memcg is used
>> > for - some operations can be done also on the offline memcg.
>> >
>> > Btw. mem_cgroup_from_task is not the proper interface for you. You
>> > really want to do
>> > memcg = get_mem_cgroup_from_mm(current->mm)
>> > [...]
>> > css_put(&memcg)
>>
>> Unfortunately this function is static, do you think there might be any
>> value of a patch that exposes it upstream?
>
> Ohh, you are right! I thought I made it visible with my recent changes
> but nope. There are no external users currently.
>
> Could you tell us more why it would be useful for you?

In my particular use case I have to query the memcg's various counters to expose
them to the user in a different way than via the cgroup files
(memory.limit_in_bytes etc).

> --
> Michal Hocko
> SUSE Labs

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Access rules for current->memcg
       [not found]                 ` <CAJFSNy6sLX82+3ZW_COr__pDTd9aSGgL1bjryMKKcVPhEN0F9g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2015-07-17  7:13                   ` Michal Hocko
       [not found]                     ` <20150717071339.GA24787-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>
  0 siblings, 1 reply; 15+ messages in thread
From: Michal Hocko @ 2015-07-17  7:13 UTC (permalink / raw)
  To: Nikolay Borisov; +Cc: cgroups-u79uwXL29TY76Z2rM5mHXA

On Fri 17-07-15 00:21:51, Nikolay Borisov wrote:
> On Thu, Jul 16, 2015 at 6:22 PM, Michal Hocko <mhocko-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> wrote:
> > On Thu 16-07-15 18:11:48, Nikolay Borisov wrote:
> >>
> >>
> >> On 07/16/2015 05:59 PM, Michal Hocko wrote:
> >> > On Thu 16-07-15 16:34:08, Nikolay Borisov wrote:
> >> >> Hello,
> >> >>
> >> >> I'd like to ask what are the locking rules when using
> >> >> mem_cgroup_from_task(current)? Currently I'm doing this under
> >> >> rcu_read_lock which I believe is sufficient. However, I've seen patches
> >> >> where reference is obtained via mem_cgroup_from_task and then
> >> >> css_tryget_online is used on the resulting cgroup?
> >> >
> >> > RCU will guarantee that the memcg will not go away. The rest depends on
> >> > what you want to do with it. If you want to use it outside of RCU you
> >> > have to take a reference. And then it depends what the memcg is used
> >> > for - some operations can be done also on the offline memcg.
> >> >
> >> > Btw. mem_cgroup_from_task is not the proper interface for you. You
> >> > really want to do
> >> > memcg = get_mem_cgroup_from_mm(current->mm)
> >> > [...]
> >> > css_put(&memcg)
> >>
> >> Unfortunately this function is static, do you think there might be any
> >> value of a patch that exposes it upstream?
> >
> > Ohh, you are right! I thought I made it visible with my recent changes
> > but nope. There are no external users currently.
> >
> > Could you tell us more why it would be useful for you?
> 
> In my particular use case I have to query the memcg's various counters to expose
> them to the user in a different way than via the cgroup files
> (memory.limit_in_bytes etc).

Why is the regular interface not sufficient?
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Access rules for current->memcg
       [not found]                     ` <20150717071339.GA24787-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>
@ 2015-07-17  7:16                       ` Nikolay Borisov
       [not found]                         ` <55A8ABC9.7090701-/eCPMmvKun9pLGFMi4vTTA@public.gmane.org>
  0 siblings, 1 reply; 15+ messages in thread
From: Nikolay Borisov @ 2015-07-17  7:16 UTC (permalink / raw)
  To: Michal Hocko; +Cc: cgroups-u79uwXL29TY76Z2rM5mHXA



On 07/17/2015 10:13 AM, Michal Hocko wrote:
> On Fri 17-07-15 00:21:51, Nikolay Borisov wrote:
>> On Thu, Jul 16, 2015 at 6:22 PM, Michal Hocko <mhocko-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> wrote:
>>> On Thu 16-07-15 18:11:48, Nikolay Borisov wrote:
>>>>
>>>>
>>>> On 07/16/2015 05:59 PM, Michal Hocko wrote:
>>>>> On Thu 16-07-15 16:34:08, Nikolay Borisov wrote:
>>>>>> Hello,
>>>>>>
>>>>>> I'd like to ask what are the locking rules when using
>>>>>> mem_cgroup_from_task(current)? Currently I'm doing this under
>>>>>> rcu_read_lock which I believe is sufficient. However, I've seen patches
>>>>>> where reference is obtained via mem_cgroup_from_task and then
>>>>>> css_tryget_online is used on the resulting cgroup?
>>>>>
>>>>> RCU will guarantee that the memcg will not go away. The rest depends on
>>>>> what you want to do with it. If you want to use it outside of RCU you
>>>>> have to take a reference. And then it depends what the memcg is used
>>>>> for - some operations can be done also on the offline memcg.
>>>>>
>>>>> Btw. mem_cgroup_from_task is not the proper interface for you. You
>>>>> really want to do
>>>>> memcg = get_mem_cgroup_from_mm(current->mm)
>>>>> [...]
>>>>> css_put(&memcg)
>>>>
>>>> Unfortunately this function is static, do you think there might be any
>>>> value of a patch that exposes it upstream?
>>>
>>> Ohh, you are right! I thought I made it visible with my recent changes
>>> but nope. There are no external users currently.
>>>
>>> Could you tell us more why it would be useful for you?
>>
>> In my particular use case I have to query the memcg's various counters to expose
>> them to the user in a different way than via the cgroup files
>> (memory.limit_in_bytes etc).
> 
> Why is the regular interface not sufficient?

In my particular case I'm interested in playing with the contents of
/proc/meminfo, so that processes running inside a cgroup only see the
the system as defined by the memcg restrictions

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Access rules for current->memcg
       [not found]                         ` <55A8ABC9.7090701-/eCPMmvKun9pLGFMi4vTTA@public.gmane.org>
@ 2015-07-20 11:17                           ` Michal Hocko
       [not found]                             ` <20150720111707.GE1211-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>
  0 siblings, 1 reply; 15+ messages in thread
From: Michal Hocko @ 2015-07-20 11:17 UTC (permalink / raw)
  To: Nikolay Borisov; +Cc: cgroups-u79uwXL29TY76Z2rM5mHXA

On Fri 17-07-15 10:16:25, Nikolay Borisov wrote:
> 
> 
> On 07/17/2015 10:13 AM, Michal Hocko wrote:
> > On Fri 17-07-15 00:21:51, Nikolay Borisov wrote:
[...]
> >> In my particular use case I have to query the memcg's various counters to expose
> >> them to the user in a different way than via the cgroup files
> >> (memory.limit_in_bytes etc).
> > 
> > Why is the regular interface not sufficient?
> 
> In my particular case I'm interested in playing with the contents of
> /proc/meminfo, so that processes running inside a cgroup only see the
> the system as defined by the memcg restrictions

I assume that this is an attempt to containerize /proc/meminfo. I am not
sure this is a great idea. There are counters which do not have memcg
specific counterpart or such a counterpart would be missleading (e.g.
slab, swap statistics).

Is this an out-of-tree project or you are trying to push your changes to
the Linus tree somewhere? I haven't noticed such patches.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Access rules for current->memcg
       [not found]                             ` <20150720111707.GE1211-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>
@ 2015-07-20 11:22                               ` Nikolay Borisov
       [not found]                                 ` <55ACD9F8.2040802-/eCPMmvKun9pLGFMi4vTTA@public.gmane.org>
  0 siblings, 1 reply; 15+ messages in thread
From: Nikolay Borisov @ 2015-07-20 11:22 UTC (permalink / raw)
  To: Michal Hocko; +Cc: cgroups-u79uwXL29TY76Z2rM5mHXA



On 07/20/2015 02:17 PM, Michal Hocko wrote:
> On Fri 17-07-15 10:16:25, Nikolay Borisov wrote:
>>
>>
>> On 07/17/2015 10:13 AM, Michal Hocko wrote:
>>> On Fri 17-07-15 00:21:51, Nikolay Borisov wrote:
> [...]
>>>> In my particular use case I have to query the memcg's various counters to expose
>>>> them to the user in a different way than via the cgroup files
>>>> (memory.limit_in_bytes etc).
>>>
>>> Why is the regular interface not sufficient?
>>
>> In my particular case I'm interested in playing with the contents of
>> /proc/meminfo, so that processes running inside a cgroup only see the
>> the system as defined by the memcg restrictions
> 
> I assume that this is an attempt to containerize /proc/meminfo. I am not
> sure this is a great idea. There are counters which do not have memcg
> specific counterpart or such a counterpart would be missleading (e.g.
> slab, swap statistics).

Why would swap be misleading? What about memsw.limit_in_bytes -
memory.limit_in_bytes for the total swap and calculating the swap usage
based on memory.memsw.max_usage_in_bytes - memory.usage_in_bytes ?

> 
> Is this an out-of-tree project or you are trying to push your changes to
> the Linus tree somewhere? I haven't noticed such patches.

It's an out-of-tree project and I'm well aware there are certain numbers
which do not correspond 1:1 but that is fine for my use case.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Access rules for current->memcg
       [not found]                                 ` <55ACD9F8.2040802-/eCPMmvKun9pLGFMi4vTTA@public.gmane.org>
@ 2015-07-21  7:48                                   ` Michal Hocko
       [not found]                                     ` <20150721074834.GF11967-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>
  0 siblings, 1 reply; 15+ messages in thread
From: Michal Hocko @ 2015-07-21  7:48 UTC (permalink / raw)
  To: Nikolay Borisov; +Cc: cgroups-u79uwXL29TY76Z2rM5mHXA

On Mon 20-07-15 14:22:32, Nikolay Borisov wrote:
> 
> 
> On 07/20/2015 02:17 PM, Michal Hocko wrote:
> > On Fri 17-07-15 10:16:25, Nikolay Borisov wrote:
> >>
> >>
> >> On 07/17/2015 10:13 AM, Michal Hocko wrote:
> >>> On Fri 17-07-15 00:21:51, Nikolay Borisov wrote:
> > [...]
> >>>> In my particular use case I have to query the memcg's various counters to expose
> >>>> them to the user in a different way than via the cgroup files
> >>>> (memory.limit_in_bytes etc).
> >>>
> >>> Why is the regular interface not sufficient?
> >>
> >> In my particular case I'm interested in playing with the contents of
> >> /proc/meminfo, so that processes running inside a cgroup only see the
> >> the system as defined by the memcg restrictions
> > 
> > I assume that this is an attempt to containerize /proc/meminfo. I am not
> > sure this is a great idea. There are counters which do not have memcg
> > specific counterpart or such a counterpart would be missleading (e.g.
> > slab, swap statistics).
> 
> Why would swap be misleading?

Because the swap space is inherently a shared resource.

> What about memsw.limit_in_bytes - memory.limit_in_bytes for the total swap

No this is not how the swap extension works. memsw counter covers
usage+swap. You can have up to memsw.limit_in_bytes swapped out. So
you would have to do min(memsw.limit_in_bytes, TotalSwap) but even then
it wouldn't tell you much because that would be the case for other
memory cgroups as well. I would be quite hard to distribute the
TotalSwap for all the cgroups.

> and calculating the swap usage
> based on memory.memsw.max_usage_in_bytes - memory.usage_in_bytes ?

This doesn't make much sense to me. Swap usage per memcg is exported by
memory.stat file. But this is not what you want to export. meminfo
exports SwapFree which would tell you how much memory could be swapped
out before the anonymous memory is not reclaimable anymore. This is
impossible to find out in general - especially when the system is
allowed to be overcommit.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Access rules for current->memcg
       [not found]                                     ` <20150721074834.GF11967-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>
@ 2015-07-21  9:08                                       ` Nikolay Borisov
       [not found]                                         ` <55AE0C18.5000304-/eCPMmvKun9pLGFMi4vTTA@public.gmane.org>
  0 siblings, 1 reply; 15+ messages in thread
From: Nikolay Borisov @ 2015-07-21  9:08 UTC (permalink / raw)
  To: Michal Hocko; +Cc: cgroups-u79uwXL29TY76Z2rM5mHXA



On 07/21/2015 10:48 AM, Michal Hocko wrote:
> On Mon 20-07-15 14:22:32, Nikolay Borisov wrote:
>>
>>
>> On 07/20/2015 02:17 PM, Michal Hocko wrote:
>>> On Fri 17-07-15 10:16:25, Nikolay Borisov wrote:
>>>>
>>>>
>>>> On 07/17/2015 10:13 AM, Michal Hocko wrote:
>>>>> On Fri 17-07-15 00:21:51, Nikolay Borisov wrote:
>>> [...]
>>>>>> In my particular use case I have to query the memcg's various counters to expose
>>>>>> them to the user in a different way than via the cgroup files
>>>>>> (memory.limit_in_bytes etc).
>>>>>
>>>>> Why is the regular interface not sufficient?
>>>>
>>>> In my particular case I'm interested in playing with the contents of
>>>> /proc/meminfo, so that processes running inside a cgroup only see the
>>>> the system as defined by the memcg restrictions
>>>
>>> I assume that this is an attempt to containerize /proc/meminfo. I am not
>>> sure this is a great idea. There are counters which do not have memcg
>>> specific counterpart or such a counterpart would be missleading (e.g.
>>> slab, swap statistics).
>>
>> Why would swap be misleading?
> 
> Because the swap space is inherently a shared resource.
> 
>> What about memsw.limit_in_bytes - memory.limit_in_bytes for the total swap
> 
> No this is not how the swap extension works. memsw counter covers
> usage+swap. You can have up to memsw.limit_in_bytes swapped out. So

I think you are wrong with that assumption. According to the
documentation here:
https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Resource_Management_Guide/sec-memory.html
(the box titled "Setting the memory.memsw.limit_in_bytes and
memory.limit_in_bytes parameters") it seems that the space that could be
swapped is actually memsw.limit_in_bytes  - memory.limit_in_bytes.

The way I understand is you will only start swapping after the limit in
memory.limit_in_bytes has been exhausted and you will be able to swap
only until memsw.limit_in_bytes is exhausted (but since
memory.usage_in_bytes is already calculated in, which would equal to
memory.limit_in_bytes in a memory pressure situation) you effectively
have memsw.limit_in_bytes - memory.limit_in_bytes, no?

> you would have to do min(memsw.limit_in_bytes, TotalSwap) but even then
> it wouldn't tell you much because that would be the case for other
> memory cgroups as well. I would be quite hard to distribute the
> TotalSwap for all the cgroups.
> 
>> and calculating the swap usage
>> based on memory.memsw.max_usage_in_bytes - memory.usage_in_bytes ?
> 
> This doesn't make much sense to me. Swap usage per memcg is exported by
> memory.stat file. But this is not what you want to export. meminfo
> exports SwapFree which would tell you how much memory could be swapped
> out before the anonymous memory is not reclaimable anymore. This is
> impossible to find out in general - especially when the system is
> allowed to be overcommit.

I looked more carefully into the code and saw that the page_counters
(which back memory/memsw.limit/usage_in_bytes) are charged during the
try_charge whereas the per-cpu statistics (which back the info in
memory.stats) are updated after committing the charge. I assume in the
case where charges are not canceled the data in memory.stats and
memory.memsw.max_usage_in_bytes - memory.usage_in_bytes should be identical?


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Access rules for current->memcg
       [not found]                                         ` <55AE0C18.5000304-/eCPMmvKun9pLGFMi4vTTA@public.gmane.org>
@ 2015-07-21  9:24                                           ` Johannes Weiner
  2015-07-21  9:48                                           ` Michal Hocko
  1 sibling, 0 replies; 15+ messages in thread
From: Johannes Weiner @ 2015-07-21  9:24 UTC (permalink / raw)
  To: Nikolay Borisov; +Cc: Michal Hocko, cgroups-u79uwXL29TY76Z2rM5mHXA

On Tue, Jul 21, 2015 at 12:08:40PM +0300, Nikolay Borisov wrote:
> On 07/21/2015 10:48 AM, Michal Hocko wrote:
> > On Mon 20-07-15 14:22:32, Nikolay Borisov wrote:
> >> What about memsw.limit_in_bytes - memory.limit_in_bytes for the total swap
> > 
> > No this is not how the swap extension works. memsw counter covers
> > usage+swap. You can have up to memsw.limit_in_bytes swapped out. So
> 
> I think you are wrong with that assumption. According to the
> documentation here:
> https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Resource_Management_Guide/sec-memory.html
> (the box titled "Setting the memory.memsw.limit_in_bytes and
> memory.limit_in_bytes parameters") it seems that the space that could be
> swapped is actually memsw.limit_in_bytes  - memory.limit_in_bytes.
> 
> The way I understand is you will only start swapping after the limit in
> memory.limit_in_bytes has been exhausted and you will be able to swap
> only until memsw.limit_in_bytes is exhausted (but since
> memory.usage_in_bytes is already calculated in, which would equal to
> memory.limit_in_bytes in a memory pressure situation) you effectively
> have memsw.limit_in_bytes - memory.limit_in_bytes, no?

No, the cgroup can also swap due to global/parental memory pressure,
not just due to memory.limit_in_bytes.  In that case, it can get its
pages swapped out until it reaches memsw.limit_in_bytes, with memory
consumption well below memory.limit or even 0.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Access rules for current->memcg
       [not found]                                         ` <55AE0C18.5000304-/eCPMmvKun9pLGFMi4vTTA@public.gmane.org>
  2015-07-21  9:24                                           ` Johannes Weiner
@ 2015-07-21  9:48                                           ` Michal Hocko
       [not found]                                             ` <20150721094832.GI11967-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>
  1 sibling, 1 reply; 15+ messages in thread
From: Michal Hocko @ 2015-07-21  9:48 UTC (permalink / raw)
  To: Nikolay Borisov; +Cc: cgroups-u79uwXL29TY76Z2rM5mHXA

On Tue 21-07-15 12:08:40, Nikolay Borisov wrote:
> 
> 
> On 07/21/2015 10:48 AM, Michal Hocko wrote:
> > On Mon 20-07-15 14:22:32, Nikolay Borisov wrote:
> >>
> >>
> >> On 07/20/2015 02:17 PM, Michal Hocko wrote:
> >>> On Fri 17-07-15 10:16:25, Nikolay Borisov wrote:
> >>>>
> >>>>
> >>>> On 07/17/2015 10:13 AM, Michal Hocko wrote:
> >>>>> On Fri 17-07-15 00:21:51, Nikolay Borisov wrote:
> >>> [...]
> >>>>>> In my particular use case I have to query the memcg's various counters to expose
> >>>>>> them to the user in a different way than via the cgroup files
> >>>>>> (memory.limit_in_bytes etc).
> >>>>>
> >>>>> Why is the regular interface not sufficient?
> >>>>
> >>>> In my particular case I'm interested in playing with the contents of
> >>>> /proc/meminfo, so that processes running inside a cgroup only see the
> >>>> the system as defined by the memcg restrictions
> >>>
> >>> I assume that this is an attempt to containerize /proc/meminfo. I am not
> >>> sure this is a great idea. There are counters which do not have memcg
> >>> specific counterpart or such a counterpart would be missleading (e.g.
> >>> slab, swap statistics).
> >>
> >> Why would swap be misleading?
> > 
> > Because the swap space is inherently a shared resource.
> > 
> >> What about memsw.limit_in_bytes - memory.limit_in_bytes for the total swap
> > 
> > No this is not how the swap extension works. memsw counter covers
> > usage+swap. You can have up to memsw.limit_in_bytes swapped out. So
> 
> I think you are wrong with that assumption. According to the
> documentation here:
> https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Resource_Management_Guide/sec-memory.html
> (the box titled "Setting the memory.memsw.limit_in_bytes and
> memory.limit_in_bytes parameters") it seems that the space that could be
> swapped is actually memsw.limit_in_bytes  - memory.limit_in_bytes.
> 
> The way I understand is you will only start swapping after the limit in
> memory.limit_in_bytes has been exhausted and you will be able to swap
> only until memsw.limit_in_bytes is exhausted (but since
> memory.usage_in_bytes is already calculated in, which would equal to
> memory.limit_in_bytes in a memory pressure situation) you effectively
> have memsw.limit_in_bytes - memory.limit_in_bytes, no?

No. Your memory might get swapped out even before you hit the hard
limit if there is an external memory pressure. This would be either
a global memory pressure or a hard limit triggered up in the hierarchy.

Take the most trivial situation when memsw and hard limits are equal. You
would have a 0 swap space which is not the case obviously.

Please have a look at Documentation/cgroups/memory.txt and '2.4 Swap
Extension (CONFIG_MEMCG_SWAP)' for more information.

> > you would have to do min(memsw.limit_in_bytes, TotalSwap) but even then
> > it wouldn't tell you much because that would be the case for other
> > memory cgroups as well. I would be quite hard to distribute the
> > TotalSwap for all the cgroups.
> > 
> >> and calculating the swap usage
> >> based on memory.memsw.max_usage_in_bytes - memory.usage_in_bytes ?
> > 
> > This doesn't make much sense to me. Swap usage per memcg is exported by
> > memory.stat file. But this is not what you want to export. meminfo
> > exports SwapFree which would tell you how much memory could be swapped
> > out before the anonymous memory is not reclaimable anymore. This is
> > impossible to find out in general - especially when the system is
> > allowed to be overcommit.
> 
> I looked more carefully into the code and saw that the page_counters
> (which back memory/memsw.limit/usage_in_bytes) are charged during the
> try_charge whereas the per-cpu statistics (which back the info in
> memory.stats) are updated after committing the charge. I assume in the
> case where charges are not canceled the data in memory.stats and
> memory.memsw.max_usage_in_bytes - memory.usage_in_bytes should be identical?

I am not sure what you mean here. max_usage_in_bytes is a historical
value which was the maximum charge used at some point in time.
usage_in_bytes is always the _current_ value of the charge counter.
max_usage_in_bytes-usage_in_bytes doesn't tell you much really.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Access rules for current->memcg
       [not found]                                             ` <20150721094832.GI11967-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>
@ 2015-07-21 10:32                                               ` Nikolay Borisov
       [not found]                                                 ` <55AE1FB7.8050107-/eCPMmvKun9pLGFMi4vTTA@public.gmane.org>
  0 siblings, 1 reply; 15+ messages in thread
From: Nikolay Borisov @ 2015-07-21 10:32 UTC (permalink / raw)
  To: Michal Hocko; +Cc: cgroups-u79uwXL29TY76Z2rM5mHXA



On 07/21/2015 12:48 PM, Michal Hocko wrote:
> On Tue 21-07-15 12:08:40, Nikolay Borisov wrote:
>>
>>
>> On 07/21/2015 10:48 AM, Michal Hocko wrote:
>>> On Mon 20-07-15 14:22:32, Nikolay Borisov wrote:
>>>>
>>>>
>>>> On 07/20/2015 02:17 PM, Michal Hocko wrote:
>>>>> On Fri 17-07-15 10:16:25, Nikolay Borisov wrote:
>>>>>>
>>>>>>
>>>>>> On 07/17/2015 10:13 AM, Michal Hocko wrote:
>>>>>>> On Fri 17-07-15 00:21:51, Nikolay Borisov wrote:
>>>>> [...]
>>>>>>>> In my particular use case I have to query the memcg's various counters to expose
>>>>>>>> them to the user in a different way than via the cgroup files
>>>>>>>> (memory.limit_in_bytes etc).
>>>>>>>
>>>>>>> Why is the regular interface not sufficient?
>>>>>>
>>>>>> In my particular case I'm interested in playing with the contents of
>>>>>> /proc/meminfo, so that processes running inside a cgroup only see the
>>>>>> the system as defined by the memcg restrictions
>>>>>
>>>>> I assume that this is an attempt to containerize /proc/meminfo. I am not
>>>>> sure this is a great idea. There are counters which do not have memcg
>>>>> specific counterpart or such a counterpart would be missleading (e.g.
>>>>> slab, swap statistics).
>>>>
>>>> Why would swap be misleading?
>>>
>>> Because the swap space is inherently a shared resource.
>>>
>>>> What about memsw.limit_in_bytes - memory.limit_in_bytes for the total swap
>>>
>>> No this is not how the swap extension works. memsw counter covers
>>> usage+swap. You can have up to memsw.limit_in_bytes swapped out. So
>>
>> I think you are wrong with that assumption. According to the
>> documentation here:
>> https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Resource_Management_Guide/sec-memory.html
>> (the box titled "Setting the memory.memsw.limit_in_bytes and
>> memory.limit_in_bytes parameters") it seems that the space that could be
>> swapped is actually memsw.limit_in_bytes  - memory.limit_in_bytes.
>>
>> The way I understand is you will only start swapping after the limit in
>> memory.limit_in_bytes has been exhausted and you will be able to swap
>> only until memsw.limit_in_bytes is exhausted (but since
>> memory.usage_in_bytes is already calculated in, which would equal to
>> memory.limit_in_bytes in a memory pressure situation) you effectively
>> have memsw.limit_in_bytes - memory.limit_in_bytes, no?
> 
> No. Your memory might get swapped out even before you hit the hard
> limit if there is an external memory pressure. This would be either
> a global memory pressure or a hard limit triggered up in the hierarchy.
> 
> Take the most trivial situation when memsw and hard limits are equal. You
> would have a 0 swap space which is not the case obviously.
> 
> Please have a look at Documentation/cgroups/memory.txt and '2.4 Swap
> Extension (CONFIG_MEMCG_SWAP)' for more information.
> 
>>> you would have to do min(memsw.limit_in_bytes, TotalSwap) but even then
>>> it wouldn't tell you much because that would be the case for other
>>> memory cgroups as well. I would be quite hard to distribute the
>>> TotalSwap for all the cgroups.
>>>
>>>> and calculating the swap usage
>>>> based on memory.memsw.max_usage_in_bytes - memory.usage_in_bytes ?
>>>
>>> This doesn't make much sense to me. Swap usage per memcg is exported by
>>> memory.stat file. But this is not what you want to export. meminfo
>>> exports SwapFree which would tell you how much memory could be swapped
>>> out before the anonymous memory is not reclaimable anymore. This is
>>> impossible to find out in general - especially when the system is
>>> allowed to be overcommit.
>>
>> I looked more carefully into the code and saw that the page_counters
>> (which back memory/memsw.limit/usage_in_bytes) are charged during the
>> try_charge whereas the per-cpu statistics (which back the info in
>> memory.stats) are updated after committing the charge. I assume in the
>> case where charges are not canceled the data in memory.stats and
>> memory.memsw.max_usage_in_bytes - memory.usage_in_bytes should be identical?
> 
> I am not sure what you mean here. max_usage_in_bytes is a historical
> value which was the maximum charge used at some point in time.
> usage_in_bytes is always the _current_ value of the charge counter.
> max_usage_in_bytes-usage_in_bytes doesn't tell you much really.
> 

I have misunderstood me I have never, ever referred to max_usage. What I
meant was that the information that memory.stat file provides is
acquired by reading the memcg->stat->count[counter] and those values are
updated when mem_cgroup_commit_charge() is invoked. And
mem_cgroup_commit_charge is invoked AFTER mem_cgroup_try_charge, which
updates the charge counters. So my point was that whether I read the
swap value (for example) from the memory.stats file or whether I
manually do the maths with subtraction  as previously shown the 2 values
should match.

Essentially whether information should be queried form the charge
counter or from the mem_cgroup_stat_cpu struct. Does that make sense?

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Access rules for current->memcg
       [not found]                                                 ` <55AE1FB7.8050107-/eCPMmvKun9pLGFMi4vTTA@public.gmane.org>
@ 2015-07-21 12:25                                                   ` Michal Hocko
  0 siblings, 0 replies; 15+ messages in thread
From: Michal Hocko @ 2015-07-21 12:25 UTC (permalink / raw)
  To: Nikolay Borisov; +Cc: cgroups-u79uwXL29TY76Z2rM5mHXA

On Tue 21-07-15 13:32:23, Nikolay Borisov wrote:
> 
> 
> On 07/21/2015 12:48 PM, Michal Hocko wrote:
> > On Tue 21-07-15 12:08:40, Nikolay Borisov wrote:
[...]
> >> I looked more carefully into the code and saw that the page_counters
> >> (which back memory/memsw.limit/usage_in_bytes) are charged during the
> >> try_charge whereas the per-cpu statistics (which back the info in
> >> memory.stats) are updated after committing the charge. I assume in the
> >> case where charges are not canceled the data in memory.stats and
> >> memory.memsw.max_usage_in_bytes - memory.usage_in_bytes should be identical?
> > 
> > I am not sure what you mean here. max_usage_in_bytes is a historical
> > value which was the maximum charge used at some point in time.
> > usage_in_bytes is always the _current_ value of the charge counter.
> > max_usage_in_bytes-usage_in_bytes doesn't tell you much really.
> > 
> 
> I have misunderstood me I have never, ever referred to max_usage. What I
> meant was that the information that memory.stat file provides is
> acquired by reading the memcg->stat->count[counter] and those values are
> updated when mem_cgroup_commit_charge() is invoked.

True

> And
> mem_cgroup_commit_charge is invoked AFTER mem_cgroup_try_charge, which
> updates the charge counters.

Still true and you should realize that the commit is called very shortly
after the charge. The race window is not really interesting for anything
practical.

> So my point was that whether I read the
> swap value (for example) from the memory.stats file or whether I
> manually do the maths with subtraction  as previously shown the 2 values
> should match.

Your subtraction simply doesn't work and doesn't tell you how much
memory is swapped out from the memcg as explained in other email.

> Essentially whether information should be queried form the charge
> counter or from the mem_cgroup_stat_cpu struct. Does that make sense?

memory.stat will tell you the information you are looking for. You
simply cannot calculate those numbers from the counters.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2015-07-21 12:25 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-07-16 13:34 Access rules for current->memcg Nikolay Borisov
     [not found] ` <55A7B2D0.1030506-/eCPMmvKun9pLGFMi4vTTA@public.gmane.org>
2015-07-16 14:59   ` Michal Hocko
     [not found]     ` <20150716145902.GA10758-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>
2015-07-16 15:11       ` Nikolay Borisov
     [not found]         ` <55A7C9B4.3010907-/eCPMmvKun9pLGFMi4vTTA@public.gmane.org>
2015-07-16 15:22           ` Michal Hocko
     [not found]             ` <20150716152239.GA22529-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>
2015-07-16 21:21               ` Nikolay Borisov
     [not found]                 ` <CAJFSNy6sLX82+3ZW_COr__pDTd9aSGgL1bjryMKKcVPhEN0F9g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-07-17  7:13                   ` Michal Hocko
     [not found]                     ` <20150717071339.GA24787-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>
2015-07-17  7:16                       ` Nikolay Borisov
     [not found]                         ` <55A8ABC9.7090701-/eCPMmvKun9pLGFMi4vTTA@public.gmane.org>
2015-07-20 11:17                           ` Michal Hocko
     [not found]                             ` <20150720111707.GE1211-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>
2015-07-20 11:22                               ` Nikolay Borisov
     [not found]                                 ` <55ACD9F8.2040802-/eCPMmvKun9pLGFMi4vTTA@public.gmane.org>
2015-07-21  7:48                                   ` Michal Hocko
     [not found]                                     ` <20150721074834.GF11967-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>
2015-07-21  9:08                                       ` Nikolay Borisov
     [not found]                                         ` <55AE0C18.5000304-/eCPMmvKun9pLGFMi4vTTA@public.gmane.org>
2015-07-21  9:24                                           ` Johannes Weiner
2015-07-21  9:48                                           ` Michal Hocko
     [not found]                                             ` <20150721094832.GI11967-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>
2015-07-21 10:32                                               ` Nikolay Borisov
     [not found]                                                 ` <55AE1FB7.8050107-/eCPMmvKun9pLGFMi4vTTA@public.gmane.org>
2015-07-21 12:25                                                   ` Michal Hocko

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.