linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Yang Shi" <yang.s@alibaba-inc.com>
To: Amir Goldstein <amir73il@gmail.com>
Cc: Jan Kara <jack@suse.cz>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	linux-mm@kvack.org, linux-kernel <linux-kernel@vger.kernel.org>
Subject: Re: [RFC PATCH] fs: fsnotify: account fsnotify metadata to kmemcg
Date: Sat, 21 Oct 2017 05:07:17 +0800	[thread overview]
Message-ID: <b530521e-5215-f735-444a-13f722d90e40@alibaba-inc.com> (raw)
In-Reply-To: <CAOQ4uxhPhXrMLu18TGKDA=ezUVHara95qJQ+BTCio8BHm-u6NA@mail.gmail.com>



On 10/19/17 8:14 PM, Amir Goldstein wrote:
> On Fri, Oct 20, 2017 at 12:20 AM, Yang Shi <yang.s@alibaba-inc.com> wrote:
>> We observed some misbehaved user applications might consume significant
>> amount of fsnotify slabs silently. It'd better to account those slabs in
>> kmemcg so that we can get heads up before misbehaved applications use too
>> much memory silently.
> 
> In what way do they misbehave? create a lot of marks? create a lot of events?
> Not reading events in their queue?

It looks both a lot marks and events. I'm not sure if it is the latter 
case. If I knew more about the details of the behavior, I would 
elaborated more in the commit log.

> The latter case is more interesting:
> 
> Process A is the one that asked to get the events.
> Process B is the one that is generating the events and queuing them on
> the queue that is owned by process A, who is also to blame if the queue
> is not being read.

I agree it is not fair to account the memory to the generator. But, 
afaik, accounting non-current memcg is not how memcg is designed and 
works. Please see the below for some details.

> 
> So why should process B be held accountable for memory pressure
> caused by, say, an FAN_UNLIMITED_QUEUE that process A created and
> doesn't read from.
> 
> Is it possible to get an explicit reference to the memcg's  events cache
> at fsnotify_group creation time, store it in the group struct and then allocate
> events from the event cache associated with the group (the listener) rather
> than the cache associated with the task generating the event?

I don't think current memcg design can do this. Because kmem accounting 
happens at allocation (when calling kmem_cache_alloc) stage, and get the 
associated memcg from current task, so basically who does the allocation 
who get it accounted. If the producer is in the different memcg of 
consumer, it should be just accounted to the producer memcg, although 
the problem might be caused by the producer.

However, afaik, both producer and consumer are typically in the same 
memcg. So, this might be not a big issue. But, I do admit such unfair 
accounting may happen.

Thanks,
Yang

> 
> Amir.
> 
>>
>> Signed-off-by: Yang Shi <yang.s@alibaba-inc.com>
>> ---
>>   fs/notify/dnotify/dnotify.c        | 4 ++--
>>   fs/notify/fanotify/fanotify_user.c | 6 +++---
>>   fs/notify/fsnotify.c               | 2 +-
>>   fs/notify/inotify/inotify_user.c   | 2 +-
>>   4 files changed, 7 insertions(+), 7 deletions(-)
>>
>> diff --git a/fs/notify/dnotify/dnotify.c b/fs/notify/dnotify/dnotify.c
>> index cba3283..3ec6233 100644
>> --- a/fs/notify/dnotify/dnotify.c
>> +++ b/fs/notify/dnotify/dnotify.c
>> @@ -379,8 +379,8 @@ int fcntl_dirnotify(int fd, struct file *filp, unsigned long arg)
>>
>>   static int __init dnotify_init(void)
>>   {
>> -       dnotify_struct_cache = KMEM_CACHE(dnotify_struct, SLAB_PANIC);
>> -       dnotify_mark_cache = KMEM_CACHE(dnotify_mark, SLAB_PANIC);
>> +       dnotify_struct_cache = KMEM_CACHE(dnotify_struct, SLAB_PANIC|SLAB_ACCOUNT);
>> +       dnotify_mark_cache = KMEM_CACHE(dnotify_mark, SLAB_PANIC|SLAB_ACCOUNT);
>>
>>          dnotify_group = fsnotify_alloc_group(&dnotify_fsnotify_ops);
>>          if (IS_ERR(dnotify_group))
>> diff --git a/fs/notify/fanotify/fanotify_user.c b/fs/notify/fanotify/fanotify_user.c
>> index 907a481..7d62dee 100644
>> --- a/fs/notify/fanotify/fanotify_user.c
>> +++ b/fs/notify/fanotify/fanotify_user.c
>> @@ -947,11 +947,11 @@ static int fanotify_add_inode_mark(struct fsnotify_group *group,
>>    */
>>   static int __init fanotify_user_setup(void)
>>   {
>> -       fanotify_mark_cache = KMEM_CACHE(fsnotify_mark, SLAB_PANIC);
>> -       fanotify_event_cachep = KMEM_CACHE(fanotify_event_info, SLAB_PANIC);
>> +       fanotify_mark_cache = KMEM_CACHE(fsnotify_mark, SLAB_PANIC|SLAB_ACCOUNT);
>> +       fanotify_event_cachep = KMEM_CACHE(fanotify_event_info, SLAB_PANIC|SLAB_ACCOUNT);
>>   #ifdef CONFIG_FANOTIFY_ACCESS_PERMISSIONS
>>          fanotify_perm_event_cachep = KMEM_CACHE(fanotify_perm_event_info,
>> -                                               SLAB_PANIC);
>> +                                               SLAB_PANIC|SLAB_ACCOUNT);
>>   #endif
>>
>>          return 0;
>> diff --git a/fs/notify/fsnotify.c b/fs/notify/fsnotify.c
>> index 0c4583b..82620ac 100644
>> --- a/fs/notify/fsnotify.c
>> +++ b/fs/notify/fsnotify.c
>> @@ -386,7 +386,7 @@ static __init int fsnotify_init(void)
>>                  panic("initializing fsnotify_mark_srcu");
>>
>>          fsnotify_mark_connector_cachep = KMEM_CACHE(fsnotify_mark_connector,
>> -                                                   SLAB_PANIC);
>> +                                                   SLAB_PANIC|SLAB_ACCOUNT);
>>
>>          return 0;
>>   }
>> diff --git a/fs/notify/inotify/inotify_user.c b/fs/notify/inotify/inotify_user.c
>> index 7cc7d3f..57b32ff 100644
>> --- a/fs/notify/inotify/inotify_user.c
>> +++ b/fs/notify/inotify/inotify_user.c
>> @@ -785,7 +785,7 @@ static int __init inotify_user_setup(void)
>>
>>          BUG_ON(hweight32(ALL_INOTIFY_BITS) != 21);
>>
>> -       inotify_inode_mark_cachep = KMEM_CACHE(inotify_inode_mark, SLAB_PANIC);
>> +       inotify_inode_mark_cachep = KMEM_CACHE(inotify_inode_mark, SLAB_PANIC|SLAB_ACCOUNT);
>>
>>          inotify_max_queued_events = 16384;
>>          init_user_ns.ucount_max[UCOUNT_INOTIFY_INSTANCES] = 128;
>> --
>> 1.8.3.1
>>

  reply	other threads:[~2017-10-20 21:07 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-10-19 21:20 [RFC PATCH] fs: fsnotify: account fsnotify metadata to kmemcg Yang Shi
2017-10-20  3:14 ` Amir Goldstein
2017-10-20 21:07   ` Yang Shi [this message]
2017-10-22  8:24     ` Amir Goldstein
2017-10-24  4:12       ` Yang Shi
2017-10-24  5:42         ` Amir Goldstein
2017-10-25  0:34           ` Yang Shi
2017-10-31 10:50       ` Jan Kara
2017-10-31 11:51         ` Amir Goldstein
2017-10-31 16:52           ` Jan Kara
2017-10-31 17:01             ` Amir Goldstein

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b530521e-5215-f735-444a-13f722d90e40@alibaba-inc.com \
    --to=yang.s@alibaba-inc.com \
    --cc=amir73il@gmail.com \
    --cc=jack@suse.cz \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).