All of lore.kernel.org
 help / color / mirror / Atom feed
From: Tejun Heo <tj@kernel.org>
To: Greg Thelen <gthelen@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>,
	Michal Hocko <mhocko@suse.cz>,
	cgroups@vger.kernel.org, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org, Jan Kara <jack@suse.cz>,
	Dave Chinner <david@fromorbit.com>, Jens Axboe <axboe@kernel.dk>,
	Christoph Hellwig <hch@infradead.org>,
	Li Zefan <lizefan@huawei.com>,
	hughd@google.com,
	Konstantin Khebnikov <khlebnikov@yandex-team.ru>
Subject: Re: [RFC] Making memcg track ownership per address_space or anon_vma
Date: Fri, 30 Jan 2015 11:07:22 -0500	[thread overview]
Message-ID: <20150130160722.GA26111@htj.dyndns.org> (raw)
In-Reply-To: <20150130062737.GB25699@htj.dyndns.org>

Hey, again.

On Fri, Jan 30, 2015 at 01:27:37AM -0500, Tejun Heo wrote:
> The previous behavior was pretty unpredictable in terms of shared file
> ownership too.  I wonder whether the better thing to do here is either
> charging cases like this to the common ancestor or splitting the
> charge equally among the accessors, which might be doable for ro
> files.

I've been thinking more about this.  It's true that doing per-page
association allows for avoiding confronting the worst side effects of
inode sharing head-on, but it is a tradeoff with fairly weak
justfications.  The only thing we're gaining is side-stepping the
blunt of the problem in an awkward manner and the loss of clarity in
taking this compromised position has nasty ramifications when we try
to connect it with the rest of the world.

I could be missing something major but the more I think about it, it
looks to me that the right thing to do here is accounting per-inode
and charging shared inodes to the nearest common ancestor.  The
resulting behavior would be way more logical and predicatable than the
current one, which would make it straight forward to integrate memcg
with blkcg and writeback.

One of the problems that I can think of off the top of my head is that
it'd involve more regular use of charge moving; however, this is an
operation which is per-inode rather than per-page and still gonna be
fairly infrequent.  Another one is that if we move memcg over to this
behavior, it's likely to affect the behavior on the traditional
hierarchies too as we sure as hell don't want to switch between the
two major behaviors dynamically but given that behaviors on inode
sharing aren't very well supported yet, this can be an acceptable
change.

Thanks.

-- 
tejun

WARNING: multiple messages have this Message-ID (diff)
From: Tejun Heo <tj@kernel.org>
To: Greg Thelen <gthelen@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>,
	Michal Hocko <mhocko@suse.cz>,
	cgroups@vger.kernel.org, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org, Jan Kara <jack@suse.cz>,
	Dave Chinner <david@fromorbit.com>, Jens Axboe <axboe@kernel.dk>,
	Christoph Hellwig <hch@infradead.org>,
	Li Zefan <lizefan@huawei.com>,
	hughd@google.com,
	Konstantin Khebnikov <khlebnikov@yandex-team.ru>
Subject: Re: [RFC] Making memcg track ownership per address_space or anon_vma
Date: Fri, 30 Jan 2015 11:07:22 -0500	[thread overview]
Message-ID: <20150130160722.GA26111@htj.dyndns.org> (raw)
In-Reply-To: <20150130062737.GB25699@htj.dyndns.org>

Hey, again.

On Fri, Jan 30, 2015 at 01:27:37AM -0500, Tejun Heo wrote:
> The previous behavior was pretty unpredictable in terms of shared file
> ownership too.  I wonder whether the better thing to do here is either
> charging cases like this to the common ancestor or splitting the
> charge equally among the accessors, which might be doable for ro
> files.

I've been thinking more about this.  It's true that doing per-page
association allows for avoiding confronting the worst side effects of
inode sharing head-on, but it is a tradeoff with fairly weak
justfications.  The only thing we're gaining is side-stepping the
blunt of the problem in an awkward manner and the loss of clarity in
taking this compromised position has nasty ramifications when we try
to connect it with the rest of the world.

I could be missing something major but the more I think about it, it
looks to me that the right thing to do here is accounting per-inode
and charging shared inodes to the nearest common ancestor.  The
resulting behavior would be way more logical and predicatable than the
current one, which would make it straight forward to integrate memcg
with blkcg and writeback.

One of the problems that I can think of off the top of my head is that
it'd involve more regular use of charge moving; however, this is an
operation which is per-inode rather than per-page and still gonna be
fairly infrequent.  Another one is that if we move memcg over to this
behavior, it's likely to affect the behavior on the traditional
hierarchies too as we sure as hell don't want to switch between the
two major behaviors dynamically but given that behaviors on inode
sharing aren't very well supported yet, this can be an acceptable
change.

Thanks.

-- 
tejun

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)
From: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
To: Greg Thelen <gthelen-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
Cc: Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org>,
	Michal Hocko <mhocko-AlSwsSmVLrQ@public.gmane.org>,
	cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	Jan Kara <jack-AlSwsSmVLrQ@public.gmane.org>,
	Dave Chinner <david-FqsqvQoI3Ljby3iVrkZq2A@public.gmane.org>,
	Jens Axboe <axboe-tSWWG44O7X1aa/9Udqfwiw@public.gmane.org>,
	Christoph Hellwig <hch-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>,
	Li Zefan <lizefan-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>,
	hughd-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org,
	Konstantin Khebnikov
	<khlebnikov-XoJtRXgx1JseBXzfvpsJ4g@public.gmane.org>
Subject: Re: [RFC] Making memcg track ownership per address_space or anon_vma
Date: Fri, 30 Jan 2015 11:07:22 -0500	[thread overview]
Message-ID: <20150130160722.GA26111@htj.dyndns.org> (raw)
In-Reply-To: <20150130062737.GB25699-Gd/HAXX7CRxy/B6EtB590w@public.gmane.org>

Hey, again.

On Fri, Jan 30, 2015 at 01:27:37AM -0500, Tejun Heo wrote:
> The previous behavior was pretty unpredictable in terms of shared file
> ownership too.  I wonder whether the better thing to do here is either
> charging cases like this to the common ancestor or splitting the
> charge equally among the accessors, which might be doable for ro
> files.

I've been thinking more about this.  It's true that doing per-page
association allows for avoiding confronting the worst side effects of
inode sharing head-on, but it is a tradeoff with fairly weak
justfications.  The only thing we're gaining is side-stepping the
blunt of the problem in an awkward manner and the loss of clarity in
taking this compromised position has nasty ramifications when we try
to connect it with the rest of the world.

I could be missing something major but the more I think about it, it
looks to me that the right thing to do here is accounting per-inode
and charging shared inodes to the nearest common ancestor.  The
resulting behavior would be way more logical and predicatable than the
current one, which would make it straight forward to integrate memcg
with blkcg and writeback.

One of the problems that I can think of off the top of my head is that
it'd involve more regular use of charge moving; however, this is an
operation which is per-inode rather than per-page and still gonna be
fairly infrequent.  Another one is that if we move memcg over to this
behavior, it's likely to affect the behavior on the traditional
hierarchies too as we sure as hell don't want to switch between the
two major behaviors dynamically but given that behaviors on inode
sharing aren't very well supported yet, this can be an acceptable
change.

Thanks.

-- 
tejun

  reply	other threads:[~2015-01-30 16:07 UTC|newest]

Thread overview: 74+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-01-30  4:43 [RFC] Making memcg track ownership per address_space or anon_vma Tejun Heo
2015-01-30  4:43 ` Tejun Heo
2015-01-30  5:55 ` Greg Thelen
2015-01-30  5:55   ` Greg Thelen
2015-01-30  6:27   ` Tejun Heo
2015-01-30  6:27     ` Tejun Heo
2015-01-30 16:07     ` Tejun Heo [this message]
2015-01-30 16:07       ` Tejun Heo
2015-01-30 16:07       ` Tejun Heo
2015-02-02 19:26       ` Konstantin Khlebnikov
2015-02-02 19:26         ` Konstantin Khlebnikov
2015-02-02 19:46         ` Tejun Heo
2015-02-02 19:46           ` Tejun Heo
2015-02-03 23:30           ` Greg Thelen
2015-02-03 23:30             ` Greg Thelen
2015-02-04 10:49             ` Konstantin Khlebnikov
2015-02-04 10:49               ` Konstantin Khlebnikov
2015-02-04 17:15               ` Tejun Heo
2015-02-04 17:15                 ` Tejun Heo
2015-02-04 17:58                 ` Konstantin Khlebnikov
2015-02-04 17:58                   ` Konstantin Khlebnikov
2015-02-04 18:28                   ` Tejun Heo
2015-02-04 18:28                     ` Tejun Heo
2015-02-04 18:28                     ` Tejun Heo
2015-02-04 17:06             ` Tejun Heo
2015-02-04 17:06               ` Tejun Heo
2015-02-04 23:51               ` Greg Thelen
2015-02-04 23:51                 ` Greg Thelen
2015-02-04 23:51                 ` Greg Thelen
2015-02-05 13:15                 ` Tejun Heo
2015-02-05 13:15                   ` Tejun Heo
2015-02-05 22:05                   ` Greg Thelen
2015-02-05 22:05                     ` Greg Thelen
2015-02-05 22:25                     ` Tejun Heo
2015-02-05 22:25                       ` Tejun Heo
2015-02-05 22:25                       ` Tejun Heo
2015-02-06  0:03                       ` Greg Thelen
2015-02-06  0:03                         ` Greg Thelen
2015-02-06 14:17                         ` Tejun Heo
2015-02-06 14:17                           ` Tejun Heo
2015-02-06 23:43                           ` Greg Thelen
2015-02-06 23:43                             ` Greg Thelen
2015-02-07 14:38                             ` Tejun Heo
2015-02-07 14:38                               ` Tejun Heo
2015-02-07 14:38                               ` Tejun Heo
2015-02-11  2:19                               ` Tejun Heo
2015-02-11  2:19                                 ` Tejun Heo
2015-02-11  2:19                                 ` Tejun Heo
2015-02-11  7:32                                 ` Jan Kara
2015-02-11  7:32                                   ` Jan Kara
2015-02-11  7:32                                   ` Jan Kara
2015-02-11 18:28                                 ` Greg Thelen
2015-02-11 18:28                                   ` Greg Thelen
2015-02-11 18:28                                   ` Greg Thelen
2015-02-11 20:33                                   ` Tejun Heo
2015-02-11 20:33                                     ` Tejun Heo
2015-02-11 21:22                                     ` Konstantin Khlebnikov
2015-02-11 21:22                                       ` Konstantin Khlebnikov
2015-02-11 21:22                                       ` Konstantin Khlebnikov
2015-02-11 21:46                                       ` Tejun Heo
2015-02-11 21:46                                         ` Tejun Heo
2015-02-11 21:57                                         ` Konstantin Khlebnikov
2015-02-11 21:57                                           ` Konstantin Khlebnikov
2015-02-11 21:57                                           ` Konstantin Khlebnikov
2015-02-11 22:05                                           ` Tejun Heo
2015-02-11 22:05                                             ` Tejun Heo
2015-02-11 22:05                                             ` Tejun Heo
2015-02-11 22:15                                             ` Konstantin Khlebnikov
2015-02-11 22:15                                               ` Konstantin Khlebnikov
2015-02-11 22:15                                               ` Konstantin Khlebnikov
2015-02-11 22:30                                               ` Tejun Heo
2015-02-11 22:30                                                 ` Tejun Heo
2015-02-12  2:10                                     ` Greg Thelen
2015-02-12  2:10                                       ` Greg Thelen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150130160722.GA26111@htj.dyndns.org \
    --to=tj@kernel.org \
    --cc=axboe@kernel.dk \
    --cc=cgroups@vger.kernel.org \
    --cc=david@fromorbit.com \
    --cc=gthelen@google.com \
    --cc=hannes@cmpxchg.org \
    --cc=hch@infradead.org \
    --cc=hughd@google.com \
    --cc=jack@suse.cz \
    --cc=khlebnikov@yandex-team.ru \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lizefan@huawei.com \
    --cc=mhocko@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.