All of lore.kernel.org
 help / color / mirror / Atom feed
From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
To: Frederic Weisbecker <fweisbec@gmail.com>
Cc: David Rientjes <rientjes@google.com>,
	Glauber Costa <glommer@parallels.com>,
	cgroups@vger.kernel.org, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org, devel@openvz.org,
	Michal Hocko <mhocko@suse.cz>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Greg Thelen <gthelen@google.com>,
	Suleiman Souhlal <suleiman@google.com>,
	Christoph Lameter <cl@linux.com>,
	Pekka Enberg <penberg@cs.helsinki.fi>
Subject: Re: [PATCH 17/23] kmem controller charge/uncharge infrastructure
Date: Wed, 25 Apr 2012 10:56:16 +0900	[thread overview]
Message-ID: <4F9759C0.1070805@jp.fujitsu.com> (raw)
In-Reply-To: <20120424142232.GC8626@somewhere>

(2012/04/24 23:22), Frederic Weisbecker wrote:

> On Mon, Apr 23, 2012 at 03:25:59PM -0700, David Rientjes wrote:
>> On Sun, 22 Apr 2012, Glauber Costa wrote:
>>
>>> +/*
>>> + * Return the kmem_cache we're supposed to use for a slab allocation.
>>> + * If we are in interrupt context or otherwise have an allocation that
>>> + * can't fail, we return the original cache.
>>> + * Otherwise, we will try to use the current memcg's version of the cache.
>>> + *
>>> + * If the cache does not exist yet, if we are the first user of it,
>>> + * we either create it immediately, if possible, or create it asynchronously
>>> + * in a workqueue.
>>> + * In the latter case, we will let the current allocation go through with
>>> + * the original cache.
>>> + *
>>> + * This function returns with rcu_read_lock() held.
>>> + */
>>> +struct kmem_cache *__mem_cgroup_get_kmem_cache(struct kmem_cache *cachep,
>>> +					     gfp_t gfp)
>>> +{
>>> +	struct mem_cgroup *memcg;
>>> +	int idx;
>>> +
>>> +	gfp |=  cachep->allocflags;
>>> +
>>> +	if ((current->mm == NULL))
>>> +		return cachep;
>>> +
>>> +	if (cachep->memcg_params.memcg)
>>> +		return cachep;
>>> +
>>> +	idx = cachep->memcg_params.id;
>>> +	VM_BUG_ON(idx == -1);
>>> +
>>> +	memcg = mem_cgroup_from_task(current);
>>> +	if (!mem_cgroup_kmem_enabled(memcg))
>>> +		return cachep;
>>> +
>>> +	if (rcu_access_pointer(memcg->slabs[idx]) == NULL) {
>>> +		memcg_create_cache_enqueue(memcg, cachep);
>>> +		return cachep;
>>> +	}
>>> +
>>> +	return rcu_dereference(memcg->slabs[idx]);
>>> +}
>>> +EXPORT_SYMBOL(__mem_cgroup_get_kmem_cache);
>>> +
>>> +void mem_cgroup_remove_child_kmem_cache(struct kmem_cache *cachep, int id)
>>> +{
>>> +	rcu_assign_pointer(cachep->memcg_params.memcg->slabs[id], NULL);
>>> +}
>>> +
>>> +bool __mem_cgroup_charge_kmem(gfp_t gfp, size_t size)
>>> +{
>>> +	struct mem_cgroup *memcg;
>>> +	bool ret = true;
>>> +
>>> +	rcu_read_lock();
>>> +	memcg = mem_cgroup_from_task(current);
>>
>> This seems horribly inconsistent with memcg charging of user memory since 
>> it charges to p->mm->owner and you're charging to p.  So a thread attached 
>> to a memcg can charge user memory to one memcg while charging slab to 
>> another memcg?
> 
> Charging to the thread rather than the process seem to me the right behaviour:
> you can have two threads of a same process attached to different cgroups.
> 
> Perhaps it is the user memory memcg that needs to be fixed?
> 

There is a problem of OOM-Kill.
To free memory by killing process, 'mm' should be released by kill.
So, oom-killer just finds a leader of process.

Assume A process X consists of thread A, B and A is thread-group-leader.

Put thread A into cgroup/Gold
    thread B into cgroup/Silver.

If we do accounting based on threads, we can't do anything at OOM in cgroup/Silver.
An idea 'Killing thread-A to kill thread-B'..... breaks isolation.
 
As far as resources used by process, I think accounting should be done per process.
It's not tied to thread.

About kmem, if we count task_struct, page tables, etc...which can be freed by
OOM-Killer i.e. it's allocated for 'process', should be aware of OOM problem.
Using mm->owner makes sense to me until someone finds a great idea to handle
OOM situation rather than task killing.

Thanks,
-Kame


WARNING: multiple messages have this Message-ID (diff)
From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
To: Frederic Weisbecker <fweisbec@gmail.com>
Cc: David Rientjes <rientjes@google.com>,
	Glauber Costa <glommer@parallels.com>,
	cgroups@vger.kernel.org, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org, devel@openvz.org,
	Michal Hocko <mhocko@suse.cz>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Greg Thelen <gthelen@google.com>,
	Suleiman Souhlal <suleiman@google.com>,
	Christoph Lameter <cl@linux.com>,
	Pekka Enberg <penberg@cs.helsinki.fi>
Subject: Re: [PATCH 17/23] kmem controller charge/uncharge infrastructure
Date: Wed, 25 Apr 2012 10:56:16 +0900	[thread overview]
Message-ID: <4F9759C0.1070805@jp.fujitsu.com> (raw)
In-Reply-To: <20120424142232.GC8626@somewhere>

(2012/04/24 23:22), Frederic Weisbecker wrote:

> On Mon, Apr 23, 2012 at 03:25:59PM -0700, David Rientjes wrote:
>> On Sun, 22 Apr 2012, Glauber Costa wrote:
>>
>>> +/*
>>> + * Return the kmem_cache we're supposed to use for a slab allocation.
>>> + * If we are in interrupt context or otherwise have an allocation that
>>> + * can't fail, we return the original cache.
>>> + * Otherwise, we will try to use the current memcg's version of the cache.
>>> + *
>>> + * If the cache does not exist yet, if we are the first user of it,
>>> + * we either create it immediately, if possible, or create it asynchronously
>>> + * in a workqueue.
>>> + * In the latter case, we will let the current allocation go through with
>>> + * the original cache.
>>> + *
>>> + * This function returns with rcu_read_lock() held.
>>> + */
>>> +struct kmem_cache *__mem_cgroup_get_kmem_cache(struct kmem_cache *cachep,
>>> +					     gfp_t gfp)
>>> +{
>>> +	struct mem_cgroup *memcg;
>>> +	int idx;
>>> +
>>> +	gfp |=  cachep->allocflags;
>>> +
>>> +	if ((current->mm == NULL))
>>> +		return cachep;
>>> +
>>> +	if (cachep->memcg_params.memcg)
>>> +		return cachep;
>>> +
>>> +	idx = cachep->memcg_params.id;
>>> +	VM_BUG_ON(idx == -1);
>>> +
>>> +	memcg = mem_cgroup_from_task(current);
>>> +	if (!mem_cgroup_kmem_enabled(memcg))
>>> +		return cachep;
>>> +
>>> +	if (rcu_access_pointer(memcg->slabs[idx]) == NULL) {
>>> +		memcg_create_cache_enqueue(memcg, cachep);
>>> +		return cachep;
>>> +	}
>>> +
>>> +	return rcu_dereference(memcg->slabs[idx]);
>>> +}
>>> +EXPORT_SYMBOL(__mem_cgroup_get_kmem_cache);
>>> +
>>> +void mem_cgroup_remove_child_kmem_cache(struct kmem_cache *cachep, int id)
>>> +{
>>> +	rcu_assign_pointer(cachep->memcg_params.memcg->slabs[id], NULL);
>>> +}
>>> +
>>> +bool __mem_cgroup_charge_kmem(gfp_t gfp, size_t size)
>>> +{
>>> +	struct mem_cgroup *memcg;
>>> +	bool ret = true;
>>> +
>>> +	rcu_read_lock();
>>> +	memcg = mem_cgroup_from_task(current);
>>
>> This seems horribly inconsistent with memcg charging of user memory since 
>> it charges to p->mm->owner and you're charging to p.  So a thread attached 
>> to a memcg can charge user memory to one memcg while charging slab to 
>> another memcg?
> 
> Charging to the thread rather than the process seem to me the right behaviour:
> you can have two threads of a same process attached to different cgroups.
> 
> Perhaps it is the user memory memcg that needs to be fixed?
> 

There is a problem of OOM-Kill.
To free memory by killing process, 'mm' should be released by kill.
So, oom-killer just finds a leader of process.

Assume A process X consists of thread A, B and A is thread-group-leader.

Put thread A into cgroup/Gold
    thread B into cgroup/Silver.

If we do accounting based on threads, we can't do anything at OOM in cgroup/Silver.
An idea 'Killing thread-A to kill thread-B'..... breaks isolation.
 
As far as resources used by process, I think accounting should be done per process.
It's not tied to thread.

About kmem, if we count task_struct, page tables, etc...which can be freed by
OOM-Killer i.e. it's allocated for 'process', should be aware of OOM problem.
Using mm->owner makes sense to me until someone finds a great idea to handle
OOM situation rather than task killing.

Thanks,
-Kame

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2012-04-25  1:58 UTC|newest]

Thread overview: 178+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-04-20 21:57 [PATCH 00/23] slab+slub accounting for memcg Glauber Costa
2012-04-20 21:57 ` Glauber Costa
2012-04-20 21:57 ` Glauber Costa
2012-04-20 21:57 ` [PATCH 01/23] slub: don't create a copy of the name string in kmem_cache_create Glauber Costa
2012-04-20 21:57   ` Glauber Costa
2012-04-20 21:57   ` Glauber Costa
2012-04-20 21:57 ` [PATCH 02/23] slub: always get the cache from its page in kfree Glauber Costa
2012-04-20 21:57   ` Glauber Costa
2012-04-20 21:57 ` [PATCH 03/23] slab: rename gfpflags to allocflags Glauber Costa
2012-04-20 21:57   ` Glauber Costa
2012-04-20 21:57 ` [PATCH 04/23] memcg: Make it possible to use the stock for more than one page Glauber Costa
2012-04-20 21:57   ` Glauber Costa
2012-04-25  0:59   ` KAMEZAWA Hiroyuki
2012-04-25  0:59     ` KAMEZAWA Hiroyuki
2012-04-25  0:59     ` KAMEZAWA Hiroyuki
2012-04-20 21:57 ` [PATCH 05/23] memcg: Reclaim when more than one page needed Glauber Costa
2012-04-20 21:57   ` Glauber Costa
2012-04-20 21:57   ` Glauber Costa
2012-04-25  1:16   ` KAMEZAWA Hiroyuki
2012-04-25  1:16     ` KAMEZAWA Hiroyuki
2012-04-25  1:16     ` KAMEZAWA Hiroyuki
2012-04-20 21:57 ` [PATCH 06/23] slab: use obj_size field of struct kmem_cache when not debugging Glauber Costa
2012-04-20 21:57   ` Glauber Costa
2012-04-20 21:57   ` Glauber Costa
2012-04-20 21:57 ` [PATCH 07/23] change defines to an enum Glauber Costa
2012-04-20 21:57   ` Glauber Costa
2012-04-25  1:18   ` KAMEZAWA Hiroyuki
2012-04-25  1:18     ` KAMEZAWA Hiroyuki
2012-04-25  1:18     ` KAMEZAWA Hiroyuki
2012-04-20 21:57 ` [PATCH 08/23] don't force return value checking in res_counter_charge_nofail Glauber Costa
2012-04-20 21:57   ` Glauber Costa
2012-04-25  1:28   ` KAMEZAWA Hiroyuki
2012-04-25  1:28     ` KAMEZAWA Hiroyuki
2012-04-25  1:28     ` KAMEZAWA Hiroyuki
2012-04-20 21:57 ` [PATCH 09/23] kmem slab accounting basic infrastructure Glauber Costa
2012-04-20 21:57   ` Glauber Costa
2012-04-25  1:32   ` KAMEZAWA Hiroyuki
2012-04-25  1:32     ` KAMEZAWA Hiroyuki
2012-04-25  1:32     ` KAMEZAWA Hiroyuki
2012-04-25 14:38     ` Glauber Costa
2012-04-25 14:38       ` Glauber Costa
2012-04-26  0:08       ` KAMEZAWA Hiroyuki
2012-04-26  0:08         ` KAMEZAWA Hiroyuki
2012-04-26  0:08         ` KAMEZAWA Hiroyuki
2012-04-30 19:33   ` Suleiman Souhlal
2012-04-30 19:33     ` Suleiman Souhlal
2012-05-02 15:15     ` Glauber Costa
2012-05-02 15:15       ` Glauber Costa
2012-05-02 15:15       ` Glauber Costa
2012-04-20 21:57 ` [PATCH 10/23] slab/slub: struct memcg_params Glauber Costa
2012-04-20 21:57   ` Glauber Costa
2012-04-20 21:57   ` Glauber Costa
2012-04-30 19:42   ` Suleiman Souhlal
2012-04-30 19:42     ` Suleiman Souhlal
2012-04-20 21:57 ` [PATCH 11/23] slub: consider a memcg parameter in kmem_create_cache Glauber Costa
2012-04-20 21:57   ` Glauber Costa
2012-04-20 21:57   ` Glauber Costa
2012-04-24 14:03   ` Frederic Weisbecker
2012-04-24 14:03     ` Frederic Weisbecker
2012-04-24 14:27     ` Glauber Costa
2012-04-24 14:27       ` Glauber Costa
2012-04-25  1:38   ` KAMEZAWA Hiroyuki
2012-04-25  1:38     ` KAMEZAWA Hiroyuki
2012-04-25  1:38     ` KAMEZAWA Hiroyuki
2012-04-25 14:37     ` Glauber Costa
2012-04-25 14:37       ` Glauber Costa
2012-04-30 19:51   ` Suleiman Souhlal
2012-04-30 19:51     ` Suleiman Souhlal
2012-05-02 15:18     ` Glauber Costa
2012-05-02 15:18       ` Glauber Costa
2012-04-22 23:53 ` [PATCH 12/23] slab: pass memcg parameter to kmem_cache_create Glauber Costa
2012-04-22 23:53   ` Glauber Costa
2012-04-22 23:53   ` Glauber Costa
2012-04-30 19:54   ` Suleiman Souhlal
2012-04-30 19:54     ` Suleiman Souhlal
2012-04-22 23:53 ` [PATCH 13/23] slub: create duplicate cache Glauber Costa
2012-04-22 23:53   ` Glauber Costa
2012-04-22 23:53   ` Glauber Costa
2012-04-24 14:18   ` Frederic Weisbecker
2012-04-24 14:18     ` Frederic Weisbecker
2012-04-24 14:18     ` Frederic Weisbecker
2012-04-24 14:37     ` Glauber Costa
2012-04-24 14:37       ` Glauber Costa
2012-04-26 13:10       ` Frederic Weisbecker
2012-04-26 13:10         ` Frederic Weisbecker
2012-04-26 13:10         ` Frederic Weisbecker
2012-04-30 20:15   ` Suleiman Souhlal
2012-04-30 20:15     ` Suleiman Souhlal
2012-04-22 23:53 ` [PATCH 14/23] slub: provide kmalloc_no_account Glauber Costa
2012-04-22 23:53   ` Glauber Costa
2012-04-22 23:53   ` Glauber Costa
2012-04-22 23:53 ` [PATCH 15/23] slab: create duplicate cache Glauber Costa
2012-04-22 23:53   ` Glauber Costa
2012-04-22 23:53   ` Glauber Costa
2012-04-22 23:53 ` [PATCH 16/23] slab: provide kmalloc_no_account Glauber Costa
2012-04-22 23:53   ` Glauber Costa
2012-04-22 23:53   ` Glauber Costa
2012-04-25  1:44   ` KAMEZAWA Hiroyuki
2012-04-25  1:44     ` KAMEZAWA Hiroyuki
2012-04-25 14:29     ` Glauber Costa
2012-04-25 14:29       ` Glauber Costa
2012-04-26  0:13       ` KAMEZAWA Hiroyuki
2012-04-26  0:13         ` KAMEZAWA Hiroyuki
2012-04-22 23:53 ` [PATCH 17/23] kmem controller charge/uncharge infrastructure Glauber Costa
2012-04-22 23:53   ` Glauber Costa
2012-04-22 23:53   ` Glauber Costa
2012-04-23 22:25   ` David Rientjes
2012-04-23 22:25     ` David Rientjes
2012-04-24 14:22     ` Frederic Weisbecker
2012-04-24 14:22       ` Frederic Weisbecker
2012-04-24 14:22       ` Frederic Weisbecker
2012-04-24 14:40       ` Glauber Costa
2012-04-24 14:40         ` Glauber Costa
2012-04-24 14:40         ` Glauber Costa
2012-04-24 20:25         ` David Rientjes
2012-04-24 20:25           ` David Rientjes
2012-04-24 20:25           ` David Rientjes
2012-04-24 21:36           ` Glauber Costa
2012-04-24 21:36             ` Glauber Costa
2012-04-24 22:54             ` David Rientjes
2012-04-24 22:54               ` David Rientjes
2012-04-25 14:43               ` Glauber Costa
2012-04-25 14:43                 ` Glauber Costa
2012-04-25 14:43                 ` Glauber Costa
2012-04-24 20:21       ` David Rientjes
2012-04-24 20:21         ` David Rientjes
2012-04-27 11:38         ` Frederic Weisbecker
2012-04-27 11:38           ` Frederic Weisbecker
2012-04-27 18:13           ` David Rientjes
2012-04-27 18:13             ` David Rientjes
2012-04-27 18:13             ` David Rientjes
2012-04-25  1:56       ` KAMEZAWA Hiroyuki [this message]
2012-04-25  1:56         ` KAMEZAWA Hiroyuki
2012-04-25 14:44         ` Glauber Costa
2012-04-25 14:44           ` Glauber Costa
2012-04-27 12:22         ` Frederic Weisbecker
2012-04-27 12:22           ` Frederic Weisbecker
2012-04-27 12:22           ` Frederic Weisbecker
2012-04-30 20:56   ` Suleiman Souhlal
2012-04-30 20:56     ` Suleiman Souhlal
2012-04-30 20:56     ` Suleiman Souhlal
2012-05-02 15:34     ` Glauber Costa
2012-05-02 15:34       ` Glauber Costa
2012-04-22 23:53 ` [PATCH 18/23] slub: charge allocation to a memcg Glauber Costa
2012-04-22 23:53   ` Glauber Costa
2012-04-22 23:53   ` Glauber Costa
2012-04-22 23:53 ` [PATCH 19/23] slab: per-memcg accounting of slab caches Glauber Costa
2012-04-22 23:53   ` Glauber Costa
2012-04-22 23:53   ` Glauber Costa
2012-04-30 21:25   ` Suleiman Souhlal
2012-04-30 21:25     ` Suleiman Souhlal
2012-05-02 15:40     ` Glauber Costa
2012-05-02 15:40       ` Glauber Costa
2012-05-02 15:40       ` Glauber Costa
2012-04-22 23:53 ` [PATCH 20/23] memcg: disable kmem code when not in use Glauber Costa
2012-04-22 23:53   ` Glauber Costa
2012-04-22 23:53   ` Glauber Costa
2012-04-22 23:53 ` [PATCH 21/23] memcg: Track all the memcg children of a kmem_cache Glauber Costa
2012-04-22 23:53   ` Glauber Costa
2012-04-22 23:53 ` [PATCH 22/23] memcg: Per-memcg memory.kmem.slabinfo file Glauber Costa
2012-04-22 23:53   ` Glauber Costa
2012-04-22 23:53 ` [PATCH 23/23] slub: create slabinfo file for memcg Glauber Costa
2012-04-22 23:53   ` Glauber Costa
2012-04-22 23:53   ` Glauber Costa
2012-04-22 23:59 ` [PATCH 00/23] slab+slub accounting " Glauber Costa
2012-04-22 23:59   ` Glauber Costa
2012-04-22 23:59   ` Glauber Costa
2012-04-30  9:59 ` [PATCH 0/3] A few fixes for '[PATCH 00/23] slab+slub accounting for memcg' series Anton Vorontsov
2012-04-30  9:59   ` Anton Vorontsov
2012-04-30  9:59   ` Anton Vorontsov
2012-04-30 10:01   ` [PATCH 1/3] slab: Proper off-slabs handling when duplicating caches Anton Vorontsov
2012-04-30 10:01     ` Anton Vorontsov
2012-04-30 10:01   ` [PATCH 2/3] slab: Fix imbalanced rcu locking Anton Vorontsov
2012-04-30 10:01     ` Anton Vorontsov
2012-04-30 10:01     ` Anton Vorontsov
2012-04-30 10:02   ` [PATCH 3/3] slab: Get rid of mem_cgroup_put_kmem_cache() Anton Vorontsov
2012-04-30 10:02     ` Anton Vorontsov
2012-04-30 10:02     ` Anton Vorontsov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4F9759C0.1070805@jp.fujitsu.com \
    --to=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=cgroups@vger.kernel.org \
    --cc=cl@linux.com \
    --cc=devel@openvz.org \
    --cc=fweisbec@gmail.com \
    --cc=glommer@parallels.com \
    --cc=gthelen@google.com \
    --cc=hannes@cmpxchg.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.cz \
    --cc=penberg@cs.helsinki.fi \
    --cc=rientjes@google.com \
    --cc=suleiman@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.