All of lore.kernel.org
 help / color / mirror / Atom feed
From: Konstantin Khlebnikov <khlebnikov@parallels.com>
To: Mel Gorman <mgorman@suse.de>
Cc: Christoph Lameter <cl@linux.com>,
	Pekka Enberg <penberg@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Matt Mackall <mpm@selenic.com>
Subject: Re: [PATCH] mm-slab: allocate kmem_cache with __GFP_REPEAT
Date: Wed, 20 Jul 2011 18:32:37 +0400	[thread overview]
Message-ID: <4E26E705.8050704@parallels.com> (raw)
In-Reply-To: <20110720142018.GL5349@suse.de>

Mel Gorman wrote:
> On Wed, Jul 20, 2011 at 08:54:10AM -0500, Christoph Lameter wrote:
>> On Wed, 20 Jul 2011, Pekka Enberg wrote:
>>
>>> On Wed, 20 Jul 2011, Konstantin Khlebnikov wrote:
>>>>> The changelog isn't that convincing, really. This is kmem_cache_create()
>>>>> so I'm surprised we'd ever get NULL here in practice. Does this fix some
>>>>> problem you're seeing? If this is really an issue, I'd blame the page
>>>>> allocator as GFP_KERNEL should just work.
>>>>
>>>> nf_conntrack creates separate slab-cache for each net-namespace,
>>>> this patch of course not eliminates the chance of failure, but makes it more
>>>> acceptable.
>>>
>>> I'm still surprised you are seeing failures. mm/slab.c hasn't changed
>>> significantly in a long time. Why hasn't anyone reported this before? I'd
>>> still be inclined to shift the blame to the page allocator... Mel, Christoph?
>>
>> There was a lot of recent fiddling with the reclaim logic. Maybe some of
>> those changes caused the problem?
>>
>
> It's more likely that creating new slabs while under memory pressure
> significant enough to fail an order-4 allocation is a situation that is
> rarely tested.
>
> What kernel version did this failure occur on? What was the system doing
> at the time of failure? Can the page allocation failure message be
> posted?
>

I catch this on our rhel6-openvz kernel, and yes it very patchy,
but I don't see any reasons why this cannot be reproduced on mainline kernel.

there was abount ten containers with random stuff, node already do intensive swapout but still alive,
in this situation starting new containers sometimes (1 per 1000) fails due to kmem_cache_create failures in nf_conntrack,
there no other messages except:
Unable to create nf_conn slab cache
and some
nf_conntrack: falling back to vmalloc.
(it try allocates huge hash table and do it via vmalloc if kmalloc fails)

WARNING: multiple messages have this Message-ID (diff)
From: Konstantin Khlebnikov <khlebnikov@parallels.com>
To: Mel Gorman <mgorman@suse.de>
Cc: Christoph Lameter <cl@linux.com>,
	Pekka Enberg <penberg@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Matt Mackall <mpm@selenic.com>
Subject: Re: [PATCH] mm-slab: allocate kmem_cache with __GFP_REPEAT
Date: Wed, 20 Jul 2011 18:32:37 +0400	[thread overview]
Message-ID: <4E26E705.8050704@parallels.com> (raw)
In-Reply-To: <20110720142018.GL5349@suse.de>

Mel Gorman wrote:
> On Wed, Jul 20, 2011 at 08:54:10AM -0500, Christoph Lameter wrote:
>> On Wed, 20 Jul 2011, Pekka Enberg wrote:
>>
>>> On Wed, 20 Jul 2011, Konstantin Khlebnikov wrote:
>>>>> The changelog isn't that convincing, really. This is kmem_cache_create()
>>>>> so I'm surprised we'd ever get NULL here in practice. Does this fix some
>>>>> problem you're seeing? If this is really an issue, I'd blame the page
>>>>> allocator as GFP_KERNEL should just work.
>>>>
>>>> nf_conntrack creates separate slab-cache for each net-namespace,
>>>> this patch of course not eliminates the chance of failure, but makes it more
>>>> acceptable.
>>>
>>> I'm still surprised you are seeing failures. mm/slab.c hasn't changed
>>> significantly in a long time. Why hasn't anyone reported this before? I'd
>>> still be inclined to shift the blame to the page allocator... Mel, Christoph?
>>
>> There was a lot of recent fiddling with the reclaim logic. Maybe some of
>> those changes caused the problem?
>>
>
> It's more likely that creating new slabs while under memory pressure
> significant enough to fail an order-4 allocation is a situation that is
> rarely tested.
>
> What kernel version did this failure occur on? What was the system doing
> at the time of failure? Can the page allocation failure message be
> posted?
>

I catch this on our rhel6-openvz kernel, and yes it very patchy,
but I don't see any reasons why this cannot be reproduced on mainline kernel.

there was abount ten containers with random stuff, node already do intensive swapout but still alive,
in this situation starting new containers sometimes (1 per 1000) fails due to kmem_cache_create failures in nf_conntrack,
there no other messages except:
Unable to create nf_conn slab cache
and some
nf_conntrack: falling back to vmalloc.
(it try allocates huge hash table and do it via vmalloc if kmalloc fails)

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2011-07-20 14:32 UTC|newest]

Thread overview: 67+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-07-20 12:16 [PATCH] mm-slab: allocate kmem_cache with __GFP_REPEAT Konstantin Khlebnikov
2011-07-20 12:16 ` Konstantin Khlebnikov
2011-07-20 13:14 ` Pekka Enberg
2011-07-20 13:14   ` Pekka Enberg
2011-07-20 13:28   ` Konstantin Khlebnikov
2011-07-20 13:28     ` Konstantin Khlebnikov
2011-07-20 13:42     ` Pekka Enberg
2011-07-20 13:42       ` Pekka Enberg
2011-07-20 13:50       ` Konstantin Khlebnikov
2011-07-20 13:50         ` Konstantin Khlebnikov
2011-07-20 13:53         ` Pekka Enberg
2011-07-20 13:53           ` Pekka Enberg
2011-07-20 15:36           ` Christoph Lameter
2011-07-20 15:36             ` Christoph Lameter
2011-07-20 13:59         ` Konstantin Khlebnikov
2011-07-20 13:59           ` Konstantin Khlebnikov
2011-07-20 13:54       ` Christoph Lameter
2011-07-20 13:54         ` Christoph Lameter
2011-07-20 14:20         ` Mel Gorman
2011-07-20 14:20           ` Mel Gorman
2011-07-20 14:32           ` Konstantin Khlebnikov [this message]
2011-07-20 14:32             ` Konstantin Khlebnikov
2011-07-20 14:40             ` Eric Dumazet
2011-07-20 14:40               ` Eric Dumazet
2011-07-20 14:47               ` Konstantin Khlebnikov
2011-07-20 14:47                 ` Konstantin Khlebnikov
2011-07-20 13:43   ` Mel Gorman
2011-07-20 13:43     ` Mel Gorman
2011-07-20 13:56     ` Christoph Lameter
2011-07-20 13:56       ` Christoph Lameter
2011-07-20 14:08       ` Eric Dumazet
2011-07-20 14:08         ` Eric Dumazet
2011-07-20 14:52         ` Christoph Lameter
2011-07-20 14:52           ` Christoph Lameter
2011-07-20 15:09           ` Eric Dumazet
2011-07-20 15:09             ` Eric Dumazet
2011-07-20 15:34             ` Christoph Lameter
2011-07-20 15:34               ` Christoph Lameter
2011-07-20 15:56               ` Eric Dumazet
2011-07-20 15:56                 ` Eric Dumazet
2011-07-20 16:17                 ` Christoph Lameter
2011-07-20 16:17                   ` Christoph Lameter
2011-07-20 16:31                   ` Eric Dumazet
2011-07-20 16:31                     ` Eric Dumazet
2011-07-20 17:04                     ` Eric Dumazet
2011-07-20 17:04                       ` Eric Dumazet
2011-07-20 17:13                       ` Christoph Lameter
2011-07-20 17:13                         ` Christoph Lameter
2011-07-20 17:28                         ` Pekka Enberg
2011-07-20 17:28                           ` Pekka Enberg
2011-07-20 17:37                           ` Christoph Lameter
2011-07-20 17:37                             ` Christoph Lameter
2011-07-20 17:41                             ` Pekka Enberg
2011-07-20 17:41                               ` Pekka Enberg
2011-07-20 18:07                               ` Matt Mackall
2011-07-20 18:07                                 ` Matt Mackall
2011-07-21  7:18                                 ` Konstantin Khlebnikov
2011-07-21  7:18                                   ` Konstantin Khlebnikov
2011-07-20 19:09                               ` Mel Gorman
2011-07-20 19:09                                 ` Mel Gorman
2011-07-31 11:41             ` Konstantin Khlebnikov
2011-07-31 11:41               ` Konstantin Khlebnikov
2011-07-31 11:44               ` Pekka Enberg
2011-07-31 11:44                 ` Pekka Enberg
2011-07-21  8:43           ` Eric Dumazet
2011-07-21  8:43             ` Eric Dumazet
2011-07-21 15:27             ` Christoph Lameter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4E26E705.8050704@parallels.com \
    --to=khlebnikov@parallels.com \
    --cc=akpm@linux-foundation.org \
    --cc=cl@linux.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@suse.de \
    --cc=mpm@selenic.com \
    --cc=penberg@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.