linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Christoph Lameter <clameter@sgi.com>
To: Hugh Dickins <hugh@veritas.com>
Cc: Russell King <rmk+lkml@arm.linux.org.uk>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Nicolas Ferre <nicolas.ferre@rfo.atmel.com>,
	ARM Linux Mailing List  <linux-arm-kernel@lists.arm.linux.org.uk>,
	Linux Kernel list <linux-kernel@vger.kernel.org>,
	Marc Pignat <marc.pignat@hevs.ch>,
	Andrew Victor <andrew@sanpeople.com>,
	Pierre Ossman <drzeus@drzeus.cx>,
	Andrew Morton <akpm@linux-foundation.org>
Subject: Re: Oops in a driver while using SLUB as a SLAB allocator
Date: Mon, 25 Jun 2007 07:07:53 -0700 (PDT)	[thread overview]
Message-ID: <Pine.LNX.4.64.0706250607260.5191@schroedinger.engr.sgi.com> (raw)
In-Reply-To: <Pine.LNX.4.64.0706250120560.28286@blonde.wat.veritas.com>

On Mon, 25 Jun 2007, Hugh Dickins wrote:

> And I now rather think that needs to stay, not be replaced by the
> VM_BUG_ON Christoph was proposing for 2.6.23 (which earlier I acked).
> 
> Christoph responded to my page_mapping patch by looking at arch/arm,
> and there finding a kmalloc in dma_alloc_coherent which he didn't
> like; but you're right, it's entirely irrelevant to Nicolas' oops.
> 
> The slub allocation which gives rise to Nicolas' oops in not in
> ARM, but (I'm guessing) in drivers/mmc/core/sd.c: one of those
> 	status = kmalloc(64, GFP_KERNEL);
> where status is passed down for the response from mmc_sd_switch.
>
> And what is wrong with using kmalloc there?
> Why should that be changed to allocate a whole page?
> How many other such cases might there be?

So someone effectively does a flush_dcache_page(virt_to_page(status))?

> In the kmalloc case it's not mapped into userspace: flush_dcache_page
> should detect that and do nothing, as it does with slab; but slub was
> reusing page->mapping for something else, so we oopsed.

If that is the case then what we really want is a flush_dcache_range
not the above. flush_dcache_range does not take a page struct as an 
argument and it will work on memory that has no struct page backing it.

Is flush_dcache_range available in all platforms? I see some drivers
using it:

drivers/net/fec.c
drivers/serial/mpsc.c
drivers/char/agp/uninorth-agp.c


flush_dcache_page is implemented by

sparc64		Uses mapping
sh		Ok. Only uses PG_mapped
arm		Uses mapping in the mmu case
frv		Does a kmap_atomic ?? Otherwise looks ok.
ppc		Clears PG_arch_1
mips		Uses mapping
sh64		No page struct use
parisc		Uses mapping
xtensa		Uses mapping
powerpc		Handles page flags PG_arch_1
ia64		Clears PG_arch_1
sparc		Calculates address based on page struct addr.
blackfin	Does an immediate page_address(page)
m68k		Does an immediate page_address(page)

In many situations the page struct passed to flush_dcache_page is
simply used to calculate the virtual address. So its mostly harmless.
Trouble starts when page attributes like the mapping is used.

So the problem platforms are

sparc64 arm mips parisc xtensa

If we indeed do these weird things then I think the general fix should
be to use flush_dcache_range() but that is too late for 2.6.22. The 
VM_BUG_ON will be useful to detect these scenarios. Maybe we need
to replace that with a WARN_ON or something if the usage is frequent?
There are a large number of platforms on which flush_dcache_range has
no effect or an effect that is negligible.

A kmalloc slab object (even 64 byte) may be crossing a page boundary 
with a ARCH_KMALLOC_MINALIGN of 4 or 8. So I think that 
flush_dcache_range *must* be used rather than flush_dcache_page. 
flush_dcache_page(virt_to_page(object)) takes the starting address of 
the object and flushes the page in which the object started. It may
not be the complete object. This usually works fine with 64 byte objects
because they neatly fit into a slab page. Again if CONFIG_SLAB_DEBUG
f.e. is enabled then the alignment will no longer be to a 64 byte bound 
but only to the alignment guaranteed by ARCH_KMALLOC_MINALIGN. If this 
trick is used on a non kmalloc cache with a non power of size then we
may have a larger chance of trouble occurring.

For 2.6.22 the easiest solution may be to check for PageSlab in the
flush_dcache_pages of the affected platforms and then count on
the users not enabling any slab debugging. Its then simply the same state 
as before.

  parent reply	other threads:[~2007-06-25 14:08 UTC|newest]

Thread overview: 50+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-06-21  9:30 Oops in a driver while using SLUB as a SLAB allocator Nicolas Ferre
2007-06-21 14:54 ` Marc Pignat
2007-06-21 14:57 ` Marc Pignat
2007-06-21 15:54   ` Nicolas Ferre
2007-06-22  6:28     ` [PATCH] mmc-atmel : fix kunmap wrong usage Marc Pignat
2007-06-22 12:00       ` Hugh Dickins
2007-06-22 13:34         ` Nicolas Ferre
2007-06-22 13:46           ` Hugh Dickins
2007-06-22 14:21           ` Marc Pignat
2007-06-22 14:58           ` Marc Pignat
2007-06-22 19:00       ` Jens Axboe
2007-06-22  9:09     ` Oops in a driver while using SLUB as a SLAB allocator Nicolas Ferre
2007-06-21 22:27 ` Hugh Dickins
2007-06-22  1:01   ` Christoph Lameter
2007-06-22  4:26     ` Hugh Dickins
2007-06-22  5:13       ` Christoph Lameter
2007-06-22  7:00       ` Russell King
2007-06-22  1:36   ` Christoph Lameter
2007-06-22  4:40     ` Hugh Dickins
2007-06-22  5:10       ` Christoph Lameter
2007-06-22  5:37         ` Hugh Dickins
2007-06-22 16:40       ` Linus Torvalds
2007-06-22 17:26         ` Christoph Lameter
2007-06-22 17:41         ` Christoph Lameter
2007-06-22 18:39           ` Hugh Dickins
2007-06-22 18:51             ` Christoph Lameter
2007-06-22 19:01               ` Hugh Dickins
2007-06-22 19:11                 ` Christoph Lameter
2007-06-22 20:21                   ` Hugh Dickins
2007-06-22 22:54                     ` Christoph Lameter
2007-06-22 20:15                 ` Christoph Lameter
2007-06-23 10:40                   ` Oleg Verych
2007-06-24  8:38             ` Russell King
2007-06-24 10:24               ` Hugh Dickins
2007-06-24 10:51                 ` Russell King
2007-06-25  0:25                   ` Hugh Dickins
2007-06-25 13:55                     ` Nicolas Ferre
2007-06-25 14:07                     ` Christoph Lameter [this message]
2007-06-25 16:42                       ` Hugh Dickins
2007-06-25 17:00                         ` Christoph Lameter
2007-06-25 17:23                           ` Hugh Dickins
2007-06-25 18:23                             ` Christoph Lameter
2007-06-25 18:43                               ` Hugh Dickins
2007-06-25 18:50                                 ` Christoph Lameter
2007-06-25 19:04                                   ` Hugh Dickins
2007-06-26 18:09                                     ` Christoph Lameter
2007-06-22 20:18         ` Russell King
2007-06-22  1:41   ` Christoph Lameter
2007-06-22  4:46     ` Hugh Dickins
2007-06-22  5:31       ` Christoph Lameter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Pine.LNX.4.64.0706250607260.5191@schroedinger.engr.sgi.com \
    --to=clameter@sgi.com \
    --cc=akpm@linux-foundation.org \
    --cc=andrew@sanpeople.com \
    --cc=drzeus@drzeus.cx \
    --cc=hugh@veritas.com \
    --cc=linux-arm-kernel@lists.arm.linux.org.uk \
    --cc=linux-kernel@vger.kernel.org \
    --cc=marc.pignat@hevs.ch \
    --cc=nicolas.ferre@rfo.atmel.com \
    --cc=rmk+lkml@arm.linux.org.uk \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).