All of lore.kernel.org
 help / color / mirror / Atom feed
From: James Bottomley <James.Bottomley@suse.de>
To: Parisc List <linux-parisc@vger.kernel.org>,
	Linux Filesystem Mailing List <linux-fsdevel@vger.kernel.org>,
	linux-arch@vger.kernel.org
Cc: Christoph Hellwig <hch@lst.de>
Subject: xfs failure on parisc (and presumably other VI cache systems) caused by I/O to vmalloc/vmap areas
Date: Tue, 08 Sep 2009 13:27:49 -0500	[thread overview]
Message-ID: <1252434469.13003.3.camel@mulgrave.site> (raw)

This bug was observed on parisc, but I would expect it to affect all
architectures with virtually indexed caches.

The inception of this problem is the changes we made to block and SCSI
to eliminate the special case path for kernel buffers.  This change
forced every I/O to go via the full scatter gather processing.  In this
way we thought we'd removed the restrictions about using vmalloc/vmap
areas for I/O from the kernel. XFS acually took advantage of this, hence
the problems.

Actually, if you look at the implementation of blk_rq_map_kern(), it
still won't accept vmalloc pages on most architectures because
virt_to_page() assumes an offset mapped page ... x86 actually has a bug
on for the vmalloc case if you enable DEBUG_VIRTUAL).  The only reason
xfs gets away with this is because it builds the vmalloc'd bio manually,
essentially open coding blk_rq_map_kern().

The problem comes because by the time we get to map scatter gather
lists, all we have is the page, we've lost the virtual address.  There's
a macro: sg_virt() which claims to recover the virtual address, but all
it really does is provide the offset map of the page physical address.
This means that sg_virt() returns a different address from the one the
page was actually used by if it's in a vmalloc/vmap area (because we
remapped the page within the kernel virtual address space).  This means
that for virtually indexed caches, we end up flushing the wrong page
alias ... and hence corrupting data because we do DMA with a possibly
dirty cache line set above the page.

The generic fix is simple:  flush the potentially dirty page along the
correct cache alias before feeding it into the block routines and losing
the alias address information.

The slight problem is that we don't have an API to handle this ...
flush_kernel_dcache_page() would be the correct one except that it only
takes a page as the argument, not the virtual address.  So, I propose as
part of this change to introduce a new API:  flush_kernel_dcache_addr()
which performs exactly the same as flush_kernel_dcache_page except that
it flushes through the provided virtual address (whether offset mapped
or mapped via vmalloc/vmap).

I'll send out the patch series as a reply to this email.

James

             reply	other threads:[~2009-09-08 18:27 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-09-08 18:27 James Bottomley [this message]
2009-09-08 19:00 ` xfs failure on parisc (and presumably other VI cache systems) caused by I/O to vmalloc/vmap areas Russell King
2009-09-08 19:11   ` James Bottomley
2009-09-08 20:16     ` Russell King
2009-09-08 20:16       ` Russell King
2009-09-08 20:39       ` James Bottomley
2009-09-08 21:39         ` Russell King
2009-09-09  3:14           ` James Bottomley
2009-09-09  3:17             ` [PATCH 1/5] mm: add coherence API for DMA " James Bottomley
2009-09-09  3:23               ` James Bottomley
2009-09-09  3:35                 ` Paul Mundt
2009-09-09 14:34                   ` James Bottomley
2009-09-10  0:24                     ` Paul Mundt
2009-09-10  0:30                       ` James Bottomley
2009-09-09  3:18             ` [PATCH 2/5] parisc: add mm " James Bottomley
2009-09-09  3:20             ` [PATCH 3/5] arm: " James Bottomley
2009-09-09  3:21             ` [PATCH 4/5] block: permit I/O to vmalloc/vmap kernel pages James Bottomley
2009-09-09  3:21             ` [PATCH 5/5] xfs: fix xfs to work with Virtually Indexed architectures James Bottomley
2009-10-13  1:40 ` xfs failure on parisc (and presumably other VI cache systems) caused by I/O to vmalloc/vmap areas Christoph Hellwig
2009-10-13  4:13   ` James Bottomley

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1252434469.13003.3.camel@mulgrave.site \
    --to=james.bottomley@suse.de \
    --cc=hch@lst.de \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-parisc@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.