All of lore.kernel.org
 help / color / mirror / Atom feed
From: Christoph Hellwig <hch@lst.de>
To: "Russell King (Oracle)" <linux@armlinux.org.uk>
Cc: Christoph Hellwig <hch@lst.de>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	"James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>,
	Guo Ren <guoren@kernel.org>,
	Thomas Bogendoerfer <tsbogend@alpha.franken.de>,
	Nick Hu <nickhu@andestech.com>, Greentime Hu <green.hu@gmail.com>,
	Vincent Chen <deanbo422@gmail.com>, Helge Deller <deller@gmx.de>,
	Yoshinori Sato <ysato@users.sourceforge.jp>,
	Rich Felker <dalias@libc.org>, Geoff Levand <geoff@infradead.org>,
	Paul Cercueil <paul@crapouillou.net>,
	Ulf Hansson <ulf.hansson@linaro.org>, Alex Shi <alexs@kernel.org>,
	linux-kernel@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org, linux-csky@vger.kernel.org,
	linux-mips@vger.kernel.org, linux-parisc@vger.kernel.org,
	linux-sh@vger.kernel.org, linux-mmc@vger.kernel.org,
	linux-scsi@vger.kernel.org, linux-mm@kvack.org,
	linux-doc@vger.kernel.org
Subject: Re: flush_kernel_dcache_page fixes and removal
Date: Tue, 13 Jul 2021 11:11:14 +0200	[thread overview]
Message-ID: <20210713091113.GA23518@lst.de> (raw)
In-Reply-To: <20210713084648.GF22278@shell.armlinux.org.uk>

On Tue, Jul 13, 2021 at 09:46:48AM +0100, Russell King (Oracle) wrote:
> I think you need to be careful - I seem to have a recollection that the
> reason we ended up with flush_kernel_dcache_page() was the need to avoid
> the taking of the mmap lock for 32-bit ARM VIVT based CPUs in
> flush_dcache_page(). 32-bit ARM flush_dcache_page() can block.

Where can arm32 flush_dcache_page block?  __flush_dcache_aliases does
walk mapping->i_mmap, but using a spinlock hidden under
flush_dcache_mmap_lock.  If flush_dcache_page did block plenty of code
already calling it e.g. from interrupt and block submission contexts
in various mmc/sdcard drivers would be rather unhappy.  Even today
calls to flush_kernel_dcache_page page in the block I/O path are
vastly outnumber by calls to flush_dcache_page.

> The second issue I have is that, when we are reading a page into a page
> cache page, how can that page be mapped to userspace? Isn't that a
> violation of semantics? If the page is mapped to userspace but does not
> contain data from the underlying storage device, then the page contains
> stale data (if it's a new page, lets hope that's zeroed and not some
> previous contents - which would be a massive security hole.)

I did not come up with the rules, but these are the existing documented
ones for flush_dcache_page:

        Any time the kernel writes to a page cache page, _OR_
	the kernel is about to read from a page cache page and
	user space shared/writable mappings of this page potentially
	exist, this routine is called.

So writing to (aka reading into) a page cache page is not conditional
on it being mapped yet.  Only the kernel reading from the page cache
page has this condition.

> As I
> understand it, the flush_kernel_dcache_page() calls in the block layer
> are primarily there to cope with drivers that do PIO read to write to a
> page cache page to ensure that later userspace mappings can see the data
> in the page cache page - by ensuring that the page cache pages are in
> the same state as far as caches go as if they had been DMA'd to.

PIO is one big case, but also kernel generated data and all kinds of
bounce buffering schemes.

> We know that the current implementation works fine - you're now
> proposing to radically change it, asserting that it's buggy. I'm
> nervous about this change.

Do we know it works?  There are very few calls to
flush_dcache_kernel_page and very few implementations that differ from
flush_dcache_page.  For arm32 the relevant drivers would mostly be
mmc drivers using the sg_miter interface, are they even used much on
the platforms where the difference exists?

> 
> -- 
> RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
> FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!
---end quoted text---

WARNING: multiple messages have this Message-ID
From: Christoph Hellwig <hch@lst.de>
To: "Russell King (Oracle)" <linux@armlinux.org.uk>
Cc: Christoph Hellwig <hch@lst.de>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	"James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>,
	Guo Ren <guoren@kernel.org>,
	Thomas Bogendoerfer <tsbogend@alpha.franken.de>,
	Nick Hu <nickhu@andestech.com>, Greentime Hu <green.hu@gmail.com>,
	Vincent Chen <deanbo422@gmail.com>, Helge Deller <deller@gmx.de>,
	Yoshinori Sato <ysato@users.sourceforge.jp>,
	Rich Felker <dalias@libc.org>, Geoff Levand <geoff@infradead.org>,
	Paul Cercueil <paul@crapouillou.net>,
	Ulf Hansson <ulf.hansson@linaro.org>, Alex Shi <alexs@kernel.org>,
	linux-kernel@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org, linux-csky@vger.kernel.org,
	linux-mips@vger.kernel.org, linux-parisc@vger.kernel.org,
	linux-sh@vger.kernel.org, linux-mmc@vger.kernel.org,
	linux-scsi@vger.kernel.org, linux-mm@kvack.org,
	linux-doc@vger.kernel.org
Subject: Re: flush_kernel_dcache_page fixes and removal
Date: Tue, 13 Jul 2021 11:11:14 +0200	[thread overview]
Message-ID: <20210713091113.GA23518@lst.de> (raw)
In-Reply-To: <20210713084648.GF22278@shell.armlinux.org.uk>

On Tue, Jul 13, 2021 at 09:46:48AM +0100, Russell King (Oracle) wrote:
> I think you need to be careful - I seem to have a recollection that the
> reason we ended up with flush_kernel_dcache_page() was the need to avoid
> the taking of the mmap lock for 32-bit ARM VIVT based CPUs in
> flush_dcache_page(). 32-bit ARM flush_dcache_page() can block.

Where can arm32 flush_dcache_page block?  __flush_dcache_aliases does
walk mapping->i_mmap, but using a spinlock hidden under
flush_dcache_mmap_lock.  If flush_dcache_page did block plenty of code
already calling it e.g. from interrupt and block submission contexts
in various mmc/sdcard drivers would be rather unhappy.  Even today
calls to flush_kernel_dcache_page page in the block I/O path are
vastly outnumber by calls to flush_dcache_page.

> The second issue I have is that, when we are reading a page into a page
> cache page, how can that page be mapped to userspace? Isn't that a
> violation of semantics? If the page is mapped to userspace but does not
> contain data from the underlying storage device, then the page contains
> stale data (if it's a new page, lets hope that's zeroed and not some
> previous contents - which would be a massive security hole.)

I did not come up with the rules, but these are the existing documented
ones for flush_dcache_page:

        Any time the kernel writes to a page cache page, _OR_
	the kernel is about to read from a page cache page and
	user space shared/writable mappings of this page potentially
	exist, this routine is called.

So writing to (aka reading into) a page cache page is not conditional
on it being mapped yet.  Only the kernel reading from the page cache
page has this condition.

> As I
> understand it, the flush_kernel_dcache_page() calls in the block layer
> are primarily there to cope with drivers that do PIO read to write to a
> page cache page to ensure that later userspace mappings can see the data
> in the page cache page - by ensuring that the page cache pages are in
> the same state as far as caches go as if they had been DMA'd to.

PIO is one big case, but also kernel generated data and all kinds of
bounce buffering schemes.

> We know that the current implementation works fine - you're now
> proposing to radically change it, asserting that it's buggy. I'm
> nervous about this change.

Do we know it works?  There are very few calls to
flush_dcache_kernel_page and very few implementations that differ from
flush_dcache_page.  For arm32 the relevant drivers would mostly be
mmc drivers using the sg_miter interface, are they even used much on
the platforms where the difference exists?

> 
> -- 
> RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
> FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!
---end quoted text---

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2021-07-13  9:11 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-12  6:09 Christoph Hellwig
2021-07-12  6:09 ` Christoph Hellwig
2021-07-12  6:09 ` [PATCH 1/6] mmc: JZ4740: remove the flush_kernel_dcache_page call in jz4740_mmc_read_data Christoph Hellwig
2021-07-12  6:09   ` Christoph Hellwig
2021-07-19  8:15   ` Paul Cercueil
2021-07-19  8:15     ` Paul Cercueil
2021-08-04 11:12   ` Ulf Hansson
2021-08-04 11:12     ` Ulf Hansson
2021-08-04 11:12     ` Ulf Hansson
2021-07-12  6:09 ` [PATCH 2/6] mmc: mmc_spi: replace flush_kernel_dcache_page with flush_dcache_page Christoph Hellwig
2021-07-12  6:09   ` Christoph Hellwig
2021-08-04 11:12   ` Ulf Hansson
2021-08-04 11:12     ` Ulf Hansson
2021-08-04 11:12     ` Ulf Hansson
2021-07-12  6:09 ` [PATCH 3/6] ps3disk: " Christoph Hellwig
2021-07-12  6:09   ` Christoph Hellwig
2021-07-19  3:07   ` Geoff Levand
2021-07-19  3:07     ` Geoff Levand
2021-07-12  6:09 ` [PATCH 4/6] scatterlist: " Christoph Hellwig
2021-07-12  6:09   ` Christoph Hellwig
2021-07-12  6:09 ` [PATCH 5/6] aacraid: remove an unused include Christoph Hellwig
2021-07-12  6:09   ` Christoph Hellwig
2021-07-19  1:50   ` Martin K. Petersen
2021-07-19  1:50     ` Martin K. Petersen
2021-07-12  6:09 ` [PATCH 6/6] mm: remove flush_kernel_dcache_page Christoph Hellwig
2021-07-12  6:09   ` Christoph Hellwig
2021-07-12 23:56   ` Ira Weiny
2021-07-12 23:56     ` Ira Weiny
2021-07-12 19:24 ` flush_kernel_dcache_page fixes and removal Linus Torvalds
2021-07-12 19:24   ` Linus Torvalds
2021-07-12 19:24   ` Linus Torvalds
2021-07-13  5:46   ` Christoph Hellwig
2021-07-13  5:46     ` Christoph Hellwig
2021-07-13  8:46 ` Russell King (Oracle)
2021-07-13  8:46   ` Russell King (Oracle)
2021-07-13  9:11   ` Christoph Hellwig [this message]
2021-07-13  9:11     ` Christoph Hellwig
2021-07-19  5:38   ` Herbert Xu
2021-07-19  5:38     ` Herbert Xu
2021-07-19 15:42     ` Russell King (Oracle)
2021-07-19 15:42       ` Russell King (Oracle)
2021-07-14  3:13 ` Guo Ren
2021-07-14  3:13   ` Guo Ren
2021-07-14  3:13   ` Guo Ren

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210713091113.GA23518@lst.de \
    --to=hch@lst.de \
    --cc=James.Bottomley@HansenPartnership.com \
    --cc=akpm@linux-foundation.org \
    --cc=alexs@kernel.org \
    --cc=dalias@libc.org \
    --cc=deanbo422@gmail.com \
    --cc=deller@gmx.de \
    --cc=geoff@infradead.org \
    --cc=green.hu@gmail.com \
    --cc=guoren@kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-csky@vger.kernel.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mips@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-mmc@vger.kernel.org \
    --cc=linux-parisc@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=linux-sh@vger.kernel.org \
    --cc=linux@armlinux.org.uk \
    --cc=nickhu@andestech.com \
    --cc=paul@crapouillou.net \
    --cc=torvalds@linux-foundation.org \
    --cc=tsbogend@alpha.franken.de \
    --cc=ulf.hansson@linaro.org \
    --cc=ysato@users.sourceforge.jp \
    --subject='Re: flush_kernel_dcache_page fixes and removal' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.