linux-sh.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* flush_kernel_dcache_page fixes and removal
@ 2021-07-12  6:09 Christoph Hellwig
  2021-07-12  6:09 ` [PATCH 1/6] mmc: JZ4740: remove the flush_kernel_dcache_page call in jz4740_mmc_read_data Christoph Hellwig
                   ` (8 more replies)
  0 siblings, 9 replies; 20+ messages in thread
From: Christoph Hellwig @ 2021-07-12  6:09 UTC (permalink / raw)
  To: Linus Torvalds, Andrew Morton, James E.J. Bottomley
  Cc: Russell King, Guo Ren, Thomas Bogendoerfer, Nick Hu,
	Greentime Hu, Vincent Chen, Helge Deller, Yoshinori Sato,
	Rich Felker, Geoff Levand, Paul Cercueil, Ulf Hansson, Alex Shi,
	linux-kernel, linux-arm-kernel, linux-csky, linux-mips,
	linux-parisc, linux-sh, linux-mmc, linux-scsi, linux-mm,
	linux-doc

Hi all,

while looking to convert the block layer away from kmap_atomic towards
kmap_local_page and prefeably the helpers that abstract it away I noticed
that a few block drivers directly or implicitly call
flush_kernel_dcache_page before kunmapping a page that has been written
to.  flush_kernel_dcache_page is documented to to be used in such cases,
but flush_dcache_page is actually required when the page could be in
the page cache and mapped to userspace, which is pretty much always the
case when kmapping an arbitrary page.  Unfortunately the documentation
doesn't exactly make that clear, which lead to this misused.  And it turns
out that only the copy_strings / copy_string_kernel in the exec code
were actually correct users of flush_kernel_dcache_page, which is why
I think we should just remove it and eat the very minor overhead in
exec rather than confusing poor driver writers.

Diffstat:
 Documentation/core-api/cachetlb.rst                    |   86 +++++++----------
 Documentation/translations/zh_CN/core-api/cachetlb.rst |    9 -
 arch/arm/include/asm/cacheflush.h                      |    4 
 arch/arm/mm/flush.c                                    |   33 ------
 arch/arm/mm/nommu.c                                    |    6 -
 arch/csky/abiv1/cacheflush.c                           |   11 --
 arch/csky/abiv1/inc/abi/cacheflush.h                   |    4 
 arch/mips/include/asm/cacheflush.h                     |    8 -
 arch/nds32/include/asm/cacheflush.h                    |    3 
 arch/nds32/mm/cacheflush.c                             |    9 -
 arch/parisc/include/asm/cacheflush.h                   |    8 -
 arch/parisc/kernel/cache.c                             |    3 
 arch/sh/include/asm/cacheflush.h                       |    8 -
 block/blk-map.c                                        |    2 
 drivers/block/ps3disk.c                                |    2 
 drivers/mmc/host/jz4740_mmc.c                          |    4 
 drivers/mmc/host/mmc_spi.c                             |    2 
 drivers/scsi/aacraid/aachba.c                          |    1 
 fs/exec.c                                              |    6 -
 include/linux/highmem.h                                |    5 
 lib/scatterlist.c                                      |    5 
 tools/testing/scatterlist/linux/mm.h                   |    1 
 22 files changed, 55 insertions(+), 165 deletions(-)

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 1/6] mmc: JZ4740: remove the flush_kernel_dcache_page call in jz4740_mmc_read_data
  2021-07-12  6:09 flush_kernel_dcache_page fixes and removal Christoph Hellwig
@ 2021-07-12  6:09 ` Christoph Hellwig
  2021-07-19  8:15   ` Paul Cercueil
  2021-08-04 11:12   ` Ulf Hansson
  2021-07-12  6:09 ` [PATCH 2/6] mmc: mmc_spi: replace flush_kernel_dcache_page with flush_dcache_page Christoph Hellwig
                   ` (7 subsequent siblings)
  8 siblings, 2 replies; 20+ messages in thread
From: Christoph Hellwig @ 2021-07-12  6:09 UTC (permalink / raw)
  To: Linus Torvalds, Andrew Morton, James E.J. Bottomley
  Cc: Russell King, Guo Ren, Thomas Bogendoerfer, Nick Hu,
	Greentime Hu, Vincent Chen, Helge Deller, Yoshinori Sato,
	Rich Felker, Geoff Levand, Paul Cercueil, Ulf Hansson, Alex Shi,
	linux-kernel, linux-arm-kernel, linux-csky, linux-mips,
	linux-parisc, linux-sh, linux-mmc, linux-scsi, linux-mm,
	linux-doc

MIPS now implements flush_kernel_dcache_page (as an alias to
flush_dcache_page).

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/mmc/host/jz4740_mmc.c | 4 ----
 1 file changed, 4 deletions(-)

diff --git a/drivers/mmc/host/jz4740_mmc.c b/drivers/mmc/host/jz4740_mmc.c
index 0db17bcc9c16..aa2240c83510 100644
--- a/drivers/mmc/host/jz4740_mmc.c
+++ b/drivers/mmc/host/jz4740_mmc.c
@@ -578,10 +578,6 @@ static bool jz4740_mmc_read_data(struct jz4740_mmc_host *host,
 			}
 		}
 		data->bytes_xfered += miter->length;
-
-		/* This can go away once MIPS implements
-		 * flush_kernel_dcache_page */
-		flush_dcache_page(miter->page);
 	}
 	sg_miter_stop(miter);
 
-- 
2.30.2


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 2/6] mmc: mmc_spi: replace flush_kernel_dcache_page with flush_dcache_page
  2021-07-12  6:09 flush_kernel_dcache_page fixes and removal Christoph Hellwig
  2021-07-12  6:09 ` [PATCH 1/6] mmc: JZ4740: remove the flush_kernel_dcache_page call in jz4740_mmc_read_data Christoph Hellwig
@ 2021-07-12  6:09 ` Christoph Hellwig
  2021-08-04 11:12   ` Ulf Hansson
  2021-07-12  6:09 ` [PATCH 3/6] ps3disk: " Christoph Hellwig
                   ` (6 subsequent siblings)
  8 siblings, 1 reply; 20+ messages in thread
From: Christoph Hellwig @ 2021-07-12  6:09 UTC (permalink / raw)
  To: Linus Torvalds, Andrew Morton, James E.J. Bottomley
  Cc: Russell King, Guo Ren, Thomas Bogendoerfer, Nick Hu,
	Greentime Hu, Vincent Chen, Helge Deller, Yoshinori Sato,
	Rich Felker, Geoff Levand, Paul Cercueil, Ulf Hansson, Alex Shi,
	linux-kernel, linux-arm-kernel, linux-csky, linux-mips,
	linux-parisc, linux-sh, linux-mmc, linux-scsi, linux-mm,
	linux-doc

Pages passed to block drivers can be mapped page cache pages, so we
must use flush_dcache_page here instead of the more limited
flush_kernel_dcache_page that is intended for highmem pages only.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/mmc/host/mmc_spi.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/mmc/host/mmc_spi.c b/drivers/mmc/host/mmc_spi.c
index 65c65bb5737f..3d28a3d3001b 100644
--- a/drivers/mmc/host/mmc_spi.c
+++ b/drivers/mmc/host/mmc_spi.c
@@ -948,7 +948,7 @@ mmc_spi_data_do(struct mmc_spi_host *host, struct mmc_command *cmd,
 
 		/* discard mappings */
 		if (direction == DMA_FROM_DEVICE)
-			flush_kernel_dcache_page(sg_page(sg));
+			flush_dcache_page(sg_page(sg));
 		kunmap(sg_page(sg));
 		if (dma_dev)
 			dma_unmap_page(dma_dev, dma_addr, PAGE_SIZE, dir);
-- 
2.30.2


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 3/6] ps3disk: replace flush_kernel_dcache_page with flush_dcache_page
  2021-07-12  6:09 flush_kernel_dcache_page fixes and removal Christoph Hellwig
  2021-07-12  6:09 ` [PATCH 1/6] mmc: JZ4740: remove the flush_kernel_dcache_page call in jz4740_mmc_read_data Christoph Hellwig
  2021-07-12  6:09 ` [PATCH 2/6] mmc: mmc_spi: replace flush_kernel_dcache_page with flush_dcache_page Christoph Hellwig
@ 2021-07-12  6:09 ` Christoph Hellwig
  2021-07-19  3:07   ` Geoff Levand
  2021-07-12  6:09 ` [PATCH 4/6] scatterlist: " Christoph Hellwig
                   ` (5 subsequent siblings)
  8 siblings, 1 reply; 20+ messages in thread
From: Christoph Hellwig @ 2021-07-12  6:09 UTC (permalink / raw)
  To: Linus Torvalds, Andrew Morton, James E.J. Bottomley
  Cc: Russell King, Guo Ren, Thomas Bogendoerfer, Nick Hu,
	Greentime Hu, Vincent Chen, Helge Deller, Yoshinori Sato,
	Rich Felker, Geoff Levand, Paul Cercueil, Ulf Hansson, Alex Shi,
	linux-kernel, linux-arm-kernel, linux-csky, linux-mips,
	linux-parisc, linux-sh, linux-mmc, linux-scsi, linux-mm,
	linux-doc

Pages passed to block drivers can be mapped page cache pages, so we
must use flush_dcache_page here instead of the more limited
flush_kernel_dcache_page that is intended for highmem pages only.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/block/ps3disk.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/block/ps3disk.c b/drivers/block/ps3disk.c
index f374ea2c67ce..32bfb0487bdb 100644
--- a/drivers/block/ps3disk.c
+++ b/drivers/block/ps3disk.c
@@ -100,7 +100,7 @@ static void ps3disk_scatter_gather(struct ps3_storage_device *dev,
 		else
 			memcpy(buf, dev->bounce_buf+offset, size);
 		offset += size;
-		flush_kernel_dcache_page(bvec.bv_page);
+		flush_dcache_page(bvec.bv_page);
 		bvec_kunmap_irq(buf, &flags);
 		i++;
 	}
-- 
2.30.2


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 4/6] scatterlist: replace flush_kernel_dcache_page with flush_dcache_page
  2021-07-12  6:09 flush_kernel_dcache_page fixes and removal Christoph Hellwig
                   ` (2 preceding siblings ...)
  2021-07-12  6:09 ` [PATCH 3/6] ps3disk: " Christoph Hellwig
@ 2021-07-12  6:09 ` Christoph Hellwig
  2021-07-12  6:09 ` [PATCH 5/6] aacraid: remove an unused include Christoph Hellwig
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 20+ messages in thread
From: Christoph Hellwig @ 2021-07-12  6:09 UTC (permalink / raw)
  To: Linus Torvalds, Andrew Morton, James E.J. Bottomley
  Cc: Russell King, Guo Ren, Thomas Bogendoerfer, Nick Hu,
	Greentime Hu, Vincent Chen, Helge Deller, Yoshinori Sato,
	Rich Felker, Geoff Levand, Paul Cercueil, Ulf Hansson, Alex Shi,
	linux-kernel, linux-arm-kernel, linux-csky, linux-mips,
	linux-parisc, linux-sh, linux-mmc, linux-scsi, linux-mm,
	linux-doc

Pages used in scatterlist can be mapped page cache pages (and often are),
so we must use flush_dcache_page here instead of the more limited
flush_kernel_dcache_page that is intended for highmem pages only.

Also remove the PageSlab check given that page_mapping_file as used
by the flush_dcache_page implementations already contains that check.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 lib/scatterlist.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/lib/scatterlist.c b/lib/scatterlist.c
index 27efa6178153..627aa84f8bbd 100644
--- a/lib/scatterlist.c
+++ b/lib/scatterlist.c
@@ -887,9 +887,8 @@ void sg_miter_stop(struct sg_mapping_iter *miter)
 		miter->__offset += miter->consumed;
 		miter->__remaining -= miter->consumed;
 
-		if ((miter->__flags & SG_MITER_TO_SG) &&
-		    !PageSlab(miter->page))
-			flush_kernel_dcache_page(miter->page);
+		if (miter->__flags & SG_MITER_TO_SG)
+			flush_dcache_page(miter->page);
 
 		if (miter->__flags & SG_MITER_ATOMIC) {
 			WARN_ON_ONCE(preemptible());
-- 
2.30.2


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 5/6] aacraid: remove an unused include
  2021-07-12  6:09 flush_kernel_dcache_page fixes and removal Christoph Hellwig
                   ` (3 preceding siblings ...)
  2021-07-12  6:09 ` [PATCH 4/6] scatterlist: " Christoph Hellwig
@ 2021-07-12  6:09 ` Christoph Hellwig
  2021-07-19  1:50   ` Martin K. Petersen
  2021-07-12  6:09 ` [PATCH 6/6] mm: remove flush_kernel_dcache_page Christoph Hellwig
                   ` (3 subsequent siblings)
  8 siblings, 1 reply; 20+ messages in thread
From: Christoph Hellwig @ 2021-07-12  6:09 UTC (permalink / raw)
  To: Linus Torvalds, Andrew Morton, James E.J. Bottomley
  Cc: Russell King, Guo Ren, Thomas Bogendoerfer, Nick Hu,
	Greentime Hu, Vincent Chen, Helge Deller, Yoshinori Sato,
	Rich Felker, Geoff Levand, Paul Cercueil, Ulf Hansson, Alex Shi,
	linux-kernel, linux-arm-kernel, linux-csky, linux-mips,
	linux-parisc, linux-sh, linux-mmc, linux-scsi, linux-mm,
	linux-doc

flush_kernel_dcache_page is not used by aacraid, and this header already
comes in through the scatterlist/block headers anyway.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/scsi/aacraid/aachba.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/scsi/aacraid/aachba.c b/drivers/scsi/aacraid/aachba.c
index 46b8dffce2dd..267934d2f14b 100644
--- a/drivers/scsi/aacraid/aachba.c
+++ b/drivers/scsi/aacraid/aachba.c
@@ -25,7 +25,6 @@
 #include <linux/completion.h>
 #include <linux/blkdev.h>
 #include <linux/uaccess.h>
-#include <linux/highmem.h> /* For flush_kernel_dcache_page */
 #include <linux/module.h>
 
 #include <asm/unaligned.h>
-- 
2.30.2


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 6/6] mm: remove flush_kernel_dcache_page
  2021-07-12  6:09 flush_kernel_dcache_page fixes and removal Christoph Hellwig
                   ` (4 preceding siblings ...)
  2021-07-12  6:09 ` [PATCH 5/6] aacraid: remove an unused include Christoph Hellwig
@ 2021-07-12  6:09 ` Christoph Hellwig
  2021-07-12 23:56   ` Ira Weiny
  2021-07-12 19:24 ` flush_kernel_dcache_page fixes and removal Linus Torvalds
                   ` (2 subsequent siblings)
  8 siblings, 1 reply; 20+ messages in thread
From: Christoph Hellwig @ 2021-07-12  6:09 UTC (permalink / raw)
  To: Linus Torvalds, Andrew Morton, James E.J. Bottomley
  Cc: Russell King, Guo Ren, Thomas Bogendoerfer, Nick Hu,
	Greentime Hu, Vincent Chen, Helge Deller, Yoshinori Sato,
	Rich Felker, Geoff Levand, Paul Cercueil, Ulf Hansson, Alex Shi,
	linux-kernel, linux-arm-kernel, linux-csky, linux-mips,
	linux-parisc, linux-sh, linux-mmc, linux-scsi, linux-mm,
	linux-doc

flush_kernel_dcache_page is a rather confusing interface that implements
a subset of flush_dcache_page by not being able to properly handle page
cache mapped pages.

The only callers left are in the exec code as all other previous callers
were incorrect as they could have dealt with page cache pages.  Replace
the calls to flush_kernel_dcache_page with calls to
flush_kernel_dcache_page, which for all architectures does either
exactly the same thing, can contains one or more of the following:

 1) an optimization to defer the cache flush for page cache pages not
    mapped into userspace
 2) additional flushing for mapped page cache pages if cache aliases
    are possible

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 Documentation/core-api/cachetlb.rst           | 86 ++++++++-----------
 .../translations/zh_CN/core-api/cachetlb.rst  |  9 --
 arch/arm/include/asm/cacheflush.h             |  4 +-
 arch/arm/mm/flush.c                           | 33 -------
 arch/arm/mm/nommu.c                           |  6 --
 arch/csky/abiv1/cacheflush.c                  | 11 ---
 arch/csky/abiv1/inc/abi/cacheflush.h          |  4 +-
 arch/mips/include/asm/cacheflush.h            |  8 +-
 arch/nds32/include/asm/cacheflush.h           |  3 +-
 arch/nds32/mm/cacheflush.c                    |  9 --
 arch/parisc/include/asm/cacheflush.h          |  8 +-
 arch/parisc/kernel/cache.c                    |  3 +-
 arch/sh/include/asm/cacheflush.h              |  8 +-
 block/blk-map.c                               |  2 +-
 fs/exec.c                                     |  6 +-
 include/linux/highmem.h                       |  5 +-
 tools/testing/scatterlist/linux/mm.h          |  1 -
 17 files changed, 51 insertions(+), 155 deletions(-)

diff --git a/Documentation/core-api/cachetlb.rst b/Documentation/core-api/cachetlb.rst
index fe4290e26729..8aed9103e48a 100644
--- a/Documentation/core-api/cachetlb.rst
+++ b/Documentation/core-api/cachetlb.rst
@@ -271,10 +271,15 @@ maps this page at its virtual address.
 
   ``void flush_dcache_page(struct page *page)``
 
-	Any time the kernel writes to a page cache page, _OR_
-	the kernel is about to read from a page cache page and
-	user space shared/writable mappings of this page potentially
-	exist, this routine is called.
+        This routines must be called when:
+
+	  a) the kernel did write to a page that is in the page cache page
+	     and / or in high memory
+	  b) the kernel is about to read from a page cache page and user space
+	     shared/writable mappings of this page potentially exist.  Note
+	     that {get,pin}_user_pages{_fast} already call flush_dcache_page
+	     on any page found in the user address space and thus driver
+	     code rarely needs to take this into account.
 
 	.. note::
 
@@ -284,38 +289,34 @@ maps this page at its virtual address.
 	      handling vfs symlinks in the page cache need not call
 	      this interface at all.
 
-	The phrase "kernel writes to a page cache page" means,
-	specifically, that the kernel executes store instructions
-	that dirty data in that page at the page->virtual mapping
-	of that page.  It is important to flush here to handle
-	D-cache aliasing, to make sure these kernel stores are
-	visible to user space mappings of that page.
-
-	The corollary case is just as important, if there are users
-	which have shared+writable mappings of this file, we must make
-	sure that kernel reads of these pages will see the most recent
-	stores done by the user.
-
-	If D-cache aliasing is not an issue, this routine may
-	simply be defined as a nop on that architecture.
-
-        There is a bit set aside in page->flags (PG_arch_1) as
-	"architecture private".  The kernel guarantees that,
-	for pagecache pages, it will clear this bit when such
-	a page first enters the pagecache.
-
-	This allows these interfaces to be implemented much more
-	efficiently.  It allows one to "defer" (perhaps indefinitely)
-	the actual flush if there are currently no user processes
-	mapping this page.  See sparc64's flush_dcache_page and
-	update_mmu_cache implementations for an example of how to go
-	about doing this.
-
-	The idea is, first at flush_dcache_page() time, if
-	page->mapping->i_mmap is an empty tree, just mark the architecture
-	private page flag bit.  Later, in update_mmu_cache(), a check is
-	made of this flag bit, and if set the flush is done and the flag
-	bit is cleared.
+	The phrase "kernel writes to a page cache page" means, specifically,
+	that the kernel executes store instructions that dirty data in that
+	page at the page->virtual mapping of that page.  It is important to
+	flush here to handle D-cache aliasing, to make sure these kernel stores
+	are visible to user space mappings of that page.
+
+	The corollary case is just as important, if there are users which have
+	shared+writable mappings of this file, we must make sure that kernel
+	reads of these pages will see the most recent stores done by the user.
+
+	If D-cache aliasing is not an issue, this routine may simply be defined
+	as a nop on that architecture.
+
+        There is a bit set aside in page->flags (PG_arch_1) as "architecture
+	private".  The kernel guarantees that, for pagecache pages, it will
+	clear this bit when such a page first enters the pagecache.
+
+	This allows these interfaces to be implemented much more efficiently.
+	It allows one to "defer" (perhaps indefinitely) the actual flush if
+	there are currently no user processes mapping this page.  See sparc64's
+	flush_dcache_page and update_mmu_cache implementations for an example
+	of how to go about doing this.
+
+	The idea is, first at flush_dcache_page() time, if page_file_mapping()
+	returns a mapping, and mapping_mapped on that mapping returns %false,
+	just mark the architecture private page flag bit.  Later, in
+	update_mmu_cache(), a check is made of this flag bit, and if set the
+	flush is done and the flag bit is cleared.
 
 	.. important::
 
@@ -351,19 +352,6 @@ maps this page at its virtual address.
 	architectures).  For incoherent architectures, it should flush
 	the cache of the page at vmaddr.
 
-  ``void flush_kernel_dcache_page(struct page *page)``
-
-	When the kernel needs to modify a user page is has obtained
-	with kmap, it calls this function after all modifications are
-	complete (but before kunmapping it) to bring the underlying
-	page up to date.  It is assumed here that the user has no
-	incoherent cached copies (i.e. the original page was obtained
-	from a mechanism like get_user_pages()).  The default
-	implementation is a nop and should remain so on all coherent
-	architectures.  On incoherent architectures, this should flush
-	the kernel cache for page (using page_address(page)).
-
-
   ``void flush_icache_range(unsigned long start, unsigned long end)``
 
   	When the kernel stores into addresses that it will execute
diff --git a/Documentation/translations/zh_CN/core-api/cachetlb.rst b/Documentation/translations/zh_CN/core-api/cachetlb.rst
index 8376485a534d..55827b8a7c53 100644
--- a/Documentation/translations/zh_CN/core-api/cachetlb.rst
+++ b/Documentation/translations/zh_CN/core-api/cachetlb.rst
@@ -298,15 +298,6 @@ HyperSparc cpu就是这样一个具有这种属性的cpu。
 	用。默认的实现是nop(对于所有相干的架构应该保持这样)。对于不一致性
 	的架构,它应该刷新vmaddr处的页面缓存。
 
-  ``void flush_kernel_dcache_page(struct page *page)``
-
-	当内核需要修改一个用kmap获得的用户页时,它会在所有修改完成后(但在
-	kunmapping之前)调用这个函数,以使底层页面达到最新状态。这里假定用
-	户没有不一致性的缓存副本(即原始页面是从类似get_user_pages()的机制
-	中获得的)。默认的实现是一个nop,在所有相干的架构上都应该如此。在不
-	一致性的架构上,这应该刷新内核缓存中的页面(使用page_address(page))。
-
-
   ``void flush_icache_range(unsigned long start, unsigned long end)``
 
 	当内核存储到它将执行的地址中时(例如在加载模块时),这个函数被调用。
diff --git a/arch/arm/include/asm/cacheflush.h b/arch/arm/include/asm/cacheflush.h
index 2e24e765e6d3..5e56288e343b 100644
--- a/arch/arm/include/asm/cacheflush.h
+++ b/arch/arm/include/asm/cacheflush.h
@@ -291,6 +291,7 @@ extern void flush_cache_page(struct vm_area_struct *vma, unsigned long user_addr
 #define ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE 1
 extern void flush_dcache_page(struct page *);
 
+#define ARCH_IMPLEMENTS_FLUSH_KERNEL_VMAP_RANGE 1
 static inline void flush_kernel_vmap_range(void *addr, int size)
 {
 	if ((cache_is_vivt() || cache_is_vipt_aliasing()))
@@ -312,9 +313,6 @@ static inline void flush_anon_page(struct vm_area_struct *vma,
 		__flush_anon_page(vma, page, vmaddr);
 }
 
-#define ARCH_HAS_FLUSH_KERNEL_DCACHE_PAGE
-extern void flush_kernel_dcache_page(struct page *);
-
 #define flush_dcache_mmap_lock(mapping)		xa_lock_irq(&mapping->i_pages)
 #define flush_dcache_mmap_unlock(mapping)	xa_unlock_irq(&mapping->i_pages)
 
diff --git a/arch/arm/mm/flush.c b/arch/arm/mm/flush.c
index 6d89db7895d1..7ff9feea13a6 100644
--- a/arch/arm/mm/flush.c
+++ b/arch/arm/mm/flush.c
@@ -345,39 +345,6 @@ void flush_dcache_page(struct page *page)
 }
 EXPORT_SYMBOL(flush_dcache_page);
 
-/*
- * Ensure cache coherency for the kernel mapping of this page. We can
- * assume that the page is pinned via kmap.
- *
- * If the page only exists in the page cache and there are no user
- * space mappings, this is a no-op since the page was already marked
- * dirty at creation.  Otherwise, we need to flush the dirty kernel
- * cache lines directly.
- */
-void flush_kernel_dcache_page(struct page *page)
-{
-	if (cache_is_vivt() || cache_is_vipt_aliasing()) {
-		struct address_space *mapping;
-
-		mapping = page_mapping_file(page);
-
-		if (!mapping || mapping_mapped(mapping)) {
-			void *addr;
-
-			addr = page_address(page);
-			/*
-			 * kmap_atomic() doesn't set the page virtual
-			 * address for highmem pages, and
-			 * kunmap_atomic() takes care of cache
-			 * flushing already.
-			 */
-			if (!IS_ENABLED(CONFIG_HIGHMEM) || addr)
-				__cpuc_flush_dcache_area(addr, PAGE_SIZE);
-		}
-	}
-}
-EXPORT_SYMBOL(flush_kernel_dcache_page);
-
 /*
  * Flush an anonymous page so that users of get_user_pages()
  * can safely access the data.  The expected sequence is:
diff --git a/arch/arm/mm/nommu.c b/arch/arm/mm/nommu.c
index 8b3d7191e2b8..2658f52903da 100644
--- a/arch/arm/mm/nommu.c
+++ b/arch/arm/mm/nommu.c
@@ -166,12 +166,6 @@ void flush_dcache_page(struct page *page)
 }
 EXPORT_SYMBOL(flush_dcache_page);
 
-void flush_kernel_dcache_page(struct page *page)
-{
-	__cpuc_flush_dcache_area(page_address(page), PAGE_SIZE);
-}
-EXPORT_SYMBOL(flush_kernel_dcache_page);
-
 void copy_to_user_page(struct vm_area_struct *vma, struct page *page,
 		       unsigned long uaddr, void *dst, const void *src,
 		       unsigned long len)
diff --git a/arch/csky/abiv1/cacheflush.c b/arch/csky/abiv1/cacheflush.c
index 07ff17ea33de..fb91b069dc69 100644
--- a/arch/csky/abiv1/cacheflush.c
+++ b/arch/csky/abiv1/cacheflush.c
@@ -56,17 +56,6 @@ void update_mmu_cache(struct vm_area_struct *vma, unsigned long addr,
 	}
 }
 
-void flush_kernel_dcache_page(struct page *page)
-{
-	struct address_space *mapping;
-
-	mapping = page_mapping_file(page);
-
-	if (!mapping || mapping_mapped(mapping))
-		dcache_wbinv_all();
-}
-EXPORT_SYMBOL(flush_kernel_dcache_page);
-
 void flush_cache_range(struct vm_area_struct *vma, unsigned long start,
 	unsigned long end)
 {
diff --git a/arch/csky/abiv1/inc/abi/cacheflush.h b/arch/csky/abiv1/inc/abi/cacheflush.h
index 6cab7afae962..ed62e2066ba7 100644
--- a/arch/csky/abiv1/inc/abi/cacheflush.h
+++ b/arch/csky/abiv1/inc/abi/cacheflush.h
@@ -14,12 +14,10 @@ extern void flush_dcache_page(struct page *);
 #define flush_cache_page(vma, page, pfn)	cache_wbinv_all()
 #define flush_cache_dup_mm(mm)			cache_wbinv_all()
 
-#define ARCH_HAS_FLUSH_KERNEL_DCACHE_PAGE
-extern void flush_kernel_dcache_page(struct page *);
-
 #define flush_dcache_mmap_lock(mapping)		xa_lock_irq(&mapping->i_pages)
 #define flush_dcache_mmap_unlock(mapping)	xa_unlock_irq(&mapping->i_pages)
 
+#define ARCH_IMPLEMENTS_FLUSH_KERNEL_VMAP_RANGE 1
 static inline void flush_kernel_vmap_range(void *addr, int size)
 {
 	dcache_wbinv_all();
diff --git a/arch/mips/include/asm/cacheflush.h b/arch/mips/include/asm/cacheflush.h
index d687b40b9fbb..b3dc9c589442 100644
--- a/arch/mips/include/asm/cacheflush.h
+++ b/arch/mips/include/asm/cacheflush.h
@@ -125,13 +125,7 @@ static inline void kunmap_noncoherent(void)
 	kunmap_coherent();
 }
 
-#define ARCH_HAS_FLUSH_KERNEL_DCACHE_PAGE
-static inline void flush_kernel_dcache_page(struct page *page)
-{
-	BUG_ON(cpu_has_dc_aliases && PageHighMem(page));
-	flush_dcache_page(page);
-}
-
+#define ARCH_IMPLEMENTS_FLUSH_KERNEL_VMAP_RANGE 1
 /*
  * For now flush_kernel_vmap_range and invalidate_kernel_vmap_range both do a
  * cache writeback and invalidate operation.
diff --git a/arch/nds32/include/asm/cacheflush.h b/arch/nds32/include/asm/cacheflush.h
index 7d6824f7c0e8..c2a222ebfa2a 100644
--- a/arch/nds32/include/asm/cacheflush.h
+++ b/arch/nds32/include/asm/cacheflush.h
@@ -36,8 +36,7 @@ void copy_from_user_page(struct vm_area_struct *vma, struct page *page,
 void flush_anon_page(struct vm_area_struct *vma,
 		     struct page *page, unsigned long vaddr);
 
-#define ARCH_HAS_FLUSH_KERNEL_DCACHE_PAGE
-void flush_kernel_dcache_page(struct page *page);
+#define ARCH_IMPLEMENTS_FLUSH_KERNEL_VMAP_RANGE 1
 void flush_kernel_vmap_range(void *addr, int size);
 void invalidate_kernel_vmap_range(void *addr, int size);
 #define flush_dcache_mmap_lock(mapping)   xa_lock_irq(&(mapping)->i_pages)
diff --git a/arch/nds32/mm/cacheflush.c b/arch/nds32/mm/cacheflush.c
index ad5344ef5d33..07aac65d1cab 100644
--- a/arch/nds32/mm/cacheflush.c
+++ b/arch/nds32/mm/cacheflush.c
@@ -318,15 +318,6 @@ void flush_anon_page(struct vm_area_struct *vma,
 	local_irq_restore(flags);
 }
 
-void flush_kernel_dcache_page(struct page *page)
-{
-	unsigned long flags;
-	local_irq_save(flags);
-	cpu_dcache_wbinval_page((unsigned long)page_address(page));
-	local_irq_restore(flags);
-}
-EXPORT_SYMBOL(flush_kernel_dcache_page);
-
 void flush_kernel_vmap_range(void *addr, int size)
 {
 	unsigned long flags;
diff --git a/arch/parisc/include/asm/cacheflush.h b/arch/parisc/include/asm/cacheflush.h
index 99663fc1f997..eef0096db5f8 100644
--- a/arch/parisc/include/asm/cacheflush.h
+++ b/arch/parisc/include/asm/cacheflush.h
@@ -36,16 +36,12 @@ void flush_cache_all_local(void);
 void flush_cache_all(void);
 void flush_cache_mm(struct mm_struct *mm);
 
-#define ARCH_HAS_FLUSH_KERNEL_DCACHE_PAGE
 void flush_kernel_dcache_page_addr(void *addr);
-static inline void flush_kernel_dcache_page(struct page *page)
-{
-	flush_kernel_dcache_page_addr(page_address(page));
-}
 
 #define flush_kernel_dcache_range(start,size) \
 	flush_kernel_dcache_range_asm((start), (start)+(size));
 
+#define ARCH_IMPLEMENTS_FLUSH_KERNEL_VMAP_RANGE 1
 void flush_kernel_vmap_range(void *vaddr, int size);
 void invalidate_kernel_vmap_range(void *vaddr, int size);
 
@@ -59,7 +55,7 @@ extern void flush_dcache_page(struct page *page);
 #define flush_dcache_mmap_unlock(mapping)	xa_unlock_irq(&mapping->i_pages)
 
 #define flush_icache_page(vma,page)	do { 		\
-	flush_kernel_dcache_page(page);			\
+	flush_kernel_dcache_page_addr(page_address(page)); \
 	flush_kernel_icache_page(page_address(page)); 	\
 } while (0)
 
diff --git a/arch/parisc/kernel/cache.c b/arch/parisc/kernel/cache.c
index 86a1a63563fd..39e02227e231 100644
--- a/arch/parisc/kernel/cache.c
+++ b/arch/parisc/kernel/cache.c
@@ -334,7 +334,7 @@ void flush_dcache_page(struct page *page)
 		return;
 	}
 
-	flush_kernel_dcache_page(page);
+	flush_kernel_dcache_page_addr(page_address(page));
 
 	if (!mapping)
 		return;
@@ -375,7 +375,6 @@ EXPORT_SYMBOL(flush_dcache_page);
 
 /* Defined in arch/parisc/kernel/pacache.S */
 EXPORT_SYMBOL(flush_kernel_dcache_range_asm);
-EXPORT_SYMBOL(flush_kernel_dcache_page_asm);
 EXPORT_SYMBOL(flush_data_cache_local);
 EXPORT_SYMBOL(flush_kernel_icache_range_asm);
 
diff --git a/arch/sh/include/asm/cacheflush.h b/arch/sh/include/asm/cacheflush.h
index 4486a865ff62..372afa82fee6 100644
--- a/arch/sh/include/asm/cacheflush.h
+++ b/arch/sh/include/asm/cacheflush.h
@@ -63,6 +63,8 @@ static inline void flush_anon_page(struct vm_area_struct *vma,
 	if (boot_cpu_data.dcache.n_aliases && PageAnon(page))
 		__flush_anon_page(page, vmaddr);
 }
+
+#define ARCH_IMPLEMENTS_FLUSH_KERNEL_VMAP_RANGE 1
 static inline void flush_kernel_vmap_range(void *addr, int size)
 {
 	__flush_wback_region(addr, size);
@@ -72,12 +74,6 @@ static inline void invalidate_kernel_vmap_range(void *addr, int size)
 	__flush_invalidate_region(addr, size);
 }
 
-#define ARCH_HAS_FLUSH_KERNEL_DCACHE_PAGE
-static inline void flush_kernel_dcache_page(struct page *page)
-{
-	flush_dcache_page(page);
-}
-
 extern void copy_to_user_page(struct vm_area_struct *vma,
 	struct page *page, unsigned long vaddr, void *dst, const void *src,
 	unsigned long len);
diff --git a/block/blk-map.c b/block/blk-map.c
index 3743158ddaeb..4639bc6b5c62 100644
--- a/block/blk-map.c
+++ b/block/blk-map.c
@@ -309,7 +309,7 @@ static int bio_map_user_iov(struct request *rq, struct iov_iter *iter,
 
 static void bio_invalidate_vmalloc_pages(struct bio *bio)
 {
-#ifdef ARCH_HAS_FLUSH_KERNEL_DCACHE_PAGE
+#ifdef ARCH_IMPLEMENTS_FLUSH_KERNEL_VMAP_RANGE
 	if (bio->bi_private && !op_is_write(bio_op(bio))) {
 		unsigned long i, len = 0;
 
diff --git a/fs/exec.c b/fs/exec.c
index 38f63451b928..41a888d4edde 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -574,7 +574,7 @@ static int copy_strings(int argc, struct user_arg_ptr argv,
 				}
 
 				if (kmapped_page) {
-					flush_kernel_dcache_page(kmapped_page);
+					flush_dcache_page(kmapped_page);
 					kunmap(kmapped_page);
 					put_arg_page(kmapped_page);
 				}
@@ -592,7 +592,7 @@ static int copy_strings(int argc, struct user_arg_ptr argv,
 	ret = 0;
 out:
 	if (kmapped_page) {
-		flush_kernel_dcache_page(kmapped_page);
+		flush_dcache_page(kmapped_page);
 		kunmap(kmapped_page);
 		put_arg_page(kmapped_page);
 	}
@@ -634,7 +634,7 @@ int copy_string_kernel(const char *arg, struct linux_binprm *bprm)
 		kaddr = kmap_atomic(page);
 		flush_arg_page(bprm, pos & PAGE_MASK, page);
 		memcpy(kaddr + offset_in_page(pos), arg, bytes_to_copy);
-		flush_kernel_dcache_page(page);
+		flush_dcache_page(page);
 		kunmap_atomic(kaddr);
 		put_arg_page(page);
 	}
diff --git a/include/linux/highmem.h b/include/linux/highmem.h
index 8c6e8e996c87..e95551bf99e9 100644
--- a/include/linux/highmem.h
+++ b/include/linux/highmem.h
@@ -130,10 +130,7 @@ static inline void flush_anon_page(struct vm_area_struct *vma, struct page *page
 }
 #endif
 
-#ifndef ARCH_HAS_FLUSH_KERNEL_DCACHE_PAGE
-static inline void flush_kernel_dcache_page(struct page *page)
-{
-}
+#ifndef ARCH_IMPLEMENTS_FLUSH_KERNEL_VMAP_RANGE
 static inline void flush_kernel_vmap_range(void *vaddr, int size)
 {
 }
diff --git a/tools/testing/scatterlist/linux/mm.h b/tools/testing/scatterlist/linux/mm.h
index f9a12005fcea..16ec895bbe5f 100644
--- a/tools/testing/scatterlist/linux/mm.h
+++ b/tools/testing/scatterlist/linux/mm.h
@@ -127,7 +127,6 @@ kmalloc_array(unsigned int n, unsigned int size, unsigned int flags)
 #define kmemleak_free(a)
 
 #define PageSlab(p) (0)
-#define flush_kernel_dcache_page(p)
 
 #define MAX_ERRNO	4095
 
-- 
2.30.2


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: flush_kernel_dcache_page fixes and removal
  2021-07-12  6:09 flush_kernel_dcache_page fixes and removal Christoph Hellwig
                   ` (5 preceding siblings ...)
  2021-07-12  6:09 ` [PATCH 6/6] mm: remove flush_kernel_dcache_page Christoph Hellwig
@ 2021-07-12 19:24 ` Linus Torvalds
  2021-07-13  5:46   ` Christoph Hellwig
  2021-07-13  8:46 ` Russell King (Oracle)
  2021-07-14  3:13 ` Guo Ren
  8 siblings, 1 reply; 20+ messages in thread
From: Linus Torvalds @ 2021-07-12 19:24 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Andrew Morton, James E.J. Bottomley, Russell King, Guo Ren,
	Thomas Bogendoerfer, Nick Hu, Greentime Hu, Vincent Chen,
	Helge Deller, Yoshinori Sato, Rich Felker, Geoff Levand,
	Paul Cercueil, Ulf Hansson, Alex Shi, Linux Kernel Mailing List,
	Linux ARM, linux-csky, open list:BROADCOM NVRAM DRIVER,
	linux-parisc, Linux-sh list, linux-mmc, linux-scsi, Linux-MM,
	open list:DOCUMENTATION

On Sun, Jul 11, 2021 at 11:09 PM Christoph Hellwig <hch@lst.de> wrote:
>
> I think we should just remove it and eat the very minor overhead in
> exec rather than confusing poor driver writers.

Ack.

I think architectures that have virtual caches might want to think
about this patch a bit more, but on the whole I can't argue against
the "it's badly documented and misused".

No sane architecture will care, since dcache will be coherent (there
are more issues on the I$ side, but that's a different issue)

              Linus

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 6/6] mm: remove flush_kernel_dcache_page
  2021-07-12  6:09 ` [PATCH 6/6] mm: remove flush_kernel_dcache_page Christoph Hellwig
@ 2021-07-12 23:56   ` Ira Weiny
  0 siblings, 0 replies; 20+ messages in thread
From: Ira Weiny @ 2021-07-12 23:56 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Linus Torvalds, Andrew Morton, James E.J. Bottomley,
	Russell King, Guo Ren, Thomas Bogendoerfer, Nick Hu,
	Greentime Hu, Vincent Chen, Helge Deller, Yoshinori Sato,
	Rich Felker, Geoff Levand, Paul Cercueil, Ulf Hansson, Alex Shi,
	linux-kernel, linux-arm-kernel, linux-csky, linux-mips,
	linux-parisc, linux-sh, linux-mmc, linux-scsi, linux-mm,
	linux-doc

On Mon, Jul 12, 2021 at 08:09:28AM +0200, Christoph Hellwig wrote:
> flush_kernel_dcache_page is a rather confusing interface that implements
> a subset of flush_dcache_page by not being able to properly handle page
> cache mapped pages.
> 
> The only callers left are in the exec code as all other previous callers
> were incorrect as they could have dealt with page cache pages.  Replace
> the calls to flush_kernel_dcache_page with calls to
> flush_kernel_dcache_page, which for all architectures does either
  ^^^^^^^^^^^^^^^^^^^^^^^^
  flush_dcache_page


Other than that, for the series:

Reviewed-by: Ira Weiny <ira.weiny@intel.com>

> exactly the same thing, can contains one or more of the following:
> 
>  1) an optimization to defer the cache flush for page cache pages not
>     mapped into userspace
>  2) additional flushing for mapped page cache pages if cache aliases
>     are possible
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>  Documentation/core-api/cachetlb.rst           | 86 ++++++++-----------
>  .../translations/zh_CN/core-api/cachetlb.rst  |  9 --
>  arch/arm/include/asm/cacheflush.h             |  4 +-
>  arch/arm/mm/flush.c                           | 33 -------
>  arch/arm/mm/nommu.c                           |  6 --
>  arch/csky/abiv1/cacheflush.c                  | 11 ---
>  arch/csky/abiv1/inc/abi/cacheflush.h          |  4 +-
>  arch/mips/include/asm/cacheflush.h            |  8 +-
>  arch/nds32/include/asm/cacheflush.h           |  3 +-
>  arch/nds32/mm/cacheflush.c                    |  9 --
>  arch/parisc/include/asm/cacheflush.h          |  8 +-
>  arch/parisc/kernel/cache.c                    |  3 +-
>  arch/sh/include/asm/cacheflush.h              |  8 +-
>  block/blk-map.c                               |  2 +-
>  fs/exec.c                                     |  6 +-
>  include/linux/highmem.h                       |  5 +-
>  tools/testing/scatterlist/linux/mm.h          |  1 -
>  17 files changed, 51 insertions(+), 155 deletions(-)
> 
> diff --git a/Documentation/core-api/cachetlb.rst b/Documentation/core-api/cachetlb.rst
> index fe4290e26729..8aed9103e48a 100644
> --- a/Documentation/core-api/cachetlb.rst
> +++ b/Documentation/core-api/cachetlb.rst
> @@ -271,10 +271,15 @@ maps this page at its virtual address.
>  
>    ``void flush_dcache_page(struct page *page)``
>  
> -	Any time the kernel writes to a page cache page, _OR_
> -	the kernel is about to read from a page cache page and
> -	user space shared/writable mappings of this page potentially
> -	exist, this routine is called.
> +        This routines must be called when:
> +
> +	  a) the kernel did write to a page that is in the page cache page
> +	     and / or in high memory
> +	  b) the kernel is about to read from a page cache page and user space
> +	     shared/writable mappings of this page potentially exist.  Note
> +	     that {get,pin}_user_pages{_fast} already call flush_dcache_page
> +	     on any page found in the user address space and thus driver
> +	     code rarely needs to take this into account.
>  
>  	.. note::
>  
> @@ -284,38 +289,34 @@ maps this page at its virtual address.
>  	      handling vfs symlinks in the page cache need not call
>  	      this interface at all.
>  
> -	The phrase "kernel writes to a page cache page" means,
> -	specifically, that the kernel executes store instructions
> -	that dirty data in that page at the page->virtual mapping
> -	of that page.  It is important to flush here to handle
> -	D-cache aliasing, to make sure these kernel stores are
> -	visible to user space mappings of that page.
> -
> -	The corollary case is just as important, if there are users
> -	which have shared+writable mappings of this file, we must make
> -	sure that kernel reads of these pages will see the most recent
> -	stores done by the user.
> -
> -	If D-cache aliasing is not an issue, this routine may
> -	simply be defined as a nop on that architecture.
> -
> -        There is a bit set aside in page->flags (PG_arch_1) as
> -	"architecture private".  The kernel guarantees that,
> -	for pagecache pages, it will clear this bit when such
> -	a page first enters the pagecache.
> -
> -	This allows these interfaces to be implemented much more
> -	efficiently.  It allows one to "defer" (perhaps indefinitely)
> -	the actual flush if there are currently no user processes
> -	mapping this page.  See sparc64's flush_dcache_page and
> -	update_mmu_cache implementations for an example of how to go
> -	about doing this.
> -
> -	The idea is, first at flush_dcache_page() time, if
> -	page->mapping->i_mmap is an empty tree, just mark the architecture
> -	private page flag bit.  Later, in update_mmu_cache(), a check is
> -	made of this flag bit, and if set the flush is done and the flag
> -	bit is cleared.
> +	The phrase "kernel writes to a page cache page" means, specifically,
> +	that the kernel executes store instructions that dirty data in that
> +	page at the page->virtual mapping of that page.  It is important to
> +	flush here to handle D-cache aliasing, to make sure these kernel stores
> +	are visible to user space mappings of that page.
> +
> +	The corollary case is just as important, if there are users which have
> +	shared+writable mappings of this file, we must make sure that kernel
> +	reads of these pages will see the most recent stores done by the user.
> +
> +	If D-cache aliasing is not an issue, this routine may simply be defined
> +	as a nop on that architecture.
> +
> +        There is a bit set aside in page->flags (PG_arch_1) as "architecture
> +	private".  The kernel guarantees that, for pagecache pages, it will
> +	clear this bit when such a page first enters the pagecache.
> +
> +	This allows these interfaces to be implemented much more efficiently.
> +	It allows one to "defer" (perhaps indefinitely) the actual flush if
> +	there are currently no user processes mapping this page.  See sparc64's
> +	flush_dcache_page and update_mmu_cache implementations for an example
> +	of how to go about doing this.
> +
> +	The idea is, first at flush_dcache_page() time, if page_file_mapping()
> +	returns a mapping, and mapping_mapped on that mapping returns %false,
> +	just mark the architecture private page flag bit.  Later, in
> +	update_mmu_cache(), a check is made of this flag bit, and if set the
> +	flush is done and the flag bit is cleared.
>  
>  	.. important::
>  
> @@ -351,19 +352,6 @@ maps this page at its virtual address.
>  	architectures).  For incoherent architectures, it should flush
>  	the cache of the page at vmaddr.
>  
> -  ``void flush_kernel_dcache_page(struct page *page)``
> -
> -	When the kernel needs to modify a user page is has obtained
> -	with kmap, it calls this function after all modifications are
> -	complete (but before kunmapping it) to bring the underlying
> -	page up to date.  It is assumed here that the user has no
> -	incoherent cached copies (i.e. the original page was obtained
> -	from a mechanism like get_user_pages()).  The default
> -	implementation is a nop and should remain so on all coherent
> -	architectures.  On incoherent architectures, this should flush
> -	the kernel cache for page (using page_address(page)).
> -
> -
>    ``void flush_icache_range(unsigned long start, unsigned long end)``
>  
>    	When the kernel stores into addresses that it will execute
> diff --git a/Documentation/translations/zh_CN/core-api/cachetlb.rst b/Documentation/translations/zh_CN/core-api/cachetlb.rst
> index 8376485a534d..55827b8a7c53 100644
> --- a/Documentation/translations/zh_CN/core-api/cachetlb.rst
> +++ b/Documentation/translations/zh_CN/core-api/cachetlb.rst
> @@ -298,15 +298,6 @@ HyperSparc cpu就是这样一个具有这种属性的cpu。
>  	用。默认的实现是nop(对于所有相干的架构应该保持这样)。对于不一致性
>  	的架构,它应该刷新vmaddr处的页面缓存。
>  
> -  ``void flush_kernel_dcache_page(struct page *page)``
> -
> -	当内核需要修改一个用kmap获得的用户页时,它会在所有修改完成后(但在
> -	kunmapping之前)调用这个函数,以使底层页面达到最新状态。这里假定用
> -	户没有不一致性的缓存副本(即原始页面是从类似get_user_pages()的机制
> -	中获得的)。默认的实现是一个nop,在所有相干的架构上都应该如此。在不
> -	一致性的架构上,这应该刷新内核缓存中的页面(使用page_address(page))。
> -
> -
>    ``void flush_icache_range(unsigned long start, unsigned long end)``
>  
>  	当内核存储到它将执行的地址中时(例如在加载模块时),这个函数被调用。
> diff --git a/arch/arm/include/asm/cacheflush.h b/arch/arm/include/asm/cacheflush.h
> index 2e24e765e6d3..5e56288e343b 100644
> --- a/arch/arm/include/asm/cacheflush.h
> +++ b/arch/arm/include/asm/cacheflush.h
> @@ -291,6 +291,7 @@ extern void flush_cache_page(struct vm_area_struct *vma, unsigned long user_addr
>  #define ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE 1
>  extern void flush_dcache_page(struct page *);
>  
> +#define ARCH_IMPLEMENTS_FLUSH_KERNEL_VMAP_RANGE 1
>  static inline void flush_kernel_vmap_range(void *addr, int size)
>  {
>  	if ((cache_is_vivt() || cache_is_vipt_aliasing()))
> @@ -312,9 +313,6 @@ static inline void flush_anon_page(struct vm_area_struct *vma,
>  		__flush_anon_page(vma, page, vmaddr);
>  }
>  
> -#define ARCH_HAS_FLUSH_KERNEL_DCACHE_PAGE
> -extern void flush_kernel_dcache_page(struct page *);
> -
>  #define flush_dcache_mmap_lock(mapping)		xa_lock_irq(&mapping->i_pages)
>  #define flush_dcache_mmap_unlock(mapping)	xa_unlock_irq(&mapping->i_pages)
>  
> diff --git a/arch/arm/mm/flush.c b/arch/arm/mm/flush.c
> index 6d89db7895d1..7ff9feea13a6 100644
> --- a/arch/arm/mm/flush.c
> +++ b/arch/arm/mm/flush.c
> @@ -345,39 +345,6 @@ void flush_dcache_page(struct page *page)
>  }
>  EXPORT_SYMBOL(flush_dcache_page);
>  
> -/*
> - * Ensure cache coherency for the kernel mapping of this page. We can
> - * assume that the page is pinned via kmap.
> - *
> - * If the page only exists in the page cache and there are no user
> - * space mappings, this is a no-op since the page was already marked
> - * dirty at creation.  Otherwise, we need to flush the dirty kernel
> - * cache lines directly.
> - */
> -void flush_kernel_dcache_page(struct page *page)
> -{
> -	if (cache_is_vivt() || cache_is_vipt_aliasing()) {
> -		struct address_space *mapping;
> -
> -		mapping = page_mapping_file(page);
> -
> -		if (!mapping || mapping_mapped(mapping)) {
> -			void *addr;
> -
> -			addr = page_address(page);
> -			/*
> -			 * kmap_atomic() doesn't set the page virtual
> -			 * address for highmem pages, and
> -			 * kunmap_atomic() takes care of cache
> -			 * flushing already.
> -			 */
> -			if (!IS_ENABLED(CONFIG_HIGHMEM) || addr)
> -				__cpuc_flush_dcache_area(addr, PAGE_SIZE);
> -		}
> -	}
> -}
> -EXPORT_SYMBOL(flush_kernel_dcache_page);
> -
>  /*
>   * Flush an anonymous page so that users of get_user_pages()
>   * can safely access the data.  The expected sequence is:
> diff --git a/arch/arm/mm/nommu.c b/arch/arm/mm/nommu.c
> index 8b3d7191e2b8..2658f52903da 100644
> --- a/arch/arm/mm/nommu.c
> +++ b/arch/arm/mm/nommu.c
> @@ -166,12 +166,6 @@ void flush_dcache_page(struct page *page)
>  }
>  EXPORT_SYMBOL(flush_dcache_page);
>  
> -void flush_kernel_dcache_page(struct page *page)
> -{
> -	__cpuc_flush_dcache_area(page_address(page), PAGE_SIZE);
> -}
> -EXPORT_SYMBOL(flush_kernel_dcache_page);
> -
>  void copy_to_user_page(struct vm_area_struct *vma, struct page *page,
>  		       unsigned long uaddr, void *dst, const void *src,
>  		       unsigned long len)
> diff --git a/arch/csky/abiv1/cacheflush.c b/arch/csky/abiv1/cacheflush.c
> index 07ff17ea33de..fb91b069dc69 100644
> --- a/arch/csky/abiv1/cacheflush.c
> +++ b/arch/csky/abiv1/cacheflush.c
> @@ -56,17 +56,6 @@ void update_mmu_cache(struct vm_area_struct *vma, unsigned long addr,
>  	}
>  }
>  
> -void flush_kernel_dcache_page(struct page *page)
> -{
> -	struct address_space *mapping;
> -
> -	mapping = page_mapping_file(page);
> -
> -	if (!mapping || mapping_mapped(mapping))
> -		dcache_wbinv_all();
> -}
> -EXPORT_SYMBOL(flush_kernel_dcache_page);
> -
>  void flush_cache_range(struct vm_area_struct *vma, unsigned long start,
>  	unsigned long end)
>  {
> diff --git a/arch/csky/abiv1/inc/abi/cacheflush.h b/arch/csky/abiv1/inc/abi/cacheflush.h
> index 6cab7afae962..ed62e2066ba7 100644
> --- a/arch/csky/abiv1/inc/abi/cacheflush.h
> +++ b/arch/csky/abiv1/inc/abi/cacheflush.h
> @@ -14,12 +14,10 @@ extern void flush_dcache_page(struct page *);
>  #define flush_cache_page(vma, page, pfn)	cache_wbinv_all()
>  #define flush_cache_dup_mm(mm)			cache_wbinv_all()
>  
> -#define ARCH_HAS_FLUSH_KERNEL_DCACHE_PAGE
> -extern void flush_kernel_dcache_page(struct page *);
> -
>  #define flush_dcache_mmap_lock(mapping)		xa_lock_irq(&mapping->i_pages)
>  #define flush_dcache_mmap_unlock(mapping)	xa_unlock_irq(&mapping->i_pages)
>  
> +#define ARCH_IMPLEMENTS_FLUSH_KERNEL_VMAP_RANGE 1
>  static inline void flush_kernel_vmap_range(void *addr, int size)
>  {
>  	dcache_wbinv_all();
> diff --git a/arch/mips/include/asm/cacheflush.h b/arch/mips/include/asm/cacheflush.h
> index d687b40b9fbb..b3dc9c589442 100644
> --- a/arch/mips/include/asm/cacheflush.h
> +++ b/arch/mips/include/asm/cacheflush.h
> @@ -125,13 +125,7 @@ static inline void kunmap_noncoherent(void)
>  	kunmap_coherent();
>  }
>  
> -#define ARCH_HAS_FLUSH_KERNEL_DCACHE_PAGE
> -static inline void flush_kernel_dcache_page(struct page *page)
> -{
> -	BUG_ON(cpu_has_dc_aliases && PageHighMem(page));
> -	flush_dcache_page(page);
> -}
> -
> +#define ARCH_IMPLEMENTS_FLUSH_KERNEL_VMAP_RANGE 1
>  /*
>   * For now flush_kernel_vmap_range and invalidate_kernel_vmap_range both do a
>   * cache writeback and invalidate operation.
> diff --git a/arch/nds32/include/asm/cacheflush.h b/arch/nds32/include/asm/cacheflush.h
> index 7d6824f7c0e8..c2a222ebfa2a 100644
> --- a/arch/nds32/include/asm/cacheflush.h
> +++ b/arch/nds32/include/asm/cacheflush.h
> @@ -36,8 +36,7 @@ void copy_from_user_page(struct vm_area_struct *vma, struct page *page,
>  void flush_anon_page(struct vm_area_struct *vma,
>  		     struct page *page, unsigned long vaddr);
>  
> -#define ARCH_HAS_FLUSH_KERNEL_DCACHE_PAGE
> -void flush_kernel_dcache_page(struct page *page);
> +#define ARCH_IMPLEMENTS_FLUSH_KERNEL_VMAP_RANGE 1
>  void flush_kernel_vmap_range(void *addr, int size);
>  void invalidate_kernel_vmap_range(void *addr, int size);
>  #define flush_dcache_mmap_lock(mapping)   xa_lock_irq(&(mapping)->i_pages)
> diff --git a/arch/nds32/mm/cacheflush.c b/arch/nds32/mm/cacheflush.c
> index ad5344ef5d33..07aac65d1cab 100644
> --- a/arch/nds32/mm/cacheflush.c
> +++ b/arch/nds32/mm/cacheflush.c
> @@ -318,15 +318,6 @@ void flush_anon_page(struct vm_area_struct *vma,
>  	local_irq_restore(flags);
>  }
>  
> -void flush_kernel_dcache_page(struct page *page)
> -{
> -	unsigned long flags;
> -	local_irq_save(flags);
> -	cpu_dcache_wbinval_page((unsigned long)page_address(page));
> -	local_irq_restore(flags);
> -}
> -EXPORT_SYMBOL(flush_kernel_dcache_page);
> -
>  void flush_kernel_vmap_range(void *addr, int size)
>  {
>  	unsigned long flags;
> diff --git a/arch/parisc/include/asm/cacheflush.h b/arch/parisc/include/asm/cacheflush.h
> index 99663fc1f997..eef0096db5f8 100644
> --- a/arch/parisc/include/asm/cacheflush.h
> +++ b/arch/parisc/include/asm/cacheflush.h
> @@ -36,16 +36,12 @@ void flush_cache_all_local(void);
>  void flush_cache_all(void);
>  void flush_cache_mm(struct mm_struct *mm);
>  
> -#define ARCH_HAS_FLUSH_KERNEL_DCACHE_PAGE
>  void flush_kernel_dcache_page_addr(void *addr);
> -static inline void flush_kernel_dcache_page(struct page *page)
> -{
> -	flush_kernel_dcache_page_addr(page_address(page));
> -}
>  
>  #define flush_kernel_dcache_range(start,size) \
>  	flush_kernel_dcache_range_asm((start), (start)+(size));
>  
> +#define ARCH_IMPLEMENTS_FLUSH_KERNEL_VMAP_RANGE 1
>  void flush_kernel_vmap_range(void *vaddr, int size);
>  void invalidate_kernel_vmap_range(void *vaddr, int size);
>  
> @@ -59,7 +55,7 @@ extern void flush_dcache_page(struct page *page);
>  #define flush_dcache_mmap_unlock(mapping)	xa_unlock_irq(&mapping->i_pages)
>  
>  #define flush_icache_page(vma,page)	do { 		\
> -	flush_kernel_dcache_page(page);			\
> +	flush_kernel_dcache_page_addr(page_address(page)); \
>  	flush_kernel_icache_page(page_address(page)); 	\
>  } while (0)
>  
> diff --git a/arch/parisc/kernel/cache.c b/arch/parisc/kernel/cache.c
> index 86a1a63563fd..39e02227e231 100644
> --- a/arch/parisc/kernel/cache.c
> +++ b/arch/parisc/kernel/cache.c
> @@ -334,7 +334,7 @@ void flush_dcache_page(struct page *page)
>  		return;
>  	}
>  
> -	flush_kernel_dcache_page(page);
> +	flush_kernel_dcache_page_addr(page_address(page));
>  
>  	if (!mapping)
>  		return;
> @@ -375,7 +375,6 @@ EXPORT_SYMBOL(flush_dcache_page);
>  
>  /* Defined in arch/parisc/kernel/pacache.S */
>  EXPORT_SYMBOL(flush_kernel_dcache_range_asm);
> -EXPORT_SYMBOL(flush_kernel_dcache_page_asm);
>  EXPORT_SYMBOL(flush_data_cache_local);
>  EXPORT_SYMBOL(flush_kernel_icache_range_asm);
>  
> diff --git a/arch/sh/include/asm/cacheflush.h b/arch/sh/include/asm/cacheflush.h
> index 4486a865ff62..372afa82fee6 100644
> --- a/arch/sh/include/asm/cacheflush.h
> +++ b/arch/sh/include/asm/cacheflush.h
> @@ -63,6 +63,8 @@ static inline void flush_anon_page(struct vm_area_struct *vma,
>  	if (boot_cpu_data.dcache.n_aliases && PageAnon(page))
>  		__flush_anon_page(page, vmaddr);
>  }
> +
> +#define ARCH_IMPLEMENTS_FLUSH_KERNEL_VMAP_RANGE 1
>  static inline void flush_kernel_vmap_range(void *addr, int size)
>  {
>  	__flush_wback_region(addr, size);
> @@ -72,12 +74,6 @@ static inline void invalidate_kernel_vmap_range(void *addr, int size)
>  	__flush_invalidate_region(addr, size);
>  }
>  
> -#define ARCH_HAS_FLUSH_KERNEL_DCACHE_PAGE
> -static inline void flush_kernel_dcache_page(struct page *page)
> -{
> -	flush_dcache_page(page);
> -}
> -
>  extern void copy_to_user_page(struct vm_area_struct *vma,
>  	struct page *page, unsigned long vaddr, void *dst, const void *src,
>  	unsigned long len);
> diff --git a/block/blk-map.c b/block/blk-map.c
> index 3743158ddaeb..4639bc6b5c62 100644
> --- a/block/blk-map.c
> +++ b/block/blk-map.c
> @@ -309,7 +309,7 @@ static int bio_map_user_iov(struct request *rq, struct iov_iter *iter,
>  
>  static void bio_invalidate_vmalloc_pages(struct bio *bio)
>  {
> -#ifdef ARCH_HAS_FLUSH_KERNEL_DCACHE_PAGE
> +#ifdef ARCH_IMPLEMENTS_FLUSH_KERNEL_VMAP_RANGE
>  	if (bio->bi_private && !op_is_write(bio_op(bio))) {
>  		unsigned long i, len = 0;
>  
> diff --git a/fs/exec.c b/fs/exec.c
> index 38f63451b928..41a888d4edde 100644
> --- a/fs/exec.c
> +++ b/fs/exec.c
> @@ -574,7 +574,7 @@ static int copy_strings(int argc, struct user_arg_ptr argv,
>  				}
>  
>  				if (kmapped_page) {
> -					flush_kernel_dcache_page(kmapped_page);
> +					flush_dcache_page(kmapped_page);
>  					kunmap(kmapped_page);
>  					put_arg_page(kmapped_page);
>  				}
> @@ -592,7 +592,7 @@ static int copy_strings(int argc, struct user_arg_ptr argv,
>  	ret = 0;
>  out:
>  	if (kmapped_page) {
> -		flush_kernel_dcache_page(kmapped_page);
> +		flush_dcache_page(kmapped_page);
>  		kunmap(kmapped_page);
>  		put_arg_page(kmapped_page);
>  	}
> @@ -634,7 +634,7 @@ int copy_string_kernel(const char *arg, struct linux_binprm *bprm)
>  		kaddr = kmap_atomic(page);
>  		flush_arg_page(bprm, pos & PAGE_MASK, page);
>  		memcpy(kaddr + offset_in_page(pos), arg, bytes_to_copy);
> -		flush_kernel_dcache_page(page);
> +		flush_dcache_page(page);
>  		kunmap_atomic(kaddr);
>  		put_arg_page(page);
>  	}
> diff --git a/include/linux/highmem.h b/include/linux/highmem.h
> index 8c6e8e996c87..e95551bf99e9 100644
> --- a/include/linux/highmem.h
> +++ b/include/linux/highmem.h
> @@ -130,10 +130,7 @@ static inline void flush_anon_page(struct vm_area_struct *vma, struct page *page
>  }
>  #endif
>  
> -#ifndef ARCH_HAS_FLUSH_KERNEL_DCACHE_PAGE
> -static inline void flush_kernel_dcache_page(struct page *page)
> -{
> -}
> +#ifndef ARCH_IMPLEMENTS_FLUSH_KERNEL_VMAP_RANGE
>  static inline void flush_kernel_vmap_range(void *vaddr, int size)
>  {
>  }
> diff --git a/tools/testing/scatterlist/linux/mm.h b/tools/testing/scatterlist/linux/mm.h
> index f9a12005fcea..16ec895bbe5f 100644
> --- a/tools/testing/scatterlist/linux/mm.h
> +++ b/tools/testing/scatterlist/linux/mm.h
> @@ -127,7 +127,6 @@ kmalloc_array(unsigned int n, unsigned int size, unsigned int flags)
>  #define kmemleak_free(a)
>  
>  #define PageSlab(p) (0)
> -#define flush_kernel_dcache_page(p)
>  
>  #define MAX_ERRNO	4095
>  
> -- 
> 2.30.2
> 
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: flush_kernel_dcache_page fixes and removal
  2021-07-12 19:24 ` flush_kernel_dcache_page fixes and removal Linus Torvalds
@ 2021-07-13  5:46   ` Christoph Hellwig
  0 siblings, 0 replies; 20+ messages in thread
From: Christoph Hellwig @ 2021-07-13  5:46 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Christoph Hellwig, Andrew Morton, James E.J. Bottomley,
	Russell King, Guo Ren, Thomas Bogendoerfer, Nick Hu,
	Greentime Hu, Vincent Chen, Helge Deller, Yoshinori Sato,
	Rich Felker, Geoff Levand, Paul Cercueil, Ulf Hansson, Alex Shi,
	Linux Kernel Mailing List, Linux ARM, linux-csky,
	open list:BROADCOM NVRAM DRIVER, linux-parisc, Linux-sh list,
	linux-mmc, linux-scsi, Linux-MM, open list:DOCUMENTATION

On Mon, Jul 12, 2021 at 12:24:11PM -0700, Linus Torvalds wrote:
> I think architectures that have virtual caches might want to think
> about this patch a bit more, but on the whole I can't argue against
> the "it's badly documented and misused".
> 
> No sane architecture will care, since dcache will be coherent (there
> are more issues on the I$ side, but that's a different issue)

Yeah.  Once the arch maintainers look it it it might be worth to check
if there is optimization potential for pages that are not in highmem
and not in the page cache, as most architectures should be able to
just do nothing in that case.

Either way, I think getting patches 1-4 into 5.14 as bug fixes would
be useful, 6 is a trivial cleanup and 5 is something we can chew on
for a bit.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: flush_kernel_dcache_page fixes and removal
  2021-07-12  6:09 flush_kernel_dcache_page fixes and removal Christoph Hellwig
                   ` (6 preceding siblings ...)
  2021-07-12 19:24 ` flush_kernel_dcache_page fixes and removal Linus Torvalds
@ 2021-07-13  8:46 ` Russell King (Oracle)
  2021-07-13  9:11   ` Christoph Hellwig
  2021-07-19  5:38   ` Herbert Xu
  2021-07-14  3:13 ` Guo Ren
  8 siblings, 2 replies; 20+ messages in thread
From: Russell King (Oracle) @ 2021-07-13  8:46 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Linus Torvalds, Andrew Morton, James E.J. Bottomley, Guo Ren,
	Thomas Bogendoerfer, Nick Hu, Greentime Hu, Vincent Chen,
	Helge Deller, Yoshinori Sato, Rich Felker, Geoff Levand,
	Paul Cercueil, Ulf Hansson, Alex Shi, linux-kernel,
	linux-arm-kernel, linux-csky, linux-mips, linux-parisc, linux-sh,
	linux-mmc, linux-scsi, linux-mm, linux-doc

On Mon, Jul 12, 2021 at 08:09:22AM +0200, Christoph Hellwig wrote:
> while looking to convert the block layer away from kmap_atomic towards
> kmap_local_page and prefeably the helpers that abstract it away I noticed
> that a few block drivers directly or implicitly call
> flush_kernel_dcache_page before kunmapping a page that has been written
> to.  flush_kernel_dcache_page is documented to to be used in such cases,
> but flush_dcache_page is actually required when the page could be in
> the page cache and mapped to userspace, which is pretty much always the
> case when kmapping an arbitrary page.  Unfortunately the documentation
> doesn't exactly make that clear, which lead to this misused.  And it turns
> out that only the copy_strings / copy_string_kernel in the exec code
> were actually correct users of flush_kernel_dcache_page, which is why
> I think we should just remove it and eat the very minor overhead in
> exec rather than confusing poor driver writers.

I think you need to be careful - I seem to have a recollection that the
reason we ended up with flush_kernel_dcache_page() was the need to avoid
the taking of the mmap lock for 32-bit ARM VIVT based CPUs in
flush_dcache_page(). 32-bit ARM flush_dcache_page() can block.

If you're sure that all these changes you're making do not end up
calling flush_dcache_page() from a path where we are atomic, then fine.

The second issue I have is that, when we are reading a page into a page
cache page, how can that page be mapped to userspace? Isn't that a
violation of semantics? If the page is mapped to userspace but does not
contain data from the underlying storage device, then the page contains
stale data (if it's a new page, lets hope that's zeroed and not some
previous contents - which would be a massive security hole.) As I
understand it, the flush_kernel_dcache_page() calls in the block layer
are primarily there to cope with drivers that do PIO read to write to a
page cache page to ensure that later userspace mappings can see the data
in the page cache page - by ensuring that the page cache pages are in
the same state as far as caches go as if they had been DMA'd to.

We know that the current implementation works fine - you're now
proposing to radically change it, asserting that it's buggy. I'm
nervous about this change.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: flush_kernel_dcache_page fixes and removal
  2021-07-13  8:46 ` Russell King (Oracle)
@ 2021-07-13  9:11   ` Christoph Hellwig
  2021-07-19  5:38   ` Herbert Xu
  1 sibling, 0 replies; 20+ messages in thread
From: Christoph Hellwig @ 2021-07-13  9:11 UTC (permalink / raw)
  To: Russell King (Oracle)
  Cc: Christoph Hellwig, Linus Torvalds, Andrew Morton,
	James E.J. Bottomley, Guo Ren, Thomas Bogendoerfer, Nick Hu,
	Greentime Hu, Vincent Chen, Helge Deller, Yoshinori Sato,
	Rich Felker, Geoff Levand, Paul Cercueil, Ulf Hansson, Alex Shi,
	linux-kernel, linux-arm-kernel, linux-csky, linux-mips,
	linux-parisc, linux-sh, linux-mmc, linux-scsi, linux-mm,
	linux-doc

On Tue, Jul 13, 2021 at 09:46:48AM +0100, Russell King (Oracle) wrote:
> I think you need to be careful - I seem to have a recollection that the
> reason we ended up with flush_kernel_dcache_page() was the need to avoid
> the taking of the mmap lock for 32-bit ARM VIVT based CPUs in
> flush_dcache_page(). 32-bit ARM flush_dcache_page() can block.

Where can arm32 flush_dcache_page block?  __flush_dcache_aliases does
walk mapping->i_mmap, but using a spinlock hidden under
flush_dcache_mmap_lock.  If flush_dcache_page did block plenty of code
already calling it e.g. from interrupt and block submission contexts
in various mmc/sdcard drivers would be rather unhappy.  Even today
calls to flush_kernel_dcache_page page in the block I/O path are
vastly outnumber by calls to flush_dcache_page.

> The second issue I have is that, when we are reading a page into a page
> cache page, how can that page be mapped to userspace? Isn't that a
> violation of semantics? If the page is mapped to userspace but does not
> contain data from the underlying storage device, then the page contains
> stale data (if it's a new page, lets hope that's zeroed and not some
> previous contents - which would be a massive security hole.)

I did not come up with the rules, but these are the existing documented
ones for flush_dcache_page:

        Any time the kernel writes to a page cache page, _OR_
	the kernel is about to read from a page cache page and
	user space shared/writable mappings of this page potentially
	exist, this routine is called.

So writing to (aka reading into) a page cache page is not conditional
on it being mapped yet.  Only the kernel reading from the page cache
page has this condition.

> As I
> understand it, the flush_kernel_dcache_page() calls in the block layer
> are primarily there to cope with drivers that do PIO read to write to a
> page cache page to ensure that later userspace mappings can see the data
> in the page cache page - by ensuring that the page cache pages are in
> the same state as far as caches go as if they had been DMA'd to.

PIO is one big case, but also kernel generated data and all kinds of
bounce buffering schemes.

> We know that the current implementation works fine - you're now
> proposing to radically change it, asserting that it's buggy. I'm
> nervous about this change.

Do we know it works?  There are very few calls to
flush_dcache_kernel_page and very few implementations that differ from
flush_dcache_page.  For arm32 the relevant drivers would mostly be
mmc drivers using the sg_miter interface, are they even used much on
the platforms where the difference exists?

> 
> -- 
> RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
> FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!
---end quoted text---

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: flush_kernel_dcache_page fixes and removal
  2021-07-12  6:09 flush_kernel_dcache_page fixes and removal Christoph Hellwig
                   ` (7 preceding siblings ...)
  2021-07-13  8:46 ` Russell King (Oracle)
@ 2021-07-14  3:13 ` Guo Ren
  8 siblings, 0 replies; 20+ messages in thread
From: Guo Ren @ 2021-07-14  3:13 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Linus Torvalds, Andrew Morton, James E.J. Bottomley,
	Russell King, Thomas Bogendoerfer, Nick Hu, Greentime Hu,
	Vincent Chen, Helge Deller, Yoshinori Sato, Rich Felker,
	Geoff Levand, Paul Cercueil, Ulf Hansson, Alex Shi,
	Linux Kernel Mailing List, Linux ARM, linux-csky, linux-mips,
	linux-parisc, linux-sh, linux-mmc, linux-scsi, Linux-MM,
	linux-doc

Acked-by for csky abiv1 part.(No change to our execution path after
the patch set.)

On Mon, Jul 12, 2021 at 2:10 PM Christoph Hellwig <hch@lst.de> wrote:
>
> Hi all,
>
> while looking to convert the block layer away from kmap_atomic towards
> kmap_local_page and prefeably the helpers that abstract it away I noticed
> that a few block drivers directly or implicitly call
> flush_kernel_dcache_page before kunmapping a page that has been written
> to.  flush_kernel_dcache_page is documented to to be used in such cases,
> but flush_dcache_page is actually required when the page could be in
> the page cache and mapped to userspace, which is pretty much always the
> case when kmapping an arbitrary page.  Unfortunately the documentation
> doesn't exactly make that clear, which lead to this misused.  And it turns
> out that only the copy_strings / copy_string_kernel in the exec code
> were actually correct users of flush_kernel_dcache_page, which is why
> I think we should just remove it and eat the very minor overhead in
> exec rather than confusing poor driver writers.
>
> Diffstat:
>  Documentation/core-api/cachetlb.rst                    |   86 +++++++----------
>  Documentation/translations/zh_CN/core-api/cachetlb.rst |    9 -
>  arch/arm/include/asm/cacheflush.h                      |    4
>  arch/arm/mm/flush.c                                    |   33 ------
>  arch/arm/mm/nommu.c                                    |    6 -
>  arch/csky/abiv1/cacheflush.c                           |   11 --
>  arch/csky/abiv1/inc/abi/cacheflush.h                   |    4
>  arch/mips/include/asm/cacheflush.h                     |    8 -
>  arch/nds32/include/asm/cacheflush.h                    |    3
>  arch/nds32/mm/cacheflush.c                             |    9 -
>  arch/parisc/include/asm/cacheflush.h                   |    8 -
>  arch/parisc/kernel/cache.c                             |    3
>  arch/sh/include/asm/cacheflush.h                       |    8 -
>  block/blk-map.c                                        |    2
>  drivers/block/ps3disk.c                                |    2
>  drivers/mmc/host/jz4740_mmc.c                          |    4
>  drivers/mmc/host/mmc_spi.c                             |    2
>  drivers/scsi/aacraid/aachba.c                          |    1
>  fs/exec.c                                              |    6 -
>  include/linux/highmem.h                                |    5
>  lib/scatterlist.c                                      |    5
>  tools/testing/scatterlist/linux/mm.h                   |    1
>  22 files changed, 55 insertions(+), 165 deletions(-)



-- 
Best Regards
 Guo Ren

ML: https://lore.kernel.org/linux-csky/

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 5/6] aacraid: remove an unused include
  2021-07-12  6:09 ` [PATCH 5/6] aacraid: remove an unused include Christoph Hellwig
@ 2021-07-19  1:50   ` Martin K. Petersen
  0 siblings, 0 replies; 20+ messages in thread
From: Martin K. Petersen @ 2021-07-19  1:50 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Linus Torvalds, Andrew Morton, James E.J. Bottomley,
	Russell King, Guo Ren, Thomas Bogendoerfer, Nick Hu,
	Greentime Hu, Vincent Chen, Helge Deller, Yoshinori Sato,
	Rich Felker, Geoff Levand, Paul Cercueil, Ulf Hansson, Alex Shi,
	linux-kernel, linux-arm-kernel, linux-csky, linux-mips,
	linux-parisc, linux-sh, linux-mmc, linux-scsi, linux-mm,
	linux-doc


Christoph,

> flush_kernel_dcache_page is not used by aacraid, and this header
> already comes in through the scatterlist/block headers anyway.

Seems orthogonal to the rest of your series. Applied to
5.15/scsi-staging, thanks!

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 3/6] ps3disk: replace flush_kernel_dcache_page with flush_dcache_page
  2021-07-12  6:09 ` [PATCH 3/6] ps3disk: " Christoph Hellwig
@ 2021-07-19  3:07   ` Geoff Levand
  0 siblings, 0 replies; 20+ messages in thread
From: Geoff Levand @ 2021-07-19  3:07 UTC (permalink / raw)
  To: Christoph Hellwig, Linus Torvalds, Andrew Morton, James E.J. Bottomley
  Cc: Russell King, Guo Ren, Thomas Bogendoerfer, Nick Hu,
	Greentime Hu, Vincent Chen, Helge Deller, Yoshinori Sato,
	Rich Felker, Paul Cercueil, Ulf Hansson, Alex Shi, linux-kernel,
	linux-arm-kernel, linux-csky, linux-mips, linux-parisc, linux-sh,
	linux-mmc, linux-scsi, linux-mm, linux-doc

Hi Christoph,

On 7/11/21 11:09 PM, Christoph Hellwig wrote:
> +++ b/drivers/block/ps3disk.c
>  		offset += size;
> -		flush_kernel_dcache_page(bvec.bv_page);
> +		flush_dcache_page(bvec.bv_page);
>  		bvec_kunmap_irq(buf, &flags);

I tested your series applied to v5.14-rc1 on the PS3
and it seems to be working OK.

Tested-by: Geoff Levand <geoff@infradead.org>


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: flush_kernel_dcache_page fixes and removal
  2021-07-13  8:46 ` Russell King (Oracle)
  2021-07-13  9:11   ` Christoph Hellwig
@ 2021-07-19  5:38   ` Herbert Xu
  2021-07-19 15:42     ` Russell King (Oracle)
  1 sibling, 1 reply; 20+ messages in thread
From: Herbert Xu @ 2021-07-19  5:38 UTC (permalink / raw)
  To: Russell King Oracle
  Cc: hch, torvalds, akpm, James.Bottomley, guoren, tsbogend, nickhu,
	green.hu, deanbo422, deller, ysato, dalias, geoff, paul,
	ulf.hansson, alexs, linux-kernel, linux-arm-kernel, linux-csky,
	linux-mips, linux-parisc, linux-sh, linux-mmc, linux-scsi,
	linux-mm, linux-doc

Russell King Oracle <linux@armlinux.org.uk> wrote:
>
> I think you need to be careful - I seem to have a recollection that the
> reason we ended up with flush_kernel_dcache_page() was the need to avoid
> the taking of the mmap lock for 32-bit ARM VIVT based CPUs in
> flush_dcache_page(). 32-bit ARM flush_dcache_page() can block.
> 
> If you're sure that all these changes you're making do not end up
> calling flush_dcache_page() from a path where we are atomic, then fine.

The Crypto API has been calling flush_dcache_page from softirq
context since before the advent of git (see crypto/scatterwalk.c
from the initial import).  So if 32-bit ARM blocks on it then this
has been broken for almost 20 years.

Cheers,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/6] mmc: JZ4740: remove the flush_kernel_dcache_page call in jz4740_mmc_read_data
  2021-07-12  6:09 ` [PATCH 1/6] mmc: JZ4740: remove the flush_kernel_dcache_page call in jz4740_mmc_read_data Christoph Hellwig
@ 2021-07-19  8:15   ` Paul Cercueil
  2021-08-04 11:12   ` Ulf Hansson
  1 sibling, 0 replies; 20+ messages in thread
From: Paul Cercueil @ 2021-07-19  8:15 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Linus Torvalds, Andrew Morton, James E.J. Bottomley,
	Russell King, Guo Ren, Thomas Bogendoerfer, Nick Hu,
	Greentime Hu, Vincent Chen, Helge Deller, Yoshinori Sato,
	Rich Felker, Geoff Levand, Ulf Hansson, Alex Shi, linux-kernel,
	linux-arm-kernel, linux-csky, linux-mips, linux-parisc, linux-sh,
	linux-mmc, linux-scsi, linux-mm, linux-doc

Hi Christoph,

Le lun., juil. 12 2021 at 08:09:23 +0200, Christoph Hellwig 
<hch@lst.de> a écrit :
> MIPS now implements flush_kernel_dcache_page (as an alias to
> flush_dcache_page).
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>

Tested-by: Paul Cercueil <paul@crapouillou.net>

Cheers,
-Paul

> ---
>  drivers/mmc/host/jz4740_mmc.c | 4 ----
>  1 file changed, 4 deletions(-)
> 
> diff --git a/drivers/mmc/host/jz4740_mmc.c 
> b/drivers/mmc/host/jz4740_mmc.c
> index 0db17bcc9c16..aa2240c83510 100644
> --- a/drivers/mmc/host/jz4740_mmc.c
> +++ b/drivers/mmc/host/jz4740_mmc.c
> @@ -578,10 +578,6 @@ static bool jz4740_mmc_read_data(struct 
> jz4740_mmc_host *host,
>  			}
>  		}
>  		data->bytes_xfered += miter->length;
> -
> -		/* This can go away once MIPS implements
> -		 * flush_kernel_dcache_page */
> -		flush_dcache_page(miter->page);
>  	}
>  	sg_miter_stop(miter);
> 
> --
> 2.30.2
> 



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: flush_kernel_dcache_page fixes and removal
  2021-07-19  5:38   ` Herbert Xu
@ 2021-07-19 15:42     ` Russell King (Oracle)
  0 siblings, 0 replies; 20+ messages in thread
From: Russell King (Oracle) @ 2021-07-19 15:42 UTC (permalink / raw)
  To: Herbert Xu
  Cc: hch, torvalds, akpm, James.Bottomley, guoren, tsbogend, nickhu,
	green.hu, deanbo422, deller, ysato, dalias, geoff, paul,
	ulf.hansson, alexs, linux-kernel, linux-arm-kernel, linux-csky,
	linux-mips, linux-parisc, linux-sh, linux-mmc, linux-scsi,
	linux-mm, linux-doc

On Mon, Jul 19, 2021 at 01:38:51PM +0800, Herbert Xu wrote:
> Russell King Oracle <linux@armlinux.org.uk> wrote:
> >
> > I think you need to be careful - I seem to have a recollection that the
> > reason we ended up with flush_kernel_dcache_page() was the need to avoid
> > the taking of the mmap lock for 32-bit ARM VIVT based CPUs in
> > flush_dcache_page(). 32-bit ARM flush_dcache_page() can block.
> > 
> > If you're sure that all these changes you're making do not end up
> > calling flush_dcache_page() from a path where we are atomic, then fine.
> 
> The Crypto API has been calling flush_dcache_page from softirq
> context since before the advent of git (see crypto/scatterwalk.c
> from the initial import).  So if 32-bit ARM blocks on it then this
> has been broken for almost 20 years.

I think what's confusing me is the naming of flush_dcache_mmap_lock().
The mmap lock is a read-write semaphore (see linux/mmap-lock.h), and
is even called "mmap_lock" in mm_struct, but this has nothing to do
with flush_dcache_mmap_lock().

So no, flush_dcache_mmap_lock() doesn't block as I first thought, and
therefore flush_dcache_page() doesn't block either.

Sorry for the noise.

However, I now seem to remember some discussion in the past when I was
trying to get people to use flush_dcache_page() to solve the coherency
problems when block drivers were doing PIO to page cache pages. I seem
to remember there being objections to it, which is one of the reasons
we ended up with a lighter weight flush_kernel_dcache_page(). But
shrug, dim and distant memories.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/6] mmc: JZ4740: remove the flush_kernel_dcache_page call in jz4740_mmc_read_data
  2021-07-12  6:09 ` [PATCH 1/6] mmc: JZ4740: remove the flush_kernel_dcache_page call in jz4740_mmc_read_data Christoph Hellwig
  2021-07-19  8:15   ` Paul Cercueil
@ 2021-08-04 11:12   ` Ulf Hansson
  1 sibling, 0 replies; 20+ messages in thread
From: Ulf Hansson @ 2021-08-04 11:12 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Linus Torvalds, Andrew Morton, James E.J. Bottomley,
	Russell King, Guo Ren, Thomas Bogendoerfer, Nick Hu,
	Greentime Hu, Vincent Chen, Helge Deller, Yoshinori Sato,
	Rich Felker, Geoff Levand, Paul Cercueil, Alex Shi,
	Linux Kernel Mailing List, Linux ARM, linux-csky, linux-mips,
	linux-parisc, Linux-sh list, linux-mmc, linux-scsi, linux-mm,
	Linux Documentation

On Mon, 12 Jul 2021 at 08:10, Christoph Hellwig <hch@lst.de> wrote:
>
> MIPS now implements flush_kernel_dcache_page (as an alias to
> flush_dcache_page).
>
> Signed-off-by: Christoph Hellwig <hch@lst.de>

Apologies for the delay!

Acked-by: Ulf Hansson <ulf.hansson@linaro.org>

Kind regards
Uffe


> ---
>  drivers/mmc/host/jz4740_mmc.c | 4 ----
>  1 file changed, 4 deletions(-)
>
> diff --git a/drivers/mmc/host/jz4740_mmc.c b/drivers/mmc/host/jz4740_mmc.c
> index 0db17bcc9c16..aa2240c83510 100644
> --- a/drivers/mmc/host/jz4740_mmc.c
> +++ b/drivers/mmc/host/jz4740_mmc.c
> @@ -578,10 +578,6 @@ static bool jz4740_mmc_read_data(struct jz4740_mmc_host *host,
>                         }
>                 }
>                 data->bytes_xfered += miter->length;
> -
> -               /* This can go away once MIPS implements
> -                * flush_kernel_dcache_page */
> -               flush_dcache_page(miter->page);
>         }
>         sg_miter_stop(miter);
>
> --
> 2.30.2
>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/6] mmc: mmc_spi: replace flush_kernel_dcache_page with flush_dcache_page
  2021-07-12  6:09 ` [PATCH 2/6] mmc: mmc_spi: replace flush_kernel_dcache_page with flush_dcache_page Christoph Hellwig
@ 2021-08-04 11:12   ` Ulf Hansson
  0 siblings, 0 replies; 20+ messages in thread
From: Ulf Hansson @ 2021-08-04 11:12 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Linus Torvalds, Andrew Morton, James E.J. Bottomley,
	Russell King, Guo Ren, Thomas Bogendoerfer, Nick Hu,
	Greentime Hu, Vincent Chen, Helge Deller, Yoshinori Sato,
	Rich Felker, Geoff Levand, Paul Cercueil, Alex Shi,
	Linux Kernel Mailing List, Linux ARM, linux-csky, linux-mips,
	linux-parisc, Linux-sh list, linux-mmc, linux-scsi, linux-mm,
	Linux Documentation

On Mon, 12 Jul 2021 at 08:11, Christoph Hellwig <hch@lst.de> wrote:
>
> Pages passed to block drivers can be mapped page cache pages, so we
> must use flush_dcache_page here instead of the more limited
> flush_kernel_dcache_page that is intended for highmem pages only.
>
> Signed-off-by: Christoph Hellwig <hch@lst.de>

Apologies for the delay!

Acked-by: Ulf Hansson <ulf.hansson@linaro.org>

Kind regards
Uffe

> ---
>  drivers/mmc/host/mmc_spi.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/mmc/host/mmc_spi.c b/drivers/mmc/host/mmc_spi.c
> index 65c65bb5737f..3d28a3d3001b 100644
> --- a/drivers/mmc/host/mmc_spi.c
> +++ b/drivers/mmc/host/mmc_spi.c
> @@ -948,7 +948,7 @@ mmc_spi_data_do(struct mmc_spi_host *host, struct mmc_command *cmd,
>
>                 /* discard mappings */
>                 if (direction == DMA_FROM_DEVICE)
> -                       flush_kernel_dcache_page(sg_page(sg));
> +                       flush_dcache_page(sg_page(sg));
>                 kunmap(sg_page(sg));
>                 if (dma_dev)
>                         dma_unmap_page(dma_dev, dma_addr, PAGE_SIZE, dir);
> --
> 2.30.2
>

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2021-08-04 11:13 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-07-12  6:09 flush_kernel_dcache_page fixes and removal Christoph Hellwig
2021-07-12  6:09 ` [PATCH 1/6] mmc: JZ4740: remove the flush_kernel_dcache_page call in jz4740_mmc_read_data Christoph Hellwig
2021-07-19  8:15   ` Paul Cercueil
2021-08-04 11:12   ` Ulf Hansson
2021-07-12  6:09 ` [PATCH 2/6] mmc: mmc_spi: replace flush_kernel_dcache_page with flush_dcache_page Christoph Hellwig
2021-08-04 11:12   ` Ulf Hansson
2021-07-12  6:09 ` [PATCH 3/6] ps3disk: " Christoph Hellwig
2021-07-19  3:07   ` Geoff Levand
2021-07-12  6:09 ` [PATCH 4/6] scatterlist: " Christoph Hellwig
2021-07-12  6:09 ` [PATCH 5/6] aacraid: remove an unused include Christoph Hellwig
2021-07-19  1:50   ` Martin K. Petersen
2021-07-12  6:09 ` [PATCH 6/6] mm: remove flush_kernel_dcache_page Christoph Hellwig
2021-07-12 23:56   ` Ira Weiny
2021-07-12 19:24 ` flush_kernel_dcache_page fixes and removal Linus Torvalds
2021-07-13  5:46   ` Christoph Hellwig
2021-07-13  8:46 ` Russell King (Oracle)
2021-07-13  9:11   ` Christoph Hellwig
2021-07-19  5:38   ` Herbert Xu
2021-07-19 15:42     ` Russell King (Oracle)
2021-07-14  3:13 ` Guo Ren

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).