All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH V2 0/2] fix and improvement of flush(_kernel)_dcache_page()
@ 2012-10-07 11:29 Simon Baatz
  2012-10-07 11:29 ` [PATCH V2 1/2] ARM: remove unnecessary flush of anon pages in flush_dcache_page() Simon Baatz
  2012-10-07 11:29 ` [PATCH V3 2/2] ARM: Handle user space mapped pages in flush_kernel_dcache_page Simon Baatz
  0 siblings, 2 replies; 35+ messages in thread
From: Simon Baatz @ 2012-10-07 11:29 UTC (permalink / raw)
  To: linux-arm-kernel

The first patch optimizes flush_dcache_page() for non-aliasing VIPT
caches.  On these the flush of the kernel mapping for the anon case is
actually not needed.  It may be needed to ensure I/D coherency, but
this can be handled by __sync_icache_dcache() later and more
effectively (similar to arm64, see [3]).  Thus, we should be lazy as
well here and just clear the PG_dcache_clean bit.

The second patch is another attempt to fix the regression caused by
patch f8b63c1 "ARM: 6382/1" in 2.6.37 (see [2]).  The problem occurs
on VIVT (and probably VIPT aliasing) cache architectures for drivers
that use flush_kernel_dcache_page().  For example, 3.6 (as any
kernel beginning with 2.6.37) on kirkwood using mv_cesa produces data
corruption when using direct I/O:

~# cryptsetup luksOpen /dev/sda2 c_sda2
Enter passphrase for /dev/sda2: 
~# dd if=/dev/mapper/c_sda2 iflag=direct bs=4k count=16 2>/dev/null | sha1sum 
9ae4997ec85ad9b7ab4b10341e42ace80f0ea5d6  -
~# dd if=/dev/mapper/c_sda2 bs=4k count=16 2>/dev/null | sha1sum
ca39c5d4950b3704eff952c48e383bf1db20532e  -

(The mv_cesa driver uses the scatterlist memory iterator, which uses
flush_kernel_dcache_page() before kunmap()).

As described in [1], both flush_dcache_page() and
flush_kernel_dcache_page() need to handle the following cases:
- page cache pages with no user space mapping
- page cache pages with user space mapping(s)
- flush the kernel mapping of anonymous pages

The last one is contradictory to documentation for flush_dcache_page(),
but many uses in the kernel require it (drivers use it to flush
modifications via the kernel mapping).

In the proposed patch the implementation of flush_kernel_dcache_page()
follows the one of flush_dcache_page(), but only flushes the kernel
mapping, not the alias user mapping(s).

I could only test this on kirkwood (ARMv5, VIVT) and raspberry pi
BCM2835 (ARMv6, PIPT / VIPT nonaliasing data cache).  More testing
would probably be a good thing...  Especially, I could not test the
non-aliasing cache case with a real driver using
flush_kernel_dcache_page() (mv_cesa on dove could be a test case for
this).


Changes:

in V2:
- Followed Catalin's suggestion to reverse the order of the patches
- Clarified comment for flush_dcache_page()


- Simon

[1] http://lists.infradead.org/pipermail/linux-arm-kernel/2012-August/113908.html
[2] http://lists.infradead.org/pipermail/linux-arm-kernel/2012-May/101795.html
[3] https://lkml.org/lkml/2012/9/12/121



Simon Baatz (2):
  ARM: remove unnecessary flush of anon pages in flush_dcache_page()
  ARM: Handle user space mapped pages in flush_kernel_dcache_page

 arch/arm/include/asm/cacheflush.h |    4 ++
 arch/arm/mm/flush.c               |   87 +++++++++++++++++++++++++++++--------
 2 files changed, 72 insertions(+), 19 deletions(-)

-- 
1.7.9.5

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [PATCH V2 1/2] ARM: remove unnecessary flush of anon pages in flush_dcache_page()
  2012-10-07 11:29 [PATCH V2 0/2] fix and improvement of flush(_kernel)_dcache_page() Simon Baatz
@ 2012-10-07 11:29 ` Simon Baatz
  2012-10-08 11:36   ` Catalin Marinas
  2012-10-07 11:29 ` [PATCH V3 2/2] ARM: Handle user space mapped pages in flush_kernel_dcache_page Simon Baatz
  1 sibling, 1 reply; 35+ messages in thread
From: Simon Baatz @ 2012-10-07 11:29 UTC (permalink / raw)
  To: linux-arm-kernel

On non-aliasing VIPT D-caches, there is no need to flush the kernel
mapping of anon pages in flush_dcache_page() directly.  If the page is
mapped as executable later, the necessary D/I-cache flush will be done
in __sync_icache_dcache().

Signed-off-by: Simon Baatz <gmbnomis@gmail.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Russell King <linux@arm.linux.org.uk>
---

Changes:

in V2:
- Followed Catalin's suggestion to reverse the order of the patches
- Clarified comment for flush_dcache_page()


 arch/arm/mm/flush.c |   45 ++++++++++++++++++++++++++-------------------
 1 file changed, 26 insertions(+), 19 deletions(-)

diff --git a/arch/arm/mm/flush.c b/arch/arm/mm/flush.c
index 40ca11e..5c474a1 100644
--- a/arch/arm/mm/flush.c
+++ b/arch/arm/mm/flush.c
@@ -257,16 +257,20 @@ void __sync_icache_dcache(pte_t pteval)
  * of this page.
  *
  * We have three cases to consider:
- *  - VIPT non-aliasing cache: fully coherent so nothing required.
- *  - VIVT: fully aliasing, so we need to handle every alias in our
- *          current VM view.
- *  - VIPT aliasing: need to handle one alias in our current VM view.
+ *  - VIPT non-aliasing D-cache:
+ *          D-cache: fully coherent so nothing required.
+ *          I-cache: Ensure I/D coherency in case of an already mapped page;
+ *                   __sync_icache_dcache() will handle the other cases.
+ *  - VIPT aliasing D-cache:
+ *          D-cache: need to handle one alias in our current VM view.
+ *          I-cache: same as VIPT non-aliasing cache.
+ *  - VIVT D-cache: fully aliasing, so we need to handle every alias in our
+ *                  current VM view.
  *
- * If we need to handle aliasing:
- *  If the page only exists in the page cache and there are no user
- *  space mappings, we can be lazy and remember that we may have dirty
- *  kernel cache lines for later.  Otherwise, we assume we have
- *  aliasing mappings.
+ * If the page only exists in the page cache and there are no user
+ * space mappings, we can be lazy and remember that we may have dirty
+ * kernel cache lines for later.  Otherwise, we assume we have
+ * aliasing mappings.
  *
  * Note that we disable the lazy flush for SMP configurations where
  * the cache maintenance operations are not automatically broadcasted.
@@ -284,17 +288,20 @@ void flush_dcache_page(struct page *page)
 
 	mapping = page_mapping(page);
 
-	if (!cache_ops_need_broadcast() &&
-	    mapping && !mapping_mapped(mapping))
-		clear_bit(PG_dcache_clean, &page->flags);
-	else {
-		__flush_dcache_page(mapping, page);
-		if (mapping && cache_is_vivt())
-			__flush_dcache_aliases(mapping, page);
-		else if (mapping)
-			__flush_icache_all();
-		set_bit(PG_dcache_clean, &page->flags);
+	if (!cache_ops_need_broadcast()) {
+		if ((mapping && !mapping_mapped(mapping)) ||
+		    (!mapping && cache_is_vipt_nonaliasing())) {
+			clear_bit(PG_dcache_clean, &page->flags);
+			return;
+		}
 	}
+
+	__flush_dcache_page(mapping, page);
+	if (mapping && cache_is_vivt())
+		__flush_dcache_aliases(mapping, page);
+	else if (mapping)
+		__flush_icache_all();
+	set_bit(PG_dcache_clean, &page->flags);
 }
 EXPORT_SYMBOL(flush_dcache_page);
 
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH V3 2/2] ARM: Handle user space mapped pages in flush_kernel_dcache_page
  2012-10-07 11:29 [PATCH V2 0/2] fix and improvement of flush(_kernel)_dcache_page() Simon Baatz
  2012-10-07 11:29 ` [PATCH V2 1/2] ARM: remove unnecessary flush of anon pages in flush_dcache_page() Simon Baatz
@ 2012-10-07 11:29 ` Simon Baatz
  2012-10-08 17:44   ` Catalin Marinas
  1 sibling, 1 reply; 35+ messages in thread
From: Simon Baatz @ 2012-10-07 11:29 UTC (permalink / raw)
  To: linux-arm-kernel

Commit f8b63c1 made flush_kernel_dcache_page() a no-op assuming that
the pages it needs to handle are kernel mapped only.  However, for
example when doing direct I/O, pages with user space mappings may
occur.

Thus, do lazy flushing like in flush_dcache_page() if there are no user
space mappings.  Otherwise, flush the kernel cache lines directly.

Signed-off-by: Simon Baatz <gmbnomis@gmail.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Russell King <linux@arm.linux.org.uk>
---

Changes:

in V3:
- Followed Catalin's suggestion to reverse the order of the patches

in V2:
- flush_kernel_dcache_page() follows flush_dcache_page() now, except that it
  does not flush the user mappings

 arch/arm/include/asm/cacheflush.h |    4 ++++
 arch/arm/mm/flush.c               |   42 +++++++++++++++++++++++++++++++++++++
 2 files changed, 46 insertions(+)

diff --git a/arch/arm/include/asm/cacheflush.h b/arch/arm/include/asm/cacheflush.h
index e4448e1..eca955f 100644
--- a/arch/arm/include/asm/cacheflush.h
+++ b/arch/arm/include/asm/cacheflush.h
@@ -307,6 +307,10 @@ static inline void flush_anon_page(struct vm_area_struct *vma,
 #define ARCH_HAS_FLUSH_KERNEL_DCACHE_PAGE
 static inline void flush_kernel_dcache_page(struct page *page)
 {
+	extern void __flush_kernel_dcache_page(struct page *);
+	/* highmem pages are always flushed upon kunmap already */
+	if (!PageHighMem(page))
+		__flush_kernel_dcache_page(page);
 }
 
 #define flush_dcache_mmap_lock(mapping) \
diff --git a/arch/arm/mm/flush.c b/arch/arm/mm/flush.c
index 5c474a1..59ad4fc 100644
--- a/arch/arm/mm/flush.c
+++ b/arch/arm/mm/flush.c
@@ -192,6 +192,48 @@ void __flush_dcache_page(struct address_space *mapping, struct page *page)
 				page->index << PAGE_CACHE_SHIFT);
 }
 
+/*
+ * Ensure cache coherency for the kernel mapping of this page.
+ *
+ * If the page only exists in the page cache and there are no user
+ * space mappings, we can be lazy and remember that we may have dirty
+ * kernel cache lines for later.  Otherwise, we need to flush the
+ * dirty kernel cache lines directly.
+ *
+ * Note that we disable the lazy flush for SMP configurations where
+ * the cache maintenance operations are not automatically broadcasted.
+ *
+ * We can assume that the page is no high mem page, see
+ * flush_kernel_dcache_page.
+ */
+void __flush_kernel_dcache_page(struct page *page)
+{
+	struct address_space *mapping;
+
+	/*
+	 * The zero page is never written to, so never has any dirty
+	 * cache lines, and therefore never needs to be flushed.
+	 */
+	if (page == ZERO_PAGE(0))
+		return;
+
+	mapping = page_mapping(page);
+
+	if (!cache_ops_need_broadcast()) {
+		if ((mapping && !mapping_mapped(mapping)) ||
+		    (!mapping && cache_is_vipt_nonaliasing())) {
+			clear_bit(PG_dcache_clean, &page->flags);
+			return;
+		}
+	}
+
+	__cpuc_flush_dcache_area(page_address(page), PAGE_SIZE);
+	if (mapping && !cache_is_vivt())
+		__flush_icache_all();
+	set_bit(PG_dcache_clean, &page->flags);
+}
+EXPORT_SYMBOL(__flush_kernel_dcache_page);
+
 static void __flush_dcache_aliases(struct address_space *mapping, struct page *page)
 {
 	struct mm_struct *mm = current->active_mm;
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH V2 1/2] ARM: remove unnecessary flush of anon pages in flush_dcache_page()
  2012-10-07 11:29 ` [PATCH V2 1/2] ARM: remove unnecessary flush of anon pages in flush_dcache_page() Simon Baatz
@ 2012-10-08 11:36   ` Catalin Marinas
  2012-10-08 17:38     ` Simon Baatz
  0 siblings, 1 reply; 35+ messages in thread
From: Catalin Marinas @ 2012-10-08 11:36 UTC (permalink / raw)
  To: linux-arm-kernel

On Sun, Oct 07, 2012 at 12:29:11PM +0100, Simon Baatz wrote:
> On non-aliasing VIPT D-caches, there is no need to flush the kernel
> mapping of anon pages in flush_dcache_page() directly.  If the page is
> mapped as executable later, the necessary D/I-cache flush will be done
> in __sync_icache_dcache().
> 
> Signed-off-by: Simon Baatz <gmbnomis@gmail.com>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Russell King <linux@arm.linux.org.uk>

Acked-by: Catalin Marinas <catalin.marinas@arm.com>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [PATCH V2 1/2] ARM: remove unnecessary flush of anon pages in flush_dcache_page()
  2012-10-08 11:36   ` Catalin Marinas
@ 2012-10-08 17:38     ` Simon Baatz
  2012-10-08 17:43       ` Catalin Marinas
  0 siblings, 1 reply; 35+ messages in thread
From: Simon Baatz @ 2012-10-08 17:38 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Catalin,

On Mon, Oct 08, 2012 at 12:36:07PM +0100, Catalin Marinas wrote:
> On Sun, Oct 07, 2012 at 12:29:11PM +0100, Simon Baatz wrote:
> > On non-aliasing VIPT D-caches, there is no need to flush the kernel
> > mapping of anon pages in flush_dcache_page() directly.  If the page is
> > mapped as executable later, the necessary D/I-cache flush will be done
> > in __sync_icache_dcache().
> > 
> > Signed-off-by: Simon Baatz <gmbnomis@gmail.com>
> > Cc: Catalin Marinas <catalin.marinas@arm.com>
> > Cc: Russell King <linux@arm.linux.org.uk>
> 
> Acked-by: Catalin Marinas <catalin.marinas@arm.com>
> 

Thanks for the ack! However, while this improvement is nice to have,
I care more about fixing that nasty data corruption on VIVT/VIPT
aliasing caches with the second patch.  In case you cannot offer a
Acked-by or Reviewed-by for the patch for technical reasons, could
you please explain your reservations against the fix?


- Simon

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [PATCH V2 1/2] ARM: remove unnecessary flush of anon pages in flush_dcache_page()
  2012-10-08 17:38     ` Simon Baatz
@ 2012-10-08 17:43       ` Catalin Marinas
  0 siblings, 0 replies; 35+ messages in thread
From: Catalin Marinas @ 2012-10-08 17:43 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Oct 08, 2012 at 06:38:27PM +0100, Simon Baatz wrote:
> Hi Catalin,
> 
> On Mon, Oct 08, 2012 at 12:36:07PM +0100, Catalin Marinas wrote:
> > On Sun, Oct 07, 2012 at 12:29:11PM +0100, Simon Baatz wrote:
> > > On non-aliasing VIPT D-caches, there is no need to flush the kernel
> > > mapping of anon pages in flush_dcache_page() directly.  If the page is
> > > mapped as executable later, the necessary D/I-cache flush will be done
> > > in __sync_icache_dcache().
> > > 
> > > Signed-off-by: Simon Baatz <gmbnomis@gmail.com>
> > > Cc: Catalin Marinas <catalin.marinas@arm.com>
> > > Cc: Russell King <linux@arm.linux.org.uk>
> > 
> > Acked-by: Catalin Marinas <catalin.marinas@arm.com>
> > 
> 
> Thanks for the ack! However, while this improvement is nice to have,
> I care more about fixing that nasty data corruption on VIVT/VIPT
> aliasing caches with the second patch.  In case you cannot offer a
> Acked-by or Reviewed-by for the patch for technical reasons, could
> you please explain your reservations against the fix?

I was busy with other things today and didn't have time to get back to
this thread. I'll post a question though.

-- 
Catalin

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [PATCH V3 2/2] ARM: Handle user space mapped pages in flush_kernel_dcache_page
  2012-10-07 11:29 ` [PATCH V3 2/2] ARM: Handle user space mapped pages in flush_kernel_dcache_page Simon Baatz
@ 2012-10-08 17:44   ` Catalin Marinas
  2012-10-08 20:02     ` Simon Baatz
  0 siblings, 1 reply; 35+ messages in thread
From: Catalin Marinas @ 2012-10-08 17:44 UTC (permalink / raw)
  To: linux-arm-kernel

On Sun, Oct 07, 2012 at 12:29:12PM +0100, Simon Baatz wrote:
> Commit f8b63c1 made flush_kernel_dcache_page() a no-op assuming that
> the pages it needs to handle are kernel mapped only.  However, for
> example when doing direct I/O, pages with user space mappings may
> occur.
> 
> Thus, do lazy flushing like in flush_dcache_page() if there are no user
> space mappings.  Otherwise, flush the kernel cache lines directly.

Do you need to fix the VIPT non-aliasing case as well? Does
flush_kernel_dcache_page() need to handle I-cache?

-- 
Catalin

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [PATCH V3 2/2] ARM: Handle user space mapped pages in flush_kernel_dcache_page
  2012-10-08 17:44   ` Catalin Marinas
@ 2012-10-08 20:02     ` Simon Baatz
  2012-10-08 20:20       ` Russell King - ARM Linux
  0 siblings, 1 reply; 35+ messages in thread
From: Simon Baatz @ 2012-10-08 20:02 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Catalin,

On Mon, Oct 08, 2012 at 06:44:28PM +0100, Catalin Marinas wrote:
> On Sun, Oct 07, 2012 at 12:29:12PM +0100, Simon Baatz wrote:
> > Commit f8b63c1 made flush_kernel_dcache_page() a no-op assuming that
> > the pages it needs to handle are kernel mapped only.  However, for
> > example when doing direct I/O, pages with user space mappings may
> > occur.
> > 
> > Thus, do lazy flushing like in flush_dcache_page() if there are no user
> > space mappings.  Otherwise, flush the kernel cache lines directly.
> 
> Do you need to fix the VIPT non-aliasing case as well? Does
> flush_kernel_dcache_page() need to handle I-cache?

Good question. My previous version of the patch did not handle it,
but after our discussion on the arm64 case, I came to the conclusion
that we probably need to handle it.

flush_dcache_page() and flush_kernel_dcache_page() are both used when
a page was modified via its kernel mapping.  For example,
crypto/scatterwalk.c uses flush_dcache_page() whereas
lib/scatterlist.c uses flush_kernel_dcache_page().  

Thus, the reasoning is that if flush_dcache_page() needs to handle
I-cache in the case there is no hook later (already user-mapped page
cache page) then flush_kernel_dcache_page() needs to do the same. 

With respect to clearing the flag in the VIPT non-aliasing case with
anon pages: Declaring the pages dirty by default may already be
sufficient in this case, at least that is what the current
implementation assumes.  On the other hand, if we clear the
PG_dcache_clean flag in a function that is not even supposed to
handle anon pages, then why shouldn't we do it in a function that is? 
I wanted to play safe here and apply the same logic in both
functions.


- Simon

 

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [PATCH V3 2/2] ARM: Handle user space mapped pages in flush_kernel_dcache_page
  2012-10-08 20:02     ` Simon Baatz
@ 2012-10-08 20:20       ` Russell King - ARM Linux
  2012-10-08 23:07         ` Simon Baatz
  0 siblings, 1 reply; 35+ messages in thread
From: Russell King - ARM Linux @ 2012-10-08 20:20 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Oct 08, 2012 at 10:02:16PM +0200, Simon Baatz wrote:
> Hi Catalin,
> 
> On Mon, Oct 08, 2012 at 06:44:28PM +0100, Catalin Marinas wrote:
> > On Sun, Oct 07, 2012 at 12:29:12PM +0100, Simon Baatz wrote:
> > > Commit f8b63c1 made flush_kernel_dcache_page() a no-op assuming that
> > > the pages it needs to handle are kernel mapped only.  However, for
> > > example when doing direct I/O, pages with user space mappings may
> > > occur.
> > > 
> > > Thus, do lazy flushing like in flush_dcache_page() if there are no user
> > > space mappings.  Otherwise, flush the kernel cache lines directly.
> > 
> > Do you need to fix the VIPT non-aliasing case as well? Does
> > flush_kernel_dcache_page() need to handle I-cache?
> 
> Good question. My previous version of the patch did not handle it,
> but after our discussion on the arm64 case, I came to the conclusion
> that we probably need to handle it.
> 
> flush_dcache_page() and flush_kernel_dcache_page() are both used when
> a page was modified via its kernel mapping.  For example,
> crypto/scatterwalk.c uses flush_dcache_page() whereas
> lib/scatterlist.c uses flush_kernel_dcache_page().  

It's likely that this stuff is incredibly buggy - and where we suspect
there's problems (like using the wrong flush_dcache_page() vs
flush_kernel_dcache_page() call) that needs to be fixed.

I suspect crypto/scatterwalk.c was created long before
flush_kernel_dcache_page() came into existence, and it probably needs
to be fixed.

> Thus, the reasoning is that if flush_dcache_page() needs to handle
> I-cache in the case there is no hook later (already user-mapped page
> cache page) then flush_kernel_dcache_page() needs to do the same. 

This sounds like we're heading in the direction that flush_kernel_dcache_page()
and flush_dcache_page() end up being virtually identical.  This isn't
supposed to be - they _are_ supposed to be different.  Again, maybe
that's because the users have chosen the wrong function.

Remember that these functions only get used on so-called "minority"
architectures where the caches are a hinderance - to put that a
different way, most people don't care about these calls because x86
just works.

> With respect to clearing the flag in the VIPT non-aliasing case with
> anon pages: Declaring the pages dirty by default may already be
> sufficient in this case, at least that is what the current
> implementation assumes.

PG_arch_1 has no meaning what so ever for anon pages.  Don't even try
to make it have meaning there, it will cause lots of pain if you do.
The reason is that anon pages have no mapping, and so trying to do the
PG_arch_1 trick will result in most places deciding "there is no
userspace mapping we can skip that".

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [PATCH V3 2/2] ARM: Handle user space mapped pages in flush_kernel_dcache_page
  2012-10-08 20:20       ` Russell King - ARM Linux
@ 2012-10-08 23:07         ` Simon Baatz
  2012-11-18 21:10           ` Jason Cooper
  0 siblings, 1 reply; 35+ messages in thread
From: Simon Baatz @ 2012-10-08 23:07 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Russell,

On Mon, Oct 08, 2012 at 09:20:40PM +0100, Russell King - ARM Linux wrote:
> On Mon, Oct 08, 2012 at 10:02:16PM +0200, Simon Baatz wrote:
> > Hi Catalin,
> > 
> > On Mon, Oct 08, 2012 at 06:44:28PM +0100, Catalin Marinas wrote:
> > > On Sun, Oct 07, 2012 at 12:29:12PM +0100, Simon Baatz wrote:
> > > > Commit f8b63c1 made flush_kernel_dcache_page() a no-op assuming that
> > > > the pages it needs to handle are kernel mapped only.  However, for
> > > > example when doing direct I/O, pages with user space mappings may
> > > > occur.
> > > > 
> > > > Thus, do lazy flushing like in flush_dcache_page() if there are no user
> > > > space mappings.  Otherwise, flush the kernel cache lines directly.
> > > 
> > > Do you need to fix the VIPT non-aliasing case as well? Does
> > > flush_kernel_dcache_page() need to handle I-cache?
> > 
> > Good question. My previous version of the patch did not handle it,
> > but after our discussion on the arm64 case, I came to the conclusion
> > that we probably need to handle it.
> > 
> > flush_dcache_page() and flush_kernel_dcache_page() are both used when
> > a page was modified via its kernel mapping.  For example,
> > crypto/scatterwalk.c uses flush_dcache_page() whereas
> > lib/scatterlist.c uses flush_kernel_dcache_page().  
> 
> It's likely that this stuff is incredibly buggy - and where we suspect
> there's problems (like using the wrong flush_dcache_page() vs
> flush_kernel_dcache_page() call) that needs to be fixed.
> 
> I suspect crypto/scatterwalk.c was created long before
> flush_kernel_dcache_page() came into existence, and it probably needs
> to be fixed.
>
> > Thus, the reasoning is that if flush_dcache_page() needs to handle
> > I-cache in the case there is no hook later (already user-mapped page
> > cache page) then flush_kernel_dcache_page() needs to do the same. 
> 
> This sounds like we're heading in the direction that flush_kernel_dcache_page()
> and flush_dcache_page() end up being virtually identical.  This isn't
> supposed to be - they _are_ supposed to be different.  Again, maybe
> that's because the users have chosen the wrong function.
> ...

Yes, it is a mess and may be incredibly buggy. 

According to documentation flush_kernel_dcache_page() is supposed to
handle user pages.  It can assume that potential user mappings have
been flushed.  However, the documentation does not mention the case
of page cache pages with no user mappings explicitly.

flush_dcache_page() is supposed to deal with the user and kernel
mappings. However, the documentation does explicitly state that it is
not supposed to handle anon pages (but the arm implementation does
flush the kernel mapping).

Now, what should crypto/scatterwalk.c do? In the write to page path,
it can see page cache pages with and without user mappings and anon
pages.  If there are user mappings, it can assume that they have been
flushed.  I would argue it should use flush_kernel_dcache_page(). 
But if so, our current implementation does not handle two of these
three cases at all.

And the proposed flush_kernel_dcache_page() is different. It only
flushes the kernel mapping in the relevant cases whereas
flush_dcache_page() needs to care about user mappings as well. 
According to documentation we could remove the anon page case in
flush_dcache_page(), but uses like in crypto/scatterwalk.c currently
prevent that.

> > With respect to clearing the flag in the VIPT non-aliasing case with
> > anon pages: Declaring the pages dirty by default may already be
> > sufficient in this case, at least that is what the current
> > implementation assumes.
> 
> PG_arch_1 has no meaning what so ever for anon pages.  Don't even try
> to make it have meaning there, it will cause lots of pain if you do.
> The reason is that anon pages have no mapping, and so trying to do the
> PG_arch_1 trick will result in most places deciding "there is no
> userspace mapping we can skip that".

Hmm, now that you say it, it's quite obvious. There is the following
in __sync_icache_dcache():

        if (cache_is_vipt_aliasing())
                mapping = page_mapping(page);
        else
                mapping = NULL;

        if (!test_and_set_bit(PG_dcache_clean, &page->flags))
                __flush_dcache_page(mapping, page);

        if (pte_exec(pteval))
                __flush_icache_all();

At least this looks strange to me given that PG_dcache_clean has no
meaning for anon pages.  Is this correct?

- Simon

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [PATCH V3 2/2] ARM: Handle user space mapped pages in flush_kernel_dcache_page
  2012-10-08 23:07         ` Simon Baatz
@ 2012-11-18 21:10           ` Jason Cooper
  2013-04-18 11:16             ` Jason Cooper
  0 siblings, 1 reply; 35+ messages in thread
From: Jason Cooper @ 2012-11-18 21:10 UTC (permalink / raw)
  To: linux-arm-kernel

Russell, Catalin, Simon,

Any progress on this?  At worst, I'd like to see this in fixes for v3.8.
Assuming, of course, we've gotten to the bottom of it.

thx,

Jason.

On Tue, Oct 09, 2012 at 01:07:34AM +0200, Simon Baatz wrote:
> Hi Russell,
> 
> On Mon, Oct 08, 2012 at 09:20:40PM +0100, Russell King - ARM Linux wrote:
> > On Mon, Oct 08, 2012 at 10:02:16PM +0200, Simon Baatz wrote:
> > > Hi Catalin,
> > > 
> > > On Mon, Oct 08, 2012 at 06:44:28PM +0100, Catalin Marinas wrote:
> > > > On Sun, Oct 07, 2012 at 12:29:12PM +0100, Simon Baatz wrote:
> > > > > Commit f8b63c1 made flush_kernel_dcache_page() a no-op assuming that
> > > > > the pages it needs to handle are kernel mapped only.  However, for
> > > > > example when doing direct I/O, pages with user space mappings may
> > > > > occur.
> > > > > 
> > > > > Thus, do lazy flushing like in flush_dcache_page() if there are no user
> > > > > space mappings.  Otherwise, flush the kernel cache lines directly.
> > > > 
> > > > Do you need to fix the VIPT non-aliasing case as well? Does
> > > > flush_kernel_dcache_page() need to handle I-cache?
> > > 
> > > Good question. My previous version of the patch did not handle it,
> > > but after our discussion on the arm64 case, I came to the conclusion
> > > that we probably need to handle it.
> > > 
> > > flush_dcache_page() and flush_kernel_dcache_page() are both used when
> > > a page was modified via its kernel mapping.  For example,
> > > crypto/scatterwalk.c uses flush_dcache_page() whereas
> > > lib/scatterlist.c uses flush_kernel_dcache_page().  
> > 
> > It's likely that this stuff is incredibly buggy - and where we suspect
> > there's problems (like using the wrong flush_dcache_page() vs
> > flush_kernel_dcache_page() call) that needs to be fixed.
> > 
> > I suspect crypto/scatterwalk.c was created long before
> > flush_kernel_dcache_page() came into existence, and it probably needs
> > to be fixed.
> >
> > > Thus, the reasoning is that if flush_dcache_page() needs to handle
> > > I-cache in the case there is no hook later (already user-mapped page
> > > cache page) then flush_kernel_dcache_page() needs to do the same. 
> > 
> > This sounds like we're heading in the direction that flush_kernel_dcache_page()
> > and flush_dcache_page() end up being virtually identical.  This isn't
> > supposed to be - they _are_ supposed to be different.  Again, maybe
> > that's because the users have chosen the wrong function.
> > ...
> 
> Yes, it is a mess and may be incredibly buggy. 
> 
> According to documentation flush_kernel_dcache_page() is supposed to
> handle user pages.  It can assume that potential user mappings have
> been flushed.  However, the documentation does not mention the case
> of page cache pages with no user mappings explicitly.
> 
> flush_dcache_page() is supposed to deal with the user and kernel
> mappings. However, the documentation does explicitly state that it is
> not supposed to handle anon pages (but the arm implementation does
> flush the kernel mapping).
> 
> Now, what should crypto/scatterwalk.c do? In the write to page path,
> it can see page cache pages with and without user mappings and anon
> pages.  If there are user mappings, it can assume that they have been
> flushed.  I would argue it should use flush_kernel_dcache_page(). 
> But if so, our current implementation does not handle two of these
> three cases at all.
> 
> And the proposed flush_kernel_dcache_page() is different. It only
> flushes the kernel mapping in the relevant cases whereas
> flush_dcache_page() needs to care about user mappings as well. 
> According to documentation we could remove the anon page case in
> flush_dcache_page(), but uses like in crypto/scatterwalk.c currently
> prevent that.
> 
> > > With respect to clearing the flag in the VIPT non-aliasing case with
> > > anon pages: Declaring the pages dirty by default may already be
> > > sufficient in this case, at least that is what the current
> > > implementation assumes.
> > 
> > PG_arch_1 has no meaning what so ever for anon pages.  Don't even try
> > to make it have meaning there, it will cause lots of pain if you do.
> > The reason is that anon pages have no mapping, and so trying to do the
> > PG_arch_1 trick will result in most places deciding "there is no
> > userspace mapping we can skip that".
> 
> Hmm, now that you say it, it's quite obvious. There is the following
> in __sync_icache_dcache():
> 
>         if (cache_is_vipt_aliasing())
>                 mapping = page_mapping(page);
>         else
>                 mapping = NULL;
> 
>         if (!test_and_set_bit(PG_dcache_clean, &page->flags))
>                 __flush_dcache_page(mapping, page);
> 
>         if (pte_exec(pteval))
>                 __flush_icache_all();
> 
> At least this looks strange to me given that PG_dcache_clean has no
> meaning for anon pages.  Is this correct?
> 
> - Simon
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [PATCH V3 2/2] ARM: Handle user space mapped pages in flush_kernel_dcache_page
  2012-11-18 21:10           ` Jason Cooper
@ 2013-04-18 11:16             ` Jason Cooper
  2013-04-18 11:22               ` Russell King - ARM Linux
  0 siblings, 1 reply; 35+ messages in thread
From: Jason Cooper @ 2013-04-18 11:16 UTC (permalink / raw)
  To: linux-arm-kernel

Russell, Catalin, Simon,

I'm digging through my 'onice' mail folder and found this.  I've
personally experienced this bug in the past (workaround: don't use LVM
on ARM systems :-( ).

Is there any interest in reviving this series?  I can dig up the patches
if needed.

thx,

Jason.


On Sun, Nov 18, 2012 at 04:10:05PM -0500, Jason Cooper wrote:
> Russell, Catalin, Simon,
> 
> Any progress on this?  At worst, I'd like to see this in fixes for v3.8.
> Assuming, of course, we've gotten to the bottom of it.


-- Here's the original thread:

> On Tue, Oct 09, 2012 at 01:07:34AM +0200, Simon Baatz wrote:
> > Hi Russell,
> > 
> > On Mon, Oct 08, 2012 at 09:20:40PM +0100, Russell King - ARM Linux wrote:
> > > On Mon, Oct 08, 2012 at 10:02:16PM +0200, Simon Baatz wrote:
> > > > Hi Catalin,
> > > > 
> > > > On Mon, Oct 08, 2012 at 06:44:28PM +0100, Catalin Marinas wrote:
> > > > > On Sun, Oct 07, 2012 at 12:29:12PM +0100, Simon Baatz wrote:
> > > > > > Commit f8b63c1 made flush_kernel_dcache_page() a no-op assuming that
> > > > > > the pages it needs to handle are kernel mapped only.  However, for
> > > > > > example when doing direct I/O, pages with user space mappings may
> > > > > > occur.
> > > > > > 
> > > > > > Thus, do lazy flushing like in flush_dcache_page() if there are no user
> > > > > > space mappings.  Otherwise, flush the kernel cache lines directly.
> > > > > 
> > > > > Do you need to fix the VIPT non-aliasing case as well? Does
> > > > > flush_kernel_dcache_page() need to handle I-cache?
> > > > 
> > > > Good question. My previous version of the patch did not handle it,
> > > > but after our discussion on the arm64 case, I came to the conclusion
> > > > that we probably need to handle it.
> > > > 
> > > > flush_dcache_page() and flush_kernel_dcache_page() are both used when
> > > > a page was modified via its kernel mapping.  For example,
> > > > crypto/scatterwalk.c uses flush_dcache_page() whereas
> > > > lib/scatterlist.c uses flush_kernel_dcache_page().  
> > > 
> > > It's likely that this stuff is incredibly buggy - and where we suspect
> > > there's problems (like using the wrong flush_dcache_page() vs
> > > flush_kernel_dcache_page() call) that needs to be fixed.
> > > 
> > > I suspect crypto/scatterwalk.c was created long before
> > > flush_kernel_dcache_page() came into existence, and it probably needs
> > > to be fixed.
> > >
> > > > Thus, the reasoning is that if flush_dcache_page() needs to handle
> > > > I-cache in the case there is no hook later (already user-mapped page
> > > > cache page) then flush_kernel_dcache_page() needs to do the same. 
> > > 
> > > This sounds like we're heading in the direction that flush_kernel_dcache_page()
> > > and flush_dcache_page() end up being virtually identical.  This isn't
> > > supposed to be - they _are_ supposed to be different.  Again, maybe
> > > that's because the users have chosen the wrong function.
> > > ...
> > 
> > Yes, it is a mess and may be incredibly buggy. 
> > 
> > According to documentation flush_kernel_dcache_page() is supposed to
> > handle user pages.  It can assume that potential user mappings have
> > been flushed.  However, the documentation does not mention the case
> > of page cache pages with no user mappings explicitly.
> > 
> > flush_dcache_page() is supposed to deal with the user and kernel
> > mappings. However, the documentation does explicitly state that it is
> > not supposed to handle anon pages (but the arm implementation does
> > flush the kernel mapping).
> > 
> > Now, what should crypto/scatterwalk.c do? In the write to page path,
> > it can see page cache pages with and without user mappings and anon
> > pages.  If there are user mappings, it can assume that they have been
> > flushed.  I would argue it should use flush_kernel_dcache_page(). 
> > But if so, our current implementation does not handle two of these
> > three cases at all.
> > 
> > And the proposed flush_kernel_dcache_page() is different. It only
> > flushes the kernel mapping in the relevant cases whereas
> > flush_dcache_page() needs to care about user mappings as well. 
> > According to documentation we could remove the anon page case in
> > flush_dcache_page(), but uses like in crypto/scatterwalk.c currently
> > prevent that.
> > 
> > > > With respect to clearing the flag in the VIPT non-aliasing case with
> > > > anon pages: Declaring the pages dirty by default may already be
> > > > sufficient in this case, at least that is what the current
> > > > implementation assumes.
> > > 
> > > PG_arch_1 has no meaning what so ever for anon pages.  Don't even try
> > > to make it have meaning there, it will cause lots of pain if you do.
> > > The reason is that anon pages have no mapping, and so trying to do the
> > > PG_arch_1 trick will result in most places deciding "there is no
> > > userspace mapping we can skip that".
> > 
> > Hmm, now that you say it, it's quite obvious. There is the following
> > in __sync_icache_dcache():
> > 
> >         if (cache_is_vipt_aliasing())
> >                 mapping = page_mapping(page);
> >         else
> >                 mapping = NULL;
> > 
> >         if (!test_and_set_bit(PG_dcache_clean, &page->flags))
> >                 __flush_dcache_page(mapping, page);
> > 
> >         if (pte_exec(pteval))
> >                 __flush_icache_all();
> > 
> > At least this looks strange to me given that PG_dcache_clean has no
> > meaning for anon pages.  Is this correct?
> > 
> > - Simon
> > 
> > _______________________________________________
> > linux-arm-kernel mailing list
> > linux-arm-kernel at lists.infradead.org
> > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [PATCH V3 2/2] ARM: Handle user space mapped pages in flush_kernel_dcache_page
  2013-04-18 11:16             ` Jason Cooper
@ 2013-04-18 11:22               ` Russell King - ARM Linux
  2013-04-18 11:40                 ` Jason Cooper
  2013-04-18 19:00                 ` Simon Baatz
  0 siblings, 2 replies; 35+ messages in thread
From: Russell King - ARM Linux @ 2013-04-18 11:22 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Apr 18, 2013 at 07:16:08AM -0400, Jason Cooper wrote:
> Russell, Catalin, Simon,
> 
> I'm digging through my 'onice' mail folder and found this.  I've
> personally experienced this bug in the past (workaround: don't use LVM
> on ARM systems :-( ).
> 
> Is there any interest in reviving this series?  I can dig up the patches
> if needed.

None what so ever.  flush_kernel_dcache_page() is not supposed to touch
userspace pages.

  void flush_kernel_dcache_page(struct page *page)
        When the kernel needs to modify a user page is has obtained
        with kmap, it calls this function after all modifications are
        complete (but before kunmapping it) to bring the underlying
        page up to date.  It is assumed here that the user has no
                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        incoherent cached copies (i.e. the original page was obtained
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        from a mechanism like get_user_pages()).  The default
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        implementation is a nop and should remain so on all coherent
        architectures.  On incoherent architectures, this should flush
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        the kernel cache for page (using page_address(page)).
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

The first underlined sentence means that it's assumed that something _else_
has already dealt with the userspace aliases (in other words, get_user_pages()
via flush_dcache_page() or flush_anon_page()).  The second is absolutely
clear that _only_ the kernel side should be flushed.

flush_kernel_dcache_page() was invented explicitly to separate out the
kernel side flushing from flush_dcache_page(), as a lighter weight version
for block drivers to use.

Also see the commit text, which explicitly states that this is a lighter
weight API, and that this _only_ touches the kernel view of the page:

commit 5a3a5a98b6422d05c39eaa32c8b3f83840c7b768
Author: James Bottomley <James.Bottomley@SteelEye.com>
Date:   Sun Mar 26 01:36:59 2006 -0800

    [PATCH] Add flush_kernel_dcache_page() API

    We have a problem in a lot of emulated storage in that it takes a page from
    get_user_pages() and does something like

    kmap_atomic(page)
    modify page
    kunmap_atomic(page)

    However, nothing has flushed the kernel cache view of the page before the
    kunmap.  We need a lightweight API to do this, so this new API would
    specifically be for flushing the kernel cache view of a user page which the
    kernel has modified.  The driver would need to add
    flush_kernel_dcache_page(page) before the final kunmap.

    Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
    Cc: Russell King <rmk@arm.linux.org.uk>
    Cc: "David S. Miller" <davem@davemloft.net>
    Signed-off-by: Andrew Morton <akpm@osdl.org>
    Signed-off-by: Linus Torvalds <torvalds@osdl.org>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [PATCH V3 2/2] ARM: Handle user space mapped pages in flush_kernel_dcache_page
  2013-04-18 11:22               ` Russell King - ARM Linux
@ 2013-04-18 11:40                 ` Jason Cooper
  2013-04-18 13:51                   ` Catalin Marinas
  2013-04-18 19:39                   ` Simon Baatz
  2013-04-18 19:00                 ` Simon Baatz
  1 sibling, 2 replies; 35+ messages in thread
From: Jason Cooper @ 2013-04-18 11:40 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Apr 18, 2013 at 12:22:01PM +0100, Russell King - ARM Linux wrote:
> On Thu, Apr 18, 2013 at 07:16:08AM -0400, Jason Cooper wrote:
> > Russell, Catalin, Simon,
> > 
> > I'm digging through my 'onice' mail folder and found this.  I've
> > personally experienced this bug in the past (workaround: don't use LVM
> > on ARM systems :-( ).
> > 
> > Is there any interest in reviving this series?  I can dig up the patches
> > if needed.
> 
> None what so ever.  flush_kernel_dcache_page() is not supposed to touch
> userspace pages.
> 
>   void flush_kernel_dcache_page(struct page *page)
>         When the kernel needs to modify a user page is has obtained
>         with kmap, it calls this function after all modifications are
>         complete (but before kunmapping it) to bring the underlying
>         page up to date.  It is assumed here that the user has no
>                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>         incoherent cached copies (i.e. the original page was obtained
>         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>         from a mechanism like get_user_pages()).  The default
>         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>         implementation is a nop and should remain so on all coherent
>         architectures.  On incoherent architectures, this should flush
>                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>         the kernel cache for page (using page_address(page)).
>         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> 
> The first underlined sentence means that it's assumed that something _else_
> has already dealt with the userspace aliases (in other words, get_user_pages()
> via flush_dcache_page() or flush_anon_page()).  The second is absolutely
> clear that _only_ the kernel side should be flushed.
> 
> flush_kernel_dcache_page() was invented explicitly to separate out the
> kernel side flushing from flush_dcache_page(), as a lighter weight version
> for block drivers to use.
> 
> Also see the commit text, which explicitly states that this is a lighter
> weight API, and that this _only_ touches the kernel view of the page:
> 
> commit 5a3a5a98b6422d05c39eaa32c8b3f83840c7b768
> Author: James Bottomley <James.Bottomley@SteelEye.com>
> Date:   Sun Mar 26 01:36:59 2006 -0800
> 
>     [PATCH] Add flush_kernel_dcache_page() API
> 
>     We have a problem in a lot of emulated storage in that it takes a page from
>     get_user_pages() and does something like
> 
>     kmap_atomic(page)
>     modify page
>     kunmap_atomic(page)
> 
>     However, nothing has flushed the kernel cache view of the page before the
>     kunmap.  We need a lightweight API to do this, so this new API would
>     specifically be for flushing the kernel cache view of a user page which the
>     kernel has modified.  The driver would need to add
>     flush_kernel_dcache_page(page) before the final kunmap.
> 
>     Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
>     Cc: Russell King <rmk@arm.linux.org.uk>
>     Cc: "David S. Miller" <davem@davemloft.net>
>     Signed-off-by: Andrew Morton <akpm@osdl.org>
>     Signed-off-by: Linus Torvalds <torvalds@osdl.org>

Ok, got it.  I should have been more explicit.  LVM doesn't work on ARM.
iirc, Simon had a demo of dm-crypt also faulting on ARM.  This patch was
not the correct approach.  Is there an interest (particularly Simon) in
fixing the problem?

I'm willing to take a look at it, but LVM on ARM is a "nice to have" for
me, not a critical need.  The job would be easier if Simon was willing
to assist and lend his experience with the issue.

thx,

Jason.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [PATCH V3 2/2] ARM: Handle user space mapped pages in flush_kernel_dcache_page
  2013-04-18 11:40                 ` Jason Cooper
@ 2013-04-18 13:51                   ` Catalin Marinas
  2013-04-21 22:06                     ` Simon Baatz
  2013-04-18 19:39                   ` Simon Baatz
  1 sibling, 1 reply; 35+ messages in thread
From: Catalin Marinas @ 2013-04-18 13:51 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Apr 18, 2013 at 12:40:16PM +0100, Jason Cooper wrote:
> Ok, got it.  I should have been more explicit.  LVM doesn't work on ARM.
> iirc, Simon had a demo of dm-crypt also faulting on ARM.  This patch was
> not the correct approach.  Is there an interest (particularly Simon) in
> fixing the problem?

I think fixing this for ARM is useful but I don't have any time to
allocate. I think I acked the first patch in the series but I don't
fully remember the details behind the second one.

As Russell said, flush_kernel_dcache_page() is not the right API.
flush_dcache_page() is not supposed to be used on anonymous pages. What
we have for such pages is flush_anon_page() which is a no-op for VIPT
non-aliasing pages. I can see that __get_user_pages() calls both
flush_anon_page() flush_dcache_page().

Is the problem that you have related to I-D cache coherency? Is
flush_anon_page() the right place for this?

-- 
Catalin

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [PATCH V3 2/2] ARM: Handle user space mapped pages in flush_kernel_dcache_page
  2013-04-18 11:22               ` Russell King - ARM Linux
  2013-04-18 11:40                 ` Jason Cooper
@ 2013-04-18 19:00                 ` Simon Baatz
  1 sibling, 0 replies; 35+ messages in thread
From: Simon Baatz @ 2013-04-18 19:00 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Russel,

On Thu, Apr 18, 2013 at 12:22:01PM +0100, Russell King - ARM Linux wrote:
> On Thu, Apr 18, 2013 at 07:16:08AM -0400, Jason Cooper wrote:
> > Russell, Catalin, Simon,
> > 
> > I'm digging through my 'onice' mail folder and found this.  I've
> > personally experienced this bug in the past (workaround: don't use LVM
> > on ARM systems :-( ).
> > 
> > Is there any interest in reviving this series?  I can dig up the patches
> > if needed.
> 
> None what so ever.  flush_kernel_dcache_page() is not supposed to touch
> userspace pages.

I don't know if we have a wording issue here. The header of the patch
says "user space mapped pages".  Thus, pages for which a user space
mapping exists.  As the documentation you cite below states, these
pages ("user pages") should be handled by flush_kernel_dcache_page(). 
Of course, flush_kernel_dcache_page() has only to care about the
dirty kernel mapping, but this is what the proposed patches do.  I
never proposed to touch the user space mappings.
 
>   void flush_kernel_dcache_page(struct page *page)
>         When the kernel needs to modify a user page is has obtained
>         with kmap, it calls this function after all modifications are
>         complete (but before kunmapping it) to bring the underlying
>         page up to date.  It is assumed here that the user has no
>                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>         incoherent cached copies (i.e. the original page was obtained
>         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>         from a mechanism like get_user_pages()).  The default
>         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>         implementation is a nop and should remain so on all coherent
>         architectures.  On incoherent architectures, this should flush
>                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>         the kernel cache for page (using page_address(page)).
>         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> 
> The first underlined sentence means that it's assumed that something _else_
> has already dealt with the userspace aliases (in other words, get_user_pages()
> via flush_dcache_page() or flush_anon_page()).  The second is absolutely
> clear that _only_ the kernel side should be flushed.
> 
> flush_kernel_dcache_page() was invented explicitly to separate out the
> kernel side flushing from flush_dcache_page(), as a lighter weight version
> for block drivers to use.

Yes, exactly. Apart from one thing: flush_kernel_dcache_page() is
supposed to handle all user pages. Thus, it is a more lightweight
version of flush_dcache_page() and flush_anon_page().

But the current implementation only handles page cache pages without a
user mapping (it's a nop because we can do lazy flushing here). It
does not handle page cache pages with a user space mapping and anon
pages.

My first patch (see [1]) was the "minimalistic" approach to do this. Just
flush the _kernel_ mapping if we have user space mappings. Continue to
be lazy otherwise.

But then the question is, whether this is enough. Let's take commit
65b6ecc:

+++ b/kernel/events/uprobes.c
@@ -1199,6 +1199,11 @@ static unsigned long xol_get_insn_slot(struct uprobe *uprobe, unsigned long slot
        vaddr = kmap_atomic(area->page);
        memcpy(vaddr + offset, uprobe->arch.insn, MAX_UINSN_BYTES);
        kunmap_atomic(vaddr);
+       /*
+        * We probably need flush_icache_user_range() but it needs
vma.
+        * This should work on supported architectures too.
+        */
+       flush_dcache_page(area->page);

Wouldn't that be a use for flush_kernel_dcache_page() (before
kunmap())?  But then, we need to care about I/D-cache coherency, i.e. 
flush_kernel_dcache_page() very much looks like flush_dcache_page()
but without the "flush user mappings" part.  That's the idea behind
the second version (see [2]).  If that's not a relevant case, I am
more than happy to go back to [1].

The third version combined this with a slight optimization, but this
had problems as you pointed out at the time and we can forget about
that one for the time being.  Still, I consider the older versions
[1] or [2] to be relevant.

- Simon

[1] http://lists.infradead.org/pipermail/linux-arm-kernel/2012-May/101794.html
[2] http://lists.infradead.org/pipermail/linux-arm-kernel/2012-September/122874.html

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [PATCH V3 2/2] ARM: Handle user space mapped pages in flush_kernel_dcache_page
  2013-04-18 11:40                 ` Jason Cooper
  2013-04-18 13:51                   ` Catalin Marinas
@ 2013-04-18 19:39                   ` Simon Baatz
  1 sibling, 0 replies; 35+ messages in thread
From: Simon Baatz @ 2013-04-18 19:39 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Jason,

On Thu, Apr 18, 2013 at 07:40:16AM -0400, Jason Cooper wrote:
> On Thu, Apr 18, 2013 at 12:22:01PM +0100, Russell King - ARM Linux wrote:
> > On Thu, Apr 18, 2013 at 07:16:08AM -0400, Jason Cooper wrote:
> > > Russell, Catalin, Simon,
> > > 
> > > I'm digging through my 'onice' mail folder and found this.  I've
> > > personally experienced this bug in the past (workaround: don't use LVM
> > > on ARM systems :-( ).
> > > 
> > > Is there any interest in reviving this series?  I can dig up the patches
> > > if needed.
...
> 
> Ok, got it.  I should have been more explicit.  LVM doesn't work on ARM.
> iirc, Simon had a demo of dm-crypt also faulting on ARM.  This patch was
> not the correct approach.  Is there an interest (particularly Simon) in
> fixing the problem?
> 
> I'm willing to take a look at it, but LVM on ARM is a "nice to have" for
> me, not a critical need.  The job would be easier if Simon was willing
> to assist and lend his experience with the issue.

Just to be precise here, the problem is much more specific: Drivers
which use flush_kernel_dcache_page() on VIVT or aliasing VIPT ARM
architectures may have problems with direct I/O.

In general, LVM should work perfectly fine on ARM. But if you have
this particular combination (in my case Kirkwood ARMv5 with VIVT
D-Cache and the mv_cesa driver), you can't put a physical volume on
such a device.  LVM user space tools will just read garbage when
trying to read from the device using direct I/O.

- Simon

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [PATCH V3 2/2] ARM: Handle user space mapped pages in flush_kernel_dcache_page
  2013-04-18 13:51                   ` Catalin Marinas
@ 2013-04-21 22:06                     ` Simon Baatz
  2013-04-30 11:22                       ` Catalin Marinas
  0 siblings, 1 reply; 35+ messages in thread
From: Simon Baatz @ 2013-04-21 22:06 UTC (permalink / raw)
  To: linux-arm-kernel


Hi Catalin,

On Thu, Apr 18, 2013 at 02:51:04PM +0100, Catalin Marinas wrote:
> On Thu, Apr 18, 2013 at 12:40:16PM +0100, Jason Cooper wrote:
> > Ok, got it.  I should have been more explicit.  LVM doesn't work on ARM.
> > iirc, Simon had a demo of dm-crypt also faulting on ARM.  This patch was
> > not the correct approach.  Is there an interest (particularly Simon) in
> > fixing the problem?
> 
> I think fixing this for ARM is useful but I don't have any time to
> allocate. I think I acked the first patch in the series but I don't
> fully remember the details behind the second one.
> 
> As Russell said, flush_kernel_dcache_page() is not the right API.

It is not the driver itself which is using the API, it is the
generic scatterlist memory iterator. And I don't think that this is
wrong, as I have tried to explain in [1].


> flush_dcache_page() is not supposed to be used on anonymous pages. What
> we have for such pages is flush_anon_page() which is a no-op for VIPT
> non-aliasing pages. I can see that __get_user_pages() calls both
> flush_anon_page() flush_dcache_page().

Yes. But I think that is not relevant here. Although the naming
suggest that the function is just a light-weight version of
flush_dcache_page(), both the documentation and the commit text say
that flush_kernel_dcache_page() has to handle pages that were
obtained through get_user_pages, which includes anon pages.


Btw. many drivers which modify pages via the kernel mapping currently
use flush_dcache_page() instead.  This is not supposed to work in
e.g.  the direct I/O case where a driver sees anon pages.  This works
just because flush_dcache_page() flushes the kernel mapping also for
anon pages.  Something is is clearly not supposed to do according to
documentation.

> Is the problem that you have related to I-D cache coherency? Is
> flush_anon_page() the right place for this?

This particular problem is not related to I/D cache coherency. 
However, as far as I know, flush_dcache_page() is expected to ensure
also I/D cache coherency.  Also, we flush the I-cache in certain
situations in this function.  As said in my response to Russel, I
don't know whether this is expected from flush_kernel_dcache_page()
as well.  I would say yes, but I also have posted a version that only
flushes the D-cache kernel mapping.

- Simon

[1] http://permalink.gmane.org/gmane.linux.ports.arm.kernel/181703

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [PATCH V3 2/2] ARM: Handle user space mapped pages in flush_kernel_dcache_page
  2013-04-21 22:06                     ` Simon Baatz
@ 2013-04-30 11:22                       ` Catalin Marinas
  2013-04-30 21:04                         ` Simon Baatz
  0 siblings, 1 reply; 35+ messages in thread
From: Catalin Marinas @ 2013-04-30 11:22 UTC (permalink / raw)
  To: linux-arm-kernel

On Sun, Apr 21, 2013 at 11:06:30PM +0100, Simon Baatz wrote:
> On Thu, Apr 18, 2013 at 02:51:04PM +0100, Catalin Marinas wrote:
> > On Thu, Apr 18, 2013 at 12:40:16PM +0100, Jason Cooper wrote:
> > > Ok, got it.  I should have been more explicit.  LVM doesn't work on ARM.
> > > iirc, Simon had a demo of dm-crypt also faulting on ARM.  This patch was
> > > not the correct approach.  Is there an interest (particularly Simon) in
> > > fixing the problem?
> > 
> > I think fixing this for ARM is useful but I don't have any time to
> > allocate. I think I acked the first patch in the series but I don't
> > fully remember the details behind the second one.
> > 
> > As Russell said, flush_kernel_dcache_page() is not the right API.
> 
> It is not the driver itself which is using the API, it is the
> generic scatterlist memory iterator. And I don't think that this is
> wrong, as I have tried to explain in [1].

Trying to remember what we've discussed over the past months on this
topic. It looks like sg_miter_stop() does the right thing in calling
flush_kernel_dcache_page(). Commit f8b63c1 (ARM: 6382/1: Remove
superfluous flush_kernel_dcache_page()) removed this function entirely.
The code previously had this comment - /* highmem pages are always
flushed upon kunmap already */ which I think it wasn't fully correct
either. The kunmap_atomic() flushes the caches but kunmap() doesn't, so
I suspect we only get the flushing if SG_MITER_ATOMIC.

So it looks to me like flush_kernel_dcache_page() should be implemented
even for highmem pages (with VIVT or aliasing VIPT, at least for non
kmap_atomic addresses by checking for FIXADDR_START). If highmem is
disabled, I suspect we still need this function since the calling code
doesn't care whether kmap/kunmap was a no-op. But can we keep it as a
simple call to __cpuc_flush_dcache_area(page_address(page), PAGE_SIZE)?

-- 
Catalin

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [PATCH V3 2/2] ARM: Handle user space mapped pages in flush_kernel_dcache_page
  2013-04-30 11:22                       ` Catalin Marinas
@ 2013-04-30 21:04                         ` Simon Baatz
  2013-05-01 14:22                           ` Catalin Marinas
  0 siblings, 1 reply; 35+ messages in thread
From: Simon Baatz @ 2013-04-30 21:04 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Catalin,

On Tue, Apr 30, 2013 at 12:22:25PM +0100, Catalin Marinas wrote:
> On Sun, Apr 21, 2013 at 11:06:30PM +0100, Simon Baatz wrote:
> > On Thu, Apr 18, 2013 at 02:51:04PM +0100, Catalin Marinas wrote:
> > > On Thu, Apr 18, 2013 at 12:40:16PM +0100, Jason Cooper wrote:
...
> > 
> > It is not the driver itself which is using the API, it is the
> > generic scatterlist memory iterator. And I don't think that this is
> > wrong, as I have tried to explain in [1].
> 
> Trying to remember what we've discussed over the past months on this
> topic. It looks like sg_miter_stop() does the right thing in calling
> flush_kernel_dcache_page(). Commit f8b63c1 (ARM: 6382/1: Remove
> superfluous flush_kernel_dcache_page()) removed this function entirely.
> The code previously had this comment - /* highmem pages are always
> flushed upon kunmap already */ which I think it wasn't fully correct
> either. The kunmap_atomic() flushes the caches but kunmap() doesn't, so
> I suspect we only get the flushing if SG_MITER_ATOMIC.
> 
> So it looks to me like flush_kernel_dcache_page() should be implemented
> even for highmem pages (with VIVT or aliasing VIPT, at least for non
> kmap_atomic addresses by checking for FIXADDR_START). If highmem is
> disabled, I suspect we still need this function since the calling code
> doesn't care whether kmap/kunmap was a no-op. But can we keep it as a
> simple call to __cpuc_flush_dcache_area(page_address(page), PAGE_SIZE)?

My first version ([1]) had:

	if ((cache_is_vivt() || cache_is_vipt_aliasing()) && !PageHighMem(page))
		__flush_kernel_dcache_page(page);

If I understand this correctly, you are proposing to remove the
highmem exclusion.

And then in __flush_kernel_dcache_page():

	mapping = page_mapping(page);

	if (!mapping || mapping_mapped(mapping))
		__cpuc_flush_dcache_area(page_address(page), PAGE_SIZE);

I still prefer to have this condition here to avoid the flush when
there is no user mapping at all.  This is handled by lazy flushing
and is probably the most common case anyway (given how many people
seem to be affected by this problem).

Additionally, although we can assume that the page is kmapped,
page_address(page) can still be NULL for a highmem page, right?

- Simon

[1] http://lists.infradead.org/pipermail/linux-arm-kernel/2012-May/101794.html

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [PATCH V3 2/2] ARM: Handle user space mapped pages in flush_kernel_dcache_page
  2013-04-30 21:04                         ` Simon Baatz
@ 2013-05-01 14:22                           ` Catalin Marinas
  2013-05-01 19:04                             ` Simon Baatz
  0 siblings, 1 reply; 35+ messages in thread
From: Catalin Marinas @ 2013-05-01 14:22 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Apr 30, 2013 at 10:04:03PM +0100, Simon Baatz wrote:
> On Tue, Apr 30, 2013 at 12:22:25PM +0100, Catalin Marinas wrote:
> > On Sun, Apr 21, 2013 at 11:06:30PM +0100, Simon Baatz wrote:
> > > On Thu, Apr 18, 2013 at 02:51:04PM +0100, Catalin Marinas wrote:
> > > > On Thu, Apr 18, 2013 at 12:40:16PM +0100, Jason Cooper wrote:
> ...
> > > 
> > > It is not the driver itself which is using the API, it is the
> > > generic scatterlist memory iterator. And I don't think that this is
> > > wrong, as I have tried to explain in [1].
> > 
> > Trying to remember what we've discussed over the past months on this
> > topic. It looks like sg_miter_stop() does the right thing in calling
> > flush_kernel_dcache_page(). Commit f8b63c1 (ARM: 6382/1: Remove
> > superfluous flush_kernel_dcache_page()) removed this function entirely.
> > The code previously had this comment - /* highmem pages are always
> > flushed upon kunmap already */ which I think it wasn't fully correct
> > either. The kunmap_atomic() flushes the caches but kunmap() doesn't, so
> > I suspect we only get the flushing if SG_MITER_ATOMIC.
> > 
> > So it looks to me like flush_kernel_dcache_page() should be implemented
> > even for highmem pages (with VIVT or aliasing VIPT, at least for non
> > kmap_atomic addresses by checking for FIXADDR_START). If highmem is
> > disabled, I suspect we still need this function since the calling code
> > doesn't care whether kmap/kunmap was a no-op. But can we keep it as a
> > simple call to __cpuc_flush_dcache_area(page_address(page), PAGE_SIZE)?
> 
> My first version ([1]) had:
> 
> 	if ((cache_is_vivt() || cache_is_vipt_aliasing()) && !PageHighMem(page))
> 		__flush_kernel_dcache_page(page);
> 
> If I understand this correctly, you are proposing to remove the
> highmem exclusion.

The highmem exclusion may have been there originally because of a
comment suggesting that kunmap() does the flushing. This is the case
only for kunmap_atomic() AFAICT (and maybe we could remove that as well
and rely on flush_kernel_dcache_page() being called).

> And then in __flush_kernel_dcache_page():
> 
> 	mapping = page_mapping(page);
> 
> 	if (!mapping || mapping_mapped(mapping))
> 		__cpuc_flush_dcache_area(page_address(page), PAGE_SIZE);
> 
> I still prefer to have this condition here to avoid the flush when
> there is no user mapping at all.  This is handled by lazy flushing
> and is probably the most common case anyway (given how many people
> seem to be affected by this problem).

Looking at the old thread, you said there is a case when this condition
is not true (O_DIRECT case). If that's for a page cache page, then we
can handle it lazily (for anonymous pages I don't think we can rely on
lazy flushing since the kernel does not guarantee the clearing of the
PG_arch_1 bit).

> Additionally, although we can assume that the page is kmapped,
> page_address(page) can still be NULL for a highmem page, right?

It looks like kmap() always sets page_address(page) but I'm not sure
about kmap_atomic(), it doesn't seem to.

-- 
Catalin

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [PATCH V3 2/2] ARM: Handle user space mapped pages in flush_kernel_dcache_page
  2013-05-01 14:22                           ` Catalin Marinas
@ 2013-05-01 19:04                             ` Simon Baatz
  2013-05-02  9:54                               ` Catalin Marinas
  0 siblings, 1 reply; 35+ messages in thread
From: Simon Baatz @ 2013-05-01 19:04 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, May 01, 2013 at 03:22:06PM +0100, Catalin Marinas wrote:
> On Tue, Apr 30, 2013 at 10:04:03PM +0100, Simon Baatz wrote:
> > On Tue, Apr 30, 2013 at 12:22:25PM +0100, Catalin Marinas wrote:
> > > On Sun, Apr 21, 2013 at 11:06:30PM +0100, Simon Baatz wrote:
> > > > On Thu, Apr 18, 2013 at 02:51:04PM +0100, Catalin Marinas wrote:
> > > > > On Thu, Apr 18, 2013 at 12:40:16PM +0100, Jason Cooper wrote:
> > ...
> > > > 
> > > > It is not the driver itself which is using the API, it is the
> > > > generic scatterlist memory iterator. And I don't think that this is
> > > > wrong, as I have tried to explain in [1].
> > > 
> > > Trying to remember what we've discussed over the past months on this
> > > topic. It looks like sg_miter_stop() does the right thing in calling
> > > flush_kernel_dcache_page(). Commit f8b63c1 (ARM: 6382/1: Remove
> > > superfluous flush_kernel_dcache_page()) removed this function entirely.
> > > The code previously had this comment - /* highmem pages are always
> > > flushed upon kunmap already */ which I think it wasn't fully correct
> > > either. The kunmap_atomic() flushes the caches but kunmap() doesn't, so
> > > I suspect we only get the flushing if SG_MITER_ATOMIC.
> > > 
> > > So it looks to me like flush_kernel_dcache_page() should be implemented
> > > even for highmem pages (with VIVT or aliasing VIPT, at least for non
> > > kmap_atomic addresses by checking for FIXADDR_START). If highmem is
> > > disabled, I suspect we still need this function since the calling code
> > > doesn't care whether kmap/kunmap was a no-op. But can we keep it as a
> > > simple call to __cpuc_flush_dcache_area(page_address(page), PAGE_SIZE)?
> > 
> > My first version ([1]) had:
> > 
> > 	if ((cache_is_vivt() || cache_is_vipt_aliasing()) && !PageHighMem(page))
> > 		__flush_kernel_dcache_page(page);
> > 
> > If I understand this correctly, you are proposing to remove the
> > highmem exclusion.
> 
> The highmem exclusion may have been there originally because of a
> comment suggesting that kunmap() does the flushing. This is the case
> only for kunmap_atomic() AFAICT (and maybe we could remove that as well
> and rely on flush_kernel_dcache_page() being called).
> 
> > And then in __flush_kernel_dcache_page():
> > 
> > 	mapping = page_mapping(page);
> > 
> > 	if (!mapping || mapping_mapped(mapping))
> > 		__cpuc_flush_dcache_area(page_address(page), PAGE_SIZE);
> > 
> > I still prefer to have this condition here to avoid the flush when
> > there is no user mapping at all.  This is handled by lazy flushing
> > and is probably the most common case anyway (given how many people
> > seem to be affected by this problem).
> 
> Looking at the old thread, you said there is a case when this condition
> is not true (O_DIRECT case). If that's for a page cache page, then we
> can handle it lazily (for anonymous pages I don't think we can rely on
> lazy flushing since the kernel does not guarantee the clearing of the
> PG_arch_1 bit).

As Russel pointed out in a comment to a later version of the patch,
PG_arch_1 makes only sense for page cache pages. The condition above
is ok from my point of view (it is based on what flush_dcache_page()
uses):

- Page cache page without user space mapping:
mapping != NULL and mapping_mapped() == 0

-> no flush here; lazy flush based on PG_arch_1 later if needed (we
   rely on the proper initialization of the page to "dirty" here.)


- Page cache page with user space mapping:
mapping != NULL and mapping_mapped() != 0

-> kernel mapping flushed here (user mapping can be assumed to be clean)


- Anonymous page:
mapping == NULL

-> kernel mapping flushed here (user mapping can be assumed to be clean)

 
> > Additionally, although we can assume that the page is kmapped,
> > page_address(page) can still be NULL for a highmem page, right?
> 
> It looks like kmap() always sets page_address(page) but I'm not sure
> about kmap_atomic(), it doesn't seem to.

Hmm, in __flush_dcache_page() we have the following code to flush the
kernel mapping:

void __flush_dcache_page(struct address_space *mapping, struct page *page)
{
	/*
	 * Writeback any data associated with the kernel mapping of this
	 * page.  This ensures that data in the physical page is mutually
	 * coherent with the kernels mapping.
	 */
	if (!PageHighMem(page)) {
		__cpuc_flush_dcache_area(page_address(page), PAGE_SIZE);
	} else {
		void *addr = kmap_high_get(page);
		if (addr) {
			__cpuc_flush_dcache_area(addr, PAGE_SIZE);
			kunmap_high(page);
		} else if (cache_is_vipt()) {
			/* unmapped pages might still be cached */
			addr = kmap_atomic(page);
			__cpuc_flush_dcache_area(addr, PAGE_SIZE);
			kunmap_atomic(addr);
		}
	}
...


We probably should reuse it in flush_kernel_dcache_page() to flush
the kernel mapping. (The last else clause looks strange though)


- Simon

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [PATCH V3 2/2] ARM: Handle user space mapped pages in flush_kernel_dcache_page
  2013-05-01 19:04                             ` Simon Baatz
@ 2013-05-02  9:54                               ` Catalin Marinas
  2013-05-02 19:38                                 ` Simon Baatz
  0 siblings, 1 reply; 35+ messages in thread
From: Catalin Marinas @ 2013-05-02  9:54 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, May 01, 2013 at 08:04:41PM +0100, Simon Baatz wrote:
> On Wed, May 01, 2013 at 03:22:06PM +0100, Catalin Marinas wrote:
> > On Tue, Apr 30, 2013 at 10:04:03PM +0100, Simon Baatz wrote:
> > > On Tue, Apr 30, 2013 at 12:22:25PM +0100, Catalin Marinas wrote:
> > > > On Sun, Apr 21, 2013 at 11:06:30PM +0100, Simon Baatz wrote:
> > > > > On Thu, Apr 18, 2013 at 02:51:04PM +0100, Catalin Marinas wrote:
> > > > > > On Thu, Apr 18, 2013 at 12:40:16PM +0100, Jason Cooper wrote:
> > > ...
> > > > > 
> > > > > It is not the driver itself which is using the API, it is the
> > > > > generic scatterlist memory iterator. And I don't think that this is
> > > > > wrong, as I have tried to explain in [1].
> > > > 
> > > > Trying to remember what we've discussed over the past months on this
> > > > topic. It looks like sg_miter_stop() does the right thing in calling
> > > > flush_kernel_dcache_page(). Commit f8b63c1 (ARM: 6382/1: Remove
> > > > superfluous flush_kernel_dcache_page()) removed this function entirely.
> > > > The code previously had this comment - /* highmem pages are always
> > > > flushed upon kunmap already */ which I think it wasn't fully correct
> > > > either. The kunmap_atomic() flushes the caches but kunmap() doesn't, so
> > > > I suspect we only get the flushing if SG_MITER_ATOMIC.
> > > > 
> > > > So it looks to me like flush_kernel_dcache_page() should be implemented
> > > > even for highmem pages (with VIVT or aliasing VIPT, at least for non
> > > > kmap_atomic addresses by checking for FIXADDR_START). If highmem is
> > > > disabled, I suspect we still need this function since the calling code
> > > > doesn't care whether kmap/kunmap was a no-op. But can we keep it as a
> > > > simple call to __cpuc_flush_dcache_area(page_address(page), PAGE_SIZE)?
> > > 
> > > My first version ([1]) had:
> > > 
> > > 	if ((cache_is_vivt() || cache_is_vipt_aliasing()) && !PageHighMem(page))
> > > 		__flush_kernel_dcache_page(page);
> > > 
> > > If I understand this correctly, you are proposing to remove the
> > > highmem exclusion.
> > 
> > The highmem exclusion may have been there originally because of a
> > comment suggesting that kunmap() does the flushing. This is the case
> > only for kunmap_atomic() AFAICT (and maybe we could remove that as well
> > and rely on flush_kernel_dcache_page() being called).
> > 
> > > And then in __flush_kernel_dcache_page():
> > > 
> > > 	mapping = page_mapping(page);
> > > 
> > > 	if (!mapping || mapping_mapped(mapping))
> > > 		__cpuc_flush_dcache_area(page_address(page), PAGE_SIZE);
> > > 
> > > I still prefer to have this condition here to avoid the flush when
> > > there is no user mapping at all.  This is handled by lazy flushing
> > > and is probably the most common case anyway (given how many people
> > > seem to be affected by this problem).
> > 
> > Looking at the old thread, you said there is a case when this condition
> > is not true (O_DIRECT case). If that's for a page cache page, then we
> > can handle it lazily (for anonymous pages I don't think we can rely on
> > lazy flushing since the kernel does not guarantee the clearing of the
> > PG_arch_1 bit).
> 
> As Russel pointed out in a comment to a later version of the patch,
> PG_arch_1 makes only sense for page cache pages. The condition above
> is ok from my point of view (it is based on what flush_dcache_page()
> uses):
> 
> - Page cache page without user space mapping:
> mapping != NULL and mapping_mapped() == 0
> 
> -> no flush here; lazy flush based on PG_arch_1 later if needed (we
>    rely on the proper initialization of the page to "dirty" here.)

Indeed.

> - Page cache page with user space mapping:
> mapping != NULL and mapping_mapped() != 0
> 
> -> kernel mapping flushed here (user mapping can be assumed to be clean)

We had similar thoughts for AArch64 here and decided it's not needed, it
can just clear the PG_arch_1 bit and do it lazily (patches not pushed
yet, need more testing). This assumes that even if mapping_mapped(), the
page is not actually mapped in user space and we eventually get a
set_pte_at() call. That's what powerpc is doing.

> - Anonymous page:
> mapping == NULL
> 
> -> kernel mapping flushed here (user mapping can be assumed to be clean)

I don't think it should care about anonymous pages at all. I put a
WARN_ON(!mapping) in flush_dcache_page() and it hasn't triggered yet,
though not intensive testing.

> > > Additionally, although we can assume that the page is kmapped,
> > > page_address(page) can still be NULL for a highmem page, right?
> > 
> > It looks like kmap() always sets page_address(page) but I'm not sure
> > about kmap_atomic(), it doesn't seem to.
> 
> Hmm, in __flush_dcache_page() we have the following code to flush the
> kernel mapping:
> 
> void __flush_dcache_page(struct address_space *mapping, struct page *page)
> {
> 	/*
> 	 * Writeback any data associated with the kernel mapping of this
> 	 * page.  This ensures that data in the physical page is mutually
> 	 * coherent with the kernels mapping.
> 	 */
> 	if (!PageHighMem(page)) {
> 		__cpuc_flush_dcache_area(page_address(page), PAGE_SIZE);
> 	} else {
> 		void *addr = kmap_high_get(page);
> 		if (addr) {
> 			__cpuc_flush_dcache_area(addr, PAGE_SIZE);
> 			kunmap_high(page);
> 		} else if (cache_is_vipt()) {
> 			/* unmapped pages might still be cached */
> 			addr = kmap_atomic(page);
> 			__cpuc_flush_dcache_area(addr, PAGE_SIZE);
> 			kunmap_atomic(addr);
> 		}
> 	}
> ...
> 
> 
> We probably should reuse it in flush_kernel_dcache_page() to flush
> the kernel mapping. (The last else clause looks strange though)

I think it makes sense to reuse this logic in
flush_kernel_dcache_page(). If the page is already mapped with kmap,
then kmap_high_get() should return the actual address. If it was mapped
with kmap_atomic, kmap_high_get() would return NULL, hence the 'else'
clause and the additional kmap_atomic(). The cache_is_vipt() check is
useful because kunmap_atomic() would flush VIVT caches anyway.

As for __flush_dcache_page() called from other places like
flush_dcache_page(), because of this 'else if' clause it looks like it
misses flushing unmapped highmem pages on VIVT cache.

-- 
Catalin

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [PATCH V3 2/2] ARM: Handle user space mapped pages in flush_kernel_dcache_page
  2013-05-02  9:54                               ` Catalin Marinas
@ 2013-05-02 19:38                                 ` Simon Baatz
  2013-05-03 10:02                                   ` Catalin Marinas
  0 siblings, 1 reply; 35+ messages in thread
From: Simon Baatz @ 2013-05-02 19:38 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Catalin,

On Thu, May 02, 2013 at 10:54:31AM +0100, Catalin Marinas wrote:
> On Wed, May 01, 2013 at 08:04:41PM +0100, Simon Baatz wrote:
> > On Wed, May 01, 2013 at 03:22:06PM +0100, Catalin Marinas wrote:
> > > On Tue, Apr 30, 2013 at 10:04:03PM +0100, Simon Baatz wrote:
> > > > On Tue, Apr 30, 2013 at 12:22:25PM +0100, Catalin Marinas wrote:
> > > > > On Sun, Apr 21, 2013 at 11:06:30PM +0100, Simon Baatz wrote:
> > > > > > On Thu, Apr 18, 2013 at 02:51:04PM +0100, Catalin Marinas wrote:
> > > > > > > On Thu, Apr 18, 2013 at 12:40:16PM +0100, Jason Cooper wrote:
> > > > ...
> > > > > > 
> > > > > > It is not the driver itself which is using the API, it is the
> > > > > > generic scatterlist memory iterator. And I don't think that this is
> > > > > > wrong, as I have tried to explain in [1].
> > > > > 
> > > > > Trying to remember what we've discussed over the past months on this
> > > > > topic. It looks like sg_miter_stop() does the right thing in calling
> > > > > flush_kernel_dcache_page(). Commit f8b63c1 (ARM: 6382/1: Remove
> > > > > superfluous flush_kernel_dcache_page()) removed this function entirely.
> > > > > The code previously had this comment - /* highmem pages are always
> > > > > flushed upon kunmap already */ which I think it wasn't fully correct
> > > > > either. The kunmap_atomic() flushes the caches but kunmap() doesn't, so
> > > > > I suspect we only get the flushing if SG_MITER_ATOMIC.
> > > > > 
> > > > > So it looks to me like flush_kernel_dcache_page() should be implemented
> > > > > even for highmem pages (with VIVT or aliasing VIPT, at least for non
> > > > > kmap_atomic addresses by checking for FIXADDR_START). If highmem is
> > > > > disabled, I suspect we still need this function since the calling code
> > > > > doesn't care whether kmap/kunmap was a no-op. But can we keep it as a
> > > > > simple call to __cpuc_flush_dcache_area(page_address(page), PAGE_SIZE)?
> > > > 
> > > > My first version ([1]) had:
> > > > 
> > > > 	if ((cache_is_vivt() || cache_is_vipt_aliasing()) && !PageHighMem(page))
> > > > 		__flush_kernel_dcache_page(page);
> > > > 
> > > > If I understand this correctly, you are proposing to remove the
> > > > highmem exclusion.
> > > 
> > > The highmem exclusion may have been there originally because of a
> > > comment suggesting that kunmap() does the flushing. This is the case
> > > only for kunmap_atomic() AFAICT (and maybe we could remove that as well
> > > and rely on flush_kernel_dcache_page() being called).
> > > 
> > > > And then in __flush_kernel_dcache_page():
> > > > 
> > > > 	mapping = page_mapping(page);
> > > > 
> > > > 	if (!mapping || mapping_mapped(mapping))
> > > > 		__cpuc_flush_dcache_area(page_address(page), PAGE_SIZE);
> > > > 
> > > > I still prefer to have this condition here to avoid the flush when
> > > > there is no user mapping at all.  This is handled by lazy flushing
> > > > and is probably the most common case anyway (given how many people
> > > > seem to be affected by this problem).
> > > 
> > > Looking at the old thread, you said there is a case when this condition
> > > is not true (O_DIRECT case). If that's for a page cache page, then we
> > > can handle it lazily (for anonymous pages I don't think we can rely on
> > > lazy flushing since the kernel does not guarantee the clearing of the
> > > PG_arch_1 bit).
> > 
> > As Russel pointed out in a comment to a later version of the patch,
> > PG_arch_1 makes only sense for page cache pages. The condition above
> > is ok from my point of view (it is based on what flush_dcache_page()
> > uses):
> > 
> > - Page cache page without user space mapping:
> > mapping != NULL and mapping_mapped() == 0
> > 
> > -> no flush here; lazy flush based on PG_arch_1 later if needed (we
> >    rely on the proper initialization of the page to "dirty" here.)
> 
> Indeed.
> 
> > - Page cache page with user space mapping:
> > mapping != NULL and mapping_mapped() != 0
> > 
> > -> kernel mapping flushed here (user mapping can be assumed to be clean)
> 
> We had similar thoughts for AArch64 here and decided it's not needed, it
> can just clear the PG_arch_1 bit and do it lazily (patches not pushed
> yet, need more testing). This assumes that even if mapping_mapped(), the
> page is not actually mapped in user space and we eventually get a
> set_pte_at() call. That's what powerpc is doing.

For arm64 only I-/D-cache coherency is relevant, right? In this case,
that may be right (but you may want to have a look at what
xol_get_insn_slot() in kernel/events/uprobes.c does.  I don't know
what kind of pages it handles, but this looks suspicious. And I think
uprobes are also available for powerpc).

However, if you can have aliases in D-cache this is needed.  See
below to trigger this on pages that are already mapped
to user space (direct I/O into a memory mapped file).
 
> > - Anonymous page:
> > mapping == NULL
> > 
> > -> kernel mapping flushed here (user mapping can be assumed to be clean)
> 
> I don't think it should care about anonymous pages at all. I put a
> WARN_ON(!mapping) in flush_dcache_page() and it hasn't triggered yet,
> though not intensive testing.

I assume that you inhibited the call to flush_dcache_page() in
__get_user_pages() for anon pages.  Otherwise, you will be flooded
with warnings.

DIY steps to produce both cases:
(We will need software encryption, thus I need to remove the hardware
encryption module mv_cesa on my hardware first)

# rmmod mv_cesa     
# wget -O mapping_tests.c http://ubuntuone.com/0jF9fNsguN8BvI1Z8fsBVI
# gcc -o mapping_tests mapping_tests.c
# dd if=/dev/urandom of=map.out bs=4k count=10
# dd if=/dev/urandom of=cdevice.img bs=4k count=10
# losetup /dev/loop0 cdevice.img 
# cryptsetup -c aes create c_sda2 /dev/loop0
<enter any passphrase>
# ./mapping_tests

cryptsetup will already trigger the warning and mapping_tests should
flood your console.
 
Since flush_kernel_dcache_page() is supposed to handle user space
pages, it must flush the kernel mapping in these cases.  Running the
above on a device driver using an unpatched
flush_kernel_dcache_page() yields:

# ./mapping_tests 
Anonymous page: differs!
User space mapped page: differs!

It is even needed in flush_dcache_page() as long as everybody
continues to use flush_dcache_page() instead of
flush_kernel_dcache_page() when appropriate...
(This is probably the main reason why the problem I reported is so
uncommon: Everybody seems to use flush_dcache_page() and since it
flushes the kernel mapping in these cases, everything is fine.)

> > > > Additionally, although we can assume that the page is kmapped,
> > > > page_address(page) can still be NULL for a highmem page, right?
> > > 
> > > It looks like kmap() always sets page_address(page) but I'm not sure
> > > about kmap_atomic(), it doesn't seem to.
> > 
> > Hmm, in __flush_dcache_page() we have the following code to flush the
> > kernel mapping:
> > 
> > void __flush_dcache_page(struct address_space *mapping, struct page *page)
> > {
> > 	/*
> > 	 * Writeback any data associated with the kernel mapping of this
> > 	 * page.  This ensures that data in the physical page is mutually
> > 	 * coherent with the kernels mapping.
> > 	 */
> > 	if (!PageHighMem(page)) {
> > 		__cpuc_flush_dcache_area(page_address(page), PAGE_SIZE);
> > 	} else {
> > 		void *addr = kmap_high_get(page);
> > 		if (addr) {
> > 			__cpuc_flush_dcache_area(addr, PAGE_SIZE);
> > 			kunmap_high(page);
> > 		} else if (cache_is_vipt()) {
> > 			/* unmapped pages might still be cached */
> > 			addr = kmap_atomic(page);
> > 			__cpuc_flush_dcache_area(addr, PAGE_SIZE);
> > 			kunmap_atomic(addr);
> > 		}
> > 	}
> > ...
> > 
> > 
> > We probably should reuse it in flush_kernel_dcache_page() to flush
> > the kernel mapping. (The last else clause looks strange though)
> 
> I think it makes sense to reuse this logic in
> flush_kernel_dcache_page(). If the page is already mapped with kmap,
> then kmap_high_get() should return the actual address. If it was mapped
> with kmap_atomic, kmap_high_get() would return NULL, hence the 'else'
> clause and the additional kmap_atomic(). The cache_is_vipt() check is
> useful because kunmap_atomic() would flush VIVT caches anyway.

Yes, but why don't we flush for VIPT _aliasing_ already in
kunmap_atomic? Why do we need to do anything for non-aliasing VIPT
here?
 
> As for __flush_dcache_page() called from other places like
> flush_dcache_page(), because of this 'else if' clause it looks like it
> misses flushing unmapped highmem pages on VIVT cache.

How many users do we have with VIVT D-caches using highmem? (This is
not a rhetorical questions, I have no idea.  However, I would assume
that this use case is almost non-existent.)

- Simon

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [PATCH V3 2/2] ARM: Handle user space mapped pages in flush_kernel_dcache_page
  2013-05-02 19:38                                 ` Simon Baatz
@ 2013-05-03 10:02                                   ` Catalin Marinas
  2013-05-04  8:21                                     ` Ming Lei
  2013-05-05 22:26                                     ` Simon Baatz
  0 siblings, 2 replies; 35+ messages in thread
From: Catalin Marinas @ 2013-05-03 10:02 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, May 02, 2013 at 08:38:36PM +0100, Simon Baatz wrote:
> On Thu, May 02, 2013 at 10:54:31AM +0100, Catalin Marinas wrote:
> > On Wed, May 01, 2013 at 08:04:41PM +0100, Simon Baatz wrote:
> > > On Wed, May 01, 2013 at 03:22:06PM +0100, Catalin Marinas wrote:
> > > > On Tue, Apr 30, 2013 at 10:04:03PM +0100, Simon Baatz wrote:
> > > > > On Tue, Apr 30, 2013 at 12:22:25PM +0100, Catalin Marinas wrote:
> > > > > > On Sun, Apr 21, 2013 at 11:06:30PM +0100, Simon Baatz wrote:
> > > > > > > On Thu, Apr 18, 2013 at 02:51:04PM +0100, Catalin Marinas wrote:
> > > > > > > > On Thu, Apr 18, 2013 at 12:40:16PM +0100, Jason Cooper wrote:
> > > > > ...
> > > > > > > 
> > > > > > > It is not the driver itself which is using the API, it is the
> > > > > > > generic scatterlist memory iterator. And I don't think that this is
> > > > > > > wrong, as I have tried to explain in [1].
> > > > > > 
> > > > > > Trying to remember what we've discussed over the past months on this
> > > > > > topic. It looks like sg_miter_stop() does the right thing in calling
> > > > > > flush_kernel_dcache_page(). Commit f8b63c1 (ARM: 6382/1: Remove
> > > > > > superfluous flush_kernel_dcache_page()) removed this function entirely.
> > > > > > The code previously had this comment - /* highmem pages are always
> > > > > > flushed upon kunmap already */ which I think it wasn't fully correct
> > > > > > either. The kunmap_atomic() flushes the caches but kunmap() doesn't, so
> > > > > > I suspect we only get the flushing if SG_MITER_ATOMIC.
> > > > > > 
> > > > > > So it looks to me like flush_kernel_dcache_page() should be implemented
> > > > > > even for highmem pages (with VIVT or aliasing VIPT, at least for non
> > > > > > kmap_atomic addresses by checking for FIXADDR_START). If highmem is
> > > > > > disabled, I suspect we still need this function since the calling code
> > > > > > doesn't care whether kmap/kunmap was a no-op. But can we keep it as a
> > > > > > simple call to __cpuc_flush_dcache_area(page_address(page), PAGE_SIZE)?
> > > > > 
> > > > > My first version ([1]) had:
> > > > > 
> > > > > 	if ((cache_is_vivt() || cache_is_vipt_aliasing()) && !PageHighMem(page))
> > > > > 		__flush_kernel_dcache_page(page);
> > > > > 
> > > > > If I understand this correctly, you are proposing to remove the
> > > > > highmem exclusion.
> > > > 
> > > > The highmem exclusion may have been there originally because of a
> > > > comment suggesting that kunmap() does the flushing. This is the case
> > > > only for kunmap_atomic() AFAICT (and maybe we could remove that as well
> > > > and rely on flush_kernel_dcache_page() being called).
> > > > 
> > > > > And then in __flush_kernel_dcache_page():
> > > > > 
> > > > > 	mapping = page_mapping(page);
> > > > > 
> > > > > 	if (!mapping || mapping_mapped(mapping))
> > > > > 		__cpuc_flush_dcache_area(page_address(page), PAGE_SIZE);
> > > > > 
> > > > > I still prefer to have this condition here to avoid the flush when
> > > > > there is no user mapping at all.  This is handled by lazy flushing
> > > > > and is probably the most common case anyway (given how many people
> > > > > seem to be affected by this problem).
> > > > 
> > > > Looking at the old thread, you said there is a case when this condition
> > > > is not true (O_DIRECT case). If that's for a page cache page, then we
> > > > can handle it lazily (for anonymous pages I don't think we can rely on
> > > > lazy flushing since the kernel does not guarantee the clearing of the
> > > > PG_arch_1 bit).
> > > 
> > > As Russel pointed out in a comment to a later version of the patch,
> > > PG_arch_1 makes only sense for page cache pages. The condition above
> > > is ok from my point of view (it is based on what flush_dcache_page()
> > > uses):
> > > 
> > > - Page cache page without user space mapping:
> > > mapping != NULL and mapping_mapped() == 0
> > > 
> > > -> no flush here; lazy flush based on PG_arch_1 later if needed (we
> > >    rely on the proper initialization of the page to "dirty" here.)
> > 
> > Indeed.
> > 
> > > - Page cache page with user space mapping:
> > > mapping != NULL and mapping_mapped() != 0
> > > 
> > > -> kernel mapping flushed here (user mapping can be assumed to be clean)
> > 
> > We had similar thoughts for AArch64 here and decided it's not needed, it
> > can just clear the PG_arch_1 bit and do it lazily (patches not pushed
> > yet, need more testing). This assumes that even if mapping_mapped(), the
> > page is not actually mapped in user space and we eventually get a
> > set_pte_at() call. That's what powerpc is doing.
> 
> For arm64 only I-/D-cache coherency is relevant, right?

Yes. Same for 32-bit ARMv7 or non-aliasing D-cache on ARMv6.

> In this case,
> that may be right (but you may want to have a look at what
> xol_get_insn_slot() in kernel/events/uprobes.c does.  I don't know
> what kind of pages it handles, but this looks suspicious. And I think
> uprobes are also available for powerpc).

It is suspicious indeed and even the author is unsure about using the
right API. Should it rather use copy_to_user_page? I'm not familiar with
uprobes.

> However, if you can have aliases in D-cache this is needed.  See
> below to trigger this on pages that are already mapped
> to user space (direct I/O into a memory mapped file).

You are probably right, for direct I/O and aliases in the D-cache it
needs to flush. The documentation for flush_dcache_page() states that it
can also be called when the kernel is about to read from a page cache
page (potentially mapped into user space).

> > > - Anonymous page:
> > > mapping == NULL
> > > 
> > > -> kernel mapping flushed here (user mapping can be assumed to be clean)
> > 
> > I don't think it should care about anonymous pages at all. I put a
> > WARN_ON(!mapping) in flush_dcache_page() and it hasn't triggered yet,
> > though not intensive testing.
> 
> I assume that you inhibited the call to flush_dcache_page() in
> __get_user_pages() for anon pages.  Otherwise, you will be flooded
> with warnings.

I haven't done any stress testing so I don't think I hit this code path,
so no warning. But yes, it should have triggered. Anyway, in this case
flush_dcache_page() should have just ignored (clearing PG_arch_1 is
harmless anyway if we also ignore this bit in __sync_icache_dcache for
non-aliasing caches).

> DIY steps to produce both cases:
> (We will need software encryption, thus I need to remove the hardware
> encryption module mv_cesa on my hardware first)
> 
> # rmmod mv_cesa     
> # wget -O mapping_tests.c http://ubuntuone.com/0jF9fNsguN8BvI1Z8fsBVI
> # gcc -o mapping_tests mapping_tests.c
> # dd if=/dev/urandom of=map.out bs=4k count=10
> # dd if=/dev/urandom of=cdevice.img bs=4k count=10
> # losetup /dev/loop0 cdevice.img 
> # cryptsetup -c aes create c_sda2 /dev/loop0
> <enter any passphrase>
> # ./mapping_tests
> 
> cryptsetup will already trigger the warning and mapping_tests should
> flood your console.
>  
> Since flush_kernel_dcache_page() is supposed to handle user space
> pages, it must flush the kernel mapping in these cases.  Running the
> above on a device driver using an unpatched
> flush_kernel_dcache_page() yields:
> 
> # ./mapping_tests 
> Anonymous page: differs!
> User space mapped page: differs!

I haven't run the tests but I can see  why it fails without
flush_kernel_dcache_page(). So I think this function needs to be
implemented for aliasing VIPT or VIVT caches.

> It is even needed in flush_dcache_page() as long as everybody
> continues to use flush_dcache_page() instead of
> flush_kernel_dcache_page() when appropriate...
> (This is probably the main reason why the problem I reported is so
> uncommon: Everybody seems to use flush_dcache_page() and since it
> flushes the kernel mapping in these cases, everything is fine.)

That's why for non-aliasing VIPT we could make it just a clear_bit() and
let the callers fix their API usage.

> > > > > Additionally, although we can assume that the page is kmapped,
> > > > > page_address(page) can still be NULL for a highmem page, right?
> > > > 
> > > > It looks like kmap() always sets page_address(page) but I'm not sure
> > > > about kmap_atomic(), it doesn't seem to.
> > > 
> > > Hmm, in __flush_dcache_page() we have the following code to flush the
> > > kernel mapping:
> > > 
> > > void __flush_dcache_page(struct address_space *mapping, struct page *page)
> > > {
> > > 	/*
> > > 	 * Writeback any data associated with the kernel mapping of this
> > > 	 * page.  This ensures that data in the physical page is mutually
> > > 	 * coherent with the kernels mapping.
> > > 	 */
> > > 	if (!PageHighMem(page)) {
> > > 		__cpuc_flush_dcache_area(page_address(page), PAGE_SIZE);
> > > 	} else {
> > > 		void *addr = kmap_high_get(page);
> > > 		if (addr) {
> > > 			__cpuc_flush_dcache_area(addr, PAGE_SIZE);
> > > 			kunmap_high(page);
> > > 		} else if (cache_is_vipt()) {
> > > 			/* unmapped pages might still be cached */
> > > 			addr = kmap_atomic(page);
> > > 			__cpuc_flush_dcache_area(addr, PAGE_SIZE);
> > > 			kunmap_atomic(addr);
> > > 		}
> > > 	}
> > > ...
> > > 
> > > 
> > > We probably should reuse it in flush_kernel_dcache_page() to flush
> > > the kernel mapping. (The last else clause looks strange though)
> > 
> > I think it makes sense to reuse this logic in
> > flush_kernel_dcache_page(). If the page is already mapped with kmap,
> > then kmap_high_get() should return the actual address. If it was mapped
> > with kmap_atomic, kmap_high_get() would return NULL, hence the 'else'
> > clause and the additional kmap_atomic(). The cache_is_vipt() check is
> > useful because kunmap_atomic() would flush VIVT caches anyway.
> 
> Yes, but why don't we flush for VIPT _aliasing_ already in
> kunmap_atomic?

Some comment from Nico on the list, it seems that highmem does not work
with aliasing VIPT caches:

http://article.gmane.org/gmane.linux.ports.arm.kernel/85114

> Why do we need to do anything for non-aliasing VIPT here?

Do you mean __flush_dcache_page()? I don't think we need.

> > As for __flush_dcache_page() called from other places like
> > flush_dcache_page(), because of this 'else if' clause it looks like it
> > misses flushing unmapped highmem pages on VIVT cache.
> 
> How many users do we have with VIVT D-caches using highmem? (This is
> not a rhetorical questions, I have no idea.  However, I would assume
> that this use case is almost non-existent.)

I don't really know. Maybe Nico has more info.

So my proposal:

1. Non-aliasing VIPT - defer any D-cache flushing until
   __sync_icache_dcache() (and fix callers like uprobes)
2. Aliasing VIPT or VIVT - flush the aliases as before in
   flush_dcache_page()
3. Aliasing VIPT or VIVT - implement flush_kernel_dcache_page() for all
   pages (don't check for PageHighmem)
4. VIVT - remove cache flushing from kunmap_atomic and rely on
   flush_kernel_dcache_page()

I'm not entirely sure about 4 but at the moment kunmap() doesn't flush
caches, only kunmap_atomic() so it looks like an inconsistency.

-- 
Catalin

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [PATCH V3 2/2] ARM: Handle user space mapped pages in flush_kernel_dcache_page
  2013-05-03 10:02                                   ` Catalin Marinas
@ 2013-05-04  8:21                                     ` Ming Lei
  2013-05-08 15:08                                       ` Catalin Marinas
  2013-05-11  6:27                                       ` Simon Baatz
  2013-05-05 22:26                                     ` Simon Baatz
  1 sibling, 2 replies; 35+ messages in thread
From: Ming Lei @ 2013-05-04  8:21 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, May 3, 2013 at 6:02 PM, Catalin Marinas <catalin.marinas@arm.com> wrote:
>>
>> I assume that you inhibited the call to flush_dcache_page() in
>> __get_user_pages() for anon pages.  Otherwise, you will be flooded
>> with warnings.
>
> I haven't done any stress testing so I don't think I hit this code path,
> so no warning. But yes, it should have triggered. Anyway, in this case
> flush_dcache_page() should have just ignored (clearing PG_arch_1 is
> harmless anyway if we also ignore this bit in __sync_icache_dcache for
> non-aliasing caches).

Yes, maybe we can do a little optimization for O_DIRECT since no
dcache alias and I/D coherency problem in this case on ARMv7, how
about below change?

diff --git a/arch/arm/mm/flush.c b/arch/arm/mm/flush.c
index 1c8f7f5..962a657 100644
--- a/arch/arm/mm/flush.c
+++ b/arch/arm/mm/flush.c
@@ -287,6 +287,8 @@ void flush_dcache_page(struct page *page)
 	    mapping && !mapping_mapped(mapping))
 		clear_bit(PG_dcache_clean, &page->flags);
 	else {
+		if (!mapping && cache_is_vipt_nonaliasing())
+			return;
 		__flush_dcache_page(mapping, page);
 		if (mapping && cache_is_vivt())
 			__flush_dcache_aliases(mapping, page);

Thanks,
-- 
Ming Lei

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH V3 2/2] ARM: Handle user space mapped pages in flush_kernel_dcache_page
  2013-05-03 10:02                                   ` Catalin Marinas
  2013-05-04  8:21                                     ` Ming Lei
@ 2013-05-05 22:26                                     ` Simon Baatz
  2013-05-08 15:28                                       ` Catalin Marinas
  1 sibling, 1 reply; 35+ messages in thread
From: Simon Baatz @ 2013-05-05 22:26 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, May 03, 2013 at 11:02:42AM +0100, Catalin Marinas wrote:
> On Thu, May 02, 2013 at 08:38:36PM +0100, Simon Baatz wrote:
> > On Thu, May 02, 2013 at 10:54:31AM +0100, Catalin Marinas wrote:
> > > On Wed, May 01, 2013 at 08:04:41PM +0100, Simon Baatz wrote:
> > > > On Wed, May 01, 2013 at 03:22:06PM +0100, Catalin Marinas wrote:
> > > > > On Tue, Apr 30, 2013 at 10:04:03PM +0100, Simon Baatz wrote:
> > > > > > On Tue, Apr 30, 2013 at 12:22:25PM +0100, Catalin Marinas wrote:
> > > > > > > On Sun, Apr 21, 2013 at 11:06:30PM +0100, Simon Baatz wrote:
> > > > > > > > On Thu, Apr 18, 2013 at 02:51:04PM +0100, Catalin Marinas wrote:
> > > > > > > > > On Thu, Apr 18, 2013 at 12:40:16PM +0100, Jason Cooper wrote:
...
> 
> I haven't run the tests but I can see  why it fails without
> flush_kernel_dcache_page(). So I think this function needs to be
> implemented for aliasing VIPT or VIVT caches.
> 
> > It is even needed in flush_dcache_page() as long as everybody
> > continues to use flush_dcache_page() instead of
> > flush_kernel_dcache_page() when appropriate...
> > (This is probably the main reason why the problem I reported is so
> > uncommon: Everybody seems to use flush_dcache_page() and since it
> > flushes the kernel mapping in these cases, everything is fine.)
> 
> That's why for non-aliasing VIPT we could make it just a clear_bit() and
> let the callers fix their API usage.

Do you mean the unnecessary flush for anon pages or also the D/I
flush for user space mapped page cache pages?

For anon pages, that was the content of the patch you acked ([1])
once.  However, Russel did not like the idea to use the
PG_dcache_clean bit also for anon pages.

I have a newer version that does the following: Do nothing for
mapping == NULL on non-aliasing VIPT in flush_dcache_page().  In
__sync_icache_dcache() flush if mapping == NULL (always on aliasing
VIPT, only if pte_exec() on non-aliasing VIPT).
 
> > > > > > Additionally, although we can assume that the page is kmapped,
> > > > > > page_address(page) can still be NULL for a highmem page, right?
> > > > > 
> > > > > It looks like kmap() always sets page_address(page) but I'm not sure
> > > > > about kmap_atomic(), it doesn't seem to.
> > > > 
> > > > Hmm, in __flush_dcache_page() we have the following code to flush the
> > > > kernel mapping:
> > > > 
> > > > void __flush_dcache_page(struct address_space *mapping, struct page *page)
> > > > {
> > > > 	/*
> > > > 	 * Writeback any data associated with the kernel mapping of this
> > > > 	 * page.  This ensures that data in the physical page is mutually
> > > > 	 * coherent with the kernels mapping.
> > > > 	 */
> > > > 	if (!PageHighMem(page)) {
> > > > 		__cpuc_flush_dcache_area(page_address(page), PAGE_SIZE);
> > > > 	} else {
> > > > 		void *addr = kmap_high_get(page);
> > > > 		if (addr) {
> > > > 			__cpuc_flush_dcache_area(addr, PAGE_SIZE);
> > > > 			kunmap_high(page);
> > > > 		} else if (cache_is_vipt()) {
> > > > 			/* unmapped pages might still be cached */
> > > > 			addr = kmap_atomic(page);
> > > > 			__cpuc_flush_dcache_area(addr, PAGE_SIZE);
> > > > 			kunmap_atomic(addr);
> > > > 		}
> > > > 	}
> > > > ...
> > > > 
> > > > 
> > > > We probably should reuse it in flush_kernel_dcache_page() to flush
> > > > the kernel mapping. (The last else clause looks strange though)
> > > 
> > > I think it makes sense to reuse this logic in
> > > flush_kernel_dcache_page(). If the page is already mapped with kmap,
> > > then kmap_high_get() should return the actual address. If it was mapped
> > > with kmap_atomic, kmap_high_get() would return NULL, hence the 'else'
> > > clause and the additional kmap_atomic(). The cache_is_vipt() check is
> > > useful because kunmap_atomic() would flush VIVT caches anyway.
> > 
> > Yes, but why don't we flush for VIPT _aliasing_ already in
> > kunmap_atomic?
> 
> Some comment from Nico on the list, it seems that highmem does not work
> with aliasing VIPT caches:
> 
> http://article.gmane.org/gmane.linux.ports.arm.kernel/85114
> 
> > Why do we need to do anything for non-aliasing VIPT here?
> 
> Do you mean __flush_dcache_page()? I don't think we need.
> 
> > > As for __flush_dcache_page() called from other places like
> > > flush_dcache_page(), because of this 'else if' clause it looks like it
> > > misses flushing unmapped highmem pages on VIVT cache.
> > 
> > How many users do we have with VIVT D-caches using highmem? (This is
> > not a rhetorical questions, I have no idea.  However, I would assume
> > that this use case is almost non-existent.)
> 
> I don't really know. Maybe Nico has more info.
> 
> So my proposal:
> 
> 1. Non-aliasing VIPT - defer any D-cache flushing until
>    __sync_icache_dcache() (and fix callers like uprobes)

See above, not sure whether we can also get rid of the I-cache flush.

> 2. Aliasing VIPT or VIVT - flush the aliases as before in
>    flush_dcache_page()

Yes.

> 3. Aliasing VIPT or VIVT - implement flush_kernel_dcache_page() for all
>    pages (don't check for PageHighmem)

Ok, I will update the first version of my patch.

> 4. VIVT - remove cache flushing from kunmap_atomic and rely on
>    flush_kernel_dcache_page()
> 
> I'm not entirely sure about 4 but at the moment kunmap() doesn't flush
> caches, only kunmap_atomic() so it looks like an inconsistency.

Not sure either. 

- Simon

[1] http://lists.infradead.org/pipermail/linux-arm-kernel/2012-October/124291.html

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [PATCH V3 2/2] ARM: Handle user space mapped pages in flush_kernel_dcache_page
  2013-05-04  8:21                                     ` Ming Lei
@ 2013-05-08 15:08                                       ` Catalin Marinas
  2013-05-08 18:43                                         ` Russell King - ARM Linux
  2013-05-09  1:52                                         ` Ming Lei
  2013-05-11  6:27                                       ` Simon Baatz
  1 sibling, 2 replies; 35+ messages in thread
From: Catalin Marinas @ 2013-05-08 15:08 UTC (permalink / raw)
  To: linux-arm-kernel

On Sat, May 04, 2013 at 09:21:27AM +0100, Ming Lei wrote:
> On Fri, May 3, 2013 at 6:02 PM, Catalin Marinas <catalin.marinas@arm.com> wrote:
> >>
> >> I assume that you inhibited the call to flush_dcache_page() in
> >> __get_user_pages() for anon pages.  Otherwise, you will be flooded
> >> with warnings.
> >
> > I haven't done any stress testing so I don't think I hit this code path,
> > so no warning. But yes, it should have triggered. Anyway, in this case
> > flush_dcache_page() should have just ignored (clearing PG_arch_1 is
> > harmless anyway if we also ignore this bit in __sync_icache_dcache for
> > non-aliasing caches).
> 
> Yes, maybe we can do a little optimization for O_DIRECT since no
> dcache alias and I/D coherency problem in this case on ARMv7, how
> about below change?
> 
> diff --git a/arch/arm/mm/flush.c b/arch/arm/mm/flush.c
> index 1c8f7f5..962a657 100644
> --- a/arch/arm/mm/flush.c
> +++ b/arch/arm/mm/flush.c
> @@ -287,6 +287,8 @@ void flush_dcache_page(struct page *page)
>  	    mapping && !mapping_mapped(mapping))
>  		clear_bit(PG_dcache_clean, &page->flags);
>  	else {
> +		if (!mapping && cache_is_vipt_nonaliasing())
> +			return;
>  		__flush_dcache_page(mapping, page);
>  		if (mapping && cache_is_vivt())
>  			__flush_dcache_aliases(mapping, page);

I wonder whether we could move the:

	if (!mapping)
		return

at the top of this function, IOW don't touch any anonymous pages. Would
anything be broken (apart from wrong API use)?

-- 
Catalin

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [PATCH V3 2/2] ARM: Handle user space mapped pages in flush_kernel_dcache_page
  2013-05-05 22:26                                     ` Simon Baatz
@ 2013-05-08 15:28                                       ` Catalin Marinas
  0 siblings, 0 replies; 35+ messages in thread
From: Catalin Marinas @ 2013-05-08 15:28 UTC (permalink / raw)
  To: linux-arm-kernel

On Sun, May 05, 2013 at 11:26:47PM +0100, Simon Baatz wrote:
> On Fri, May 03, 2013 at 11:02:42AM +0100, Catalin Marinas wrote:
> > On Thu, May 02, 2013 at 08:38:36PM +0100, Simon Baatz wrote:
> ...
> > I haven't run the tests but I can see  why it fails without
> > flush_kernel_dcache_page(). So I think this function needs to be
> > implemented for aliasing VIPT or VIVT caches.
> > 
> > > It is even needed in flush_dcache_page() as long as everybody
> > > continues to use flush_dcache_page() instead of
> > > flush_kernel_dcache_page() when appropriate...
> > > (This is probably the main reason why the problem I reported is so
> > > uncommon: Everybody seems to use flush_dcache_page() and since it
> > > flushes the kernel mapping in these cases, everything is fine.)
> > 
> > That's why for non-aliasing VIPT we could make it just a clear_bit() and
> > let the callers fix their API usage.
> 
> Do you mean the unnecessary flush for anon pages or also the D/I
> flush for user space mapped page cache pages?

flush_dcache_page() should just ignore anonymous pages. The D/I
coherency would be handled in __sync_icache_dcache() later. Do we have
cases where a page is already mapped in user space (pte valid) and the
kernel writes any code to it? I don't think we have. PowerPC have a
simplified flush_dcache_page() and they haven't seen any issues.

> For anon pages, that was the content of the patch you acked ([1])
> once.  However, Russel did not like the idea to use the
> PG_dcache_clean bit also for anon pages.

I think Russell was right, the kernel does not have any guarantees about
the PG_arch_1 bit on anonymous pages. Do we could remove the !mapping
code path in flush_dcache_page().

> I have a newer version that does the following: Do nothing for
> mapping == NULL on non-aliasing VIPT in flush_dcache_page().  

Can this mapping == NULL be generalised for aliasing caches?

> In __sync_icache_dcache() flush if mapping == NULL (always on aliasing
> VIPT, only if pte_exec() on non-aliasing VIPT).

If the page is anonymous, do we expect user code to execute from a
clear page? When the page is first allocated, clear_user_highpage should
take care of the aliases already.

-- 
Catalin

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [PATCH V3 2/2] ARM: Handle user space mapped pages in flush_kernel_dcache_page
  2013-05-08 15:08                                       ` Catalin Marinas
@ 2013-05-08 18:43                                         ` Russell King - ARM Linux
  2013-05-08 19:31                                           ` Simon Baatz
  2013-05-09  1:52                                         ` Ming Lei
  1 sibling, 1 reply; 35+ messages in thread
From: Russell King - ARM Linux @ 2013-05-08 18:43 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, May 08, 2013 at 04:08:00PM +0100, Catalin Marinas wrote:
> On Sat, May 04, 2013 at 09:21:27AM +0100, Ming Lei wrote:
> > On Fri, May 3, 2013 at 6:02 PM, Catalin Marinas <catalin.marinas@arm.com> wrote:
> > >>
> > >> I assume that you inhibited the call to flush_dcache_page() in
> > >> __get_user_pages() for anon pages.  Otherwise, you will be flooded
> > >> with warnings.
> > >
> > > I haven't done any stress testing so I don't think I hit this code path,
> > > so no warning. But yes, it should have triggered. Anyway, in this case
> > > flush_dcache_page() should have just ignored (clearing PG_arch_1 is
> > > harmless anyway if we also ignore this bit in __sync_icache_dcache for
> > > non-aliasing caches).
> > 
> > Yes, maybe we can do a little optimization for O_DIRECT since no
> > dcache alias and I/D coherency problem in this case on ARMv7, how
> > about below change?
> > 
> > diff --git a/arch/arm/mm/flush.c b/arch/arm/mm/flush.c
> > index 1c8f7f5..962a657 100644
> > --- a/arch/arm/mm/flush.c
> > +++ b/arch/arm/mm/flush.c
> > @@ -287,6 +287,8 @@ void flush_dcache_page(struct page *page)
> >  	    mapping && !mapping_mapped(mapping))
> >  		clear_bit(PG_dcache_clean, &page->flags);
> >  	else {
> > +		if (!mapping && cache_is_vipt_nonaliasing())
> > +			return;
> >  		__flush_dcache_page(mapping, page);
> >  		if (mapping && cache_is_vivt())
> >  			__flush_dcache_aliases(mapping, page);
> 
> I wonder whether we could move the:
> 
> 	if (!mapping)
> 		return
> 
> at the top of this function, IOW don't touch any anonymous pages. Would
> anything be broken (apart from wrong API use)?

I suspect you may be missing something - and I suggest reading the Sparc
version, which is where this PG_arch_1 stuff was first dreamed up (by
DaveM).

Are you positive that this path can be eliminated?  Totally sure?  Have
you checked the arg/env pages that fs/exec.c and the binfmt's insert into
the new process?

The comments against the Sparc version suggest that the "else" clause
is necessary ensure that case is adequately covered.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [PATCH V3 2/2] ARM: Handle user space mapped pages in flush_kernel_dcache_page
  2013-05-08 18:43                                         ` Russell King - ARM Linux
@ 2013-05-08 19:31                                           ` Simon Baatz
  2013-05-08 21:13                                             ` Catalin Marinas
  0 siblings, 1 reply; 35+ messages in thread
From: Simon Baatz @ 2013-05-08 19:31 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Russel,

On Wed, May 08, 2013 at 07:43:54PM +0100, Russell King - ARM Linux wrote:
> On Wed, May 08, 2013 at 04:08:00PM +0100, Catalin Marinas wrote:
> > On Sat, May 04, 2013 at 09:21:27AM +0100, Ming Lei wrote:
> > > On Fri, May 3, 2013 at 6:02 PM, Catalin Marinas <catalin.marinas@arm.com> wrote:
> > > >>
> > > >> I assume that you inhibited the call to flush_dcache_page() in
> > > >> __get_user_pages() for anon pages.  Otherwise, you will be flooded
> > > >> with warnings.
> > > >
> > > > I haven't done any stress testing so I don't think I hit this code path,
> > > > so no warning. But yes, it should have triggered. Anyway, in this case
> > > > flush_dcache_page() should have just ignored (clearing PG_arch_1 is
> > > > harmless anyway if we also ignore this bit in __sync_icache_dcache for
> > > > non-aliasing caches).
> > > 
> > > Yes, maybe we can do a little optimization for O_DIRECT since no
> > > dcache alias and I/D coherency problem in this case on ARMv7, how
> > > about below change?
> > > 
> > > diff --git a/arch/arm/mm/flush.c b/arch/arm/mm/flush.c
> > > index 1c8f7f5..962a657 100644
> > > --- a/arch/arm/mm/flush.c
> > > +++ b/arch/arm/mm/flush.c
> > > @@ -287,6 +287,8 @@ void flush_dcache_page(struct page *page)
> > >  	    mapping && !mapping_mapped(mapping))
> > >  		clear_bit(PG_dcache_clean, &page->flags);
> > >  	else {
> > > +		if (!mapping && cache_is_vipt_nonaliasing())
> > > +			return;
> > >  		__flush_dcache_page(mapping, page);
> > >  		if (mapping && cache_is_vivt())
> > >  			__flush_dcache_aliases(mapping, page);
> > 
> > I wonder whether we could move the:
> > 
> > 	if (!mapping)
> > 		return
> > 
> > at the top of this function, IOW don't touch any anonymous pages. Would
> > anything be broken (apart from wrong API use)?
> 
> I suspect you may be missing something - and I suggest reading the Sparc
> version, which is where this PG_arch_1 stuff was first dreamed up (by
> DaveM).
> 
> Are you positive that this path can be eliminated?  Totally sure?  Have
> you checked the arg/env pages that fs/exec.c and the binfmt's insert into
> the new process?
> 
> The comments against the Sparc version suggest that the "else" clause
> is necessary ensure that case is adequately covered.

At least the fs/exec.c and the binfmt stuff use
flush_kernel_dcache_page() for 5 years or so now (see commit b6a2fe). 
Which makes sense, since flush_dcache_page() is not supposed to
handle anon pages anyway.

The problem are the tons of flush_dcache_page() uses that wrongly
assume that no anon pages need to be handled (e.g. 
drivers/block/loop.c and crypto/scatterwalk.c just to name two).

- Simon

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [PATCH V3 2/2] ARM: Handle user space mapped pages in flush_kernel_dcache_page
  2013-05-08 19:31                                           ` Simon Baatz
@ 2013-05-08 21:13                                             ` Catalin Marinas
  0 siblings, 0 replies; 35+ messages in thread
From: Catalin Marinas @ 2013-05-08 21:13 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, May 08, 2013 at 08:31:30PM +0100, Simon Baatz wrote:
> On Wed, May 08, 2013 at 07:43:54PM +0100, Russell King - ARM Linux wrote:
> > On Wed, May 08, 2013 at 04:08:00PM +0100, Catalin Marinas wrote:
> > > On Sat, May 04, 2013 at 09:21:27AM +0100, Ming Lei wrote:
> > > > On Fri, May 3, 2013 at 6:02 PM, Catalin Marinas <catalin.marinas@arm.com> wrote:
> > > > >>
> > > > >> I assume that you inhibited the call to flush_dcache_page() in
> > > > >> __get_user_pages() for anon pages.  Otherwise, you will be flooded
> > > > >> with warnings.
> > > > >
> > > > > I haven't done any stress testing so I don't think I hit this code path,
> > > > > so no warning. But yes, it should have triggered. Anyway, in this case
> > > > > flush_dcache_page() should have just ignored (clearing PG_arch_1 is
> > > > > harmless anyway if we also ignore this bit in __sync_icache_dcache for
> > > > > non-aliasing caches).
> > > > 
> > > > Yes, maybe we can do a little optimization for O_DIRECT since no
> > > > dcache alias and I/D coherency problem in this case on ARMv7, how
> > > > about below change?
> > > > 
> > > > diff --git a/arch/arm/mm/flush.c b/arch/arm/mm/flush.c
> > > > index 1c8f7f5..962a657 100644
> > > > --- a/arch/arm/mm/flush.c
> > > > +++ b/arch/arm/mm/flush.c
> > > > @@ -287,6 +287,8 @@ void flush_dcache_page(struct page *page)
> > > >  	    mapping && !mapping_mapped(mapping))
> > > >  		clear_bit(PG_dcache_clean, &page->flags);
> > > >  	else {
> > > > +		if (!mapping && cache_is_vipt_nonaliasing())
> > > > +			return;
> > > >  		__flush_dcache_page(mapping, page);
> > > >  		if (mapping && cache_is_vivt())
> > > >  			__flush_dcache_aliases(mapping, page);
> > > 
> > > I wonder whether we could move the:
> > > 
> > > 	if (!mapping)
> > > 		return
> > > 
> > > at the top of this function, IOW don't touch any anonymous pages. Would
> > > anything be broken (apart from wrong API use)?
> > 
> > I suspect you may be missing something - and I suggest reading the Sparc
> > version, which is where this PG_arch_1 stuff was first dreamed up (by
> > DaveM).
> > 
> > Are you positive that this path can be eliminated?  Totally sure?  Have
> > you checked the arg/env pages that fs/exec.c and the binfmt's insert into
> > the new process?
> > 
> > The comments against the Sparc version suggest that the "else" clause
> > is necessary ensure that case is adequately covered.
> 
> At least the fs/exec.c and the binfmt stuff use
> flush_kernel_dcache_page() for 5 years or so now (see commit b6a2fe). 
> Which makes sense, since flush_dcache_page() is not supposed to
> handle anon pages anyway.

Indeed, the sparc comment on arg pages no longer applies. It was also
more of an optimisation which I'm not convinced it would have saved
anything on ARM.

-- 
Catalin

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [PATCH V3 2/2] ARM: Handle user space mapped pages in flush_kernel_dcache_page
  2013-05-08 15:08                                       ` Catalin Marinas
  2013-05-08 18:43                                         ` Russell King - ARM Linux
@ 2013-05-09  1:52                                         ` Ming Lei
  1 sibling, 0 replies; 35+ messages in thread
From: Ming Lei @ 2013-05-09  1:52 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, May 8, 2013 at 11:08 PM, Catalin Marinas
<catalin.marinas@arm.com> wrote:
> On Sat, May 04, 2013 at 09:21:27AM +0100, Ming Lei wrote:
>> On Fri, May 3, 2013 at 6:02 PM, Catalin Marinas <catalin.marinas@arm.com> wrote:
>> >>
>> >> I assume that you inhibited the call to flush_dcache_page() in
>> >> __get_user_pages() for anon pages.  Otherwise, you will be flooded
>> >> with warnings.
>> >
>> > I haven't done any stress testing so I don't think I hit this code path,
>> > so no warning. But yes, it should have triggered. Anyway, in this case
>> > flush_dcache_page() should have just ignored (clearing PG_arch_1 is
>> > harmless anyway if we also ignore this bit in __sync_icache_dcache for
>> > non-aliasing caches).
>>
>> Yes, maybe we can do a little optimization for O_DIRECT since no
>> dcache alias and I/D coherency problem in this case on ARMv7, how
>> about below change?
>>
>> diff --git a/arch/arm/mm/flush.c b/arch/arm/mm/flush.c
>> index 1c8f7f5..962a657 100644
>> --- a/arch/arm/mm/flush.c
>> +++ b/arch/arm/mm/flush.c
>> @@ -287,6 +287,8 @@ void flush_dcache_page(struct page *page)
>>           mapping && !mapping_mapped(mapping))
>>               clear_bit(PG_dcache_clean, &page->flags);
>>       else {
>> +             if (!mapping && cache_is_vipt_nonaliasing())
>> +                     return;
>>               __flush_dcache_page(mapping, page);
>>               if (mapping && cache_is_vivt())
>>                       __flush_dcache_aliases(mapping, page);
>
> I wonder whether we could move the:
>
>         if (!mapping)
>                 return
>
> at the top of this function, IOW don't touch any anonymous pages. Would
> anything be broken (apart from wrong API use)?

In case of O_DIRECT, the anonymous page may be shared, so looks it
should be flushed for vipt_aliasing cache.

Catalin, btw, would you mind give comments on the patch below(it might
be a 'fix' on one commit you submitted)?

      http://marc.info/?l=linux-arm-kernel&m=136757247311694&w=2

Thanks
-- 
Ming Lei

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [PATCH V3 2/2] ARM: Handle user space mapped pages in flush_kernel_dcache_page
  2013-05-04  8:21                                     ` Ming Lei
  2013-05-08 15:08                                       ` Catalin Marinas
@ 2013-05-11  6:27                                       ` Simon Baatz
  2013-05-13  3:12                                         ` Ming Lei
  1 sibling, 1 reply; 35+ messages in thread
From: Simon Baatz @ 2013-05-11  6:27 UTC (permalink / raw)
  To: linux-arm-kernel

On Sat, May 04, 2013 at 04:21:27PM +0800, Ming Lei wrote:
> On Fri, May 3, 2013 at 6:02 PM, Catalin Marinas <catalin.marinas@arm.com> wrote:
> >>
> >> I assume that you inhibited the call to flush_dcache_page() in
> >> __get_user_pages() for anon pages.  Otherwise, you will be flooded
> >> with warnings.
> >
> > I haven't done any stress testing so I don't think I hit this code path,
> > so no warning. But yes, it should have triggered. Anyway, in this case
> > flush_dcache_page() should have just ignored (clearing PG_arch_1 is
> > harmless anyway if we also ignore this bit in __sync_icache_dcache for
> > non-aliasing caches).
> 
> Yes, maybe we can do a little optimization for O_DIRECT since no
> dcache alias and I/D coherency problem in this case on ARMv7, how
> about below change?
> 
> diff --git a/arch/arm/mm/flush.c b/arch/arm/mm/flush.c
> index 1c8f7f5..962a657 100644
> --- a/arch/arm/mm/flush.c
> +++ b/arch/arm/mm/flush.c
> @@ -287,6 +287,8 @@ void flush_dcache_page(struct page *page)
>  	    mapping && !mapping_mapped(mapping))
>  		clear_bit(PG_dcache_clean, &page->flags);
>  	else {
> +		if (!mapping && cache_is_vipt_nonaliasing())
> +			return;
>  		__flush_dcache_page(mapping, page);
>  		if (mapping && cache_is_vivt())
>  			__flush_dcache_aliases(mapping, page);
> 

Yes, this should not be needed. However, I seem to get funny pixel
errors on an Exynos 5 based Chromebook when not flushing in this
case.  Strange.  This is not on a mainline kernel, I will need to
have a look what happens here.


- Simon

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [PATCH V3 2/2] ARM: Handle user space mapped pages in flush_kernel_dcache_page
  2013-05-11  6:27                                       ` Simon Baatz
@ 2013-05-13  3:12                                         ` Ming Lei
  0 siblings, 0 replies; 35+ messages in thread
From: Ming Lei @ 2013-05-13  3:12 UTC (permalink / raw)
  To: linux-arm-kernel

On Sat, May 11, 2013 at 2:27 PM, Simon Baatz <gmbnomis@gmail.com> wrote:
> On Sat, May 04, 2013 at 04:21:27PM +0800, Ming Lei wrote:
>> On Fri, May 3, 2013 at 6:02 PM, Catalin Marinas <catalin.marinas@arm.com> wrote:
>> >>
>> >> I assume that you inhibited the call to flush_dcache_page() in
>> >> __get_user_pages() for anon pages.  Otherwise, you will be flooded
>> >> with warnings.
>> >
>> > I haven't done any stress testing so I don't think I hit this code path,
>> > so no warning. But yes, it should have triggered. Anyway, in this case
>> > flush_dcache_page() should have just ignored (clearing PG_arch_1 is
>> > harmless anyway if we also ignore this bit in __sync_icache_dcache for
>> > non-aliasing caches).
>>
>> Yes, maybe we can do a little optimization for O_DIRECT since no
>> dcache alias and I/D coherency problem in this case on ARMv7, how
>> about below change?
>>
>> diff --git a/arch/arm/mm/flush.c b/arch/arm/mm/flush.c
>> index 1c8f7f5..962a657 100644
>> --- a/arch/arm/mm/flush.c
>> +++ b/arch/arm/mm/flush.c
>> @@ -287,6 +287,8 @@ void flush_dcache_page(struct page *page)
>>           mapping && !mapping_mapped(mapping))
>>               clear_bit(PG_dcache_clean, &page->flags);
>>       else {
>> +             if (!mapping && cache_is_vipt_nonaliasing())
>> +                     return;
>>               __flush_dcache_page(mapping, page);
>>               if (mapping && cache_is_vivt())
>>                       __flush_dcache_aliases(mapping, page);
>>
>
> Yes, this should not be needed. However, I seem to get funny pixel
> errors on an Exynos 5 based Chromebook when not flushing in this
> case.  Strange.  This is not on a mainline kernel, I will need to
> have a look what happens here.

Yes, it is a bit strange, not sure how flush_dcache_page() is used
in memory mapped by mmap.(suppose the display buffer is mmaped)

I test O_DIRECT writing to USB mass storage with the patch, looks
the data written is correct.

[tom]$dd if=/mnt/nfs/test/usb/qieteing-512M.rmvb
of=/mnt/usb/tests/qieteing-512M.rmvb oflag=direct,sync bs=512M count=1
1+0 records in
1+0 records out
536870912 bytes (537 MB) copied, 111.631 s, 4.8 MB/s

#echo 3 > /proc/sys/vm/drop_caches

[tom]$cd /mnt/usb/tests/
[tom]$md5sum -c /mnt/nfs/test/usb/qieteing-512M.rmvb.md5
qieteing-512M.rmvb: OK


Thanks,
-- 
Ming Lei

^ permalink raw reply	[flat|nested] 35+ messages in thread

end of thread, other threads:[~2013-05-13  3:12 UTC | newest]

Thread overview: 35+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-10-07 11:29 [PATCH V2 0/2] fix and improvement of flush(_kernel)_dcache_page() Simon Baatz
2012-10-07 11:29 ` [PATCH V2 1/2] ARM: remove unnecessary flush of anon pages in flush_dcache_page() Simon Baatz
2012-10-08 11:36   ` Catalin Marinas
2012-10-08 17:38     ` Simon Baatz
2012-10-08 17:43       ` Catalin Marinas
2012-10-07 11:29 ` [PATCH V3 2/2] ARM: Handle user space mapped pages in flush_kernel_dcache_page Simon Baatz
2012-10-08 17:44   ` Catalin Marinas
2012-10-08 20:02     ` Simon Baatz
2012-10-08 20:20       ` Russell King - ARM Linux
2012-10-08 23:07         ` Simon Baatz
2012-11-18 21:10           ` Jason Cooper
2013-04-18 11:16             ` Jason Cooper
2013-04-18 11:22               ` Russell King - ARM Linux
2013-04-18 11:40                 ` Jason Cooper
2013-04-18 13:51                   ` Catalin Marinas
2013-04-21 22:06                     ` Simon Baatz
2013-04-30 11:22                       ` Catalin Marinas
2013-04-30 21:04                         ` Simon Baatz
2013-05-01 14:22                           ` Catalin Marinas
2013-05-01 19:04                             ` Simon Baatz
2013-05-02  9:54                               ` Catalin Marinas
2013-05-02 19:38                                 ` Simon Baatz
2013-05-03 10:02                                   ` Catalin Marinas
2013-05-04  8:21                                     ` Ming Lei
2013-05-08 15:08                                       ` Catalin Marinas
2013-05-08 18:43                                         ` Russell King - ARM Linux
2013-05-08 19:31                                           ` Simon Baatz
2013-05-08 21:13                                             ` Catalin Marinas
2013-05-09  1:52                                         ` Ming Lei
2013-05-11  6:27                                       ` Simon Baatz
2013-05-13  3:12                                         ` Ming Lei
2013-05-05 22:26                                     ` Simon Baatz
2013-05-08 15:28                                       ` Catalin Marinas
2013-04-18 19:39                   ` Simon Baatz
2013-04-18 19:00                 ` Simon Baatz

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.