* [PATCH] mm: hugetlb: flush dcache before returning zeroed huge page to userspace @ 2012-07-04 14:32 ` Will Deacon 0 siblings, 0 replies; 48+ messages in thread From: Will Deacon @ 2012-07-04 14:32 UTC (permalink / raw) To: linux-mm; +Cc: akpm, dhillf, mhocko, linux-kernel, Will Deacon When allocating and returning clear huge pages to userspace as a response to a fault, we may zero and return a mapping to a previously dirtied physical region (for example, it may have been written by a private mapping which was freed as a result of an ftruncate on the backing file). On architectures with Harvard caches, this can lead to I/D inconsistency since the zeroed view may not be visible to the instruction stream. This patch solves the problem by flushing the region after allocating and clearing a new huge page. Note that PowerPC avoids this issue by performing the flushing in their clear_user_page implementation to keep the loader happy, however this is closely tied to the semantics of the PG_arch_1 page flag which is architecture-specific. Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> --- mm/hugetlb.c | 1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index e198831..b83d026 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -2646,6 +2646,7 @@ retry: goto out; } clear_huge_page(page, address, pages_per_huge_page(h)); + flush_dcache_page(page); __SetPageUptodate(page); if (vma->vm_flags & VM_MAYSHARE) { -- 1.7.4.1 ^ permalink raw reply related [flat|nested] 48+ messages in thread
* [PATCH] mm: hugetlb: flush dcache before returning zeroed huge page to userspace @ 2012-07-04 14:32 ` Will Deacon 0 siblings, 0 replies; 48+ messages in thread From: Will Deacon @ 2012-07-04 14:32 UTC (permalink / raw) To: linux-mm; +Cc: akpm, dhillf, mhocko, linux-kernel, Will Deacon When allocating and returning clear huge pages to userspace as a response to a fault, we may zero and return a mapping to a previously dirtied physical region (for example, it may have been written by a private mapping which was freed as a result of an ftruncate on the backing file). On architectures with Harvard caches, this can lead to I/D inconsistency since the zeroed view may not be visible to the instruction stream. This patch solves the problem by flushing the region after allocating and clearing a new huge page. Note that PowerPC avoids this issue by performing the flushing in their clear_user_page implementation to keep the loader happy, however this is closely tied to the semantics of the PG_arch_1 page flag which is architecture-specific. Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> --- mm/hugetlb.c | 1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index e198831..b83d026 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -2646,6 +2646,7 @@ retry: goto out; } clear_huge_page(page, address, pages_per_huge_page(h)); + flush_dcache_page(page); __SetPageUptodate(page); if (vma->vm_flags & VM_MAYSHARE) { -- 1.7.4.1 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply related [flat|nested] 48+ messages in thread
* Re: [PATCH] mm: hugetlb: flush dcache before returning zeroed huge page to userspace 2012-07-04 14:32 ` Will Deacon @ 2012-07-05 12:37 ` Hillf Danton -1 siblings, 0 replies; 48+ messages in thread From: Hillf Danton @ 2012-07-05 12:37 UTC (permalink / raw) To: Will Deacon; +Cc: linux-mm, akpm, mhocko, linux-kernel On Wed, Jul 4, 2012 at 10:32 PM, Will Deacon <will.deacon@arm.com> wrote: > When allocating and returning clear huge pages to userspace as a > response to a fault, we may zero and return a mapping to a previously > dirtied physical region (for example, it may have been written by > a private mapping which was freed as a result of an ftruncate on the > backing file). On architectures with Harvard caches, this can lead to > I/D inconsistency since the zeroed view may not be visible to the > instruction stream. > > This patch solves the problem by flushing the region after allocating > and clearing a new huge page. Note that PowerPC avoids this issue by > performing the flushing in their clear_user_page implementation to keep > the loader happy, however this is closely tied to the semantics of the > PG_arch_1 page flag which is architecture-specific. > > Acked-by: Catalin Marinas <catalin.marinas@arm.com> > Signed-off-by: Will Deacon <will.deacon@arm.com> > --- Thanks:) Acked-by: Hillf Danton <dhillf@gmail.com> > mm/hugetlb.c | 1 + > 1 files changed, 1 insertions(+), 0 deletions(-) > > diff --git a/mm/hugetlb.c b/mm/hugetlb.c > index e198831..b83d026 100644 > --- a/mm/hugetlb.c > +++ b/mm/hugetlb.c > @@ -2646,6 +2646,7 @@ retry: > goto out; > } > clear_huge_page(page, address, pages_per_huge_page(h)); > + flush_dcache_page(page); > __SetPageUptodate(page); > > if (vma->vm_flags & VM_MAYSHARE) { > -- > 1.7.4.1 > ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [PATCH] mm: hugetlb: flush dcache before returning zeroed huge page to userspace @ 2012-07-05 12:37 ` Hillf Danton 0 siblings, 0 replies; 48+ messages in thread From: Hillf Danton @ 2012-07-05 12:37 UTC (permalink / raw) To: Will Deacon; +Cc: linux-mm, akpm, mhocko, linux-kernel On Wed, Jul 4, 2012 at 10:32 PM, Will Deacon <will.deacon@arm.com> wrote: > When allocating and returning clear huge pages to userspace as a > response to a fault, we may zero and return a mapping to a previously > dirtied physical region (for example, it may have been written by > a private mapping which was freed as a result of an ftruncate on the > backing file). On architectures with Harvard caches, this can lead to > I/D inconsistency since the zeroed view may not be visible to the > instruction stream. > > This patch solves the problem by flushing the region after allocating > and clearing a new huge page. Note that PowerPC avoids this issue by > performing the flushing in their clear_user_page implementation to keep > the loader happy, however this is closely tied to the semantics of the > PG_arch_1 page flag which is architecture-specific. > > Acked-by: Catalin Marinas <catalin.marinas@arm.com> > Signed-off-by: Will Deacon <will.deacon@arm.com> > --- Thanks:) Acked-by: Hillf Danton <dhillf@gmail.com> > mm/hugetlb.c | 1 + > 1 files changed, 1 insertions(+), 0 deletions(-) > > diff --git a/mm/hugetlb.c b/mm/hugetlb.c > index e198831..b83d026 100644 > --- a/mm/hugetlb.c > +++ b/mm/hugetlb.c > @@ -2646,6 +2646,7 @@ retry: > goto out; > } > clear_huge_page(page, address, pages_per_huge_page(h)); > + flush_dcache_page(page); > __SetPageUptodate(page); > > if (vma->vm_flags & VM_MAYSHARE) { > -- > 1.7.4.1 > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [PATCH] mm: hugetlb: flush dcache before returning zeroed huge page to userspace 2012-07-05 12:37 ` Hillf Danton @ 2012-07-05 14:17 ` Will Deacon -1 siblings, 0 replies; 48+ messages in thread From: Will Deacon @ 2012-07-05 14:17 UTC (permalink / raw) To: Hillf Danton; +Cc: linux-mm, akpm, mhocko, linux-kernel On Thu, Jul 05, 2012 at 01:37:46PM +0100, Hillf Danton wrote: > On Wed, Jul 4, 2012 at 10:32 PM, Will Deacon <will.deacon@arm.com> wrote: > > When allocating and returning clear huge pages to userspace as a > > response to a fault, we may zero and return a mapping to a previously > > dirtied physical region (for example, it may have been written by > > a private mapping which was freed as a result of an ftruncate on the > > backing file). On architectures with Harvard caches, this can lead to > > I/D inconsistency since the zeroed view may not be visible to the > > instruction stream. > > > > This patch solves the problem by flushing the region after allocating > > and clearing a new huge page. Note that PowerPC avoids this issue by > > performing the flushing in their clear_user_page implementation to keep > > the loader happy, however this is closely tied to the semantics of the > > PG_arch_1 page flag which is architecture-specific. > > > > Acked-by: Catalin Marinas <catalin.marinas@arm.com> > > Signed-off-by: Will Deacon <will.deacon@arm.com> > > --- > > Thanks:) > > Acked-by: Hillf Danton <dhillf@gmail.com> Thanks Hillf. Which tree does this stuff usually go through? Will ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [PATCH] mm: hugetlb: flush dcache before returning zeroed huge page to userspace @ 2012-07-05 14:17 ` Will Deacon 0 siblings, 0 replies; 48+ messages in thread From: Will Deacon @ 2012-07-05 14:17 UTC (permalink / raw) To: Hillf Danton; +Cc: linux-mm, akpm, mhocko, linux-kernel On Thu, Jul 05, 2012 at 01:37:46PM +0100, Hillf Danton wrote: > On Wed, Jul 4, 2012 at 10:32 PM, Will Deacon <will.deacon@arm.com> wrote: > > When allocating and returning clear huge pages to userspace as a > > response to a fault, we may zero and return a mapping to a previously > > dirtied physical region (for example, it may have been written by > > a private mapping which was freed as a result of an ftruncate on the > > backing file). On architectures with Harvard caches, this can lead to > > I/D inconsistency since the zeroed view may not be visible to the > > instruction stream. > > > > This patch solves the problem by flushing the region after allocating > > and clearing a new huge page. Note that PowerPC avoids this issue by > > performing the flushing in their clear_user_page implementation to keep > > the loader happy, however this is closely tied to the semantics of the > > PG_arch_1 page flag which is architecture-specific. > > > > Acked-by: Catalin Marinas <catalin.marinas@arm.com> > > Signed-off-by: Will Deacon <will.deacon@arm.com> > > --- > > Thanks:) > > Acked-by: Hillf Danton <dhillf@gmail.com> Thanks Hillf. Which tree does this stuff usually go through? Will -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [PATCH] mm: hugetlb: flush dcache before returning zeroed huge page to userspace 2012-07-05 14:17 ` Will Deacon @ 2012-07-06 13:15 ` Hillf Danton -1 siblings, 0 replies; 48+ messages in thread From: Hillf Danton @ 2012-07-06 13:15 UTC (permalink / raw) To: Will Deacon; +Cc: linux-mm, akpm, mhocko, linux-kernel On Thu, Jul 5, 2012 at 10:17 PM, Will Deacon <will.deacon@arm.com> wrote: > > Which tree does this stuff usually go through? mm --> next --> linus ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [PATCH] mm: hugetlb: flush dcache before returning zeroed huge page to userspace @ 2012-07-06 13:15 ` Hillf Danton 0 siblings, 0 replies; 48+ messages in thread From: Hillf Danton @ 2012-07-06 13:15 UTC (permalink / raw) To: Will Deacon; +Cc: linux-mm, akpm, mhocko, linux-kernel On Thu, Jul 5, 2012 at 10:17 PM, Will Deacon <will.deacon@arm.com> wrote: > > Which tree does this stuff usually go through? mm --> next --> linus -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [PATCH] mm: hugetlb: flush dcache before returning zeroed huge page to userspace 2012-07-04 14:32 ` Will Deacon @ 2012-07-09 12:25 ` Michal Hocko -1 siblings, 0 replies; 48+ messages in thread From: Michal Hocko @ 2012-07-09 12:25 UTC (permalink / raw) To: Will Deacon; +Cc: linux-mm, akpm, dhillf, linux-kernel On Wed 04-07-12 15:32:56, Will Deacon wrote: > When allocating and returning clear huge pages to userspace as a > response to a fault, we may zero and return a mapping to a previously > dirtied physical region (for example, it may have been written by > a private mapping which was freed as a result of an ftruncate on the > backing file). On architectures with Harvard caches, this can lead to > I/D inconsistency since the zeroed view may not be visible to the > instruction stream. > > This patch solves the problem by flushing the region after allocating > and clearing a new huge page. Note that PowerPC avoids this issue by > performing the flushing in their clear_user_page implementation to keep > the loader happy, however this is closely tied to the semantics of the > PG_arch_1 page flag which is architecture-specific. > > Acked-by: Catalin Marinas <catalin.marinas@arm.com> > Signed-off-by: Will Deacon <will.deacon@arm.com> > --- > mm/hugetlb.c | 1 + > 1 files changed, 1 insertions(+), 0 deletions(-) > > diff --git a/mm/hugetlb.c b/mm/hugetlb.c > index e198831..b83d026 100644 > --- a/mm/hugetlb.c > +++ b/mm/hugetlb.c > @@ -2646,6 +2646,7 @@ retry: > goto out; > } > clear_huge_page(page, address, pages_per_huge_page(h)); > + flush_dcache_page(page); > __SetPageUptodate(page); Does this have to be explicit in the arch independent code? It seems that ia64 uses flush_dcache_page already in the clear_user_page > > if (vma->vm_flags & VM_MAYSHARE) { > -- > 1.7.4.1 > > -- > To unsubscribe, send a message with 'unsubscribe linux-mm' in > the body to majordomo@kvack.org. For more info on Linux MM, > see: http://www.linux-mm.org/ . > Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> -- Michal Hocko SUSE Labs SUSE LINUX s.r.o. Lihovarska 1060/12 190 00 Praha 9 Czech Republic ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [PATCH] mm: hugetlb: flush dcache before returning zeroed huge page to userspace @ 2012-07-09 12:25 ` Michal Hocko 0 siblings, 0 replies; 48+ messages in thread From: Michal Hocko @ 2012-07-09 12:25 UTC (permalink / raw) To: Will Deacon; +Cc: linux-mm, akpm, dhillf, linux-kernel On Wed 04-07-12 15:32:56, Will Deacon wrote: > When allocating and returning clear huge pages to userspace as a > response to a fault, we may zero and return a mapping to a previously > dirtied physical region (for example, it may have been written by > a private mapping which was freed as a result of an ftruncate on the > backing file). On architectures with Harvard caches, this can lead to > I/D inconsistency since the zeroed view may not be visible to the > instruction stream. > > This patch solves the problem by flushing the region after allocating > and clearing a new huge page. Note that PowerPC avoids this issue by > performing the flushing in their clear_user_page implementation to keep > the loader happy, however this is closely tied to the semantics of the > PG_arch_1 page flag which is architecture-specific. > > Acked-by: Catalin Marinas <catalin.marinas@arm.com> > Signed-off-by: Will Deacon <will.deacon@arm.com> > --- > mm/hugetlb.c | 1 + > 1 files changed, 1 insertions(+), 0 deletions(-) > > diff --git a/mm/hugetlb.c b/mm/hugetlb.c > index e198831..b83d026 100644 > --- a/mm/hugetlb.c > +++ b/mm/hugetlb.c > @@ -2646,6 +2646,7 @@ retry: > goto out; > } > clear_huge_page(page, address, pages_per_huge_page(h)); > + flush_dcache_page(page); > __SetPageUptodate(page); Does this have to be explicit in the arch independent code? It seems that ia64 uses flush_dcache_page already in the clear_user_page > > if (vma->vm_flags & VM_MAYSHARE) { > -- > 1.7.4.1 > > -- > To unsubscribe, send a message with 'unsubscribe linux-mm' in > the body to majordomo@kvack.org. For more info on Linux MM, > see: http://www.linux-mm.org/ . > Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> -- Michal Hocko SUSE Labs SUSE LINUX s.r.o. Lihovarska 1060/12 190 00 Praha 9 Czech Republic -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [PATCH] mm: hugetlb: flush dcache before returning zeroed huge page to userspace 2012-07-09 12:25 ` Michal Hocko @ 2012-07-09 14:13 ` Will Deacon -1 siblings, 0 replies; 48+ messages in thread From: Will Deacon @ 2012-07-09 14:13 UTC (permalink / raw) To: Michal Hocko; +Cc: linux-mm, akpm, dhillf, linux-kernel On Mon, Jul 09, 2012 at 01:25:23PM +0100, Michal Hocko wrote: > On Wed 04-07-12 15:32:56, Will Deacon wrote: > > When allocating and returning clear huge pages to userspace as a > > response to a fault, we may zero and return a mapping to a previously > > dirtied physical region (for example, it may have been written by > > a private mapping which was freed as a result of an ftruncate on the > > backing file). On architectures with Harvard caches, this can lead to > > I/D inconsistency since the zeroed view may not be visible to the > > instruction stream. > > > > This patch solves the problem by flushing the region after allocating > > and clearing a new huge page. Note that PowerPC avoids this issue by > > performing the flushing in their clear_user_page implementation to keep > > the loader happy, however this is closely tied to the semantics of the > > PG_arch_1 page flag which is architecture-specific. > > > > Acked-by: Catalin Marinas <catalin.marinas@arm.com> > > Signed-off-by: Will Deacon <will.deacon@arm.com> > > --- > > mm/hugetlb.c | 1 + > > 1 files changed, 1 insertions(+), 0 deletions(-) > > > > diff --git a/mm/hugetlb.c b/mm/hugetlb.c > > index e198831..b83d026 100644 > > --- a/mm/hugetlb.c > > +++ b/mm/hugetlb.c > > @@ -2646,6 +2646,7 @@ retry: > > goto out; > > } > > clear_huge_page(page, address, pages_per_huge_page(h)); > > + flush_dcache_page(page); > > __SetPageUptodate(page); > > Does this have to be explicit in the arch independent code? > It seems that ia64 uses flush_dcache_page already in the clear_user_page It would match what is done in similar situations by cow_user_page (mm/memory.c) and shmem_writepage (mm/shmem.c). Other subsystems also have explicit page flushing (DMA bounce, ksm) so I think this is the right place for it. Will ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [PATCH] mm: hugetlb: flush dcache before returning zeroed huge page to userspace @ 2012-07-09 14:13 ` Will Deacon 0 siblings, 0 replies; 48+ messages in thread From: Will Deacon @ 2012-07-09 14:13 UTC (permalink / raw) To: Michal Hocko; +Cc: linux-mm, akpm, dhillf, linux-kernel On Mon, Jul 09, 2012 at 01:25:23PM +0100, Michal Hocko wrote: > On Wed 04-07-12 15:32:56, Will Deacon wrote: > > When allocating and returning clear huge pages to userspace as a > > response to a fault, we may zero and return a mapping to a previously > > dirtied physical region (for example, it may have been written by > > a private mapping which was freed as a result of an ftruncate on the > > backing file). On architectures with Harvard caches, this can lead to > > I/D inconsistency since the zeroed view may not be visible to the > > instruction stream. > > > > This patch solves the problem by flushing the region after allocating > > and clearing a new huge page. Note that PowerPC avoids this issue by > > performing the flushing in their clear_user_page implementation to keep > > the loader happy, however this is closely tied to the semantics of the > > PG_arch_1 page flag which is architecture-specific. > > > > Acked-by: Catalin Marinas <catalin.marinas@arm.com> > > Signed-off-by: Will Deacon <will.deacon@arm.com> > > --- > > mm/hugetlb.c | 1 + > > 1 files changed, 1 insertions(+), 0 deletions(-) > > > > diff --git a/mm/hugetlb.c b/mm/hugetlb.c > > index e198831..b83d026 100644 > > --- a/mm/hugetlb.c > > +++ b/mm/hugetlb.c > > @@ -2646,6 +2646,7 @@ retry: > > goto out; > > } > > clear_huge_page(page, address, pages_per_huge_page(h)); > > + flush_dcache_page(page); > > __SetPageUptodate(page); > > Does this have to be explicit in the arch independent code? > It seems that ia64 uses flush_dcache_page already in the clear_user_page It would match what is done in similar situations by cow_user_page (mm/memory.c) and shmem_writepage (mm/shmem.c). Other subsystems also have explicit page flushing (DMA bounce, ksm) so I think this is the right place for it. Will -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [PATCH] mm: hugetlb: flush dcache before returning zeroed huge page to userspace 2012-07-09 14:13 ` Will Deacon @ 2012-07-09 23:57 ` Hugh Dickins -1 siblings, 0 replies; 48+ messages in thread From: Hugh Dickins @ 2012-07-09 23:57 UTC (permalink / raw) To: Will Deacon Cc: Michal Hocko, Andrew Morton, Hillf Danton, linux-arch, linux-kernel, linux-mm On Mon, 9 Jul 2012, Will Deacon wrote: > On Mon, Jul 09, 2012 at 01:25:23PM +0100, Michal Hocko wrote: > > On Wed 04-07-12 15:32:56, Will Deacon wrote: > > > When allocating and returning clear huge pages to userspace as a > > > response to a fault, we may zero and return a mapping to a previously > > > dirtied physical region (for example, it may have been written by > > > a private mapping which was freed as a result of an ftruncate on the > > > backing file). On architectures with Harvard caches, this can lead to > > > I/D inconsistency since the zeroed view may not be visible to the > > > instruction stream. > > > > > > This patch solves the problem by flushing the region after allocating > > > and clearing a new huge page. Note that PowerPC avoids this issue by > > > performing the flushing in their clear_user_page implementation to keep > > > the loader happy, however this is closely tied to the semantics of the > > > PG_arch_1 page flag which is architecture-specific. > > > > > > Acked-by: Catalin Marinas <catalin.marinas@arm.com> > > > Signed-off-by: Will Deacon <will.deacon@arm.com> > > > --- > > > mm/hugetlb.c | 1 + > > > 1 files changed, 1 insertions(+), 0 deletions(-) > > > > > > diff --git a/mm/hugetlb.c b/mm/hugetlb.c > > > index e198831..b83d026 100644 > > > --- a/mm/hugetlb.c > > > +++ b/mm/hugetlb.c > > > @@ -2646,6 +2646,7 @@ retry: > > > goto out; > > > } > > > clear_huge_page(page, address, pages_per_huge_page(h)); > > > + flush_dcache_page(page); > > > __SetPageUptodate(page); > > > > Does this have to be explicit in the arch independent code? > > It seems that ia64 uses flush_dcache_page already in the clear_user_page > > It would match what is done in similar situations by cow_user_page (mm/memory.c) > and shmem_writepage (mm/shmem.c). Other subsystems also have explicit page > flushing (DMA bounce, ksm) so I think this is the right place for it. I am not at all sure if you are right or not: please let's consult linux-arch about this - now Cc'ed. If this hugetlb_no_page() were solely mapping the hugepage into that userspace, I would say you are wrong. It's the job of clear_huge_page() to take the mapped address into account, and pass it down to the architecture-specific implementation, to do whatever flushing is needed - you should be providing that in your architecture. In particular, notice how clear_huge_page() goes round a loop of clear_user_highpage()s: in your patch, you're expecting the implementation of flush_dcache_page() to notice whether or not this is a hugepage, and flush the appropriate size. Perhaps yours is the only architecture to need this on huge, and your flush_dcache_page() implements it correctly; but it does seem surprising. If I start to grep the architectures for non-empty flush_dcache_page(), I soon find things in arch/arm such as v4_mc_copy_user_highpage() doing if (!test_and_set_bit(PG_dcache_clean,)) __flush_dcache_page() - where the naming suggests that I'm right, it's the architecture's responsibility to arrange whatever flushing is needed in its copy and clear page functions. But... this hugetlb_no_page() has a VM_MAYSHARE case below, which puts the new page into page cache, making it accessible by other processes: that may indeed be reason for flush_dcache_page() there - or a loop of flush_dcache_page()s. But I worry then that in the !VM_MAYSHARE case you would be duplicating expensive flushes: perhaps they should be restricted to the VM_MAYSHARE block. Hugh ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [PATCH] mm: hugetlb: flush dcache before returning zeroed huge page to userspace @ 2012-07-09 23:57 ` Hugh Dickins 0 siblings, 0 replies; 48+ messages in thread From: Hugh Dickins @ 2012-07-09 23:57 UTC (permalink / raw) To: Will Deacon Cc: Michal Hocko, Andrew Morton, Hillf Danton, linux-arch, linux-kernel, linux-mm On Mon, 9 Jul 2012, Will Deacon wrote: > On Mon, Jul 09, 2012 at 01:25:23PM +0100, Michal Hocko wrote: > > On Wed 04-07-12 15:32:56, Will Deacon wrote: > > > When allocating and returning clear huge pages to userspace as a > > > response to a fault, we may zero and return a mapping to a previously > > > dirtied physical region (for example, it may have been written by > > > a private mapping which was freed as a result of an ftruncate on the > > > backing file). On architectures with Harvard caches, this can lead to > > > I/D inconsistency since the zeroed view may not be visible to the > > > instruction stream. > > > > > > This patch solves the problem by flushing the region after allocating > > > and clearing a new huge page. Note that PowerPC avoids this issue by > > > performing the flushing in their clear_user_page implementation to keep > > > the loader happy, however this is closely tied to the semantics of the > > > PG_arch_1 page flag which is architecture-specific. > > > > > > Acked-by: Catalin Marinas <catalin.marinas@arm.com> > > > Signed-off-by: Will Deacon <will.deacon@arm.com> > > > --- > > > mm/hugetlb.c | 1 + > > > 1 files changed, 1 insertions(+), 0 deletions(-) > > > > > > diff --git a/mm/hugetlb.c b/mm/hugetlb.c > > > index e198831..b83d026 100644 > > > --- a/mm/hugetlb.c > > > +++ b/mm/hugetlb.c > > > @@ -2646,6 +2646,7 @@ retry: > > > goto out; > > > } > > > clear_huge_page(page, address, pages_per_huge_page(h)); > > > + flush_dcache_page(page); > > > __SetPageUptodate(page); > > > > Does this have to be explicit in the arch independent code? > > It seems that ia64 uses flush_dcache_page already in the clear_user_page > > It would match what is done in similar situations by cow_user_page (mm/memory.c) > and shmem_writepage (mm/shmem.c). Other subsystems also have explicit page > flushing (DMA bounce, ksm) so I think this is the right place for it. I am not at all sure if you are right or not: please let's consult linux-arch about this - now Cc'ed. If this hugetlb_no_page() were solely mapping the hugepage into that userspace, I would say you are wrong. It's the job of clear_huge_page() to take the mapped address into account, and pass it down to the architecture-specific implementation, to do whatever flushing is needed - you should be providing that in your architecture. In particular, notice how clear_huge_page() goes round a loop of clear_user_highpage()s: in your patch, you're expecting the implementation of flush_dcache_page() to notice whether or not this is a hugepage, and flush the appropriate size. Perhaps yours is the only architecture to need this on huge, and your flush_dcache_page() implements it correctly; but it does seem surprising. If I start to grep the architectures for non-empty flush_dcache_page(), I soon find things in arch/arm such as v4_mc_copy_user_highpage() doing if (!test_and_set_bit(PG_dcache_clean,)) __flush_dcache_page() - where the naming suggests that I'm right, it's the architecture's responsibility to arrange whatever flushing is needed in its copy and clear page functions. But... this hugetlb_no_page() has a VM_MAYSHARE case below, which puts the new page into page cache, making it accessible by other processes: that may indeed be reason for flush_dcache_page() there - or a loop of flush_dcache_page()s. But I worry then that in the !VM_MAYSHARE case you would be duplicating expensive flushes: perhaps they should be restricted to the VM_MAYSHARE block. Hugh -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [PATCH] mm: hugetlb: flush dcache before returning zeroed huge page to userspace 2012-07-09 23:57 ` Hugh Dickins @ 2012-07-10 9:45 ` Will Deacon -1 siblings, 0 replies; 48+ messages in thread From: Will Deacon @ 2012-07-10 9:45 UTC (permalink / raw) To: Hugh Dickins Cc: Michal Hocko, Andrew Morton, Hillf Danton, linux-arch, linux-kernel, linux-mm Hi Hugh, Cheers for looking at this. On Tue, Jul 10, 2012 at 12:57:14AM +0100, Hugh Dickins wrote: > On Mon, 9 Jul 2012, Will Deacon wrote: > > On Mon, Jul 09, 2012 at 01:25:23PM +0100, Michal Hocko wrote: > > > On Wed 04-07-12 15:32:56, Will Deacon wrote: > > > > diff --git a/mm/hugetlb.c b/mm/hugetlb.c > > > > index e198831..b83d026 100644 > > > > --- a/mm/hugetlb.c > > > > +++ b/mm/hugetlb.c > > > > @@ -2646,6 +2646,7 @@ retry: > > > > goto out; > > > > } > > > > clear_huge_page(page, address, pages_per_huge_page(h)); > > > > + flush_dcache_page(page); > > > > __SetPageUptodate(page); > > > > > > Does this have to be explicit in the arch independent code? > > > It seems that ia64 uses flush_dcache_page already in the clear_user_page > > > > It would match what is done in similar situations by cow_user_page (mm/memory.c) > > and shmem_writepage (mm/shmem.c). Other subsystems also have explicit page > > flushing (DMA bounce, ksm) so I think this is the right place for it. > > I am not at all sure if you are right or not: > please let's consult linux-arch about this - now Cc'ed. I assume that by `linux-arch' you mean BenH :) > If this hugetlb_no_page() were solely mapping the hugepage into that > userspace, I would say you are wrong. It's the job of clear_huge_page() > to take the mapped address into account, and pass it down to the > architecture-specific implementation, to do whatever flushing is > needed - you should be providing that in your architecture. > > In particular, notice how clear_huge_page() goes round a loop of > clear_user_highpage()s: in your patch, you're expecting the implementation > of flush_dcache_page() to notice whether or not this is a hugepage, and > flush the appropriate size. > > Perhaps yours is the only architecture to need this on huge, and your > flush_dcache_page() implements it correctly; but it does seem surprising. ARM doesn't yet have hugetlb support in mainline so we can do whatever people prefer. I think it makes sense for flush_dcache_page to be hugepage aware so that we can sync the caches when installing huge ptes using the same code as normal ptes. Of course, that may not be the same across all architectures, so you certainly have a valid point. > If I start to grep the architectures for non-empty flush_dcache_page(), > I soon find things in arch/arm such as v4_mc_copy_user_highpage() doing > if (!test_and_set_bit(PG_dcache_clean,)) __flush_dcache_page() - where > the naming suggests that I'm right, it's the architecture's responsibility > to arrange whatever flushing is needed in its copy and clear page functions. On ARM the flushing is there to deal with dcache aliasing and highmem, so the clear/copy functions won't actually do explicit flushing on modern (ARMv7) cores. Instead we flush the page when writing the pte and noticing that PG_arch_1 (PG_dcache_clean) is clear... ...so the real question is why this wasn't being triggered for huge pages. I'll go and take another look since I would expect PG_arch_1 to be cleared for pages coming back from alloc_huge_page. > But... this hugetlb_no_page() has a VM_MAYSHARE case below, which puts > the new page into page cache, making it accessible by other processes: > that may indeed be reason for flush_dcache_page() there - or a loop of > flush_dcache_page()s. But I worry then that in the !VM_MAYSHARE case > you would be duplicating expensive flushes: perhaps they should be > restricted to the VM_MAYSHARE block. If the PG_arch_1 flag does its job, duplicate flushes shouldn't be a problem. Thanks, Will ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [PATCH] mm: hugetlb: flush dcache before returning zeroed huge page to userspace @ 2012-07-10 9:45 ` Will Deacon 0 siblings, 0 replies; 48+ messages in thread From: Will Deacon @ 2012-07-10 9:45 UTC (permalink / raw) To: Hugh Dickins Cc: Michal Hocko, Andrew Morton, Hillf Danton, linux-arch, linux-kernel, linux-mm Hi Hugh, Cheers for looking at this. On Tue, Jul 10, 2012 at 12:57:14AM +0100, Hugh Dickins wrote: > On Mon, 9 Jul 2012, Will Deacon wrote: > > On Mon, Jul 09, 2012 at 01:25:23PM +0100, Michal Hocko wrote: > > > On Wed 04-07-12 15:32:56, Will Deacon wrote: > > > > diff --git a/mm/hugetlb.c b/mm/hugetlb.c > > > > index e198831..b83d026 100644 > > > > --- a/mm/hugetlb.c > > > > +++ b/mm/hugetlb.c > > > > @@ -2646,6 +2646,7 @@ retry: > > > > goto out; > > > > } > > > > clear_huge_page(page, address, pages_per_huge_page(h)); > > > > + flush_dcache_page(page); > > > > __SetPageUptodate(page); > > > > > > Does this have to be explicit in the arch independent code? > > > It seems that ia64 uses flush_dcache_page already in the clear_user_page > > > > It would match what is done in similar situations by cow_user_page (mm/memory.c) > > and shmem_writepage (mm/shmem.c). Other subsystems also have explicit page > > flushing (DMA bounce, ksm) so I think this is the right place for it. > > I am not at all sure if you are right or not: > please let's consult linux-arch about this - now Cc'ed. I assume that by `linux-arch' you mean BenH :) > If this hugetlb_no_page() were solely mapping the hugepage into that > userspace, I would say you are wrong. It's the job of clear_huge_page() > to take the mapped address into account, and pass it down to the > architecture-specific implementation, to do whatever flushing is > needed - you should be providing that in your architecture. > > In particular, notice how clear_huge_page() goes round a loop of > clear_user_highpage()s: in your patch, you're expecting the implementation > of flush_dcache_page() to notice whether or not this is a hugepage, and > flush the appropriate size. > > Perhaps yours is the only architecture to need this on huge, and your > flush_dcache_page() implements it correctly; but it does seem surprising. ARM doesn't yet have hugetlb support in mainline so we can do whatever people prefer. I think it makes sense for flush_dcache_page to be hugepage aware so that we can sync the caches when installing huge ptes using the same code as normal ptes. Of course, that may not be the same across all architectures, so you certainly have a valid point. > If I start to grep the architectures for non-empty flush_dcache_page(), > I soon find things in arch/arm such as v4_mc_copy_user_highpage() doing > if (!test_and_set_bit(PG_dcache_clean,)) __flush_dcache_page() - where > the naming suggests that I'm right, it's the architecture's responsibility > to arrange whatever flushing is needed in its copy and clear page functions. On ARM the flushing is there to deal with dcache aliasing and highmem, so the clear/copy functions won't actually do explicit flushing on modern (ARMv7) cores. Instead we flush the page when writing the pte and noticing that PG_arch_1 (PG_dcache_clean) is clear... ...so the real question is why this wasn't being triggered for huge pages. I'll go and take another look since I would expect PG_arch_1 to be cleared for pages coming back from alloc_huge_page. > But... this hugetlb_no_page() has a VM_MAYSHARE case below, which puts > the new page into page cache, making it accessible by other processes: > that may indeed be reason for flush_dcache_page() there - or a loop of > flush_dcache_page()s. But I worry then that in the !VM_MAYSHARE case > you would be duplicating expensive flushes: perhaps they should be > restricted to the VM_MAYSHARE block. If the PG_arch_1 flag does its job, duplicate flushes shouldn't be a problem. Thanks, Will -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [PATCH] mm: hugetlb: flush dcache before returning zeroed huge page to userspace 2012-07-10 9:45 ` Will Deacon @ 2012-07-10 10:42 ` Will Deacon -1 siblings, 0 replies; 48+ messages in thread From: Will Deacon @ 2012-07-10 10:42 UTC (permalink / raw) To: Hugh Dickins Cc: Michal Hocko, Andrew Morton, Hillf Danton, linux-arch, linux-kernel, linux-mm On Tue, Jul 10, 2012 at 10:45:13AM +0100, Will Deacon wrote: > On Tue, Jul 10, 2012 at 12:57:14AM +0100, Hugh Dickins wrote: > > If I start to grep the architectures for non-empty flush_dcache_page(), > > I soon find things in arch/arm such as v4_mc_copy_user_highpage() doing > > if (!test_and_set_bit(PG_dcache_clean,)) __flush_dcache_page() - where > > the naming suggests that I'm right, it's the architecture's responsibility > > to arrange whatever flushing is needed in its copy and clear page functions. > > On ARM the flushing is there to deal with dcache aliasing and highmem, so the > clear/copy functions won't actually do explicit flushing on modern (ARMv7) > cores. Instead we flush the page when writing the pte and noticing that > PG_arch_1 (PG_dcache_clean) is clear... > > ...so the real question is why this wasn't being triggered for huge pages. > I'll go and take another look since I would expect PG_arch_1 to be cleared > for pages coming back from alloc_huge_page. Ok, so this is exactly the problem. The hugetlb allocator uses its own pool of huge pages, so free_huge_page followed by a later alloc_huge_page will give you something where the page flags of the compound head do not guarantee that PG_arch_1 is clear. I tried hacking arch_release_hugepage to clear the bit, but that's only called when actually releasing the hugepages via __free_pages which is precisely the case that works correctly anyway! Will ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [PATCH] mm: hugetlb: flush dcache before returning zeroed huge page to userspace @ 2012-07-10 10:42 ` Will Deacon 0 siblings, 0 replies; 48+ messages in thread From: Will Deacon @ 2012-07-10 10:42 UTC (permalink / raw) To: Hugh Dickins Cc: Michal Hocko, Andrew Morton, Hillf Danton, linux-arch, linux-kernel, linux-mm On Tue, Jul 10, 2012 at 10:45:13AM +0100, Will Deacon wrote: > On Tue, Jul 10, 2012 at 12:57:14AM +0100, Hugh Dickins wrote: > > If I start to grep the architectures for non-empty flush_dcache_page(), > > I soon find things in arch/arm such as v4_mc_copy_user_highpage() doing > > if (!test_and_set_bit(PG_dcache_clean,)) __flush_dcache_page() - where > > the naming suggests that I'm right, it's the architecture's responsibility > > to arrange whatever flushing is needed in its copy and clear page functions. > > On ARM the flushing is there to deal with dcache aliasing and highmem, so the > clear/copy functions won't actually do explicit flushing on modern (ARMv7) > cores. Instead we flush the page when writing the pte and noticing that > PG_arch_1 (PG_dcache_clean) is clear... > > ...so the real question is why this wasn't being triggered for huge pages. > I'll go and take another look since I would expect PG_arch_1 to be cleared > for pages coming back from alloc_huge_page. Ok, so this is exactly the problem. The hugetlb allocator uses its own pool of huge pages, so free_huge_page followed by a later alloc_huge_page will give you something where the page flags of the compound head do not guarantee that PG_arch_1 is clear. I tried hacking arch_release_hugepage to clear the bit, but that's only called when actually releasing the hugepages via __free_pages which is precisely the case that works correctly anyway! Will -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [PATCH] mm: hugetlb: flush dcache before returning zeroed huge page to userspace 2012-07-10 10:42 ` Will Deacon @ 2012-07-11 17:48 ` Will Deacon -1 siblings, 0 replies; 48+ messages in thread From: Will Deacon @ 2012-07-11 17:48 UTC (permalink / raw) To: Hugh Dickins Cc: Michal Hocko, Andrew Morton, Hillf Danton, linux-arch, linux-kernel, linux-mm On Tue, Jul 10, 2012 at 11:42:34AM +0100, Will Deacon wrote: > On Tue, Jul 10, 2012 at 10:45:13AM +0100, Will Deacon wrote: > > On Tue, Jul 10, 2012 at 12:57:14AM +0100, Hugh Dickins wrote: > > > If I start to grep the architectures for non-empty flush_dcache_page(), > > > I soon find things in arch/arm such as v4_mc_copy_user_highpage() doing > > > if (!test_and_set_bit(PG_dcache_clean,)) __flush_dcache_page() - where > > > the naming suggests that I'm right, it's the architecture's responsibility > > > to arrange whatever flushing is needed in its copy and clear page functions. [...] > Ok, so this is exactly the problem. The hugetlb allocator uses its own > pool of huge pages, so free_huge_page followed by a later alloc_huge_page > will give you something where the page flags of the compound head do not > guarantee that PG_arch_1 is clear. Just to confirm, the following quick hack at least results in the correct flushing for me (on ARM): diff --git a/mm/hugetlb.c b/mm/hugetlb.c index e198831..7a7c9d3 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -1141,6 +1141,7 @@ static struct page *alloc_huge_page(struct vm_area_struct *vma, } set_page_private(page, (unsigned long)spool); + clear_bit(PG_arch_1, &page->flags); vma_commit_reservation(h, vma, addr); The question is whether we should tidy that up for the core code or get architectures to clear the bit in arch_make_huge_pte (which also seems to work). Will ^ permalink raw reply related [flat|nested] 48+ messages in thread
* Re: [PATCH] mm: hugetlb: flush dcache before returning zeroed huge page to userspace @ 2012-07-11 17:48 ` Will Deacon 0 siblings, 0 replies; 48+ messages in thread From: Will Deacon @ 2012-07-11 17:48 UTC (permalink / raw) To: Hugh Dickins Cc: Michal Hocko, Andrew Morton, Hillf Danton, linux-arch, linux-kernel, linux-mm On Tue, Jul 10, 2012 at 11:42:34AM +0100, Will Deacon wrote: > On Tue, Jul 10, 2012 at 10:45:13AM +0100, Will Deacon wrote: > > On Tue, Jul 10, 2012 at 12:57:14AM +0100, Hugh Dickins wrote: > > > If I start to grep the architectures for non-empty flush_dcache_page(), > > > I soon find things in arch/arm such as v4_mc_copy_user_highpage() doing > > > if (!test_and_set_bit(PG_dcache_clean,)) __flush_dcache_page() - where > > > the naming suggests that I'm right, it's the architecture's responsibility > > > to arrange whatever flushing is needed in its copy and clear page functions. [...] > Ok, so this is exactly the problem. The hugetlb allocator uses its own > pool of huge pages, so free_huge_page followed by a later alloc_huge_page > will give you something where the page flags of the compound head do not > guarantee that PG_arch_1 is clear. Just to confirm, the following quick hack at least results in the correct flushing for me (on ARM): diff --git a/mm/hugetlb.c b/mm/hugetlb.c index e198831..7a7c9d3 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -1141,6 +1141,7 @@ static struct page *alloc_huge_page(struct vm_area_struct *vma, } set_page_private(page, (unsigned long)spool); + clear_bit(PG_arch_1, &page->flags); vma_commit_reservation(h, vma, addr); The question is whether we should tidy that up for the core code or get architectures to clear the bit in arch_make_huge_pte (which also seems to work). Will -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply related [flat|nested] 48+ messages in thread
* Re: [PATCH] mm: hugetlb: flush dcache before returning zeroed huge page to userspace 2012-07-11 17:48 ` Will Deacon @ 2012-07-12 11:16 ` Michal Hocko -1 siblings, 0 replies; 48+ messages in thread From: Michal Hocko @ 2012-07-12 11:16 UTC (permalink / raw) To: Will Deacon Cc: Hugh Dickins, Andrew Morton, Hillf Danton, linux-arch, linux-kernel, linux-mm On Wed 11-07-12 18:48:02, Will Deacon wrote: > On Tue, Jul 10, 2012 at 11:42:34AM +0100, Will Deacon wrote: > > On Tue, Jul 10, 2012 at 10:45:13AM +0100, Will Deacon wrote: > > > On Tue, Jul 10, 2012 at 12:57:14AM +0100, Hugh Dickins wrote: > > > > If I start to grep the architectures for non-empty flush_dcache_page(), > > > > I soon find things in arch/arm such as v4_mc_copy_user_highpage() doing > > > > if (!test_and_set_bit(PG_dcache_clean,)) __flush_dcache_page() - where > > > > the naming suggests that I'm right, it's the architecture's responsibility > > > > to arrange whatever flushing is needed in its copy and clear page functions. > > [...] > > > Ok, so this is exactly the problem. The hugetlb allocator uses its own > > pool of huge pages, so free_huge_page followed by a later alloc_huge_page > > will give you something where the page flags of the compound head do not > > guarantee that PG_arch_1 is clear. > > Just to confirm, the following quick hack at least results in the correct > flushing for me (on ARM): > > > diff --git a/mm/hugetlb.c b/mm/hugetlb.c > index e198831..7a7c9d3 100644 > --- a/mm/hugetlb.c > +++ b/mm/hugetlb.c > @@ -1141,6 +1141,7 @@ static struct page *alloc_huge_page(struct vm_area_struct *vma, > } > > set_page_private(page, (unsigned long)spool); > + clear_bit(PG_arch_1, &page->flags); > > vma_commit_reservation(h, vma, addr); > > > > The question is whether we should tidy that up for the core code or get > architectures to clear the bit in arch_make_huge_pte (which also seems to > work). This should go into arch specific code IMO. Even the page flag name suggests this shouldn't be in the base code. > > Will -- Michal Hocko SUSE Labs SUSE LINUX s.r.o. Lihovarska 1060/12 190 00 Praha 9 Czech Republic ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [PATCH] mm: hugetlb: flush dcache before returning zeroed huge page to userspace @ 2012-07-12 11:16 ` Michal Hocko 0 siblings, 0 replies; 48+ messages in thread From: Michal Hocko @ 2012-07-12 11:16 UTC (permalink / raw) To: Will Deacon Cc: Hugh Dickins, Andrew Morton, Hillf Danton, linux-arch, linux-kernel, linux-mm On Wed 11-07-12 18:48:02, Will Deacon wrote: > On Tue, Jul 10, 2012 at 11:42:34AM +0100, Will Deacon wrote: > > On Tue, Jul 10, 2012 at 10:45:13AM +0100, Will Deacon wrote: > > > On Tue, Jul 10, 2012 at 12:57:14AM +0100, Hugh Dickins wrote: > > > > If I start to grep the architectures for non-empty flush_dcache_page(), > > > > I soon find things in arch/arm such as v4_mc_copy_user_highpage() doing > > > > if (!test_and_set_bit(PG_dcache_clean,)) __flush_dcache_page() - where > > > > the naming suggests that I'm right, it's the architecture's responsibility > > > > to arrange whatever flushing is needed in its copy and clear page functions. > > [...] > > > Ok, so this is exactly the problem. The hugetlb allocator uses its own > > pool of huge pages, so free_huge_page followed by a later alloc_huge_page > > will give you something where the page flags of the compound head do not > > guarantee that PG_arch_1 is clear. > > Just to confirm, the following quick hack at least results in the correct > flushing for me (on ARM): > > > diff --git a/mm/hugetlb.c b/mm/hugetlb.c > index e198831..7a7c9d3 100644 > --- a/mm/hugetlb.c > +++ b/mm/hugetlb.c > @@ -1141,6 +1141,7 @@ static struct page *alloc_huge_page(struct vm_area_struct *vma, > } > > set_page_private(page, (unsigned long)spool); > + clear_bit(PG_arch_1, &page->flags); > > vma_commit_reservation(h, vma, addr); > > > > The question is whether we should tidy that up for the core code or get > architectures to clear the bit in arch_make_huge_pte (which also seems to > work). This should go into arch specific code IMO. Even the page flag name suggests this shouldn't be in the base code. > > Will -- Michal Hocko SUSE Labs SUSE LINUX s.r.o. Lihovarska 1060/12 190 00 Praha 9 Czech Republic -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [PATCH] mm: hugetlb: flush dcache before returning zeroed huge page to userspace 2012-07-12 11:16 ` Michal Hocko @ 2012-07-12 11:26 ` James Bottomley -1 siblings, 0 replies; 48+ messages in thread From: James Bottomley @ 2012-07-12 11:26 UTC (permalink / raw) To: Michal Hocko Cc: Will Deacon, Hugh Dickins, Andrew Morton, Hillf Danton, linux-arch, linux-kernel, linux-mm On Thu, 2012-07-12 at 13:16 +0200, Michal Hocko wrote: > On Wed 11-07-12 18:48:02, Will Deacon wrote: > > On Tue, Jul 10, 2012 at 11:42:34AM +0100, Will Deacon wrote: > > > On Tue, Jul 10, 2012 at 10:45:13AM +0100, Will Deacon wrote: > > > > On Tue, Jul 10, 2012 at 12:57:14AM +0100, Hugh Dickins wrote: > > > > > If I start to grep the architectures for non-empty flush_dcache_page(), > > > > > I soon find things in arch/arm such as v4_mc_copy_user_highpage() doing > > > > > if (!test_and_set_bit(PG_dcache_clean,)) __flush_dcache_page() - where > > > > > the naming suggests that I'm right, it's the architecture's responsibility > > > > > to arrange whatever flushing is needed in its copy and clear page functions. > > > > [...] > > > > > Ok, so this is exactly the problem. The hugetlb allocator uses its own > > > pool of huge pages, so free_huge_page followed by a later alloc_huge_page > > > will give you something where the page flags of the compound head do not > > > guarantee that PG_arch_1 is clear. > > > > Just to confirm, the following quick hack at least results in the correct > > flushing for me (on ARM): > > > > > > diff --git a/mm/hugetlb.c b/mm/hugetlb.c > > index e198831..7a7c9d3 100644 > > --- a/mm/hugetlb.c > > +++ b/mm/hugetlb.c > > @@ -1141,6 +1141,7 @@ static struct page *alloc_huge_page(struct vm_area_struct *vma, > > } > > > > set_page_private(page, (unsigned long)spool); > > + clear_bit(PG_arch_1, &page->flags); > > > > vma_commit_reservation(h, vma, addr); > > > > > > > > The question is whether we should tidy that up for the core code or get > > architectures to clear the bit in arch_make_huge_pte (which also seems to > > work). > > This should go into arch specific code IMO. Even the page flag name > suggests this shouldn't be in the base code. Agree completely. PG_Arch_1 is mostly used for flushing implementations, but it's meaning isn't unified. For instance Arm and Parisc have opposite meanings for this flag. Touching it in generic code can therefore *never* be the right thing to do. What you're looking for here is the correct flushing (or rather flush notification) interface. James ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [PATCH] mm: hugetlb: flush dcache before returning zeroed huge page to userspace @ 2012-07-12 11:26 ` James Bottomley 0 siblings, 0 replies; 48+ messages in thread From: James Bottomley @ 2012-07-12 11:26 UTC (permalink / raw) To: Michal Hocko Cc: Will Deacon, Hugh Dickins, Andrew Morton, Hillf Danton, linux-arch, linux-kernel, linux-mm On Thu, 2012-07-12 at 13:16 +0200, Michal Hocko wrote: > On Wed 11-07-12 18:48:02, Will Deacon wrote: > > On Tue, Jul 10, 2012 at 11:42:34AM +0100, Will Deacon wrote: > > > On Tue, Jul 10, 2012 at 10:45:13AM +0100, Will Deacon wrote: > > > > On Tue, Jul 10, 2012 at 12:57:14AM +0100, Hugh Dickins wrote: > > > > > If I start to grep the architectures for non-empty flush_dcache_page(), > > > > > I soon find things in arch/arm such as v4_mc_copy_user_highpage() doing > > > > > if (!test_and_set_bit(PG_dcache_clean,)) __flush_dcache_page() - where > > > > > the naming suggests that I'm right, it's the architecture's responsibility > > > > > to arrange whatever flushing is needed in its copy and clear page functions. > > > > [...] > > > > > Ok, so this is exactly the problem. The hugetlb allocator uses its own > > > pool of huge pages, so free_huge_page followed by a later alloc_huge_page > > > will give you something where the page flags of the compound head do not > > > guarantee that PG_arch_1 is clear. > > > > Just to confirm, the following quick hack at least results in the correct > > flushing for me (on ARM): > > > > > > diff --git a/mm/hugetlb.c b/mm/hugetlb.c > > index e198831..7a7c9d3 100644 > > --- a/mm/hugetlb.c > > +++ b/mm/hugetlb.c > > @@ -1141,6 +1141,7 @@ static struct page *alloc_huge_page(struct vm_area_struct *vma, > > } > > > > set_page_private(page, (unsigned long)spool); > > + clear_bit(PG_arch_1, &page->flags); > > > > vma_commit_reservation(h, vma, addr); > > > > > > > > The question is whether we should tidy that up for the core code or get > > architectures to clear the bit in arch_make_huge_pte (which also seems to > > work). > > This should go into arch specific code IMO. Even the page flag name > suggests this shouldn't be in the base code. Agree completely. PG_Arch_1 is mostly used for flushing implementations, but it's meaning isn't unified. For instance Arm and Parisc have opposite meanings for this flag. Touching it in generic code can therefore *never* be the right thing to do. What you're looking for here is the correct flushing (or rather flush notification) interface. James -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [PATCH] mm: hugetlb: flush dcache before returning zeroed huge page to userspace 2012-07-12 11:16 ` Michal Hocko @ 2012-07-12 11:26 ` Will Deacon -1 siblings, 0 replies; 48+ messages in thread From: Will Deacon @ 2012-07-12 11:26 UTC (permalink / raw) To: Michal Hocko Cc: Hugh Dickins, Andrew Morton, Hillf Danton, linux-arch, linux-kernel, linux-mm On Thu, Jul 12, 2012 at 12:16:59PM +0100, Michal Hocko wrote: > On Wed 11-07-12 18:48:02, Will Deacon wrote: > > Just to confirm, the following quick hack at least results in the correct > > flushing for me (on ARM): > > > > > > diff --git a/mm/hugetlb.c b/mm/hugetlb.c > > index e198831..7a7c9d3 100644 > > --- a/mm/hugetlb.c > > +++ b/mm/hugetlb.c > > @@ -1141,6 +1141,7 @@ static struct page *alloc_huge_page(struct vm_area_struct *vma, > > } > > > > set_page_private(page, (unsigned long)spool); > > + clear_bit(PG_arch_1, &page->flags); > > > > vma_commit_reservation(h, vma, addr); > > > > > > > > The question is whether we should tidy that up for the core code or get > > architectures to clear the bit in arch_make_huge_pte (which also seems to > > work). > > This should go into arch specific code IMO. Even the page flag name > suggests this shouldn't be in the base code. Well, the comment in linux/page-flags.h does state that: * PG_arch_1 is an architecture specific page state bit. The generic code * guarantees that this bit is cleared for a page when it first is entered into * the page cache. so it's not completely clear cut that the architecture should be responsible for clearing this bit when allocating pages from the hugepage pool. Will ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [PATCH] mm: hugetlb: flush dcache before returning zeroed huge page to userspace @ 2012-07-12 11:26 ` Will Deacon 0 siblings, 0 replies; 48+ messages in thread From: Will Deacon @ 2012-07-12 11:26 UTC (permalink / raw) To: Michal Hocko Cc: Hugh Dickins, Andrew Morton, Hillf Danton, linux-arch, linux-kernel, linux-mm On Thu, Jul 12, 2012 at 12:16:59PM +0100, Michal Hocko wrote: > On Wed 11-07-12 18:48:02, Will Deacon wrote: > > Just to confirm, the following quick hack at least results in the correct > > flushing for me (on ARM): > > > > > > diff --git a/mm/hugetlb.c b/mm/hugetlb.c > > index e198831..7a7c9d3 100644 > > --- a/mm/hugetlb.c > > +++ b/mm/hugetlb.c > > @@ -1141,6 +1141,7 @@ static struct page *alloc_huge_page(struct vm_area_struct *vma, > > } > > > > set_page_private(page, (unsigned long)spool); > > + clear_bit(PG_arch_1, &page->flags); > > > > vma_commit_reservation(h, vma, addr); > > > > > > > > The question is whether we should tidy that up for the core code or get > > architectures to clear the bit in arch_make_huge_pte (which also seems to > > work). > > This should go into arch specific code IMO. Even the page flag name > suggests this shouldn't be in the base code. Well, the comment in linux/page-flags.h does state that: * PG_arch_1 is an architecture specific page state bit. The generic code * guarantees that this bit is cleared for a page when it first is entered into * the page cache. so it's not completely clear cut that the architecture should be responsible for clearing this bit when allocating pages from the hugepage pool. Will -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [PATCH] mm: hugetlb: flush dcache before returning zeroed huge page to userspace 2012-07-12 11:26 ` Will Deacon @ 2012-07-12 11:57 ` Michal Hocko -1 siblings, 0 replies; 48+ messages in thread From: Michal Hocko @ 2012-07-12 11:57 UTC (permalink / raw) To: Will Deacon Cc: Hugh Dickins, Andrew Morton, Hillf Danton, linux-arch, linux-kernel, linux-mm On Thu 12-07-12 12:26:45, Will Deacon wrote: > On Thu, Jul 12, 2012 at 12:16:59PM +0100, Michal Hocko wrote: > > On Wed 11-07-12 18:48:02, Will Deacon wrote: > > > Just to confirm, the following quick hack at least results in the correct > > > flushing for me (on ARM): > > > > > > > > > diff --git a/mm/hugetlb.c b/mm/hugetlb.c > > > index e198831..7a7c9d3 100644 > > > --- a/mm/hugetlb.c > > > +++ b/mm/hugetlb.c > > > @@ -1141,6 +1141,7 @@ static struct page *alloc_huge_page(struct vm_area_struct *vma, > > > } > > > > > > set_page_private(page, (unsigned long)spool); > > > + clear_bit(PG_arch_1, &page->flags); > > > > > > vma_commit_reservation(h, vma, addr); > > > > > > > > > > > > The question is whether we should tidy that up for the core code or get > > > architectures to clear the bit in arch_make_huge_pte (which also seems to > > > work). > > > > This should go into arch specific code IMO. Even the page flag name > > suggests this shouldn't be in the base code. > > Well, the comment in linux/page-flags.h does state that: > > * PG_arch_1 is an architecture specific page state bit. The generic code > * guarantees that this bit is cleared for a page when it first is entered into > * the page cache. > > so it's not completely clear cut that the architecture should be responsible > for clearing this bit when allocating pages from the hugepage pool. I think the wording is quite clear. It guarantees it gets cleared not clears it. So it is up to an arch specific functions called from the generic code to do that. -- Michal Hocko SUSE Labs SUSE LINUX s.r.o. Lihovarska 1060/12 190 00 Praha 9 Czech Republic ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [PATCH] mm: hugetlb: flush dcache before returning zeroed huge page to userspace @ 2012-07-12 11:57 ` Michal Hocko 0 siblings, 0 replies; 48+ messages in thread From: Michal Hocko @ 2012-07-12 11:57 UTC (permalink / raw) To: Will Deacon Cc: Hugh Dickins, Andrew Morton, Hillf Danton, linux-arch, linux-kernel, linux-mm On Thu 12-07-12 12:26:45, Will Deacon wrote: > On Thu, Jul 12, 2012 at 12:16:59PM +0100, Michal Hocko wrote: > > On Wed 11-07-12 18:48:02, Will Deacon wrote: > > > Just to confirm, the following quick hack at least results in the correct > > > flushing for me (on ARM): > > > > > > > > > diff --git a/mm/hugetlb.c b/mm/hugetlb.c > > > index e198831..7a7c9d3 100644 > > > --- a/mm/hugetlb.c > > > +++ b/mm/hugetlb.c > > > @@ -1141,6 +1141,7 @@ static struct page *alloc_huge_page(struct vm_area_struct *vma, > > > } > > > > > > set_page_private(page, (unsigned long)spool); > > > + clear_bit(PG_arch_1, &page->flags); > > > > > > vma_commit_reservation(h, vma, addr); > > > > > > > > > > > > The question is whether we should tidy that up for the core code or get > > > architectures to clear the bit in arch_make_huge_pte (which also seems to > > > work). > > > > This should go into arch specific code IMO. Even the page flag name > > suggests this shouldn't be in the base code. > > Well, the comment in linux/page-flags.h does state that: > > * PG_arch_1 is an architecture specific page state bit. The generic code > * guarantees that this bit is cleared for a page when it first is entered into > * the page cache. > > so it's not completely clear cut that the architecture should be responsible > for clearing this bit when allocating pages from the hugepage pool. I think the wording is quite clear. It guarantees it gets cleared not clears it. So it is up to an arch specific functions called from the generic code to do that. -- Michal Hocko SUSE Labs SUSE LINUX s.r.o. Lihovarska 1060/12 190 00 Praha 9 Czech Republic -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [PATCH] mm: hugetlb: flush dcache before returning zeroed huge page to userspace 2012-07-12 11:57 ` Michal Hocko @ 2012-08-07 16:03 ` Will Deacon -1 siblings, 0 replies; 48+ messages in thread From: Will Deacon @ 2012-08-07 16:03 UTC (permalink / raw) To: Michal Hocko Cc: Hugh Dickins, Andrew Morton, Hillf Danton, Russell King, linux-arch, linux-kernel, linux-mm <resurrecting this thread> On Thu, Jul 12, 2012 at 12:57:08PM +0100, Michal Hocko wrote: > On Thu 12-07-12 12:26:45, Will Deacon wrote: > > Well, the comment in linux/page-flags.h does state that: > > > > * PG_arch_1 is an architecture specific page state bit. The generic code > > * guarantees that this bit is cleared for a page when it first is entered into > > * the page cache. > > > > so it's not completely clear cut that the architecture should be responsible > > for clearing this bit when allocating pages from the hugepage pool. > > I think the wording is quite clear. It guarantees it gets cleared not > clears it. So it is up to an arch specific functions called from the > generic code to do that. If we have to do this in arch-specific functions then: 1. Where should we do it? 2. Why don't we also do this for normal (non-huge) pages? I looked at what happens for non-huge pages and the page flags are cleared by *generic* code in free_pages_check: /* * Flags checked when a page is prepped for return by the page allocator. * Pages being prepped should not have any flags set. It they are set, * there has been a kernel bug or struct page corruption. */ #define PAGE_FLAGS_CHECK_AT_PREP ((1 << NR_PAGEFLAGS) - 1) static inline int free_pages_check(struct page *page) { [...] if (page->flags & PAGE_FLAGS_CHECK_AT_PREP) page->flags &= ~PAGE_FLAGS_CHECK_AT_PREP; return 0; } Which guarantees to the arch-code that any new pages coming back from the allocator will have PG_arch_1 clear. Why can't we just do something similar for the hugepage pool? Sorry to nag, but the discrepancy seems both unnecessary and unsolvable at the arch level with the current hooks. Will ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [PATCH] mm: hugetlb: flush dcache before returning zeroed huge page to userspace @ 2012-08-07 16:03 ` Will Deacon 0 siblings, 0 replies; 48+ messages in thread From: Will Deacon @ 2012-08-07 16:03 UTC (permalink / raw) To: Michal Hocko Cc: Hugh Dickins, Andrew Morton, Hillf Danton, Russell King, linux-arch, linux-kernel, linux-mm <resurrecting this thread> On Thu, Jul 12, 2012 at 12:57:08PM +0100, Michal Hocko wrote: > On Thu 12-07-12 12:26:45, Will Deacon wrote: > > Well, the comment in linux/page-flags.h does state that: > > > > * PG_arch_1 is an architecture specific page state bit. The generic code > > * guarantees that this bit is cleared for a page when it first is entered into > > * the page cache. > > > > so it's not completely clear cut that the architecture should be responsible > > for clearing this bit when allocating pages from the hugepage pool. > > I think the wording is quite clear. It guarantees it gets cleared not > clears it. So it is up to an arch specific functions called from the > generic code to do that. If we have to do this in arch-specific functions then: 1. Where should we do it? 2. Why don't we also do this for normal (non-huge) pages? I looked at what happens for non-huge pages and the page flags are cleared by *generic* code in free_pages_check: /* * Flags checked when a page is prepped for return by the page allocator. * Pages being prepped should not have any flags set. It they are set, * there has been a kernel bug or struct page corruption. */ #define PAGE_FLAGS_CHECK_AT_PREP ((1 << NR_PAGEFLAGS) - 1) static inline int free_pages_check(struct page *page) { [...] if (page->flags & PAGE_FLAGS_CHECK_AT_PREP) page->flags &= ~PAGE_FLAGS_CHECK_AT_PREP; return 0; } Which guarantees to the arch-code that any new pages coming back from the allocator will have PG_arch_1 clear. Why can't we just do something similar for the hugepage pool? Sorry to nag, but the discrepancy seems both unnecessary and unsolvable at the arch level with the current hooks. Will -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [PATCH] mm: hugetlb: flush dcache before returning zeroed huge page to userspace 2012-08-07 16:03 ` Will Deacon @ 2012-08-08 16:26 ` Michal Hocko -1 siblings, 0 replies; 48+ messages in thread From: Michal Hocko @ 2012-08-08 16:26 UTC (permalink / raw) To: Will Deacon Cc: Hugh Dickins, Andrew Morton, Hillf Danton, Russell King, linux-arch, linux-kernel, linux-mm On Tue 07-08-12 17:03:37, Will Deacon wrote: > <resurrecting this thread> > > On Thu, Jul 12, 2012 at 12:57:08PM +0100, Michal Hocko wrote: > > On Thu 12-07-12 12:26:45, Will Deacon wrote: > > > Well, the comment in linux/page-flags.h does state that: > > > > > > * PG_arch_1 is an architecture specific page state bit. The generic code > > > * guarantees that this bit is cleared for a page when it first is entered into > > > * the page cache. > > > > > > so it's not completely clear cut that the architecture should be responsible > > > for clearing this bit when allocating pages from the hugepage pool. > > > > I think the wording is quite clear. It guarantees it gets cleared not > > clears it. So it is up to an arch specific functions called from the > > generic code to do that. > > If we have to do this in arch-specific functions then: > > 1. Where should we do it? > 2. Why don't we also do this for normal (non-huge) pages? As you describe below, it is done during the page freeing but hugetlb tries to be clever and it doesn't do the reinitialization when the page is just returned back to the pool. We have arch_release_hugepage and arch_prepare_hugepage defined only for s390 and they are not called on the way in resp. out of the pool because we do not want to pointlessly go over ptep free&alloc cycle. I guess the cleanest way is to hook into dequeue_huge_page_node and add something like arch_clear_hugepage_flags. -- Michal Hocko SUSE Labs ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [PATCH] mm: hugetlb: flush dcache before returning zeroed huge page to userspace @ 2012-08-08 16:26 ` Michal Hocko 0 siblings, 0 replies; 48+ messages in thread From: Michal Hocko @ 2012-08-08 16:26 UTC (permalink / raw) To: Will Deacon Cc: Hugh Dickins, Andrew Morton, Hillf Danton, Russell King, linux-arch, linux-kernel, linux-mm On Tue 07-08-12 17:03:37, Will Deacon wrote: > <resurrecting this thread> > > On Thu, Jul 12, 2012 at 12:57:08PM +0100, Michal Hocko wrote: > > On Thu 12-07-12 12:26:45, Will Deacon wrote: > > > Well, the comment in linux/page-flags.h does state that: > > > > > > * PG_arch_1 is an architecture specific page state bit. The generic code > > > * guarantees that this bit is cleared for a page when it first is entered into > > > * the page cache. > > > > > > so it's not completely clear cut that the architecture should be responsible > > > for clearing this bit when allocating pages from the hugepage pool. > > > > I think the wording is quite clear. It guarantees it gets cleared not > > clears it. So it is up to an arch specific functions called from the > > generic code to do that. > > If we have to do this in arch-specific functions then: > > 1. Where should we do it? > 2. Why don't we also do this for normal (non-huge) pages? As you describe below, it is done during the page freeing but hugetlb tries to be clever and it doesn't do the reinitialization when the page is just returned back to the pool. We have arch_release_hugepage and arch_prepare_hugepage defined only for s390 and they are not called on the way in resp. out of the pool because we do not want to pointlessly go over ptep free&alloc cycle. I guess the cleanest way is to hook into dequeue_huge_page_node and add something like arch_clear_hugepage_flags. -- Michal Hocko SUSE Labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [PATCH] mm: hugetlb: flush dcache before returning zeroed huge page to userspace 2012-08-08 16:26 ` Michal Hocko (?) (?) @ 2012-08-16 16:09 ` Will Deacon -1 siblings, 0 replies; 48+ messages in thread From: Will Deacon @ 2012-08-16 16:09 UTC (permalink / raw) To: Michal Hocko Cc: Hugh Dickins, Andrew Morton, Hillf Danton, Russell King, linux-arch, linux-kernel, linux-mm On Wed, Aug 08, 2012 at 05:26:07PM +0100, Michal Hocko wrote: > I guess the cleanest way is to hook into dequeue_huge_page_node and add > something like arch_clear_hugepage_flags. I hooked into enqueue_huge_page instead, but how about something like this?: Will --- >8 >From 92f785a58bbc378c194800a2b12360528bdf4c6f Mon Sep 17 00:00:00 2001 From: Will Deacon <will.deacon@arm.com> Date: Mon, 13 Aug 2012 18:08:13 +0100 Subject: [PATCH] mm: hugetlb: add arch hook for clearing page flags before entering pool The core page allocator ensures that page flags are zeroed when freeing pages via free_pages_check. A number of architectures (ARM, PPC, MIPS) rely on this property to treat new pages as dirty with respect to the data cache and perform the appropriate flushing before mapping the pages into userspace. This can lead to cache synchronisation problems when using hugepages, since the allocator keeps its own pool of pages above the usual page allocator and does not reset the page flags when freeing a page into the pool. This patch adds a new architecture hook, arch_clear_hugepage_flags, so that architectures which rely on the page flags being in a particular state for fresh allocations can adjust the flags accordingly when a page is freed into the pool. Signed-off-by: Will Deacon <will.deacon@arm.com> --- arch/ia64/include/asm/hugetlb.h | 6 ++++++ arch/mips/include/asm/hugetlb.h | 4 ++++ arch/powerpc/include/asm/hugetlb.h | 6 ++++++ arch/s390/include/asm/hugetlb.h | 1 + arch/sh/include/asm/hugetlb.h | 6 ++++++ arch/sparc/include/asm/hugetlb.h | 4 ++++ arch/tile/include/asm/hugetlb.h | 4 ++++ arch/x86/include/asm/hugetlb.h | 4 ++++ mm/hugetlb.c | 1 + 9 files changed, 36 insertions(+), 0 deletions(-) diff --git a/arch/ia64/include/asm/hugetlb.h b/arch/ia64/include/asm/hugetlb.h index da55c63..2adaa60 100644 --- a/arch/ia64/include/asm/hugetlb.h +++ b/arch/ia64/include/asm/hugetlb.h @@ -1,6 +1,7 @@ #ifndef _ASM_IA64_HUGETLB_H #define _ASM_IA64_HUGETLB_H +#include <asm/cacheflush.h> #include <asm/page.h> @@ -77,4 +78,9 @@ static inline void arch_release_hugepage(struct page *page) { } +static inline void arch_clear_hugepage_flags(struct page *page) +{ + flush_dcache_page(page); +} + #endif /* _ASM_IA64_HUGETLB_H */ diff --git a/arch/mips/include/asm/hugetlb.h b/arch/mips/include/asm/hugetlb.h index 58d3688..bd94946 100644 --- a/arch/mips/include/asm/hugetlb.h +++ b/arch/mips/include/asm/hugetlb.h @@ -112,4 +112,8 @@ static inline void arch_release_hugepage(struct page *page) { } +static inline void arch_clear_hugepage_flags(struct page *page) +{ +} + #endif /* __ASM_HUGETLB_H */ diff --git a/arch/powerpc/include/asm/hugetlb.h b/arch/powerpc/include/asm/hugetlb.h index dfdb95b..52696e6 100644 --- a/arch/powerpc/include/asm/hugetlb.h +++ b/arch/powerpc/include/asm/hugetlb.h @@ -2,6 +2,7 @@ #define _ASM_POWERPC_HUGETLB_H #ifdef CONFIG_HUGETLB_PAGE +#include <asm/cacheflush.h> #include <asm/page.h> extern struct kmem_cache *hugepte_cache; @@ -151,6 +152,11 @@ static inline void arch_release_hugepage(struct page *page) { } +static inline void arch_clear_hugepage_flags(struct page *page) +{ + flush_dcache_page(page); +} + #else /* ! CONFIG_HUGETLB_PAGE */ static inline void flush_hugetlb_page(struct vm_area_struct *vma, unsigned long vmaddr) diff --git a/arch/s390/include/asm/hugetlb.h b/arch/s390/include/asm/hugetlb.h index 799ed0f..faa7655 100644 --- a/arch/s390/include/asm/hugetlb.h +++ b/arch/s390/include/asm/hugetlb.h @@ -33,6 +33,7 @@ static inline int prepare_hugepage_range(struct file *file, } #define hugetlb_prefault_arch_hook(mm) do { } while (0) +#define arch_clear_hugepage_flags(page) do { } while (0) int arch_prepare_hugepage(struct page *page); void arch_release_hugepage(struct page *page); diff --git a/arch/sh/include/asm/hugetlb.h b/arch/sh/include/asm/hugetlb.h index 967068f..b3808c7 100644 --- a/arch/sh/include/asm/hugetlb.h +++ b/arch/sh/include/asm/hugetlb.h @@ -1,6 +1,7 @@ #ifndef _ASM_SH_HUGETLB_H #define _ASM_SH_HUGETLB_H +#include <asm/cacheflush.h> #include <asm/page.h> @@ -89,4 +90,9 @@ static inline void arch_release_hugepage(struct page *page) { } +static inline void arch_clear_hugepage_flags(struct page *page) +{ + clear_bit(PG_dcache_clean, &page->flags); +} + #endif /* _ASM_SH_HUGETLB_H */ diff --git a/arch/sparc/include/asm/hugetlb.h b/arch/sparc/include/asm/hugetlb.h index 1770610..e7927c9 100644 --- a/arch/sparc/include/asm/hugetlb.h +++ b/arch/sparc/include/asm/hugetlb.h @@ -82,4 +82,8 @@ static inline void arch_release_hugepage(struct page *page) { } +static inline void arch_clear_hugepage_flags(struct page *page) +{ +} + #endif /* _ASM_SPARC64_HUGETLB_H */ diff --git a/arch/tile/include/asm/hugetlb.h b/arch/tile/include/asm/hugetlb.h index b204238..0f885af 100644 --- a/arch/tile/include/asm/hugetlb.h +++ b/arch/tile/include/asm/hugetlb.h @@ -106,6 +106,10 @@ static inline void arch_release_hugepage(struct page *page) { } +static inline void arch_clear_hugepage_flags(struct page *page) +{ +} + #ifdef CONFIG_HUGETLB_SUPER_PAGES static inline pte_t arch_make_huge_pte(pte_t entry, struct vm_area_struct *vma, struct page *page, int writable) diff --git a/arch/x86/include/asm/hugetlb.h b/arch/x86/include/asm/hugetlb.h index 439a9ac..bdd35db 100644 --- a/arch/x86/include/asm/hugetlb.h +++ b/arch/x86/include/asm/hugetlb.h @@ -90,4 +90,8 @@ static inline void arch_release_hugepage(struct page *page) { } +static inline void arch_clear_hugepage_flags(struct page *page) +{ +} + #endif /* _ASM_X86_HUGETLB_H */ diff --git a/mm/hugetlb.c b/mm/hugetlb.c index bc72712..78fafd5 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -509,6 +509,7 @@ void copy_huge_page(struct page *dst, struct page *src) static void enqueue_huge_page(struct hstate *h, struct page *page) { int nid = page_to_nid(page); + arch_clear_hugepage_flags(page); list_move(&page->lru, &h->hugepage_freelists[nid]); h->free_huge_pages++; h->free_huge_pages_node[nid]++; -- 1.7.4.1 ^ permalink raw reply related [flat|nested] 48+ messages in thread
* Re: [PATCH] mm: hugetlb: flush dcache before returning zeroed huge page to userspace @ 2012-08-16 16:09 ` Will Deacon 0 siblings, 0 replies; 48+ messages in thread From: Will Deacon @ 2012-08-16 16:09 UTC (permalink / raw) To: Michal Hocko Cc: Hugh Dickins, Andrew Morton, Hillf Danton, Russell King, linux-arch, linux-kernel, linux-mm On Wed, Aug 08, 2012 at 05:26:07PM +0100, Michal Hocko wrote: > I guess the cleanest way is to hook into dequeue_huge_page_node and add > something like arch_clear_hugepage_flags. I hooked into enqueue_huge_page instead, but how about something like this?: Will --- >8 ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [PATCH] mm: hugetlb: flush dcache before returning zeroed huge page to userspace @ 2012-08-16 16:09 ` Will Deacon 0 siblings, 0 replies; 48+ messages in thread From: Will Deacon @ 2012-08-16 16:09 UTC (permalink / raw) To: Michal Hocko Cc: Hugh Dickins, Andrew Morton, Hillf Danton, Russell King, linux-arch, linux-kernel, linux-mm On Wed, Aug 08, 2012 at 05:26:07PM +0100, Michal Hocko wrote: > I guess the cleanest way is to hook into dequeue_huge_page_node and add > something like arch_clear_hugepage_flags. I hooked into enqueue_huge_page instead, but how about something like this?: Will --- >8 From 92f785a58bbc378c194800a2b12360528bdf4c6f Mon Sep 17 00:00:00 2001 From: Will Deacon <will.deacon@arm.com> Date: Mon, 13 Aug 2012 18:08:13 +0100 Subject: [PATCH] mm: hugetlb: add arch hook for clearing page flags before entering pool The core page allocator ensures that page flags are zeroed when freeing pages via free_pages_check. A number of architectures (ARM, PPC, MIPS) rely on this property to treat new pages as dirty with respect to the data cache and perform the appropriate flushing before mapping the pages into userspace. This can lead to cache synchronisation problems when using hugepages, since the allocator keeps its own pool of pages above the usual page allocator and does not reset the page flags when freeing a page into the pool. This patch adds a new architecture hook, arch_clear_hugepage_flags, so that architectures which rely on the page flags being in a particular state for fresh allocations can adjust the flags accordingly when a page is freed into the pool. Signed-off-by: Will Deacon <will.deacon@arm.com> --- arch/ia64/include/asm/hugetlb.h | 6 ++++++ arch/mips/include/asm/hugetlb.h | 4 ++++ arch/powerpc/include/asm/hugetlb.h | 6 ++++++ arch/s390/include/asm/hugetlb.h | 1 + arch/sh/include/asm/hugetlb.h | 6 ++++++ arch/sparc/include/asm/hugetlb.h | 4 ++++ arch/tile/include/asm/hugetlb.h | 4 ++++ arch/x86/include/asm/hugetlb.h | 4 ++++ mm/hugetlb.c | 1 + 9 files changed, 36 insertions(+), 0 deletions(-) diff --git a/arch/ia64/include/asm/hugetlb.h b/arch/ia64/include/asm/hugetlb.h index da55c63..2adaa60 100644 --- a/arch/ia64/include/asm/hugetlb.h +++ b/arch/ia64/include/asm/hugetlb.h @@ -1,6 +1,7 @@ #ifndef _ASM_IA64_HUGETLB_H #define _ASM_IA64_HUGETLB_H +#include <asm/cacheflush.h> #include <asm/page.h> @@ -77,4 +78,9 @@ static inline void arch_release_hugepage(struct page *page) { } +static inline void arch_clear_hugepage_flags(struct page *page) +{ + flush_dcache_page(page); +} + #endif /* _ASM_IA64_HUGETLB_H */ diff --git a/arch/mips/include/asm/hugetlb.h b/arch/mips/include/asm/hugetlb.h index 58d3688..bd94946 100644 --- a/arch/mips/include/asm/hugetlb.h +++ b/arch/mips/include/asm/hugetlb.h @@ -112,4 +112,8 @@ static inline void arch_release_hugepage(struct page *page) { } +static inline void arch_clear_hugepage_flags(struct page *page) +{ +} + #endif /* __ASM_HUGETLB_H */ diff --git a/arch/powerpc/include/asm/hugetlb.h b/arch/powerpc/include/asm/hugetlb.h index dfdb95b..52696e6 100644 --- a/arch/powerpc/include/asm/hugetlb.h +++ b/arch/powerpc/include/asm/hugetlb.h @@ -2,6 +2,7 @@ #define _ASM_POWERPC_HUGETLB_H #ifdef CONFIG_HUGETLB_PAGE +#include <asm/cacheflush.h> #include <asm/page.h> extern struct kmem_cache *hugepte_cache; @@ -151,6 +152,11 @@ static inline void arch_release_hugepage(struct page *page) { } +static inline void arch_clear_hugepage_flags(struct page *page) +{ + flush_dcache_page(page); +} + #else /* ! CONFIG_HUGETLB_PAGE */ static inline void flush_hugetlb_page(struct vm_area_struct *vma, unsigned long vmaddr) diff --git a/arch/s390/include/asm/hugetlb.h b/arch/s390/include/asm/hugetlb.h index 799ed0f..faa7655 100644 --- a/arch/s390/include/asm/hugetlb.h +++ b/arch/s390/include/asm/hugetlb.h @@ -33,6 +33,7 @@ static inline int prepare_hugepage_range(struct file *file, } #define hugetlb_prefault_arch_hook(mm) do { } while (0) +#define arch_clear_hugepage_flags(page) do { } while (0) int arch_prepare_hugepage(struct page *page); void arch_release_hugepage(struct page *page); diff --git a/arch/sh/include/asm/hugetlb.h b/arch/sh/include/asm/hugetlb.h index 967068f..b3808c7 100644 --- a/arch/sh/include/asm/hugetlb.h +++ b/arch/sh/include/asm/hugetlb.h @@ -1,6 +1,7 @@ #ifndef _ASM_SH_HUGETLB_H #define _ASM_SH_HUGETLB_H +#include <asm/cacheflush.h> #include <asm/page.h> @@ -89,4 +90,9 @@ static inline void arch_release_hugepage(struct page *page) { } +static inline void arch_clear_hugepage_flags(struct page *page) +{ + clear_bit(PG_dcache_clean, &page->flags); +} + #endif /* _ASM_SH_HUGETLB_H */ diff --git a/arch/sparc/include/asm/hugetlb.h b/arch/sparc/include/asm/hugetlb.h index 1770610..e7927c9 100644 --- a/arch/sparc/include/asm/hugetlb.h +++ b/arch/sparc/include/asm/hugetlb.h @@ -82,4 +82,8 @@ static inline void arch_release_hugepage(struct page *page) { } +static inline void arch_clear_hugepage_flags(struct page *page) +{ +} + #endif /* _ASM_SPARC64_HUGETLB_H */ diff --git a/arch/tile/include/asm/hugetlb.h b/arch/tile/include/asm/hugetlb.h index b204238..0f885af 100644 --- a/arch/tile/include/asm/hugetlb.h +++ b/arch/tile/include/asm/hugetlb.h @@ -106,6 +106,10 @@ static inline void arch_release_hugepage(struct page *page) { } +static inline void arch_clear_hugepage_flags(struct page *page) +{ +} + #ifdef CONFIG_HUGETLB_SUPER_PAGES static inline pte_t arch_make_huge_pte(pte_t entry, struct vm_area_struct *vma, struct page *page, int writable) diff --git a/arch/x86/include/asm/hugetlb.h b/arch/x86/include/asm/hugetlb.h index 439a9ac..bdd35db 100644 --- a/arch/x86/include/asm/hugetlb.h +++ b/arch/x86/include/asm/hugetlb.h @@ -90,4 +90,8 @@ static inline void arch_release_hugepage(struct page *page) { } +static inline void arch_clear_hugepage_flags(struct page *page) +{ +} + #endif /* _ASM_X86_HUGETLB_H */ diff --git a/mm/hugetlb.c b/mm/hugetlb.c index bc72712..78fafd5 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -509,6 +509,7 @@ void copy_huge_page(struct page *dst, struct page *src) static void enqueue_huge_page(struct hstate *h, struct page *page) { int nid = page_to_nid(page); + arch_clear_hugepage_flags(page); list_move(&page->lru, &h->hugepage_freelists[nid]); h->free_huge_pages++; h->free_huge_pages_node[nid]++; -- 1.7.4.1 ^ permalink raw reply related [flat|nested] 48+ messages in thread
* Re: [PATCH] mm: hugetlb: flush dcache before returning zeroed huge page to userspace @ 2012-08-16 16:09 ` Will Deacon 0 siblings, 0 replies; 48+ messages in thread From: Will Deacon @ 2012-08-16 16:09 UTC (permalink / raw) To: Michal Hocko Cc: Hugh Dickins, Andrew Morton, Hillf Danton, Russell King, linux-arch, linux-kernel, linux-mm On Wed, Aug 08, 2012 at 05:26:07PM +0100, Michal Hocko wrote: > I guess the cleanest way is to hook into dequeue_huge_page_node and add > something like arch_clear_hugepage_flags. I hooked into enqueue_huge_page instead, but how about something like this?: Will --- >8 From 92f785a58bbc378c194800a2b12360528bdf4c6f Mon Sep 17 00:00:00 2001 From: Will Deacon <will.deacon@arm.com> Date: Mon, 13 Aug 2012 18:08:13 +0100 Subject: [PATCH] mm: hugetlb: add arch hook for clearing page flags before entering pool The core page allocator ensures that page flags are zeroed when freeing pages via free_pages_check. A number of architectures (ARM, PPC, MIPS) rely on this property to treat new pages as dirty with respect to the data cache and perform the appropriate flushing before mapping the pages into userspace. This can lead to cache synchronisation problems when using hugepages, since the allocator keeps its own pool of pages above the usual page allocator and does not reset the page flags when freeing a page into the pool. This patch adds a new architecture hook, arch_clear_hugepage_flags, so that architectures which rely on the page flags being in a particular state for fresh allocations can adjust the flags accordingly when a page is freed into the pool. Signed-off-by: Will Deacon <will.deacon@arm.com> --- arch/ia64/include/asm/hugetlb.h | 6 ++++++ arch/mips/include/asm/hugetlb.h | 4 ++++ arch/powerpc/include/asm/hugetlb.h | 6 ++++++ arch/s390/include/asm/hugetlb.h | 1 + arch/sh/include/asm/hugetlb.h | 6 ++++++ arch/sparc/include/asm/hugetlb.h | 4 ++++ arch/tile/include/asm/hugetlb.h | 4 ++++ arch/x86/include/asm/hugetlb.h | 4 ++++ mm/hugetlb.c | 1 + 9 files changed, 36 insertions(+), 0 deletions(-) diff --git a/arch/ia64/include/asm/hugetlb.h b/arch/ia64/include/asm/hugetlb.h index da55c63..2adaa60 100644 --- a/arch/ia64/include/asm/hugetlb.h +++ b/arch/ia64/include/asm/hugetlb.h @@ -1,6 +1,7 @@ #ifndef _ASM_IA64_HUGETLB_H #define _ASM_IA64_HUGETLB_H +#include <asm/cacheflush.h> #include <asm/page.h> @@ -77,4 +78,9 @@ static inline void arch_release_hugepage(struct page *page) { } +static inline void arch_clear_hugepage_flags(struct page *page) +{ + flush_dcache_page(page); +} + #endif /* _ASM_IA64_HUGETLB_H */ diff --git a/arch/mips/include/asm/hugetlb.h b/arch/mips/include/asm/hugetlb.h index 58d3688..bd94946 100644 --- a/arch/mips/include/asm/hugetlb.h +++ b/arch/mips/include/asm/hugetlb.h @@ -112,4 +112,8 @@ static inline void arch_release_hugepage(struct page *page) { } +static inline void arch_clear_hugepage_flags(struct page *page) +{ +} + #endif /* __ASM_HUGETLB_H */ diff --git a/arch/powerpc/include/asm/hugetlb.h b/arch/powerpc/include/asm/hugetlb.h index dfdb95b..52696e6 100644 --- a/arch/powerpc/include/asm/hugetlb.h +++ b/arch/powerpc/include/asm/hugetlb.h @@ -2,6 +2,7 @@ #define _ASM_POWERPC_HUGETLB_H #ifdef CONFIG_HUGETLB_PAGE +#include <asm/cacheflush.h> #include <asm/page.h> extern struct kmem_cache *hugepte_cache; @@ -151,6 +152,11 @@ static inline void arch_release_hugepage(struct page *page) { } +static inline void arch_clear_hugepage_flags(struct page *page) +{ + flush_dcache_page(page); +} + #else /* ! CONFIG_HUGETLB_PAGE */ static inline void flush_hugetlb_page(struct vm_area_struct *vma, unsigned long vmaddr) diff --git a/arch/s390/include/asm/hugetlb.h b/arch/s390/include/asm/hugetlb.h index 799ed0f..faa7655 100644 --- a/arch/s390/include/asm/hugetlb.h +++ b/arch/s390/include/asm/hugetlb.h @@ -33,6 +33,7 @@ static inline int prepare_hugepage_range(struct file *file, } #define hugetlb_prefault_arch_hook(mm) do { } while (0) +#define arch_clear_hugepage_flags(page) do { } while (0) int arch_prepare_hugepage(struct page *page); void arch_release_hugepage(struct page *page); diff --git a/arch/sh/include/asm/hugetlb.h b/arch/sh/include/asm/hugetlb.h index 967068f..b3808c7 100644 --- a/arch/sh/include/asm/hugetlb.h +++ b/arch/sh/include/asm/hugetlb.h @@ -1,6 +1,7 @@ #ifndef _ASM_SH_HUGETLB_H #define _ASM_SH_HUGETLB_H +#include <asm/cacheflush.h> #include <asm/page.h> @@ -89,4 +90,9 @@ static inline void arch_release_hugepage(struct page *page) { } +static inline void arch_clear_hugepage_flags(struct page *page) +{ + clear_bit(PG_dcache_clean, &page->flags); +} + #endif /* _ASM_SH_HUGETLB_H */ diff --git a/arch/sparc/include/asm/hugetlb.h b/arch/sparc/include/asm/hugetlb.h index 1770610..e7927c9 100644 --- a/arch/sparc/include/asm/hugetlb.h +++ b/arch/sparc/include/asm/hugetlb.h @@ -82,4 +82,8 @@ static inline void arch_release_hugepage(struct page *page) { } +static inline void arch_clear_hugepage_flags(struct page *page) +{ +} + #endif /* _ASM_SPARC64_HUGETLB_H */ diff --git a/arch/tile/include/asm/hugetlb.h b/arch/tile/include/asm/hugetlb.h index b204238..0f885af 100644 --- a/arch/tile/include/asm/hugetlb.h +++ b/arch/tile/include/asm/hugetlb.h @@ -106,6 +106,10 @@ static inline void arch_release_hugepage(struct page *page) { } +static inline void arch_clear_hugepage_flags(struct page *page) +{ +} + #ifdef CONFIG_HUGETLB_SUPER_PAGES static inline pte_t arch_make_huge_pte(pte_t entry, struct vm_area_struct *vma, struct page *page, int writable) diff --git a/arch/x86/include/asm/hugetlb.h b/arch/x86/include/asm/hugetlb.h index 439a9ac..bdd35db 100644 --- a/arch/x86/include/asm/hugetlb.h +++ b/arch/x86/include/asm/hugetlb.h @@ -90,4 +90,8 @@ static inline void arch_release_hugepage(struct page *page) { } +static inline void arch_clear_hugepage_flags(struct page *page) +{ +} + #endif /* _ASM_X86_HUGETLB_H */ diff --git a/mm/hugetlb.c b/mm/hugetlb.c index bc72712..78fafd5 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -509,6 +509,7 @@ void copy_huge_page(struct page *dst, struct page *src) static void enqueue_huge_page(struct hstate *h, struct page *page) { int nid = page_to_nid(page); + arch_clear_hugepage_flags(page); list_move(&page->lru, &h->hugepage_freelists[nid]); h->free_huge_pages++; h->free_huge_pages_node[nid]++; -- 1.7.4.1 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply related [flat|nested] 48+ messages in thread
* Re: [PATCH] mm: hugetlb: flush dcache before returning zeroed huge page to userspace 2012-08-16 16:09 ` Will Deacon @ 2012-08-16 17:25 ` Michal Hocko -1 siblings, 0 replies; 48+ messages in thread From: Michal Hocko @ 2012-08-16 17:25 UTC (permalink / raw) To: Will Deacon Cc: Hugh Dickins, Andrew Morton, Hillf Danton, Russell King, linux-arch, linux-kernel, linux-mm On Thu 16-08-12 17:09:54, Will Deacon wrote: > On Wed, Aug 08, 2012 at 05:26:07PM +0100, Michal Hocko wrote: > > I guess the cleanest way is to hook into dequeue_huge_page_node and add > > something like arch_clear_hugepage_flags. > > I hooked into enqueue_huge_page instead, but how about something like this?: Do you have any specific reason for that? enqueue_huge_page is called on pages which potentially never get used so isn't that wasting a bit? Not that it would be wrong I was just thinking why shouldn't we do it when the page is actualy going to be used for sure. -- Michal Hocko SUSE Labs ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [PATCH] mm: hugetlb: flush dcache before returning zeroed huge page to userspace @ 2012-08-16 17:25 ` Michal Hocko 0 siblings, 0 replies; 48+ messages in thread From: Michal Hocko @ 2012-08-16 17:25 UTC (permalink / raw) To: Will Deacon Cc: Hugh Dickins, Andrew Morton, Hillf Danton, Russell King, linux-arch, linux-kernel, linux-mm On Thu 16-08-12 17:09:54, Will Deacon wrote: > On Wed, Aug 08, 2012 at 05:26:07PM +0100, Michal Hocko wrote: > > I guess the cleanest way is to hook into dequeue_huge_page_node and add > > something like arch_clear_hugepage_flags. > > I hooked into enqueue_huge_page instead, but how about something like this?: Do you have any specific reason for that? enqueue_huge_page is called on pages which potentially never get used so isn't that wasting a bit? Not that it would be wrong I was just thinking why shouldn't we do it when the page is actualy going to be used for sure. -- Michal Hocko SUSE Labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [PATCH] mm: hugetlb: flush dcache before returning zeroed huge page to userspace 2012-08-16 17:25 ` Michal Hocko @ 2012-08-16 17:34 ` Will Deacon -1 siblings, 0 replies; 48+ messages in thread From: Will Deacon @ 2012-08-16 17:34 UTC (permalink / raw) To: Michal Hocko Cc: Hugh Dickins, Andrew Morton, Hillf Danton, Russell King, linux-arch, linux-kernel, linux-mm On Thu, Aug 16, 2012 at 06:25:27PM +0100, Michal Hocko wrote: > On Thu 16-08-12 17:09:54, Will Deacon wrote: > > On Wed, Aug 08, 2012 at 05:26:07PM +0100, Michal Hocko wrote: > > > I guess the cleanest way is to hook into dequeue_huge_page_node and add > > > something like arch_clear_hugepage_flags. > > > > I hooked into enqueue_huge_page instead, but how about something like this?: > > Do you have any specific reason for that? enqueue_huge_page is called on > pages which potentially never get used so isn't that wasting a bit? > Not that it would be wrong I was just thinking why shouldn't we do it > when the page is actualy going to be used for sure. I just did it that way to match the flag clearing for normal pages. I can move it into dequeue if you think it's worthwhile but in the worst case it just adds a clear_bit call, so I doubt it's measurable. Will ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [PATCH] mm: hugetlb: flush dcache before returning zeroed huge page to userspace @ 2012-08-16 17:34 ` Will Deacon 0 siblings, 0 replies; 48+ messages in thread From: Will Deacon @ 2012-08-16 17:34 UTC (permalink / raw) To: Michal Hocko Cc: Hugh Dickins, Andrew Morton, Hillf Danton, Russell King, linux-arch, linux-kernel, linux-mm On Thu, Aug 16, 2012 at 06:25:27PM +0100, Michal Hocko wrote: > On Thu 16-08-12 17:09:54, Will Deacon wrote: > > On Wed, Aug 08, 2012 at 05:26:07PM +0100, Michal Hocko wrote: > > > I guess the cleanest way is to hook into dequeue_huge_page_node and add > > > something like arch_clear_hugepage_flags. > > > > I hooked into enqueue_huge_page instead, but how about something like this?: > > Do you have any specific reason for that? enqueue_huge_page is called on > pages which potentially never get used so isn't that wasting a bit? > Not that it would be wrong I was just thinking why shouldn't we do it > when the page is actualy going to be used for sure. I just did it that way to match the flag clearing for normal pages. I can move it into dequeue if you think it's worthwhile but in the worst case it just adds a clear_bit call, so I doubt it's measurable. Will -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [PATCH] mm: hugetlb: flush dcache before returning zeroed huge page to userspace 2012-08-16 17:34 ` Will Deacon @ 2012-08-16 18:06 ` Michal Hocko -1 siblings, 0 replies; 48+ messages in thread From: Michal Hocko @ 2012-08-16 18:06 UTC (permalink / raw) To: Will Deacon Cc: Hugh Dickins, Andrew Morton, Hillf Danton, Russell King, linux-arch, linux-kernel, linux-mm On Thu 16-08-12 18:34:59, Will Deacon wrote: > On Thu, Aug 16, 2012 at 06:25:27PM +0100, Michal Hocko wrote: > > On Thu 16-08-12 17:09:54, Will Deacon wrote: > > > On Wed, Aug 08, 2012 at 05:26:07PM +0100, Michal Hocko wrote: > > > > I guess the cleanest way is to hook into dequeue_huge_page_node and add > > > > something like arch_clear_hugepage_flags. > > > > > > I hooked into enqueue_huge_page instead, but how about something like this?: > > > > Do you have any specific reason for that? enqueue_huge_page is called on > > pages which potentially never get used so isn't that wasting a bit? > > Not that it would be wrong I was just thinking why shouldn't we do it > > when the page is actualy going to be used for sure. > > I just did it that way to match the flag clearing for normal pages. I can > move it into dequeue if you think it's worthwhile but in the worst case it > just adds a clear_bit call, so I doubt it's measurable. I do not have a strong opinion on that but the flags come in cleared when they are freshly allocated (gather_surplus_pages) so then it would be more appropriate in free_huge_page when enqueue_huge_page is called. But this is just a nit. > > Will -- Michal Hocko SUSE Labs ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [PATCH] mm: hugetlb: flush dcache before returning zeroed huge page to userspace @ 2012-08-16 18:06 ` Michal Hocko 0 siblings, 0 replies; 48+ messages in thread From: Michal Hocko @ 2012-08-16 18:06 UTC (permalink / raw) To: Will Deacon Cc: Hugh Dickins, Andrew Morton, Hillf Danton, Russell King, linux-arch, linux-kernel, linux-mm On Thu 16-08-12 18:34:59, Will Deacon wrote: > On Thu, Aug 16, 2012 at 06:25:27PM +0100, Michal Hocko wrote: > > On Thu 16-08-12 17:09:54, Will Deacon wrote: > > > On Wed, Aug 08, 2012 at 05:26:07PM +0100, Michal Hocko wrote: > > > > I guess the cleanest way is to hook into dequeue_huge_page_node and add > > > > something like arch_clear_hugepage_flags. > > > > > > I hooked into enqueue_huge_page instead, but how about something like this?: > > > > Do you have any specific reason for that? enqueue_huge_page is called on > > pages which potentially never get used so isn't that wasting a bit? > > Not that it would be wrong I was just thinking why shouldn't we do it > > when the page is actualy going to be used for sure. > > I just did it that way to match the flag clearing for normal pages. I can > move it into dequeue if you think it's worthwhile but in the worst case it > just adds a clear_bit call, so I doubt it's measurable. I do not have a strong opinion on that but the flags come in cleared when they are freshly allocated (gather_surplus_pages) so then it would be more appropriate in free_huge_page when enqueue_huge_page is called. But this is just a nit. > > Will -- Michal Hocko SUSE Labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [PATCH] mm: hugetlb: flush dcache before returning zeroed huge page to userspace 2012-08-16 18:06 ` Michal Hocko @ 2012-08-16 18:19 ` Will Deacon -1 siblings, 0 replies; 48+ messages in thread From: Will Deacon @ 2012-08-16 18:19 UTC (permalink / raw) To: Michal Hocko Cc: Hugh Dickins, Andrew Morton, Hillf Danton, Russell King, linux-arch, linux-kernel, linux-mm On Thu, Aug 16, 2012 at 07:06:14PM +0100, Michal Hocko wrote: > On Thu 16-08-12 18:34:59, Will Deacon wrote: > > I just did it that way to match the flag clearing for normal pages. I can > > move it into dequeue if you think it's worthwhile but in the worst case it > > just adds a clear_bit call, so I doubt it's measurable. > > I do not have a strong opinion on that but the flags come in cleared when > they are freshly allocated (gather_surplus_pages) so then it would be > more appropriate in free_huge_page when enqueue_huge_page is called. Makes sense, I'll move the call into free_huge_page. Cheers, Will ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [PATCH] mm: hugetlb: flush dcache before returning zeroed huge page to userspace @ 2012-08-16 18:19 ` Will Deacon 0 siblings, 0 replies; 48+ messages in thread From: Will Deacon @ 2012-08-16 18:19 UTC (permalink / raw) To: Michal Hocko Cc: Hugh Dickins, Andrew Morton, Hillf Danton, Russell King, linux-arch, linux-kernel, linux-mm On Thu, Aug 16, 2012 at 07:06:14PM +0100, Michal Hocko wrote: > On Thu 16-08-12 18:34:59, Will Deacon wrote: > > I just did it that way to match the flag clearing for normal pages. I can > > move it into dequeue if you think it's worthwhile but in the worst case it > > just adds a clear_bit call, so I doubt it's measurable. > > I do not have a strong opinion on that but the flags come in cleared when > they are freshly allocated (gather_surplus_pages) so then it would be > more appropriate in free_huge_page when enqueue_huge_page is called. Makes sense, I'll move the call into free_huge_page. Cheers, Will -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [PATCH] mm: hugetlb: flush dcache before returning zeroed huge page to userspace 2012-08-16 16:09 ` Will Deacon @ 2012-08-16 18:20 ` Michal Hocko -1 siblings, 0 replies; 48+ messages in thread From: Michal Hocko @ 2012-08-16 18:20 UTC (permalink / raw) To: Will Deacon Cc: Hugh Dickins, Andrew Morton, Hillf Danton, Russell King, linux-arch, linux-kernel, linux-mm On Thu 16-08-12 17:09:54, Will Deacon wrote: > On Wed, Aug 08, 2012 at 05:26:07PM +0100, Michal Hocko wrote: [...] > diff --git a/arch/ia64/include/asm/hugetlb.h b/arch/ia64/include/asm/hugetlb.h > index da55c63..2adaa60 100644 > --- a/arch/ia64/include/asm/hugetlb.h > +++ b/arch/ia64/include/asm/hugetlb.h > @@ -1,6 +1,7 @@ > #ifndef _ASM_IA64_HUGETLB_H > #define _ASM_IA64_HUGETLB_H > > +#include <asm/cacheflush.h> > #include <asm/page.h> > > > @@ -77,4 +78,9 @@ static inline void arch_release_hugepage(struct page *page) > { > } > > +static inline void arch_clear_hugepage_flags(struct page *page) > +{ > + flush_dcache_page(page); > +} > + Why do we need the hook for ia64? hugetlb_no_page calls clear_huge_page and that one calls flush_dcache_page (via clear_user_page), right? The same applies to copy_huge_page for COW. > diff --git a/arch/powerpc/include/asm/hugetlb.h b/arch/powerpc/include/asm/hugetlb.h > index dfdb95b..52696e6 100644 > --- a/arch/powerpc/include/asm/hugetlb.h > +++ b/arch/powerpc/include/asm/hugetlb.h > @@ -2,6 +2,7 @@ > #define _ASM_POWERPC_HUGETLB_H > > #ifdef CONFIG_HUGETLB_PAGE > +#include <asm/cacheflush.h> > #include <asm/page.h> > > extern struct kmem_cache *hugepte_cache; > @@ -151,6 +152,11 @@ static inline void arch_release_hugepage(struct page *page) > { > } > > +static inline void arch_clear_hugepage_flags(struct page *page) > +{ > + flush_dcache_page(page); > +} > + Same here -- Michal Hocko SUSE Labs ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [PATCH] mm: hugetlb: flush dcache before returning zeroed huge page to userspace @ 2012-08-16 18:20 ` Michal Hocko 0 siblings, 0 replies; 48+ messages in thread From: Michal Hocko @ 2012-08-16 18:20 UTC (permalink / raw) To: Will Deacon Cc: Hugh Dickins, Andrew Morton, Hillf Danton, Russell King, linux-arch, linux-kernel, linux-mm On Thu 16-08-12 17:09:54, Will Deacon wrote: > On Wed, Aug 08, 2012 at 05:26:07PM +0100, Michal Hocko wrote: [...] > diff --git a/arch/ia64/include/asm/hugetlb.h b/arch/ia64/include/asm/hugetlb.h > index da55c63..2adaa60 100644 > --- a/arch/ia64/include/asm/hugetlb.h > +++ b/arch/ia64/include/asm/hugetlb.h > @@ -1,6 +1,7 @@ > #ifndef _ASM_IA64_HUGETLB_H > #define _ASM_IA64_HUGETLB_H > > +#include <asm/cacheflush.h> > #include <asm/page.h> > > > @@ -77,4 +78,9 @@ static inline void arch_release_hugepage(struct page *page) > { > } > > +static inline void arch_clear_hugepage_flags(struct page *page) > +{ > + flush_dcache_page(page); > +} > + Why do we need the hook for ia64? hugetlb_no_page calls clear_huge_page and that one calls flush_dcache_page (via clear_user_page), right? The same applies to copy_huge_page for COW. > diff --git a/arch/powerpc/include/asm/hugetlb.h b/arch/powerpc/include/asm/hugetlb.h > index dfdb95b..52696e6 100644 > --- a/arch/powerpc/include/asm/hugetlb.h > +++ b/arch/powerpc/include/asm/hugetlb.h > @@ -2,6 +2,7 @@ > #define _ASM_POWERPC_HUGETLB_H > > #ifdef CONFIG_HUGETLB_PAGE > +#include <asm/cacheflush.h> > #include <asm/page.h> > > extern struct kmem_cache *hugepte_cache; > @@ -151,6 +152,11 @@ static inline void arch_release_hugepage(struct page *page) > { > } > > +static inline void arch_clear_hugepage_flags(struct page *page) > +{ > + flush_dcache_page(page); > +} > + Same here -- Michal Hocko SUSE Labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [PATCH] mm: hugetlb: flush dcache before returning zeroed huge page to userspace 2012-08-16 18:20 ` Michal Hocko @ 2012-08-16 18:32 ` Will Deacon -1 siblings, 0 replies; 48+ messages in thread From: Will Deacon @ 2012-08-16 18:32 UTC (permalink / raw) To: Michal Hocko Cc: Hugh Dickins, Andrew Morton, Hillf Danton, Russell King, linux-arch, linux-kernel, linux-mm On Thu, Aug 16, 2012 at 07:20:15PM +0100, Michal Hocko wrote: > On Thu 16-08-12 17:09:54, Will Deacon wrote: > > +static inline void arch_clear_hugepage_flags(struct page *page) > > +{ > > + flush_dcache_page(page); > > +} > > + > > Why do we need the hook for ia64? hugetlb_no_page calls clear_huge_page > and that one calls flush_dcache_page (via clear_user_page), right? > The same applies to copy_huge_page for COW. You're right, these are redundant for ppc and ia64 (although ppc does have a comment moaning about the flush). Looks like it's just sh and ARM that need to do anything. Cheers, Will ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [PATCH] mm: hugetlb: flush dcache before returning zeroed huge page to userspace @ 2012-08-16 18:32 ` Will Deacon 0 siblings, 0 replies; 48+ messages in thread From: Will Deacon @ 2012-08-16 18:32 UTC (permalink / raw) To: Michal Hocko Cc: Hugh Dickins, Andrew Morton, Hillf Danton, Russell King, linux-arch, linux-kernel, linux-mm On Thu, Aug 16, 2012 at 07:20:15PM +0100, Michal Hocko wrote: > On Thu 16-08-12 17:09:54, Will Deacon wrote: > > +static inline void arch_clear_hugepage_flags(struct page *page) > > +{ > > + flush_dcache_page(page); > > +} > > + > > Why do we need the hook for ia64? hugetlb_no_page calls clear_huge_page > and that one calls flush_dcache_page (via clear_user_page), right? > The same applies to copy_huge_page for COW. You're right, these are redundant for ppc and ia64 (although ppc does have a comment moaning about the flush). Looks like it's just sh and ARM that need to do anything. Cheers, Will -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 48+ messages in thread
end of thread, other threads:[~2012-08-16 18:33 UTC | newest] Thread overview: 48+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2012-07-04 14:32 [PATCH] mm: hugetlb: flush dcache before returning zeroed huge page to userspace Will Deacon 2012-07-04 14:32 ` Will Deacon 2012-07-05 12:37 ` Hillf Danton 2012-07-05 12:37 ` Hillf Danton 2012-07-05 14:17 ` Will Deacon 2012-07-05 14:17 ` Will Deacon 2012-07-06 13:15 ` Hillf Danton 2012-07-06 13:15 ` Hillf Danton 2012-07-09 12:25 ` Michal Hocko 2012-07-09 12:25 ` Michal Hocko 2012-07-09 14:13 ` Will Deacon 2012-07-09 14:13 ` Will Deacon 2012-07-09 23:57 ` Hugh Dickins 2012-07-09 23:57 ` Hugh Dickins 2012-07-10 9:45 ` Will Deacon 2012-07-10 9:45 ` Will Deacon 2012-07-10 10:42 ` Will Deacon 2012-07-10 10:42 ` Will Deacon 2012-07-11 17:48 ` Will Deacon 2012-07-11 17:48 ` Will Deacon 2012-07-12 11:16 ` Michal Hocko 2012-07-12 11:16 ` Michal Hocko 2012-07-12 11:26 ` James Bottomley 2012-07-12 11:26 ` James Bottomley 2012-07-12 11:26 ` Will Deacon 2012-07-12 11:26 ` Will Deacon 2012-07-12 11:57 ` Michal Hocko 2012-07-12 11:57 ` Michal Hocko 2012-08-07 16:03 ` Will Deacon 2012-08-07 16:03 ` Will Deacon 2012-08-08 16:26 ` Michal Hocko 2012-08-08 16:26 ` Michal Hocko 2012-08-16 16:09 ` Will Deacon 2012-08-16 16:09 ` Will Deacon 2012-08-16 16:09 ` Will Deacon 2012-08-16 16:09 ` Will Deacon 2012-08-16 17:25 ` Michal Hocko 2012-08-16 17:25 ` Michal Hocko 2012-08-16 17:34 ` Will Deacon 2012-08-16 17:34 ` Will Deacon 2012-08-16 18:06 ` Michal Hocko 2012-08-16 18:06 ` Michal Hocko 2012-08-16 18:19 ` Will Deacon 2012-08-16 18:19 ` Will Deacon 2012-08-16 18:20 ` Michal Hocko 2012-08-16 18:20 ` Michal Hocko 2012-08-16 18:32 ` Will Deacon 2012-08-16 18:32 ` Will Deacon
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.