linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2] Enable s390/arc/sparc to use generic thp deposit/withdraw
@ 2016-02-11  9:28 Vineet Gupta
  2016-02-11  9:28 ` [PATCH 1/2] mm,thp: refactor generic deposit/withdraw routines for wider usage Vineet Gupta
  2016-02-11  9:28 ` [PATCH 2/2] ARC: mm: THP: use generic THP deposit/withdraw Vineet Gupta
  0 siblings, 2 replies; 7+ messages in thread
From: Vineet Gupta @ 2016-02-11  9:28 UTC (permalink / raw)
  To: Andrew Morton, Kirill A. Shutemov
  Cc: Aneesh Kumar K.V, David S. Miller, Alex Thorlton,
	Gerald Schaefer, Martin Schwidefsky, linux-snps-arc,
	linux-kernel, linux-mm, linux-arch, Vineet Gupta

Hi,

This came out my debugging THP on ARC. The generic deposit/withdraw routines
can be easily adapted to work with pgtable_t != struct page *.

Build/Run tested on ARC only.

Thx,
-Vineet

Vineet Gupta (2):
  mm,thp: refactor generic deposit/withdraw routines for wider usage
  ARC: mm: THP: use generic THP deposit/withdraw

 arch/arc/include/asm/hugepage.h |  8 --------
 arch/arc/mm/tlb.c               | 37 -------------------------------------
 mm/pgtable-generic.c            | 27 +++++++++++++++++----------
 3 files changed, 17 insertions(+), 55 deletions(-)

-- 
2.5.0

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 1/2] mm,thp: refactor generic deposit/withdraw routines for wider usage
  2016-02-11  9:28 [PATCH 0/2] Enable s390/arc/sparc to use generic thp deposit/withdraw Vineet Gupta
@ 2016-02-11  9:28 ` Vineet Gupta
  2016-02-11 10:22   ` Martin Schwidefsky
  2016-02-11  9:28 ` [PATCH 2/2] ARC: mm: THP: use generic THP deposit/withdraw Vineet Gupta
  1 sibling, 1 reply; 7+ messages in thread
From: Vineet Gupta @ 2016-02-11  9:28 UTC (permalink / raw)
  To: Andrew Morton, Kirill A. Shutemov
  Cc: Aneesh Kumar K.V, David S. Miller, Alex Thorlton,
	Gerald Schaefer, Martin Schwidefsky, linux-snps-arc,
	linux-kernel, linux-mm, linux-arch, Vineet Gupta,
	Andrea Arcangeli

Generic pgtable_trans_huge_deposit()/pgtable_trans_huge_withdraw()
assume pgtable_t to be struct page * which is not true for all arches.
Thus arc, s390, sparch end up with their own copies despite no special
hardware requirements (unlike powerpc).

It seems massaging the code a bit can make it reusbale.

 - Use explicit casts to (struct page *). For existing users, this
   should be semantically no-op for existing users

 - The only addition is zero'ing out of page->lru which for arc leaves
   a stray entry in pgtable_t cause mm spew when such pgtable is freed.

  | huge_memory: BUG: failure at
  | ../mm/huge_memory.c:1858/__split_huge_page_map()!
  | CPU: 0 PID: 901 Comm: bw_mem Not tainted 4.4.0-00015-g0569c1459cfa-dirty
  |
  | Stack Trace:
  |  arc_unwind_core.constprop.1+0x94/0x104
  |  split_huge_page_to_list+0x5c0/0x920
  |  __split_huge_page_pmd+0xc8/0x1b4
  |  vma_adjust_trans_huge+0x104/0x1c8
  |  vma_adjust+0xf8/0x6d8
  |  __split_vma.isra.40+0xf8/0x174
  |  do_munmap+0x360/0x428
  |  SyS_munmap+0x28/0x44

Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: David S. Miller <davem@davemloft.net>
Cc: Alex Thorlton <athorlton@sgi.com>
Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: linux-snps-arc@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-mm@kvack.org
Cc: linux-arch@vger.kernel.org
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
---
 mm/pgtable-generic.c | 27 +++++++++++++++++----------
 1 file changed, 17 insertions(+), 10 deletions(-)

diff --git a/mm/pgtable-generic.c b/mm/pgtable-generic.c
index 75664ed7e3ab..c9f2f6f8c7bb 100644
--- a/mm/pgtable-generic.c
+++ b/mm/pgtable-generic.c
@@ -155,13 +155,17 @@ void pmdp_splitting_flush(struct vm_area_struct *vma, unsigned long address,
 void pgtable_trans_huge_deposit(struct mm_struct *mm, pmd_t *pmdp,
 				pgtable_t pgtable)
 {
+	struct page *new = (struct page *)pgtable;
+	struct page *head;
+
 	assert_spin_locked(pmd_lockptr(mm, pmdp));
 
 	/* FIFO */
-	if (!pmd_huge_pte(mm, pmdp))
-		INIT_LIST_HEAD(&pgtable->lru);
+	head = (struct page *)pmd_huge_pte(mm, pmdp);
+	if (!head)
+		INIT_LIST_HEAD(&new->lru);
 	else
-		list_add(&pgtable->lru, &pmd_huge_pte(mm, pmdp)->lru);
+		list_add(&new->lru, &head->lru);
 	pmd_huge_pte(mm, pmdp) = pgtable;
 }
 #endif
@@ -170,20 +174,23 @@ void pgtable_trans_huge_deposit(struct mm_struct *mm, pmd_t *pmdp,
 /* no "address" argument so destroys page coloring of some arch */
 pgtable_t pgtable_trans_huge_withdraw(struct mm_struct *mm, pmd_t *pmdp)
 {
-	pgtable_t pgtable;
+	struct page *page;
 
 	assert_spin_locked(pmd_lockptr(mm, pmdp));
 
+	page = (struct page *)pmd_huge_pte(mm, pmdp);
+
 	/* FIFO */
-	pgtable = pmd_huge_pte(mm, pmdp);
-	if (list_empty(&pgtable->lru))
+	if (list_empty(&page->lru))
 		pmd_huge_pte(mm, pmdp) = NULL;
 	else {
-		pmd_huge_pte(mm, pmdp) = list_entry(pgtable->lru.next,
-					      struct page, lru);
-		list_del(&pgtable->lru);
+		pmd_huge_pte(mm, pmdp) = (pgtable_t) list_entry(page->lru.next,
+							struct page, lru);
+		list_del(&page->lru);
 	}
-	return pgtable;
+
+	memset(&page->lru, 0, sizeof(page->lru));
+	return (pgtable_t)page;
 }
 #endif
 
-- 
2.5.0

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 2/2] ARC: mm: THP: use generic THP deposit/withdraw
  2016-02-11  9:28 [PATCH 0/2] Enable s390/arc/sparc to use generic thp deposit/withdraw Vineet Gupta
  2016-02-11  9:28 ` [PATCH 1/2] mm,thp: refactor generic deposit/withdraw routines for wider usage Vineet Gupta
@ 2016-02-11  9:28 ` Vineet Gupta
  1 sibling, 0 replies; 7+ messages in thread
From: Vineet Gupta @ 2016-02-11  9:28 UTC (permalink / raw)
  To: Andrew Morton, Kirill A. Shutemov
  Cc: Aneesh Kumar K.V, David S. Miller, Alex Thorlton,
	Gerald Schaefer, Martin Schwidefsky, linux-snps-arc,
	linux-kernel, linux-mm, linux-arch, Vineet Gupta

Generic code can now cope with pgtable_t != struct page *

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
---
 arch/arc/include/asm/hugepage.h |  8 --------
 arch/arc/mm/tlb.c               | 37 -------------------------------------
 2 files changed, 45 deletions(-)

diff --git a/arch/arc/include/asm/hugepage.h b/arch/arc/include/asm/hugepage.h
index c5094de86403..8653ed2f2ec5 100644
--- a/arch/arc/include/asm/hugepage.h
+++ b/arch/arc/include/asm/hugepage.h
@@ -66,14 +66,6 @@ extern void update_mmu_cache_pmd(struct vm_area_struct *vma, unsigned long addr,
 
 #define has_transparent_hugepage() 1
 
-/* Generic variants assume pgtable_t is struct page *, hence need for these */
-#define __HAVE_ARCH_PGTABLE_DEPOSIT
-extern void pgtable_trans_huge_deposit(struct mm_struct *mm, pmd_t *pmdp,
-				       pgtable_t pgtable);
-
-#define __HAVE_ARCH_PGTABLE_WITHDRAW
-extern pgtable_t pgtable_trans_huge_withdraw(struct mm_struct *mm, pmd_t *pmdp);
-
 #define __HAVE_ARCH_FLUSH_PMD_TLB_RANGE
 extern void flush_pmd_tlb_range(struct vm_area_struct *vma, unsigned long start,
 				unsigned long end);
diff --git a/arch/arc/mm/tlb.c b/arch/arc/mm/tlb.c
index 2e731c87011e..b300479b8ad3 100644
--- a/arch/arc/mm/tlb.c
+++ b/arch/arc/mm/tlb.c
@@ -663,43 +663,6 @@ void update_mmu_cache_pmd(struct vm_area_struct *vma, unsigned long addr,
 	update_mmu_cache(vma, addr, &pte);
 }
 
-void pgtable_trans_huge_deposit(struct mm_struct *mm, pmd_t *pmdp,
-				pgtable_t pgtable)
-{
-	struct list_head *lh = (struct list_head *) pgtable;
-
-	assert_spin_locked(&mm->page_table_lock);
-
-	/* FIFO */
-	if (!pmd_huge_pte(mm, pmdp))
-		INIT_LIST_HEAD(lh);
-	else
-		list_add(lh, (struct list_head *) pmd_huge_pte(mm, pmdp));
-	pmd_huge_pte(mm, pmdp) = pgtable;
-}
-
-pgtable_t pgtable_trans_huge_withdraw(struct mm_struct *mm, pmd_t *pmdp)
-{
-	struct list_head *lh;
-	pgtable_t pgtable;
-
-	assert_spin_locked(&mm->page_table_lock);
-
-	pgtable = pmd_huge_pte(mm, pmdp);
-	lh = (struct list_head *) pgtable;
-	if (list_empty(lh))
-		pmd_huge_pte(mm, pmdp) = NULL;
-	else {
-		pmd_huge_pte(mm, pmdp) = (pgtable_t) lh->next;
-		list_del(lh);
-	}
-
-	pte_val(pgtable[0]) = 0;
-	pte_val(pgtable[1]) = 0;
-
-	return pgtable;
-}
-
 void local_flush_pmd_tlb_range(struct vm_area_struct *vma, unsigned long start,
 			       unsigned long end)
 {
-- 
2.5.0

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/2] mm,thp: refactor generic deposit/withdraw routines for wider usage
  2016-02-11  9:28 ` [PATCH 1/2] mm,thp: refactor generic deposit/withdraw routines for wider usage Vineet Gupta
@ 2016-02-11 10:22   ` Martin Schwidefsky
  2016-02-11 10:53     ` Vineet Gupta
  0 siblings, 1 reply; 7+ messages in thread
From: Martin Schwidefsky @ 2016-02-11 10:22 UTC (permalink / raw)
  To: Vineet Gupta
  Cc: Andrew Morton, Kirill A. Shutemov, Aneesh Kumar K.V,
	David S. Miller, Alex Thorlton, Gerald Schaefer, linux-snps-arc,
	linux-kernel, linux-mm, linux-arch, Andrea Arcangeli

On Thu, 11 Feb 2016 14:58:26 +0530
Vineet Gupta <Vineet.Gupta1@synopsys.com> wrote:

> Generic pgtable_trans_huge_deposit()/pgtable_trans_huge_withdraw()
> assume pgtable_t to be struct page * which is not true for all arches.
> Thus arc, s390, sparch end up with their own copies despite no special
> hardware requirements (unlike powerpc).

s390 does have a special hardware requirement. pgtable_t is an address
for a 2K block of memory. It is *not* equivalent to a struct page *
which refers to a 4K block of memory. That has been the whole point
to introduce pgtable_t.

> It seems massaging the code a bit can make it reusbale.

Imho the new code for asm-generic looks fine, as long as the override
with __HAVE_ARCH_PGTABLE_DEPOSIT/__HAVE_ARCH_PGTABLE_WITHDRAW continues
to work I do not mind.

-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/2] mm,thp: refactor generic deposit/withdraw routines for wider usage
  2016-02-11 10:22   ` Martin Schwidefsky
@ 2016-02-11 10:53     ` Vineet Gupta
  2016-02-11 11:20       ` Martin Schwidefsky
  0 siblings, 1 reply; 7+ messages in thread
From: Vineet Gupta @ 2016-02-11 10:53 UTC (permalink / raw)
  To: Martin Schwidefsky
  Cc: Andrew Morton, Kirill A. Shutemov, Aneesh Kumar K.V,
	David S. Miller, Alex Thorlton, Gerald Schaefer, linux-snps-arc,
	linux-kernel, linux-mm, linux-arch, Andrea Arcangeli

On Thursday 11 February 2016 03:52 PM, Martin Schwidefsky wrote:
> On Thu, 11 Feb 2016 14:58:26 +0530
> Vineet Gupta <Vineet.Gupta1@synopsys.com> wrote:
> 
>> Generic pgtable_trans_huge_deposit()/pgtable_trans_huge_withdraw()
>> assume pgtable_t to be struct page * which is not true for all arches.
>> Thus arc, s390, sparch end up with their own copies despite no special
>> hardware requirements (unlike powerpc).
> 
> s390 does have a special hardware requirement. pgtable_t is an address
> for a 2K block of memory. It is *not* equivalent to a struct page *
> which refers to a 4K block of memory. That has been the whole point
> to introduce pgtable_t.

Actually my reference to hardware requirement was more like powerpc style save a
hash value some where etc.

Now pgtable_t need not be struct page * even if the actual sizes are same - e.g.
in ARC port I kept pgtable_t as pte_t * simply to avoid a few page_address() calls
in mm code (you could argue that is was a micro-optimization, anyways..)

So given I know nothing about s390 MMU internals, I still think you can switch to
the update generic version despite 2K vs. 4K. Agree ?

>> It seems massaging the code a bit can make it reusbale.
> 
> Imho the new code for asm-generic looks fine, as long as the override
> with __HAVE_ARCH_PGTABLE_DEPOSIT/__HAVE_ARCH_PGTABLE_WITHDRAW continues
> to work I do not mind.


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/2] mm,thp: refactor generic deposit/withdraw routines for wider usage
  2016-02-11 10:53     ` Vineet Gupta
@ 2016-02-11 11:20       ` Martin Schwidefsky
  2016-02-11 12:29         ` Vineet Gupta
  0 siblings, 1 reply; 7+ messages in thread
From: Martin Schwidefsky @ 2016-02-11 11:20 UTC (permalink / raw)
  To: Vineet Gupta
  Cc: Andrew Morton, Kirill A. Shutemov, Aneesh Kumar K.V,
	David S. Miller, Alex Thorlton, Gerald Schaefer, linux-snps-arc,
	linux-kernel, linux-mm, linux-arch, Andrea Arcangeli

On Thu, 11 Feb 2016 16:23:33 +0530
Vineet Gupta <Vineet.Gupta1@synopsys.com> wrote:

> On Thursday 11 February 2016 03:52 PM, Martin Schwidefsky wrote:
> > On Thu, 11 Feb 2016 14:58:26 +0530
> > Vineet Gupta <Vineet.Gupta1@synopsys.com> wrote:
> > 
> >> Generic pgtable_trans_huge_deposit()/pgtable_trans_huge_withdraw()
> >> assume pgtable_t to be struct page * which is not true for all arches.
> >> Thus arc, s390, sparch end up with their own copies despite no special
> >> hardware requirements (unlike powerpc).
> > 
> > s390 does have a special hardware requirement. pgtable_t is an address
> > for a 2K block of memory. It is *not* equivalent to a struct page *
> > which refers to a 4K block of memory. That has been the whole point
> > to introduce pgtable_t.
> 
> Actually my reference to hardware requirement was more like powerpc style save a
> hash value some where etc.
> 
> Now pgtable_t need not be struct page * even if the actual sizes are same - e.g.
> in ARC port I kept pgtable_t as pte_t * simply to avoid a few page_address() calls
> in mm code (you could argue that is was a micro-optimization, anyways..)
> 
> So given I know nothing about s390 MMU internals, I still think you can switch to
> the update generic version despite 2K vs. 4K. Agree ?

No, we can not. For s390 a page table is aligned on a 2K boundary and is
only half the size of a page (except for KVM but that is another story).
For s390 a pgtable_t is a pointer to the memory location with the 256 ptes
and not a struct page *.

The cast "struct page *new = (struct page*)pgtable;" in your first patch
is already broken, "new" points to the memory of the page table and
the list_head operations will clobber that memory. You try to fix it up
with the memset to zero in pgtable_trans_huge_withdraw but that does not
correct the pte entries for s390 as an invalid page-table entry is *not*
all zeros.

In short, please let s390 keep its own copy of deposit/withdraw.

-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/2] mm,thp: refactor generic deposit/withdraw routines for wider usage
  2016-02-11 11:20       ` Martin Schwidefsky
@ 2016-02-11 12:29         ` Vineet Gupta
  0 siblings, 0 replies; 7+ messages in thread
From: Vineet Gupta @ 2016-02-11 12:29 UTC (permalink / raw)
  To: Martin Schwidefsky
  Cc: Andrew Morton, Kirill A. Shutemov, Aneesh Kumar K.V,
	David S. Miller, Alex Thorlton, Gerald Schaefer, linux-snps-arc,
	linux-kernel, linux-mm, linux-arch, Andrea Arcangeli

On Thursday 11 February 2016 04:50 PM, Martin Schwidefsky wrote:
> On Thu, 11 Feb 2016 16:23:33 +0530
> Vineet Gupta <Vineet.Gupta1@synopsys.com> wrote:
> 
>> On Thursday 11 February 2016 03:52 PM, Martin Schwidefsky wrote:
>>> On Thu, 11 Feb 2016 14:58:26 +0530
>>> Vineet Gupta <Vineet.Gupta1@synopsys.com> wrote:
>>>
>>>> Generic pgtable_trans_huge_deposit()/pgtable_trans_huge_withdraw()
>>>> assume pgtable_t to be struct page * which is not true for all arches.
>>>> Thus arc, s390, sparch end up with their own copies despite no special
>>>> hardware requirements (unlike powerpc).
>>>
>>> s390 does have a special hardware requirement. pgtable_t is an address
>>> for a 2K block of memory. It is *not* equivalent to a struct page *
>>> which refers to a 4K block of memory. That has been the whole point
>>> to introduce pgtable_t.
>>
>> Actually my reference to hardware requirement was more like powerpc style save a
>> hash value some where etc.
>>
>> Now pgtable_t need not be struct page * even if the actual sizes are same - e.g.
>> in ARC port I kept pgtable_t as pte_t * simply to avoid a few page_address() calls
>> in mm code (you could argue that is was a micro-optimization, anyways..)
>>
>> So given I know nothing about s390 MMU internals, I still think you can switch to
>> the update generic version despite 2K vs. 4K. Agree ?
> 
> No, we can not. For s390 a page table is aligned on a 2K boundary and is
> only half the size of a page (except for KVM but that is another story).
> For s390 a pgtable_t is a pointer to the memory location with the 256 ptes
> and not a struct page *.
> 
> The cast "struct page *new = (struct page*)pgtable;" in your first patch
> is already broken, "new" points to the memory of the page table and
> the list_head operations will clobber that memory.

The current s390 code does something similar using a different struct cast. It is
still writing in pgtable_t - although at a different location.

> You try to fix it up
> with the memset to zero in pgtable_trans_huge_withdraw but that does not
> correct the pte entries for s390 as an invalid page-table entry is *not*
> all zeros.

Right so that is the problem - just trying to understand.

> In short, please let s390 keep its own copy of deposit/withdraw.

You got it - I'm out of the way :-)

Thx,
-Vineet

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2016-02-11 12:29 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-02-11  9:28 [PATCH 0/2] Enable s390/arc/sparc to use generic thp deposit/withdraw Vineet Gupta
2016-02-11  9:28 ` [PATCH 1/2] mm,thp: refactor generic deposit/withdraw routines for wider usage Vineet Gupta
2016-02-11 10:22   ` Martin Schwidefsky
2016-02-11 10:53     ` Vineet Gupta
2016-02-11 11:20       ` Martin Schwidefsky
2016-02-11 12:29         ` Vineet Gupta
2016-02-11  9:28 ` [PATCH 2/2] ARC: mm: THP: use generic THP deposit/withdraw Vineet Gupta

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).