All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] khugepaged: collapse pmd for pte-mapped THP
@ 2019-07-29  5:43 Song Liu
  2019-07-29  5:43 ` [PATCH 1/2] khugepaged: enable " Song Liu
  2019-07-29  5:43 ` [PATCH 2/2] uprobe: collapse THP pmd after removing all uprobes Song Liu
  0 siblings, 2 replies; 10+ messages in thread
From: Song Liu @ 2019-07-29  5:43 UTC (permalink / raw)
  To: linux-kernel, linux-mm, akpm
  Cc: matthew.wilcox, kirill.shutemov, oleg, kernel-team,
	william.kucharski, srikar, Song Liu

This set is the newer version of 5/6 and 6/6 of [1]. v9 of 1-4 of
the work [2] was recently picked by Andrew.

Patch 1 enables khugepaged to handle pte-mapped THP. These THPs are left
in such state when khugepaged failed to get exclusive lock of mmap_sem.

Patch 2 leverages work in 1 for uprobe on THP. After [2], uprobe only
splits the PMD. When the uprobe is disabled, we get pte-mapped THP.
After this set, these pte-mapped THP will be collapsed as pmd-mapped.

[1] https://lkml.org/lkml/2019/6/23/23
[2] https://www.spinics.net/lists/linux-mm/msg185889.html

Song Liu (2):
  khugepaged: enable collapse pmd for pte-mapped THP
  uprobe: collapse THP pmd after removing all uprobes

 include/linux/khugepaged.h |  15 ++++
 kernel/events/uprobes.c    |   9 +++
 mm/khugepaged.c            | 136 +++++++++++++++++++++++++++++++++++++
 3 files changed, 160 insertions(+)

--
2.17.1

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 1/2] khugepaged: enable collapse pmd for pte-mapped THP
  2019-07-29  5:43 [PATCH 0/2] khugepaged: collapse pmd for pte-mapped THP Song Liu
@ 2019-07-29  5:43 ` Song Liu
  2019-07-30 14:59   ` Kirill A. Shutemov
  2019-07-29  5:43 ` [PATCH 2/2] uprobe: collapse THP pmd after removing all uprobes Song Liu
  1 sibling, 1 reply; 10+ messages in thread
From: Song Liu @ 2019-07-29  5:43 UTC (permalink / raw)
  To: linux-kernel, linux-mm, akpm
  Cc: matthew.wilcox, kirill.shutemov, oleg, kernel-team,
	william.kucharski, srikar, Song Liu

khugepaged needs exclusive mmap_sem to access page table. When it fails
to lock mmap_sem, the page will fault in as pte-mapped THP. As the page
is already a THP, khugepaged will not handle this pmd again.

This patch enables the khugepaged to retry collapse the page table.

struct mm_slot (in khugepaged.c) is extended with an array, containing
addresses of pte-mapped THPs. We use array here for simplicity. We can
easily replace it with more advanced data structures when needed. This
array is protected by khugepaged_mm_lock.

In khugepaged_scan_mm_slot(), if the mm contains pte-mapped THP, we try
to collapse the page table.

Since collapse may happen at an later time, some pages may already fault
in. collapse_pte_mapped_thp() is added to properly handle these pages.
collapse_pte_mapped_thp() also double checks whether all ptes in this pmd
are mapping to the same THP. This is necessary because some subpage of
the THP may be replaced, for example by uprobe. In such cases, it is not
possible to collapse the pmd.

Signed-off-by: Song Liu <songliubraving@fb.com>
---
 include/linux/khugepaged.h |  15 ++++
 mm/khugepaged.c            | 136 +++++++++++++++++++++++++++++++++++++
 2 files changed, 151 insertions(+)

diff --git a/include/linux/khugepaged.h b/include/linux/khugepaged.h
index 082d1d2a5216..2d700830fe0e 100644
--- a/include/linux/khugepaged.h
+++ b/include/linux/khugepaged.h
@@ -15,6 +15,16 @@ extern int __khugepaged_enter(struct mm_struct *mm);
 extern void __khugepaged_exit(struct mm_struct *mm);
 extern int khugepaged_enter_vma_merge(struct vm_area_struct *vma,
 				      unsigned long vm_flags);
+#ifdef CONFIG_SHMEM
+extern int khugepaged_add_pte_mapped_thp(struct mm_struct *mm,
+					 unsigned long addr);
+#else
+static inline int khugepaged_add_pte_mapped_thp(struct mm_struct *mm,
+						unsigned long addr)
+{
+	return 0;
+}
+#endif
 
 #define khugepaged_enabled()					       \
 	(transparent_hugepage_flags &				       \
@@ -73,6 +83,11 @@ static inline int khugepaged_enter_vma_merge(struct vm_area_struct *vma,
 {
 	return 0;
 }
+static inline int khugepaged_add_pte_mapped_thp(struct mm_struct *mm,
+						unsigned long addr)
+{
+	return 0;
+}
 #endif /* CONFIG_TRANSPARENT_HUGEPAGE */
 
 #endif /* _LINUX_KHUGEPAGED_H */
diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index eaaa21b23215..247c25aeb096 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -76,6 +76,7 @@ static __read_mostly DEFINE_HASHTABLE(mm_slots_hash, MM_SLOTS_HASH_BITS);
 
 static struct kmem_cache *mm_slot_cache __read_mostly;
 
+#define MAX_PTE_MAPPED_THP 8
 /**
  * struct mm_slot - hash lookup from mm to mm_slot
  * @hash: hash collision list
@@ -86,6 +87,10 @@ struct mm_slot {
 	struct hlist_node hash;
 	struct list_head mm_node;
 	struct mm_struct *mm;
+
+	/* pte-mapped THP in this mm */
+	int nr_pte_mapped_thp;
+	unsigned long pte_mapped_thp[MAX_PTE_MAPPED_THP];
 };
 
 /**
@@ -1281,11 +1286,141 @@ static void retract_page_tables(struct address_space *mapping, pgoff_t pgoff)
 			up_write(&vma->vm_mm->mmap_sem);
 			mm_dec_nr_ptes(vma->vm_mm);
 			pte_free(vma->vm_mm, pmd_pgtable(_pmd));
+		} else if (down_read_trylock(&vma->vm_mm->mmap_sem)) {
+			/* need down_read for khugepaged_test_exit() */
+			khugepaged_add_pte_mapped_thp(vma->vm_mm, addr);
+			up_read(&vma->vm_mm->mmap_sem);
 		}
 	}
 	i_mmap_unlock_write(mapping);
 }
 
+/*
+ * Notify khugepaged that given addr of the mm is pte-mapped THP. Then
+ * khugepaged should try to collapse the page table.
+ */
+int khugepaged_add_pte_mapped_thp(struct mm_struct *mm, unsigned long addr)
+{
+	struct mm_slot *mm_slot;
+	int ret = 0;
+
+	/* hold mmap_sem for khugepaged_test_exit() */
+	VM_BUG_ON_MM(!rwsem_is_locked(&mm->mmap_sem), mm);
+
+	if (unlikely(khugepaged_test_exit(mm)))
+		return 0;
+
+	if (!test_bit(MMF_VM_HUGEPAGE, &mm->flags) &&
+	    !test_bit(MMF_DISABLE_THP, &mm->flags)) {
+		ret = __khugepaged_enter(mm);
+		if (ret)
+			return ret;
+	}
+
+	spin_lock(&khugepaged_mm_lock);
+	mm_slot = get_mm_slot(mm);
+	if (likely(mm_slot && mm_slot->nr_pte_mapped_thp < MAX_PTE_MAPPED_THP))
+		mm_slot->pte_mapped_thp[mm_slot->nr_pte_mapped_thp++] = addr;
+
+	spin_unlock(&khugepaged_mm_lock);
+	return 0;
+}
+
+/**
+ * Try to collapse a pte-mapped THP for mm at address haddr.
+ *
+ * This function checks whether all the PTEs in the PMD are pointing to the
+ * right THP. If so, retract the page table so the THP can refault in with
+ * as pmd-mapped.
+ */
+static void collapse_pte_mapped_thp(struct mm_struct *mm, unsigned long haddr)
+{
+	struct vm_area_struct *vma = find_vma(mm, haddr);
+	pmd_t *pmd = mm_find_pmd(mm, haddr);
+	struct page *hpage = NULL;
+	unsigned long addr;
+	spinlock_t *ptl;
+	int count = 0;
+	pmd_t _pmd;
+	int i;
+
+	if (!vma || !pmd || pmd_trans_huge(*pmd))
+		return;
+
+	/* step 1: check all mapped PTEs are to the right huge page */
+	for (i = 0, addr = haddr; i < HPAGE_PMD_NR; i++, addr += PAGE_SIZE) {
+		pte_t *pte = pte_offset_map(pmd, addr);
+		struct page *page;
+
+		if (pte_none(*pte))
+			continue;
+
+		page = vm_normal_page(vma, addr, *pte);
+
+		if (!PageCompound(page))
+			return;
+
+		if (!hpage) {
+			hpage = compound_head(page);
+			if (hpage->mapping != vma->vm_file->f_mapping)
+				return;
+		}
+
+		if (hpage + i != page)
+			return;
+		count++;
+	}
+
+	/* step 2: adjust rmap */
+	for (i = 0, addr = haddr; i < HPAGE_PMD_NR; i++, addr += PAGE_SIZE) {
+		pte_t *pte = pte_offset_map(pmd, addr);
+		struct page *page;
+
+		if (pte_none(*pte))
+			continue;
+		page = vm_normal_page(vma, addr, *pte);
+		page_remove_rmap(page, false);
+	}
+
+	/* step 3: set proper refcount and mm_counters. */
+	if (hpage) {
+		page_ref_sub(hpage, count);
+		add_mm_counter(vma->vm_mm, mm_counter_file(hpage), -count);
+	}
+
+	/* step 4: collapse pmd */
+	ptl = pmd_lock(vma->vm_mm, pmd);
+	_pmd = pmdp_collapse_flush(vma, addr, pmd);
+	spin_unlock(ptl);
+	mm_dec_nr_ptes(mm);
+	pte_free(mm, pmd_pgtable(_pmd));
+}
+
+static int khugepaged_collapse_pte_mapped_thps(struct mm_slot *mm_slot)
+{
+	struct mm_struct *mm = mm_slot->mm;
+	int i;
+
+	lockdep_assert_held(&khugepaged_mm_lock);
+
+	if (likely(mm_slot->nr_pte_mapped_thp == 0))
+		return 0;
+
+	if (!down_write_trylock(&mm->mmap_sem))
+		return -EBUSY;
+
+	if (unlikely(khugepaged_test_exit(mm)))
+		goto out;
+
+	for (i = 0; i < mm_slot->nr_pte_mapped_thp; i++)
+		collapse_pte_mapped_thp(mm, mm_slot->pte_mapped_thp[i]);
+
+out:
+	mm_slot->nr_pte_mapped_thp = 0;
+	up_write(&mm->mmap_sem);
+	return 0;
+}
+
 /**
  * collapse_shmem - collapse small tmpfs/shmem pages into huge one.
  *
@@ -1667,6 +1802,7 @@ static unsigned int khugepaged_scan_mm_slot(unsigned int pages,
 		khugepaged_scan.address = 0;
 		khugepaged_scan.mm_slot = mm_slot;
 	}
+	khugepaged_collapse_pte_mapped_thps(mm_slot);
 	spin_unlock(&khugepaged_mm_lock);
 
 	mm = mm_slot->mm;
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 2/2] uprobe: collapse THP pmd after removing all uprobes
  2019-07-29  5:43 [PATCH 0/2] khugepaged: collapse pmd for pte-mapped THP Song Liu
  2019-07-29  5:43 ` [PATCH 1/2] khugepaged: enable " Song Liu
@ 2019-07-29  5:43 ` Song Liu
  2019-07-30 15:01   ` Kirill A. Shutemov
  2019-07-31 16:16   ` Oleg Nesterov
  1 sibling, 2 replies; 10+ messages in thread
From: Song Liu @ 2019-07-29  5:43 UTC (permalink / raw)
  To: linux-kernel, linux-mm, akpm
  Cc: matthew.wilcox, kirill.shutemov, oleg, kernel-team,
	william.kucharski, srikar, Song Liu

After all uprobes are removed from the huge page (with PTE pgtable), it
is possible to collapse the pmd and benefit from THP again. This patch
does the collapse by calling khugepaged_add_pte_mapped_thp().

Signed-off-by: Song Liu <songliubraving@fb.com>
---
 kernel/events/uprobes.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c
index 58ab7fc7272a..cc53789fefc6 100644
--- a/kernel/events/uprobes.c
+++ b/kernel/events/uprobes.c
@@ -26,6 +26,7 @@
 #include <linux/percpu-rwsem.h>
 #include <linux/task_work.h>
 #include <linux/shmem_fs.h>
+#include <linux/khugepaged.h>
 
 #include <linux/uprobes.h>
 
@@ -470,6 +471,7 @@ int uprobe_write_opcode(struct arch_uprobe *auprobe, struct mm_struct *mm,
 	struct page *old_page, *new_page;
 	struct vm_area_struct *vma;
 	int ret, is_register, ref_ctr_updated = 0;
+	bool orig_page_huge = false;
 
 	is_register = is_swbp_insn(&opcode);
 	uprobe = container_of(auprobe, struct uprobe, arch);
@@ -525,6 +527,9 @@ int uprobe_write_opcode(struct arch_uprobe *auprobe, struct mm_struct *mm,
 
 				/* dec_mm_counter for old_page */
 				dec_mm_counter(mm, MM_ANONPAGES);
+
+				if (PageCompound(orig_page))
+					orig_page_huge = true;
 			}
 			put_page(orig_page);
 		}
@@ -543,6 +548,10 @@ int uprobe_write_opcode(struct arch_uprobe *auprobe, struct mm_struct *mm,
 	if (ret && is_register && ref_ctr_updated)
 		update_ref_ctr(uprobe, mm, -1);
 
+	/* try collapse pmd for compound page */
+	if (!ret && orig_page_huge)
+		khugepaged_add_pte_mapped_thp(mm, vaddr & HPAGE_PMD_MASK);
+
 	return ret;
 }
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] khugepaged: enable collapse pmd for pte-mapped THP
  2019-07-29  5:43 ` [PATCH 1/2] khugepaged: enable " Song Liu
@ 2019-07-30 14:59   ` Kirill A. Shutemov
  2019-07-30 17:28     ` Song Liu
  0 siblings, 1 reply; 10+ messages in thread
From: Kirill A. Shutemov @ 2019-07-30 14:59 UTC (permalink / raw)
  To: Song Liu
  Cc: linux-kernel, linux-mm, akpm, matthew.wilcox, kirill.shutemov,
	oleg, kernel-team, william.kucharski, srikar

On Sun, Jul 28, 2019 at 10:43:34PM -0700, Song Liu wrote:
> khugepaged needs exclusive mmap_sem to access page table. When it fails
> to lock mmap_sem, the page will fault in as pte-mapped THP. As the page
> is already a THP, khugepaged will not handle this pmd again.
> 
> This patch enables the khugepaged to retry collapse the page table.
> 
> struct mm_slot (in khugepaged.c) is extended with an array, containing
> addresses of pte-mapped THPs. We use array here for simplicity. We can
> easily replace it with more advanced data structures when needed. This
> array is protected by khugepaged_mm_lock.
> 
> In khugepaged_scan_mm_slot(), if the mm contains pte-mapped THP, we try
> to collapse the page table.
> 
> Since collapse may happen at an later time, some pages may already fault
> in. collapse_pte_mapped_thp() is added to properly handle these pages.
> collapse_pte_mapped_thp() also double checks whether all ptes in this pmd
> are mapping to the same THP. This is necessary because some subpage of
> the THP may be replaced, for example by uprobe. In such cases, it is not
> possible to collapse the pmd.
> 
> Signed-off-by: Song Liu <songliubraving@fb.com>
> ---
>  include/linux/khugepaged.h |  15 ++++
>  mm/khugepaged.c            | 136 +++++++++++++++++++++++++++++++++++++
>  2 files changed, 151 insertions(+)
> 
> diff --git a/include/linux/khugepaged.h b/include/linux/khugepaged.h
> index 082d1d2a5216..2d700830fe0e 100644
> --- a/include/linux/khugepaged.h
> +++ b/include/linux/khugepaged.h
> @@ -15,6 +15,16 @@ extern int __khugepaged_enter(struct mm_struct *mm);
>  extern void __khugepaged_exit(struct mm_struct *mm);
>  extern int khugepaged_enter_vma_merge(struct vm_area_struct *vma,
>  				      unsigned long vm_flags);
> +#ifdef CONFIG_SHMEM
> +extern int khugepaged_add_pte_mapped_thp(struct mm_struct *mm,
> +					 unsigned long addr);
> +#else
> +static inline int khugepaged_add_pte_mapped_thp(struct mm_struct *mm,
> +						unsigned long addr)
> +{
> +	return 0;
> +}
> +#endif
>  
>  #define khugepaged_enabled()					       \
>  	(transparent_hugepage_flags &				       \
> @@ -73,6 +83,11 @@ static inline int khugepaged_enter_vma_merge(struct vm_area_struct *vma,
>  {
>  	return 0;
>  }
> +static inline int khugepaged_add_pte_mapped_thp(struct mm_struct *mm,
> +						unsigned long addr)
> +{
> +	return 0;
> +}
>  #endif /* CONFIG_TRANSPARENT_HUGEPAGE */
>  
>  #endif /* _LINUX_KHUGEPAGED_H */
> diff --git a/mm/khugepaged.c b/mm/khugepaged.c
> index eaaa21b23215..247c25aeb096 100644
> --- a/mm/khugepaged.c
> +++ b/mm/khugepaged.c
> @@ -76,6 +76,7 @@ static __read_mostly DEFINE_HASHTABLE(mm_slots_hash, MM_SLOTS_HASH_BITS);
>  
>  static struct kmem_cache *mm_slot_cache __read_mostly;
>  
> +#define MAX_PTE_MAPPED_THP 8

Is MAX_PTE_MAPPED_THP value random or do you have any justification for
it?

Please add empty line after it.

>  /**
>   * struct mm_slot - hash lookup from mm to mm_slot
>   * @hash: hash collision list
> @@ -86,6 +87,10 @@ struct mm_slot {
>  	struct hlist_node hash;
>  	struct list_head mm_node;
>  	struct mm_struct *mm;
> +
> +	/* pte-mapped THP in this mm */
> +	int nr_pte_mapped_thp;
> +	unsigned long pte_mapped_thp[MAX_PTE_MAPPED_THP];
>  };
>  
>  /**
> @@ -1281,11 +1286,141 @@ static void retract_page_tables(struct address_space *mapping, pgoff_t pgoff)
>  			up_write(&vma->vm_mm->mmap_sem);
>  			mm_dec_nr_ptes(vma->vm_mm);
>  			pte_free(vma->vm_mm, pmd_pgtable(_pmd));
> +		} else if (down_read_trylock(&vma->vm_mm->mmap_sem)) {
> +			/* need down_read for khugepaged_test_exit() */
> +			khugepaged_add_pte_mapped_thp(vma->vm_mm, addr);
> +			up_read(&vma->vm_mm->mmap_sem);
>  		}
>  	}
>  	i_mmap_unlock_write(mapping);
>  }
>  
> +/*
> + * Notify khugepaged that given addr of the mm is pte-mapped THP. Then
> + * khugepaged should try to collapse the page table.
> + */
> +int khugepaged_add_pte_mapped_thp(struct mm_struct *mm, unsigned long addr)

What is contract about addr alignment? Do we expect it PAGE_SIZE aligned
or PMD_SIZE aligned? Do we want to enforce it?

> +{
> +	struct mm_slot *mm_slot;
> +	int ret = 0;
> +
> +	/* hold mmap_sem for khugepaged_test_exit() */
> +	VM_BUG_ON_MM(!rwsem_is_locked(&mm->mmap_sem), mm);
> +
> +	if (unlikely(khugepaged_test_exit(mm)))
> +		return 0;
> +
> +	if (!test_bit(MMF_VM_HUGEPAGE, &mm->flags) &&
> +	    !test_bit(MMF_DISABLE_THP, &mm->flags)) {
> +		ret = __khugepaged_enter(mm);
> +		if (ret)
> +			return ret;
> +	}

Any reason not to call khugepaged_enter() here?

> +
> +	spin_lock(&khugepaged_mm_lock);
> +	mm_slot = get_mm_slot(mm);
> +	if (likely(mm_slot && mm_slot->nr_pte_mapped_thp < MAX_PTE_MAPPED_THP))
> +		mm_slot->pte_mapped_thp[mm_slot->nr_pte_mapped_thp++] = addr;

It's probably good enough for start, but I'm not sure how useful it will
be for real application, considering the limitation.

> +

Useless empty line?

> +	spin_unlock(&khugepaged_mm_lock);
> +	return 0;
> +}
> +
> +/**
> + * Try to collapse a pte-mapped THP for mm at address haddr.
> + *
> + * This function checks whether all the PTEs in the PMD are pointing to the
> + * right THP. If so, retract the page table so the THP can refault in with
> + * as pmd-mapped.
> + */
> +static void collapse_pte_mapped_thp(struct mm_struct *mm, unsigned long haddr)
> +{
> +	struct vm_area_struct *vma = find_vma(mm, haddr);
> +	pmd_t *pmd = mm_find_pmd(mm, haddr);
> +	struct page *hpage = NULL;
> +	unsigned long addr;
> +	spinlock_t *ptl;
> +	int count = 0;
> +	pmd_t _pmd;
> +	int i;
> +
> +	if (!vma || !pmd || pmd_trans_huge(*pmd))
> +		return;
> +
> +	/* step 1: check all mapped PTEs are to the right huge page */
> +	for (i = 0, addr = haddr; i < HPAGE_PMD_NR; i++, addr += PAGE_SIZE) {
> +		pte_t *pte = pte_offset_map(pmd, addr);
> +		struct page *page;
> +
> +		if (pte_none(*pte))
> +			continue;
> +
> +		page = vm_normal_page(vma, addr, *pte);
> +
> +		if (!PageCompound(page))
> +			return;

I think khugepaged_scan_shmem() and collapse_shmem() should changed to not
stop on PageTransCompound() to make this useful for more cases.

Ideally, it collapse_shmem() and this routine should be the same thing.
Or do you thing it's not doable for some reason?

> +
> +		if (!hpage) {
> +			hpage = compound_head(page);
> +			if (hpage->mapping != vma->vm_file->f_mapping)
> +				return;
> +		}
> +
> +		if (hpage + i != page)
> +			return;
> +		count++;
> +	}
> +
> +	/* step 2: adjust rmap */
> +	for (i = 0, addr = haddr; i < HPAGE_PMD_NR; i++, addr += PAGE_SIZE) {
> +		pte_t *pte = pte_offset_map(pmd, addr);
> +		struct page *page;
> +
> +		if (pte_none(*pte))
> +			continue;
> +		page = vm_normal_page(vma, addr, *pte);
> +		page_remove_rmap(page, false);
> +	}
> +
> +	/* step 3: set proper refcount and mm_counters. */
> +	if (hpage) {
> +		page_ref_sub(hpage, count);
> +		add_mm_counter(vma->vm_mm, mm_counter_file(hpage), -count);
> +	}
> +
> +	/* step 4: collapse pmd */
> +	ptl = pmd_lock(vma->vm_mm, pmd);
> +	_pmd = pmdp_collapse_flush(vma, addr, pmd);
> +	spin_unlock(ptl);
> +	mm_dec_nr_ptes(mm);
> +	pte_free(mm, pmd_pgtable(_pmd));
> +}
> +
> +static int khugepaged_collapse_pte_mapped_thps(struct mm_slot *mm_slot)
> +{
> +	struct mm_struct *mm = mm_slot->mm;
> +	int i;
> +
> +	lockdep_assert_held(&khugepaged_mm_lock);
> +
> +	if (likely(mm_slot->nr_pte_mapped_thp == 0))
> +		return 0;
> +
> +	if (!down_write_trylock(&mm->mmap_sem))
> +		return -EBUSY;
> +
> +	if (unlikely(khugepaged_test_exit(mm)))
> +		goto out;
> +
> +	for (i = 0; i < mm_slot->nr_pte_mapped_thp; i++)
> +		collapse_pte_mapped_thp(mm, mm_slot->pte_mapped_thp[i]);
> +
> +out:
> +	mm_slot->nr_pte_mapped_thp = 0;
> +	up_write(&mm->mmap_sem);
> +	return 0;
> +}
> +
>  /**
>   * collapse_shmem - collapse small tmpfs/shmem pages into huge one.
>   *
> @@ -1667,6 +1802,7 @@ static unsigned int khugepaged_scan_mm_slot(unsigned int pages,
>  		khugepaged_scan.address = 0;
>  		khugepaged_scan.mm_slot = mm_slot;
>  	}
> +	khugepaged_collapse_pte_mapped_thps(mm_slot);
>  	spin_unlock(&khugepaged_mm_lock);
>  
>  	mm = mm_slot->mm;
> -- 
> 2.17.1
> 

-- 
 Kirill A. Shutemov

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/2] uprobe: collapse THP pmd after removing all uprobes
  2019-07-29  5:43 ` [PATCH 2/2] uprobe: collapse THP pmd after removing all uprobes Song Liu
@ 2019-07-30 15:01   ` Kirill A. Shutemov
  2019-07-30 17:02     ` Song Liu
  2019-07-31 16:16   ` Oleg Nesterov
  1 sibling, 1 reply; 10+ messages in thread
From: Kirill A. Shutemov @ 2019-07-30 15:01 UTC (permalink / raw)
  To: Song Liu
  Cc: linux-kernel, linux-mm, akpm, matthew.wilcox, kirill.shutemov,
	oleg, kernel-team, william.kucharski, srikar

On Sun, Jul 28, 2019 at 10:43:35PM -0700, Song Liu wrote:
> After all uprobes are removed from the huge page (with PTE pgtable), it
> is possible to collapse the pmd and benefit from THP again. This patch
> does the collapse by calling khugepaged_add_pte_mapped_thp().
> 
> Signed-off-by: Song Liu <songliubraving@fb.com>
> ---
>  kernel/events/uprobes.c | 9 +++++++++
>  1 file changed, 9 insertions(+)
> 
> diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c
> index 58ab7fc7272a..cc53789fefc6 100644
> --- a/kernel/events/uprobes.c
> +++ b/kernel/events/uprobes.c
> @@ -26,6 +26,7 @@
>  #include <linux/percpu-rwsem.h>
>  #include <linux/task_work.h>
>  #include <linux/shmem_fs.h>
> +#include <linux/khugepaged.h>
>  
>  #include <linux/uprobes.h>
>  
> @@ -470,6 +471,7 @@ int uprobe_write_opcode(struct arch_uprobe *auprobe, struct mm_struct *mm,
>  	struct page *old_page, *new_page;
>  	struct vm_area_struct *vma;
>  	int ret, is_register, ref_ctr_updated = 0;
> +	bool orig_page_huge = false;
>  
>  	is_register = is_swbp_insn(&opcode);
>  	uprobe = container_of(auprobe, struct uprobe, arch);
> @@ -525,6 +527,9 @@ int uprobe_write_opcode(struct arch_uprobe *auprobe, struct mm_struct *mm,
>  
>  				/* dec_mm_counter for old_page */
>  				dec_mm_counter(mm, MM_ANONPAGES);
> +
> +				if (PageCompound(orig_page))
> +					orig_page_huge = true;
>  			}
>  			put_page(orig_page);
>  		}
> @@ -543,6 +548,10 @@ int uprobe_write_opcode(struct arch_uprobe *auprobe, struct mm_struct *mm,
>  	if (ret && is_register && ref_ctr_updated)
>  		update_ref_ctr(uprobe, mm, -1);
>  
> +	/* try collapse pmd for compound page */
> +	if (!ret && orig_page_huge)
> +		khugepaged_add_pte_mapped_thp(mm, vaddr & HPAGE_PMD_MASK);
> +

IIUC, here you have all locks taken, so you should be able to call
collapse_pte_mapped_thp() directly, shouldn't you?

-- 
 Kirill A. Shutemov

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/2] uprobe: collapse THP pmd after removing all uprobes
  2019-07-30 15:01   ` Kirill A. Shutemov
@ 2019-07-30 17:02     ` Song Liu
  0 siblings, 0 replies; 10+ messages in thread
From: Song Liu @ 2019-07-30 17:02 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: lkml, Linux-MM, Andrew Morton, Matthew Wilcox,
	Kirill A. Shutemov, Oleg Nesterov, Kernel Team,
	William Kucharski, srikar



> On Jul 30, 2019, at 8:01 AM, Kirill A. Shutemov <kirill@shutemov.name> wrote:
> 
> On Sun, Jul 28, 2019 at 10:43:35PM -0700, Song Liu wrote:
>> After all uprobes are removed from the huge page (with PTE pgtable), it
>> is possible to collapse the pmd and benefit from THP again. This patch
>> does the collapse by calling khugepaged_add_pte_mapped_thp().
>> 
>> Signed-off-by: Song Liu <songliubraving@fb.com>
>> ---
>> kernel/events/uprobes.c | 9 +++++++++
>> 1 file changed, 9 insertions(+)
>> 
>> diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c
>> index 58ab7fc7272a..cc53789fefc6 100644
>> --- a/kernel/events/uprobes.c
>> +++ b/kernel/events/uprobes.c
>> @@ -26,6 +26,7 @@
>> #include <linux/percpu-rwsem.h>
>> #include <linux/task_work.h>
>> #include <linux/shmem_fs.h>
>> +#include <linux/khugepaged.h>
>> 
>> #include <linux/uprobes.h>
>> 
>> @@ -470,6 +471,7 @@ int uprobe_write_opcode(struct arch_uprobe *auprobe, struct mm_struct *mm,
>> 	struct page *old_page, *new_page;
>> 	struct vm_area_struct *vma;
>> 	int ret, is_register, ref_ctr_updated = 0;
>> +	bool orig_page_huge = false;
>> 
>> 	is_register = is_swbp_insn(&opcode);
>> 	uprobe = container_of(auprobe, struct uprobe, arch);
>> @@ -525,6 +527,9 @@ int uprobe_write_opcode(struct arch_uprobe *auprobe, struct mm_struct *mm,
>> 
>> 				/* dec_mm_counter for old_page */
>> 				dec_mm_counter(mm, MM_ANONPAGES);
>> +
>> +				if (PageCompound(orig_page))
>> +					orig_page_huge = true;
>> 			}
>> 			put_page(orig_page);
>> 		}
>> @@ -543,6 +548,10 @@ int uprobe_write_opcode(struct arch_uprobe *auprobe, struct mm_struct *mm,
>> 	if (ret && is_register && ref_ctr_updated)
>> 		update_ref_ctr(uprobe, mm, -1);
>> 
>> +	/* try collapse pmd for compound page */
>> +	if (!ret && orig_page_huge)
>> +		khugepaged_add_pte_mapped_thp(mm, vaddr & HPAGE_PMD_MASK);
>> +
> 
> IIUC, here you have all locks taken, so you should be able to call
> collapse_pte_mapped_thp() directly, shouldn't you?
> 

Yes, we can call it directly. I had it that way in a very early 
version. 

Let me do that in the next version. 

Thanks,
Song


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] khugepaged: enable collapse pmd for pte-mapped THP
  2019-07-30 14:59   ` Kirill A. Shutemov
@ 2019-07-30 17:28     ` Song Liu
  2019-07-30 18:39       ` Song Liu
  0 siblings, 1 reply; 10+ messages in thread
From: Song Liu @ 2019-07-30 17:28 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: lkml, Linux-MM, akpm, matthew.wilcox, kirill.shutemov, oleg,
	Kernel Team, william.kucharski, srikar



> On Jul 30, 2019, at 7:59 AM, Kirill A. Shutemov <kirill@shutemov.name> wrote:
> 
> On Sun, Jul 28, 2019 at 10:43:34PM -0700, Song Liu wrote:
>> khugepaged needs exclusive mmap_sem to access page table. When it fails
>> to lock mmap_sem, the page will fault in as pte-mapped THP. As the page
>> is already a THP, khugepaged will not handle this pmd again.
>> 
>> This patch enables the khugepaged to retry collapse the page table.
>> 
>> struct mm_slot (in khugepaged.c) is extended with an array, containing
>> addresses of pte-mapped THPs. We use array here for simplicity. We can
>> easily replace it with more advanced data structures when needed. This
>> array is protected by khugepaged_mm_lock.
>> 
>> In khugepaged_scan_mm_slot(), if the mm contains pte-mapped THP, we try
>> to collapse the page table.
>> 
>> Since collapse may happen at an later time, some pages may already fault
>> in. collapse_pte_mapped_thp() is added to properly handle these pages.
>> collapse_pte_mapped_thp() also double checks whether all ptes in this pmd
>> are mapping to the same THP. This is necessary because some subpage of
>> the THP may be replaced, for example by uprobe. In such cases, it is not
>> possible to collapse the pmd.
>> 
>> Signed-off-by: Song Liu <songliubraving@fb.com>
>> ---
>> include/linux/khugepaged.h |  15 ++++
>> mm/khugepaged.c            | 136 +++++++++++++++++++++++++++++++++++++
>> 2 files changed, 151 insertions(+)
>> 
>> diff --git a/include/linux/khugepaged.h b/include/linux/khugepaged.h
>> index 082d1d2a5216..2d700830fe0e 100644
>> --- a/include/linux/khugepaged.h
>> +++ b/include/linux/khugepaged.h
>> @@ -15,6 +15,16 @@ extern int __khugepaged_enter(struct mm_struct *mm);
>> extern void __khugepaged_exit(struct mm_struct *mm);
>> extern int khugepaged_enter_vma_merge(struct vm_area_struct *vma,
>> 				      unsigned long vm_flags);
>> +#ifdef CONFIG_SHMEM
>> +extern int khugepaged_add_pte_mapped_thp(struct mm_struct *mm,
>> +					 unsigned long addr);
>> +#else
>> +static inline int khugepaged_add_pte_mapped_thp(struct mm_struct *mm,
>> +						unsigned long addr)
>> +{
>> +	return 0;
>> +}
>> +#endif
>> 
>> #define khugepaged_enabled()					       \
>> 	(transparent_hugepage_flags &				       \
>> @@ -73,6 +83,11 @@ static inline int khugepaged_enter_vma_merge(struct vm_area_struct *vma,
>> {
>> 	return 0;
>> }
>> +static inline int khugepaged_add_pte_mapped_thp(struct mm_struct *mm,
>> +						unsigned long addr)
>> +{
>> +	return 0;
>> +}
>> #endif /* CONFIG_TRANSPARENT_HUGEPAGE */
>> 
>> #endif /* _LINUX_KHUGEPAGED_H */
>> diff --git a/mm/khugepaged.c b/mm/khugepaged.c
>> index eaaa21b23215..247c25aeb096 100644
>> --- a/mm/khugepaged.c
>> +++ b/mm/khugepaged.c
>> @@ -76,6 +76,7 @@ static __read_mostly DEFINE_HASHTABLE(mm_slots_hash, MM_SLOTS_HASH_BITS);
>> 
>> static struct kmem_cache *mm_slot_cache __read_mostly;
>> 
>> +#define MAX_PTE_MAPPED_THP 8
> 
> Is MAX_PTE_MAPPED_THP value random or do you have any justification for
> it?

In our use cases, we only have small number (< 10) of huge pages for the
text section, so 8 should be enough to cover the worse case. 

If this is not sufficient, we can make it a list. 

> 
> Please add empty line after it.
> 
>> /**
>>  * struct mm_slot - hash lookup from mm to mm_slot
>>  * @hash: hash collision list
>> @@ -86,6 +87,10 @@ struct mm_slot {
>> 	struct hlist_node hash;
>> 	struct list_head mm_node;
>> 	struct mm_struct *mm;
>> +
>> +	/* pte-mapped THP in this mm */
>> +	int nr_pte_mapped_thp;
>> +	unsigned long pte_mapped_thp[MAX_PTE_MAPPED_THP];
>> };
>> 
>> /**
>> @@ -1281,11 +1286,141 @@ static void retract_page_tables(struct address_space *mapping, pgoff_t pgoff)
>> 			up_write(&vma->vm_mm->mmap_sem);
>> 			mm_dec_nr_ptes(vma->vm_mm);
>> 			pte_free(vma->vm_mm, pmd_pgtable(_pmd));
>> +		} else if (down_read_trylock(&vma->vm_mm->mmap_sem)) {
>> +			/* need down_read for khugepaged_test_exit() */
>> +			khugepaged_add_pte_mapped_thp(vma->vm_mm, addr);
>> +			up_read(&vma->vm_mm->mmap_sem);
>> 		}
>> 	}
>> 	i_mmap_unlock_write(mapping);
>> }
>> 
>> +/*
>> + * Notify khugepaged that given addr of the mm is pte-mapped THP. Then
>> + * khugepaged should try to collapse the page table.
>> + */
>> +int khugepaged_add_pte_mapped_thp(struct mm_struct *mm, unsigned long addr)
> 
> What is contract about addr alignment? Do we expect it PAGE_SIZE aligned
> or PMD_SIZE aligned? Do we want to enforce it?

It is PMD_SIZE aligned. Let me add VM_BUG_ON() for it. 

> 
>> +{
>> +	struct mm_slot *mm_slot;
>> +	int ret = 0;
>> +
>> +	/* hold mmap_sem for khugepaged_test_exit() */
>> +	VM_BUG_ON_MM(!rwsem_is_locked(&mm->mmap_sem), mm);
>> +
>> +	if (unlikely(khugepaged_test_exit(mm)))
>> +		return 0;
>> +
>> +	if (!test_bit(MMF_VM_HUGEPAGE, &mm->flags) &&
>> +	    !test_bit(MMF_DISABLE_THP, &mm->flags)) {
>> +		ret = __khugepaged_enter(mm);
>> +		if (ret)
>> +			return ret;
>> +	}
> 
> Any reason not to call khugepaged_enter() here?

No specific reasons... Let me try it. 

> 
>> +
>> +	spin_lock(&khugepaged_mm_lock);
>> +	mm_slot = get_mm_slot(mm);
>> +	if (likely(mm_slot && mm_slot->nr_pte_mapped_thp < MAX_PTE_MAPPED_THP))
>> +		mm_slot->pte_mapped_thp[mm_slot->nr_pte_mapped_thp++] = addr;
> 
> It's probably good enough for start, but I'm not sure how useful it will
> be for real application, considering the limitation.

For limitation, do you mean MAX_PTE_MAPPED_THP? I think this is good for 
our use cases. We sure can improve that without too much work. 

Thanks,
Song 

> 
>> +
> 
> Useless empty line?
> 
>> +	spin_unlock(&khugepaged_mm_lock);
>> +	return 0;
>> +}
>> +
>> +/**
>> + * Try to collapse a pte-mapped THP for mm at address haddr.
>> + *
>> + * This function checks whether all the PTEs in the PMD are pointing to the
>> + * right THP. If so, retract the page table so the THP can refault in with
>> + * as pmd-mapped.
>> + */
>> +static void collapse_pte_mapped_thp(struct mm_struct *mm, unsigned long haddr)
>> +{
>> +	struct vm_area_struct *vma = find_vma(mm, haddr);
>> +	pmd_t *pmd = mm_find_pmd(mm, haddr);
>> +	struct page *hpage = NULL;
>> +	unsigned long addr;
>> +	spinlock_t *ptl;
>> +	int count = 0;
>> +	pmd_t _pmd;
>> +	int i;
>> +
>> +	if (!vma || !pmd || pmd_trans_huge(*pmd))
>> +		return;
>> +
>> +	/* step 1: check all mapped PTEs are to the right huge page */
>> +	for (i = 0, addr = haddr; i < HPAGE_PMD_NR; i++, addr += PAGE_SIZE) {
>> +		pte_t *pte = pte_offset_map(pmd, addr);
>> +		struct page *page;
>> +
>> +		if (pte_none(*pte))
>> +			continue;
>> +
>> +		page = vm_normal_page(vma, addr, *pte);
>> +
>> +		if (!PageCompound(page))
>> +			return;
> 
> I think khugepaged_scan_shmem() and collapse_shmem() should changed to not
> stop on PageTransCompound() to make this useful for more cases.

With locking sorted out, we could call collapse_pte_mapped_thp() for 
PageTransCompound() cases. 

> 
> Ideally, it collapse_shmem() and this routine should be the same thing.
> Or do you thing it's not doable for some reason?

This routine is part of retract_page_tables(). collapse_shmem() does more 
work. We can still go into collapse_shmem() and bypass the first half of
it. 

On the other hand, I would like to keep it separate for now to minimize
conflicts with my other set, which is waiting for fixed version of 
5fd4ca2d84b249f0858ce28cf637cf25b61a398f.

How about we start with current design (with small things fixed) and 
improve it once both sets get in?

Thanks,
Song



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] khugepaged: enable collapse pmd for pte-mapped THP
  2019-07-30 17:28     ` Song Liu
@ 2019-07-30 18:39       ` Song Liu
  0 siblings, 0 replies; 10+ messages in thread
From: Song Liu @ 2019-07-30 18:39 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: lkml, Linux-MM, akpm, matthew.wilcox, kirill.shutemov, oleg,
	Kernel Team, william.kucharski, srikar



> On Jul 30, 2019, at 10:28 AM, Song Liu <songliubraving@fb.com> wrote:
> 
> 
> 
>> On Jul 30, 2019, at 7:59 AM, Kirill A. Shutemov <kirill@shutemov.name> wrote:
>> 
>> On Sun, Jul 28, 2019 at 10:43:34PM -0700, Song Liu wrote:
>>> khugepaged needs exclusive mmap_sem to access page table. When it fails
>>> to lock mmap_sem, the page will fault in as pte-mapped THP. As the page
>>> is already a THP, khugepaged will not handle this pmd again.
>>> 
>>> This patch enables the khugepaged to retry collapse the page table.
>>> 
>>> struct mm_slot (in khugepaged.c) is extended with an array, containing
>>> addresses of pte-mapped THPs. We use array here for simplicity. We can
>>> easily replace it with more advanced data structures when needed. This
>>> array is protected by khugepaged_mm_lock.
>>> 
>>> In khugepaged_scan_mm_slot(), if the mm contains pte-mapped THP, we try
>>> to collapse the page table.
>>> 
>>> Since collapse may happen at an later time, some pages may already fault
>>> in. collapse_pte_mapped_thp() is added to properly handle these pages.
>>> collapse_pte_mapped_thp() also double checks whether all ptes in this pmd
>>> are mapping to the same THP. This is necessary because some subpage of
>>> the THP may be replaced, for example by uprobe. In such cases, it is not
>>> possible to collapse the pmd.
>>> 
>>> Signed-off-by: Song Liu <songliubraving@fb.com>
>>> ---
>>> include/linux/khugepaged.h |  15 ++++
>>> mm/khugepaged.c            | 136 +++++++++++++++++++++++++++++++++++++
>>> 2 files changed, 151 insertions(+)
>>> 
>>> diff --git a/include/linux/khugepaged.h b/include/linux/khugepaged.h
>>> index 082d1d2a5216..2d700830fe0e 100644
>>> --- a/include/linux/khugepaged.h
>>> +++ b/include/linux/khugepaged.h
>>> @@ -15,6 +15,16 @@ extern int __khugepaged_enter(struct mm_struct *mm);
>>> extern void __khugepaged_exit(struct mm_struct *mm);
>>> extern int khugepaged_enter_vma_merge(struct vm_area_struct *vma,
>>> 				      unsigned long vm_flags);
>>> +#ifdef CONFIG_SHMEM
>>> +extern int khugepaged_add_pte_mapped_thp(struct mm_struct *mm,
>>> +					 unsigned long addr);
>>> +#else
>>> +static inline int khugepaged_add_pte_mapped_thp(struct mm_struct *mm,
>>> +						unsigned long addr)
>>> +{
>>> +	return 0;
>>> +}
>>> +#endif
>>> 
>>> #define khugepaged_enabled()					       \
>>> 	(transparent_hugepage_flags &				       \
>>> @@ -73,6 +83,11 @@ static inline int khugepaged_enter_vma_merge(struct vm_area_struct *vma,
>>> {
>>> 	return 0;
>>> }
>>> +static inline int khugepaged_add_pte_mapped_thp(struct mm_struct *mm,
>>> +						unsigned long addr)
>>> +{
>>> +	return 0;
>>> +}
>>> #endif /* CONFIG_TRANSPARENT_HUGEPAGE */
>>> 
>>> #endif /* _LINUX_KHUGEPAGED_H */
>>> diff --git a/mm/khugepaged.c b/mm/khugepaged.c
>>> index eaaa21b23215..247c25aeb096 100644
>>> --- a/mm/khugepaged.c
>>> +++ b/mm/khugepaged.c
>>> @@ -76,6 +76,7 @@ static __read_mostly DEFINE_HASHTABLE(mm_slots_hash, MM_SLOTS_HASH_BITS);
>>> 
>>> static struct kmem_cache *mm_slot_cache __read_mostly;
>>> 
>>> +#define MAX_PTE_MAPPED_THP 8
>> 
>> Is MAX_PTE_MAPPED_THP value random or do you have any justification for
>> it?
> 
> In our use cases, we only have small number (< 10) of huge pages for the
> text section, so 8 should be enough to cover the worse case. 
> 
> If this is not sufficient, we can make it a list. 
> 
>> 
>> Please add empty line after it.
>> 
>>> /**
>>> * struct mm_slot - hash lookup from mm to mm_slot
>>> * @hash: hash collision list
>>> @@ -86,6 +87,10 @@ struct mm_slot {
>>> 	struct hlist_node hash;
>>> 	struct list_head mm_node;
>>> 	struct mm_struct *mm;
>>> +
>>> +	/* pte-mapped THP in this mm */
>>> +	int nr_pte_mapped_thp;
>>> +	unsigned long pte_mapped_thp[MAX_PTE_MAPPED_THP];
>>> };
>>> 
>>> /**
>>> @@ -1281,11 +1286,141 @@ static void retract_page_tables(struct address_space *mapping, pgoff_t pgoff)
>>> 			up_write(&vma->vm_mm->mmap_sem);
>>> 			mm_dec_nr_ptes(vma->vm_mm);
>>> 			pte_free(vma->vm_mm, pmd_pgtable(_pmd));
>>> +		} else if (down_read_trylock(&vma->vm_mm->mmap_sem)) {
>>> +			/* need down_read for khugepaged_test_exit() */
>>> +			khugepaged_add_pte_mapped_thp(vma->vm_mm, addr);
>>> +			up_read(&vma->vm_mm->mmap_sem);
>>> 		}
>>> 	}
>>> 	i_mmap_unlock_write(mapping);
>>> }
>>> 
>>> +/*
>>> + * Notify khugepaged that given addr of the mm is pte-mapped THP. Then
>>> + * khugepaged should try to collapse the page table.
>>> + */
>>> +int khugepaged_add_pte_mapped_thp(struct mm_struct *mm, unsigned long addr)
>> 
>> What is contract about addr alignment? Do we expect it PAGE_SIZE aligned
>> or PMD_SIZE aligned? Do we want to enforce it?
> 
> It is PMD_SIZE aligned. Let me add VM_BUG_ON() for it. 
> 
>> 
>>> +{
>>> +	struct mm_slot *mm_slot;
>>> +	int ret = 0;
>>> +
>>> +	/* hold mmap_sem for khugepaged_test_exit() */
>>> +	VM_BUG_ON_MM(!rwsem_is_locked(&mm->mmap_sem), mm);
>>> +
>>> +	if (unlikely(khugepaged_test_exit(mm)))
>>> +		return 0;
>>> +
>>> +	if (!test_bit(MMF_VM_HUGEPAGE, &mm->flags) &&
>>> +	    !test_bit(MMF_DISABLE_THP, &mm->flags)) {
>>> +		ret = __khugepaged_enter(mm);
>>> +		if (ret)
>>> +			return ret;
>>> +	}
>> 
>> Any reason not to call khugepaged_enter() here?
> 
> No specific reasons... Let me try it. 

Actually, khugepaged_enter() takes vma and vm_flags; while here we only  
have the mm. I guess we should just use __khugepaged_enter(). Once we 
remove all checks on vm_flags, khugepaged_enter() is about the same as 
the logic above. 

Thanks,
Song

[...]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/2] uprobe: collapse THP pmd after removing all uprobes
  2019-07-29  5:43 ` [PATCH 2/2] uprobe: collapse THP pmd after removing all uprobes Song Liu
  2019-07-30 15:01   ` Kirill A. Shutemov
@ 2019-07-31 16:16   ` Oleg Nesterov
  2019-07-31 16:36     ` Song Liu
  1 sibling, 1 reply; 10+ messages in thread
From: Oleg Nesterov @ 2019-07-31 16:16 UTC (permalink / raw)
  To: Song Liu
  Cc: linux-kernel, linux-mm, akpm, matthew.wilcox, kirill.shutemov,
	kernel-team, william.kucharski, srikar

On 07/28, Song Liu wrote:
>
> @@ -525,6 +527,9 @@ int uprobe_write_opcode(struct arch_uprobe *auprobe, struct mm_struct *mm,
>  
>  				/* dec_mm_counter for old_page */
>  				dec_mm_counter(mm, MM_ANONPAGES);
> +
> +				if (PageCompound(orig_page))
> +					orig_page_huge = true;

I am wondering how find_get_page() can return a PageCompound() page...

IIUC, this is only possible if shmem_file(), right?

Oleg.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/2] uprobe: collapse THP pmd after removing all uprobes
  2019-07-31 16:16   ` Oleg Nesterov
@ 2019-07-31 16:36     ` Song Liu
  0 siblings, 0 replies; 10+ messages in thread
From: Song Liu @ 2019-07-31 16:36 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: lkml, Linux-MM, Andrew Morton, Matthew Wilcox,
	Kirill A. Shutemov, Kernel Team, William Kucharski, srikar



> On Jul 31, 2019, at 9:16 AM, Oleg Nesterov <oleg@redhat.com> wrote:
> 
> On 07/28, Song Liu wrote:
>> 
>> @@ -525,6 +527,9 @@ int uprobe_write_opcode(struct arch_uprobe *auprobe, struct mm_struct *mm,
>> 
>> 				/* dec_mm_counter for old_page */
>> 				dec_mm_counter(mm, MM_ANONPAGES);
>> +
>> +				if (PageCompound(orig_page))
>> +					orig_page_huge = true;
> 
> I am wondering how find_get_page() can return a PageCompound() page...
> 
> IIUC, this is only possible if shmem_file(), right?

Yes, this is the case at the moment. We will be able to do it for other
file systems when this set gets in: 

	https://lkml.org/lkml/2019/6/24/1531

Thanks,
Song

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2019-07-31 16:39 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-07-29  5:43 [PATCH 0/2] khugepaged: collapse pmd for pte-mapped THP Song Liu
2019-07-29  5:43 ` [PATCH 1/2] khugepaged: enable " Song Liu
2019-07-30 14:59   ` Kirill A. Shutemov
2019-07-30 17:28     ` Song Liu
2019-07-30 18:39       ` Song Liu
2019-07-29  5:43 ` [PATCH 2/2] uprobe: collapse THP pmd after removing all uprobes Song Liu
2019-07-30 15:01   ` Kirill A. Shutemov
2019-07-30 17:02     ` Song Liu
2019-07-31 16:16   ` Oleg Nesterov
2019-07-31 16:36     ` Song Liu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.