linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 0/3] Try to release mmap_lock temporarily in smaps_rollup
@ 2020-08-15  6:20 Chinwen Chang
  2020-08-15  6:20 ` [PATCH v3 1/3] mmap locking API: add mmap_lock_is_contended() Chinwen Chang
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Chinwen Chang @ 2020-08-15  6:20 UTC (permalink / raw)
  To: Matthias Brugger, Michel Lespinasse, Andrew Morton,
	Vlastimil Babka, Daniel Jordan, Davidlohr Bueso, Chinwen Chang,
	Alexey Dobriyan, Matthew Wilcox (Oracle),
	Jason Gunthorpe, Steven Price, Song Liu, Jimmy Assarsson,
	Huang Ying, Daniel Kiss, Laurent Dufour
  Cc: linux-fsdevel, linux-mediatek, linux-kernel, linux-arm-kernel,
	wsd_upstream

Recently, we have observed some janky issues caused by unpleasantly long
contention on mmap_lock which is held by smaps_rollup when probing large
processes. To address the problem, we let smaps_rollup detect if anyone
wants to acquire mmap_lock for write attempts. If yes, just release the
lock temporarily to ease the contention.

smaps_rollup is a procfs interface which allows users to summarize the
process's memory usage without the overhead of seq_* calls. Android uses it
to sample the memory usage of various processes to balance its memory pool
sizes. If no one wants to take the lock for write requests, smaps_rollup
with this patch will behave like the original one.

Although there are on-going mmap_lock optimizations like range-based locks,
the lock applied to smaps_rollup would be the coarse one, which is hard to
avoid the occurrence of aforementioned issues. So the detection and
temporary release for write attempts on mmap_lock in smaps_rollup is still
necessary.

Change since v1:
- If current VMA is freed after dropping the lock, it will return
- incomplete result. To fix this issue, refine the code flow as
- suggested by Steve. [1]

Change since v2:
- When getting back the mmap lock, the address where you stopped last
- time could now be in the middle of a vma. Add one more check to handle
- this case as suggested by Michel. [2]

[1] https://lore.kernel.org/lkml/bf40676e-b14b-44cd-75ce-419c70194783@arm.com/
[2] https://lore.kernel.org/lkml/CANN689FtCsC71cjAjs0GPspOhgo_HRj+diWsoU1wr98YPktgWg@mail.gmail.com/


Chinwen Chang (3):
  mmap locking API: add mmap_lock_is_contended()
  mm: smaps*: extend smap_gather_stats to support specified beginning
  mm: proc: smaps_rollup: do not stall write attempts on mmap_lock

 fs/proc/task_mmu.c        | 101 ++++++++++++++++++++++++++++++++++----
 include/linux/mmap_lock.h |   5 ++
 2 files changed, 96 insertions(+), 10 deletions(-)
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH v3 1/3] mmap locking API: add mmap_lock_is_contended()
  2020-08-15  6:20 [PATCH v3 0/3] Try to release mmap_lock temporarily in smaps_rollup Chinwen Chang
@ 2020-08-15  6:20 ` Chinwen Chang
  2020-08-15  6:20 ` [PATCH v3 2/3] mm: smaps*: extend smap_gather_stats to support specified beginning Chinwen Chang
  2020-08-15  6:20 ` [PATCH v3 3/3] mm: proc: smaps_rollup: do not stall write attempts on mmap_lock Chinwen Chang
  2 siblings, 0 replies; 7+ messages in thread
From: Chinwen Chang @ 2020-08-15  6:20 UTC (permalink / raw)
  To: Matthias Brugger, Michel Lespinasse, Andrew Morton,
	Vlastimil Babka, Daniel Jordan, Davidlohr Bueso, Chinwen Chang,
	Alexey Dobriyan, Matthew Wilcox (Oracle),
	Jason Gunthorpe, Steven Price, Song Liu, Jimmy Assarsson,
	Huang Ying, Daniel Kiss, Laurent Dufour
  Cc: linux-fsdevel, linux-mediatek, linux-kernel, linux-arm-kernel,
	wsd_upstream

Add new API to query if someone wants to acquire mmap_lock
for write attempts.

Using this instead of rwsem_is_contended makes it more tolerant
of future changes to the lock type.

Signed-off-by: Chinwen Chang <chinwen.chang@mediatek.com>
Reviewed-by: Steven Price <steven.price@arm.com>
Acked-by: Michel Lespinasse <walken@google.com>
---
 include/linux/mmap_lock.h | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/include/linux/mmap_lock.h b/include/linux/mmap_lock.h
index 0707671..18e7eae 100644
--- a/include/linux/mmap_lock.h
+++ b/include/linux/mmap_lock.h
@@ -87,4 +87,9 @@ static inline void mmap_assert_write_locked(struct mm_struct *mm)
 	VM_BUG_ON_MM(!rwsem_is_locked(&mm->mmap_lock), mm);
 }
 
+static inline int mmap_lock_is_contended(struct mm_struct *mm)
+{
+	return rwsem_is_contended(&mm->mmap_lock);
+}
+
 #endif /* _LINUX_MMAP_LOCK_H */
-- 
1.9.1
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH v3 2/3] mm: smaps*: extend smap_gather_stats to support specified beginning
  2020-08-15  6:20 [PATCH v3 0/3] Try to release mmap_lock temporarily in smaps_rollup Chinwen Chang
  2020-08-15  6:20 ` [PATCH v3 1/3] mmap locking API: add mmap_lock_is_contended() Chinwen Chang
@ 2020-08-15  6:20 ` Chinwen Chang
  2020-08-17  8:38   ` Steven Price
  2020-08-15  6:20 ` [PATCH v3 3/3] mm: proc: smaps_rollup: do not stall write attempts on mmap_lock Chinwen Chang
  2 siblings, 1 reply; 7+ messages in thread
From: Chinwen Chang @ 2020-08-15  6:20 UTC (permalink / raw)
  To: Matthias Brugger, Michel Lespinasse, Andrew Morton,
	Vlastimil Babka, Daniel Jordan, Davidlohr Bueso, Chinwen Chang,
	Alexey Dobriyan, Matthew Wilcox (Oracle),
	Jason Gunthorpe, Steven Price, Song Liu, Jimmy Assarsson,
	Huang Ying, Daniel Kiss, Laurent Dufour
  Cc: linux-fsdevel, linux-mediatek, linux-kernel, linux-arm-kernel,
	wsd_upstream

Extend smap_gather_stats to support indicated beginning address at
which it should start gathering. To achieve the goal, we add a new
parameter @start assigned by the caller and try to refactor it for
simplicity.

If @start is 0, it will use the range of @vma for gathering.

Change since v2:
- This is a new change to make the retry behavior of smaps_rollup
- more complete as suggested by Michel [1]

[1] https://lore.kernel.org/lkml/CANN689FtCsC71cjAjs0GPspOhgo_HRj+diWsoU1wr98YPktgWg@mail.gmail.com/

Signed-off-by: Chinwen Chang <chinwen.chang@mediatek.com>
CC: Michel Lespinasse <walken@google.com>
CC: Steven Price <steven.price@arm.com>
---
 fs/proc/task_mmu.c | 30 ++++++++++++++++++++++--------
 1 file changed, 22 insertions(+), 8 deletions(-)

diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index dbda449..76e623a 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -723,9 +723,21 @@ static int smaps_hugetlb_range(pte_t *pte, unsigned long hmask,
 	.pte_hole		= smaps_pte_hole,
 };
 
+/*
+ * Gather mem stats from @vma with the indicated beginning
+ * address @start, and keep them in @mss.
+ *
+ * Use vm_start of @vma as the beginning address if @start is 0.
+ */
 static void smap_gather_stats(struct vm_area_struct *vma,
-			     struct mem_size_stats *mss)
+		struct mem_size_stats *mss, unsigned long start)
 {
+	const struct mm_walk_ops *ops = &smaps_walk_ops;
+
+	/* Invalid start */
+	if (start >= vma->vm_end)
+		return;
+
 #ifdef CONFIG_SHMEM
 	/* In case of smaps_rollup, reset the value from previous vma */
 	mss->check_shmem_swap = false;
@@ -742,18 +754,20 @@ static void smap_gather_stats(struct vm_area_struct *vma,
 		 */
 		unsigned long shmem_swapped = shmem_swap_usage(vma);
 
-		if (!shmem_swapped || (vma->vm_flags & VM_SHARED) ||
-					!(vma->vm_flags & VM_WRITE)) {
+		if (!start && (!shmem_swapped || (vma->vm_flags & VM_SHARED) ||
+					!(vma->vm_flags & VM_WRITE))) {
 			mss->swap += shmem_swapped;
 		} else {
 			mss->check_shmem_swap = true;
-			walk_page_vma(vma, &smaps_shmem_walk_ops, mss);
-			return;
+			ops = &smaps_shmem_walk_ops;
 		}
 	}
 #endif
 	/* mmap_lock is held in m_start */
-	walk_page_vma(vma, &smaps_walk_ops, mss);
+	if (!start)
+		walk_page_vma(vma, ops, mss);
+	else
+		walk_page_range(vma->vm_mm, start, vma->vm_end, ops, mss);
 }
 
 #define SEQ_PUT_DEC(str, val) \
@@ -805,7 +819,7 @@ static int show_smap(struct seq_file *m, void *v)
 
 	memset(&mss, 0, sizeof(mss));
 
-	smap_gather_stats(vma, &mss);
+	smap_gather_stats(vma, &mss, 0);
 
 	show_map_vma(m, vma);
 
@@ -854,7 +868,7 @@ static int show_smaps_rollup(struct seq_file *m, void *v)
 	hold_task_mempolicy(priv);
 
 	for (vma = priv->mm->mmap; vma; vma = vma->vm_next) {
-		smap_gather_stats(vma, &mss);
+		smap_gather_stats(vma, &mss, 0);
 		last_vma_end = vma->vm_end;
 	}
 
-- 
1.9.1
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH v3 3/3] mm: proc: smaps_rollup: do not stall write attempts on mmap_lock
  2020-08-15  6:20 [PATCH v3 0/3] Try to release mmap_lock temporarily in smaps_rollup Chinwen Chang
  2020-08-15  6:20 ` [PATCH v3 1/3] mmap locking API: add mmap_lock_is_contended() Chinwen Chang
  2020-08-15  6:20 ` [PATCH v3 2/3] mm: smaps*: extend smap_gather_stats to support specified beginning Chinwen Chang
@ 2020-08-15  6:20 ` Chinwen Chang
  2020-08-17  8:38   ` Steven Price
  2 siblings, 1 reply; 7+ messages in thread
From: Chinwen Chang @ 2020-08-15  6:20 UTC (permalink / raw)
  To: Matthias Brugger, Michel Lespinasse, Andrew Morton,
	Vlastimil Babka, Daniel Jordan, Davidlohr Bueso, Chinwen Chang,
	Alexey Dobriyan, Matthew Wilcox (Oracle),
	Jason Gunthorpe, Steven Price, Song Liu, Jimmy Assarsson,
	Huang Ying, Daniel Kiss, Laurent Dufour
  Cc: linux-fsdevel, linux-mediatek, linux-kernel, linux-arm-kernel,
	wsd_upstream

smaps_rollup will try to grab mmap_lock and go through the whole vma
list until it finishes the iterating. When encountering large processes,
the mmap_lock will be held for a longer time, which may block other
write requests like mmap and munmap from progressing smoothly.

There are upcoming mmap_lock optimizations like range-based locks, but
the lock applied to smaps_rollup would be the coarse type, which doesn't
avoid the occurrence of unpleasant contention.

To solve aforementioned issue, we add a check which detects whether
anyone wants to grab mmap_lock for write attempts.

Change since v1:
- If current VMA is freed after dropping the lock, it will return
- incomplete result. To fix this issue, refine the code flow as
- suggested by Steve. [1]

Change since v2:
- When getting back the mmap lock, the address where you stopped last
- time could now be in the middle of a vma. Add one more check to handle
- this case as suggested by Michel. [2]

[1] https://lore.kernel.org/lkml/bf40676e-b14b-44cd-75ce-419c70194783@arm.com/
[2] https://lore.kernel.org/lkml/CANN689FtCsC71cjAjs0GPspOhgo_HRj+diWsoU1wr98YPktgWg@mail.gmail.com/

Signed-off-by: Chinwen Chang <chinwen.chang@mediatek.com>
CC: Steven Price <steven.price@arm.com>
CC: Michel Lespinasse <walken@google.com>
---
 fs/proc/task_mmu.c | 73 +++++++++++++++++++++++++++++++++++++++++++++++++++---
 1 file changed, 70 insertions(+), 3 deletions(-)

diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index 76e623a..945904e 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -846,7 +846,7 @@ static int show_smaps_rollup(struct seq_file *m, void *v)
 	struct mem_size_stats mss;
 	struct mm_struct *mm;
 	struct vm_area_struct *vma;
-	unsigned long last_vma_end = 0;
+	unsigned long last_vma_end = 0, last_stopped = 0;
 	int ret = 0;
 
 	priv->task = get_proc_task(priv->inode);
@@ -867,9 +867,76 @@ static int show_smaps_rollup(struct seq_file *m, void *v)
 
 	hold_task_mempolicy(priv);
 
-	for (vma = priv->mm->mmap; vma; vma = vma->vm_next) {
-		smap_gather_stats(vma, &mss, 0);
+	for (vma = priv->mm->mmap; vma;) {
+		smap_gather_stats(vma, &mss, last_stopped);
+		last_stopped = 0;
 		last_vma_end = vma->vm_end;
+
+		/*
+		 * Release mmap_lock temporarily if someone wants to
+		 * access it for write request.
+		 */
+		if (mmap_lock_is_contended(mm)) {
+			mmap_read_unlock(mm);
+			ret = mmap_read_lock_killable(mm);
+			if (ret) {
+				release_task_mempolicy(priv);
+				goto out_put_mm;
+			}
+
+			/*
+			 * After dropping the lock, there are four cases to
+			 * consider. See the following example for explanation.
+			 *
+			 *   +------+------+-----------+
+			 *   | VMA1 | VMA2 | VMA3      |
+			 *   +------+------+-----------+
+			 *   |      |      |           |
+			 *  4k     8k     16k         400k
+			 *
+			 * Suppose we drop the lock after reading VMA2 due to
+			 * contention, then we get:
+			 *
+			 *	last_vma_end = 16k
+			 *
+			 * 1) VMA2 is freed, but VMA3 exists:
+			 *
+			 *    find_vma(mm, 16k - 1) will return VMA3.
+			 *    In this case, just continue from VMA3.
+			 *
+			 * 2) VMA2 still exists:
+			 *
+			 *    find_vma(mm, 16k - 1) will return VMA2.
+			 *    Iterate the loop like the original one.
+			 *
+			 * 3) No more VMAs can be found:
+			 *
+			 *    find_vma(mm, 16k - 1) will return NULL.
+			 *    No more things to do, just break.
+			 *
+			 * 4) (last_vma_end - 1) is the middle of a vma (VMA'):
+			 *
+			 *    find_vma(mm, 16k - 1) will return VMA' whose range
+			 *    contains last_vma_end.
+			 *    Iterate VMA' from last_vma_end.
+			 */
+			vma = find_vma(mm, last_vma_end - 1);
+			/* Case 3 above */
+			if (!vma)
+				break;
+
+			/* Case 1 above */
+			if (vma->vm_start >= last_vma_end)
+				continue;
+
+			/* Case 4 above */
+			if (vma->vm_end > last_vma_end) {
+				last_stopped = last_vma_end;
+				continue;
+			}
+		}
+		/* Case 2 above */
+		vma = vma->vm_next;
 	}
 
 	show_vma_header_prefix(m, priv->mm->mmap->vm_start,
-- 
1.9.1
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH v3 2/3] mm: smaps*: extend smap_gather_stats to support specified beginning
  2020-08-15  6:20 ` [PATCH v3 2/3] mm: smaps*: extend smap_gather_stats to support specified beginning Chinwen Chang
@ 2020-08-17  8:38   ` Steven Price
  0 siblings, 0 replies; 7+ messages in thread
From: Steven Price @ 2020-08-17  8:38 UTC (permalink / raw)
  To: Chinwen Chang, Matthias Brugger, Michel Lespinasse,
	Andrew Morton, Vlastimil Babka, Daniel Jordan, Davidlohr Bueso,
	Alexey Dobriyan, Matthew Wilcox (Oracle),
	Jason Gunthorpe, Song Liu, Jimmy Assarsson, Huang Ying,
	Daniel Kiss, Laurent Dufour
  Cc: linux-fsdevel, linux-mediatek, linux-kernel, linux-arm-kernel,
	wsd_upstream

On 15/08/2020 07:20, Chinwen Chang wrote:
> Extend smap_gather_stats to support indicated beginning address at
> which it should start gathering. To achieve the goal, we add a new
> parameter @start assigned by the caller and try to refactor it for
> simplicity.
> 
> If @start is 0, it will use the range of @vma for gathering.
> 
> Change since v2:
> - This is a new change to make the retry behavior of smaps_rollup
> - more complete as suggested by Michel [1]
> 
> [1] https://lore.kernel.org/lkml/CANN689FtCsC71cjAjs0GPspOhgo_HRj+diWsoU1wr98YPktgWg@mail.gmail.com/
> 
> Signed-off-by: Chinwen Chang <chinwen.chang@mediatek.com>
> CC: Michel Lespinasse <walken@google.com>
> CC: Steven Price <steven.price@arm.com>

LGTM

Reviewed-by: Steven Price <steven.price@arm.com>

Steve

> ---
>   fs/proc/task_mmu.c | 30 ++++++++++++++++++++++--------
>   1 file changed, 22 insertions(+), 8 deletions(-)
> 
> diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
> index dbda449..76e623a 100644
> --- a/fs/proc/task_mmu.c
> +++ b/fs/proc/task_mmu.c
> @@ -723,9 +723,21 @@ static int smaps_hugetlb_range(pte_t *pte, unsigned long hmask,
>   	.pte_hole		= smaps_pte_hole,
>   };
>   
> +/*
> + * Gather mem stats from @vma with the indicated beginning
> + * address @start, and keep them in @mss.
> + *
> + * Use vm_start of @vma as the beginning address if @start is 0.
> + */
>   static void smap_gather_stats(struct vm_area_struct *vma,
> -			     struct mem_size_stats *mss)
> +		struct mem_size_stats *mss, unsigned long start)
>   {
> +	const struct mm_walk_ops *ops = &smaps_walk_ops;
> +
> +	/* Invalid start */
> +	if (start >= vma->vm_end)
> +		return;
> +
>   #ifdef CONFIG_SHMEM
>   	/* In case of smaps_rollup, reset the value from previous vma */
>   	mss->check_shmem_swap = false;
> @@ -742,18 +754,20 @@ static void smap_gather_stats(struct vm_area_struct *vma,
>   		 */
>   		unsigned long shmem_swapped = shmem_swap_usage(vma);
>   
> -		if (!shmem_swapped || (vma->vm_flags & VM_SHARED) ||
> -					!(vma->vm_flags & VM_WRITE)) {
> +		if (!start && (!shmem_swapped || (vma->vm_flags & VM_SHARED) ||
> +					!(vma->vm_flags & VM_WRITE))) {
>   			mss->swap += shmem_swapped;
>   		} else {
>   			mss->check_shmem_swap = true;
> -			walk_page_vma(vma, &smaps_shmem_walk_ops, mss);
> -			return;
> +			ops = &smaps_shmem_walk_ops;
>   		}
>   	}
>   #endif
>   	/* mmap_lock is held in m_start */
> -	walk_page_vma(vma, &smaps_walk_ops, mss);
> +	if (!start)
> +		walk_page_vma(vma, ops, mss);
> +	else
> +		walk_page_range(vma->vm_mm, start, vma->vm_end, ops, mss);
>   }
>   
>   #define SEQ_PUT_DEC(str, val) \
> @@ -805,7 +819,7 @@ static int show_smap(struct seq_file *m, void *v)
>   
>   	memset(&mss, 0, sizeof(mss));
>   
> -	smap_gather_stats(vma, &mss);
> +	smap_gather_stats(vma, &mss, 0);
>   
>   	show_map_vma(m, vma);
>   
> @@ -854,7 +868,7 @@ static int show_smaps_rollup(struct seq_file *m, void *v)
>   	hold_task_mempolicy(priv);
>   
>   	for (vma = priv->mm->mmap; vma; vma = vma->vm_next) {
> -		smap_gather_stats(vma, &mss);
> +		smap_gather_stats(vma, &mss, 0);
>   		last_vma_end = vma->vm_end;
>   	}
>   
> 


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v3 3/3] mm: proc: smaps_rollup: do not stall write attempts on mmap_lock
  2020-08-15  6:20 ` [PATCH v3 3/3] mm: proc: smaps_rollup: do not stall write attempts on mmap_lock Chinwen Chang
@ 2020-08-17  8:38   ` Steven Price
  2020-08-17  9:15     ` Chinwen Chang
  0 siblings, 1 reply; 7+ messages in thread
From: Steven Price @ 2020-08-17  8:38 UTC (permalink / raw)
  To: Chinwen Chang, Matthias Brugger, Michel Lespinasse,
	Andrew Morton, Vlastimil Babka, Daniel Jordan, Davidlohr Bueso,
	Alexey Dobriyan, Matthew Wilcox (Oracle),
	Jason Gunthorpe, Song Liu, Jimmy Assarsson, Huang Ying,
	Daniel Kiss, Laurent Dufour
  Cc: linux-fsdevel, linux-mediatek, linux-kernel, linux-arm-kernel,
	wsd_upstream

On 15/08/2020 07:20, Chinwen Chang wrote:
> smaps_rollup will try to grab mmap_lock and go through the whole vma
> list until it finishes the iterating. When encountering large processes,
> the mmap_lock will be held for a longer time, which may block other
> write requests like mmap and munmap from progressing smoothly.
> 
> There are upcoming mmap_lock optimizations like range-based locks, but
> the lock applied to smaps_rollup would be the coarse type, which doesn't
> avoid the occurrence of unpleasant contention.
> 
> To solve aforementioned issue, we add a check which detects whether
> anyone wants to grab mmap_lock for write attempts.
> 
> Change since v1:
> - If current VMA is freed after dropping the lock, it will return
> - incomplete result. To fix this issue, refine the code flow as
> - suggested by Steve. [1]
> 
> Change since v2:
> - When getting back the mmap lock, the address where you stopped last
> - time could now be in the middle of a vma. Add one more check to handle
> - this case as suggested by Michel. [2]
> 
> [1] https://lore.kernel.org/lkml/bf40676e-b14b-44cd-75ce-419c70194783@arm.com/
> [2] https://lore.kernel.org/lkml/CANN689FtCsC71cjAjs0GPspOhgo_HRj+diWsoU1wr98YPktgWg@mail.gmail.com/
> 
> Signed-off-by: Chinwen Chang <chinwen.chang@mediatek.com>
> CC: Steven Price <steven.price@arm.com>
> CC: Michel Lespinasse <walken@google.com>

Reviewed-by: Steven Price <steven.price@arm.com>

> ---
>   fs/proc/task_mmu.c | 73 +++++++++++++++++++++++++++++++++++++++++++++++++++---
>   1 file changed, 70 insertions(+), 3 deletions(-)
> 
> diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
> index 76e623a..945904e 100644
> --- a/fs/proc/task_mmu.c
> +++ b/fs/proc/task_mmu.c
> @@ -846,7 +846,7 @@ static int show_smaps_rollup(struct seq_file *m, void *v)
>   	struct mem_size_stats mss;
>   	struct mm_struct *mm;
>   	struct vm_area_struct *vma;
> -	unsigned long last_vma_end = 0;
> +	unsigned long last_vma_end = 0, last_stopped = 0;
>   	int ret = 0;
>   
>   	priv->task = get_proc_task(priv->inode);
> @@ -867,9 +867,76 @@ static int show_smaps_rollup(struct seq_file *m, void *v)
>   
>   	hold_task_mempolicy(priv);
>   
> -	for (vma = priv->mm->mmap; vma; vma = vma->vm_next) {
> -		smap_gather_stats(vma, &mss, 0);
> +	for (vma = priv->mm->mmap; vma;) {
> +		smap_gather_stats(vma, &mss, last_stopped);
> +		last_stopped = 0;
>   		last_vma_end = vma->vm_end;
> +
> +		/*
> +		 * Release mmap_lock temporarily if someone wants to
> +		 * access it for write request.
> +		 */
> +		if (mmap_lock_is_contended(mm)) {
> +			mmap_read_unlock(mm);
> +			ret = mmap_read_lock_killable(mm);
> +			if (ret) {
> +				release_task_mempolicy(priv);
> +				goto out_put_mm;
> +			}
> +
> +			/*
> +			 * After dropping the lock, there are four cases to
> +			 * consider. See the following example for explanation.
> +			 *
> +			 *   +------+------+-----------+
> +			 *   | VMA1 | VMA2 | VMA3      |
> +			 *   +------+------+-----------+
> +			 *   |      |      |           |
> +			 *  4k     8k     16k         400k
> +			 *
> +			 * Suppose we drop the lock after reading VMA2 due to
> +			 * contention, then we get:
> +			 *
> +			 *	last_vma_end = 16k
> +			 *
> +			 * 1) VMA2 is freed, but VMA3 exists:
> +			 *
> +			 *    find_vma(mm, 16k - 1) will return VMA3.
> +			 *    In this case, just continue from VMA3.
> +			 *
> +			 * 2) VMA2 still exists:
> +			 *
> +			 *    find_vma(mm, 16k - 1) will return VMA2.
> +			 *    Iterate the loop like the original one.
> +			 *
> +			 * 3) No more VMAs can be found:
> +			 *
> +			 *    find_vma(mm, 16k - 1) will return NULL.
> +			 *    No more things to do, just break.
> +			 *
> +			 * 4) (last_vma_end - 1) is the middle of a vma (VMA'):
> +			 *
> +			 *    find_vma(mm, 16k - 1) will return VMA' whose range
> +			 *    contains last_vma_end.
> +			 *    Iterate VMA' from last_vma_end.
> +			 */
> +			vma = find_vma(mm, last_vma_end - 1);
> +			/* Case 3 above */
> +			if (!vma)
> +				break;
> +
> +			/* Case 1 above */
> +			if (vma->vm_start >= last_vma_end)
> +				continue;
> +
> +			/* Case 4 above */
> +			if (vma->vm_end > last_vma_end) {
> +				last_stopped = last_vma_end;
> +				continue;

Note that instead of having last_stopped, you could replace the above 
with a direct call:

   smap_gather_stats(vma, &mss, last_vma_end);

I'm not sure which is cleaner though. last_stopped is a bit messy (it's 
easily confused with last_vma_end), but having just the one call site 
for smap_gather_stats() is nice too.

Steve

> +			}
> +		}
> +		/* Case 2 above */
> +		vma = vma->vm_next;
>   	}
>   
>   	show_vma_header_prefix(m, priv->mm->mmap->vm_start,
> 


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v3 3/3] mm: proc: smaps_rollup: do not stall write attempts on mmap_lock
  2020-08-17  8:38   ` Steven Price
@ 2020-08-17  9:15     ` Chinwen Chang
  0 siblings, 0 replies; 7+ messages in thread
From: Chinwen Chang @ 2020-08-17  9:15 UTC (permalink / raw)
  To: Steven Price
  Cc: linux-arm-kernel, Song Liu, Laurent Dufour, wsd_upstream,
	Davidlohr Bueso, linux-kernel, Matthew Wilcox (Oracle),
	Daniel Jordan, Jason Gunthorpe, linux-mediatek, Jimmy Assarsson,
	Huang Ying, Matthias Brugger, linux-fsdevel, Andrew Morton,
	Michel Lespinasse, Alexey Dobriyan, Vlastimil Babka, Daniel Kiss

On Mon, 2020-08-17 at 09:38 +0100, Steven Price wrote:
> On 15/08/2020 07:20, Chinwen Chang wrote:
> > smaps_rollup will try to grab mmap_lock and go through the whole vma
> > list until it finishes the iterating. When encountering large processes,
> > the mmap_lock will be held for a longer time, which may block other
> > write requests like mmap and munmap from progressing smoothly.
> > 
> > There are upcoming mmap_lock optimizations like range-based locks, but
> > the lock applied to smaps_rollup would be the coarse type, which doesn't
> > avoid the occurrence of unpleasant contention.
> > 
> > To solve aforementioned issue, we add a check which detects whether
> > anyone wants to grab mmap_lock for write attempts.
> > 
> > Change since v1:
> > - If current VMA is freed after dropping the lock, it will return
> > - incomplete result. To fix this issue, refine the code flow as
> > - suggested by Steve. [1]
> > 
> > Change since v2:
> > - When getting back the mmap lock, the address where you stopped last
> > - time could now be in the middle of a vma. Add one more check to handle
> > - this case as suggested by Michel. [2]
> > 
> > [1] https://lore.kernel.org/lkml/bf40676e-b14b-44cd-75ce-419c70194783@arm.com/
> > [2] https://lore.kernel.org/lkml/CANN689FtCsC71cjAjs0GPspOhgo_HRj+diWsoU1wr98YPktgWg@mail.gmail.com/
> > 
> > Signed-off-by: Chinwen Chang <chinwen.chang@mediatek.com>
> > CC: Steven Price <steven.price@arm.com>
> > CC: Michel Lespinasse <walken@google.com>
> 
> Reviewed-by: Steven Price <steven.price@arm.com>
> 
> > ---
> >   fs/proc/task_mmu.c | 73 +++++++++++++++++++++++++++++++++++++++++++++++++++---
> >   1 file changed, 70 insertions(+), 3 deletions(-)
> > 
> > diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
> > index 76e623a..945904e 100644
> > --- a/fs/proc/task_mmu.c
> > +++ b/fs/proc/task_mmu.c
> > @@ -846,7 +846,7 @@ static int show_smaps_rollup(struct seq_file *m, void *v)
> >   	struct mem_size_stats mss;
> >   	struct mm_struct *mm;
> >   	struct vm_area_struct *vma;
> > -	unsigned long last_vma_end = 0;
> > +	unsigned long last_vma_end = 0, last_stopped = 0;
> >   	int ret = 0;
> >   
> >   	priv->task = get_proc_task(priv->inode);
> > @@ -867,9 +867,76 @@ static int show_smaps_rollup(struct seq_file *m, void *v)
> >   
> >   	hold_task_mempolicy(priv);
> >   
> > -	for (vma = priv->mm->mmap; vma; vma = vma->vm_next) {
> > -		smap_gather_stats(vma, &mss, 0);
> > +	for (vma = priv->mm->mmap; vma;) {
> > +		smap_gather_stats(vma, &mss, last_stopped);
> > +		last_stopped = 0;
> >   		last_vma_end = vma->vm_end;
> > +
> > +		/*
> > +		 * Release mmap_lock temporarily if someone wants to
> > +		 * access it for write request.
> > +		 */
> > +		if (mmap_lock_is_contended(mm)) {
> > +			mmap_read_unlock(mm);
> > +			ret = mmap_read_lock_killable(mm);
> > +			if (ret) {
> > +				release_task_mempolicy(priv);
> > +				goto out_put_mm;
> > +			}
> > +
> > +			/*
> > +			 * After dropping the lock, there are four cases to
> > +			 * consider. See the following example for explanation.
> > +			 *
> > +			 *   +------+------+-----------+
> > +			 *   | VMA1 | VMA2 | VMA3      |
> > +			 *   +------+------+-----------+
> > +			 *   |      |      |           |
> > +			 *  4k     8k     16k         400k
> > +			 *
> > +			 * Suppose we drop the lock after reading VMA2 due to
> > +			 * contention, then we get:
> > +			 *
> > +			 *	last_vma_end = 16k
> > +			 *
> > +			 * 1) VMA2 is freed, but VMA3 exists:
> > +			 *
> > +			 *    find_vma(mm, 16k - 1) will return VMA3.
> > +			 *    In this case, just continue from VMA3.
> > +			 *
> > +			 * 2) VMA2 still exists:
> > +			 *
> > +			 *    find_vma(mm, 16k - 1) will return VMA2.
> > +			 *    Iterate the loop like the original one.
> > +			 *
> > +			 * 3) No more VMAs can be found:
> > +			 *
> > +			 *    find_vma(mm, 16k - 1) will return NULL.
> > +			 *    No more things to do, just break.
> > +			 *
> > +			 * 4) (last_vma_end - 1) is the middle of a vma (VMA'):
> > +			 *
> > +			 *    find_vma(mm, 16k - 1) will return VMA' whose range
> > +			 *    contains last_vma_end.
> > +			 *    Iterate VMA' from last_vma_end.
> > +			 */
> > +			vma = find_vma(mm, last_vma_end - 1);
> > +			/* Case 3 above */
> > +			if (!vma)
> > +				break;
> > +
> > +			/* Case 1 above */
> > +			if (vma->vm_start >= last_vma_end)
> > +				continue;
> > +
> > +			/* Case 4 above */
> > +			if (vma->vm_end > last_vma_end) {
> > +				last_stopped = last_vma_end;
> > +				continue;
> 
> Note that instead of having last_stopped, you could replace the above 
> with a direct call:
> 
>    smap_gather_stats(vma, &mss, last_vma_end);
> 
> I'm not sure which is cleaner though. last_stopped is a bit messy (it's 
> easily confused with last_vma_end), but having just the one call site 
> for smap_gather_stats() is nice too.
> 
> Steve
> 

Hi Steve,

I think your idea is better. Let me try refactoring for further reviews.
Thanks for your kind suggestion:)

Chinwen
> > +			}
> > +		}
> > +		/* Case 2 above */
> > +		vma = vma->vm_next;
> >   	}
> >   
> >   	show_vma_header_prefix(m, priv->mm->mmap->vm_start,
> > 
> 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2020-08-17  9:40 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-15  6:20 [PATCH v3 0/3] Try to release mmap_lock temporarily in smaps_rollup Chinwen Chang
2020-08-15  6:20 ` [PATCH v3 1/3] mmap locking API: add mmap_lock_is_contended() Chinwen Chang
2020-08-15  6:20 ` [PATCH v3 2/3] mm: smaps*: extend smap_gather_stats to support specified beginning Chinwen Chang
2020-08-17  8:38   ` Steven Price
2020-08-15  6:20 ` [PATCH v3 3/3] mm: proc: smaps_rollup: do not stall write attempts on mmap_lock Chinwen Chang
2020-08-17  8:38   ` Steven Price
2020-08-17  9:15     ` Chinwen Chang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).