* [PATCH 0/2] Try to release mmap_lock temporarily in smaps_rollup @ 2020-08-11 4:42 Chinwen Chang 2020-08-11 4:42 ` [PATCH 1/2] mmap locking API: add mmap_lock_is_contended() Chinwen Chang 2020-08-11 4:42 ` [PATCH 2/2] mm: proc: smaps_rollup: do not stall write attempts on mmap_lock Chinwen Chang 0 siblings, 2 replies; 5+ messages in thread From: Chinwen Chang @ 2020-08-11 4:42 UTC (permalink / raw) To: Matthias Brugger, Michel Lespinasse, Andrew Morton, Vlastimil Babka, Daniel Jordan, Davidlohr Bueso, Chinwen Chang, Alexey Dobriyan, Matthew Wilcox (Oracle), Jason Gunthorpe, Steven Price, Song Liu, Jimmy Assarsson, Huang Ying Cc: linux-kernel, linux-arm-kernel, linux-mediatek, linux-fsdevel, wsd_upstream Recently, we have observed some janky issues caused by unpleasantly long contention on mmap_lock which is held by smaps_rollup when probing large processes. To address the problem, we let smaps_rollup detect if anyone wants to acquire mmap_lock for write attempts. If yes, just release the lock temporarily to ease the contention. smaps_rollup is a procfs interface which allows users to summarize the process's memory usage without the overhead of seq_* calls. Android uses it to sample the memory usage of various processes to balance its memory pool sizes. If no one wants to take the lock for write requests, smaps_rollup with this patch will behave like the original one. Although there are on-going mmap_lock optimizations like range-based locks, the lock applied to smaps_rollup would be the coarse one, which is hard to avoid the occurrence of aforementioned issues. So the detection and temporary release for write attempts on mmap_lock in smaps_rollup is still necessary. Chinwen Chang (2): mmap locking API: add mmap_lock_is_contended() mm: proc: smaps_rollup: do not stall write attempts on mmap_lock fs/proc/task_mmu.c | 21 +++++++++++++++++++++ include/linux/mmap_lock.h | 5 +++++ 2 files changed, 26 insertions(+) ^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH 1/2] mmap locking API: add mmap_lock_is_contended() 2020-08-11 4:42 [PATCH 0/2] Try to release mmap_lock temporarily in smaps_rollup Chinwen Chang @ 2020-08-11 4:42 ` Chinwen Chang 2020-08-11 4:42 ` [PATCH 2/2] mm: proc: smaps_rollup: do not stall write attempts on mmap_lock Chinwen Chang 1 sibling, 0 replies; 5+ messages in thread From: Chinwen Chang @ 2020-08-11 4:42 UTC (permalink / raw) To: Matthias Brugger, Michel Lespinasse, Andrew Morton, Vlastimil Babka, Daniel Jordan, Davidlohr Bueso, Chinwen Chang, Alexey Dobriyan, Matthew Wilcox (Oracle), Jason Gunthorpe, Steven Price, Song Liu, Jimmy Assarsson, Huang Ying Cc: linux-kernel, linux-arm-kernel, linux-mediatek, linux-fsdevel, wsd_upstream Add new API to query if someone wants to acquire mmap_lock for write attempts. Using this instead of rwsem_is_contended makes it more tolerant of future changes to the lock type. Signed-off-by: Chinwen Chang <chinwen.chang@mediatek.com> --- include/linux/mmap_lock.h | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/include/linux/mmap_lock.h b/include/linux/mmap_lock.h index 0707671..18e7eae 100644 --- a/include/linux/mmap_lock.h +++ b/include/linux/mmap_lock.h @@ -87,4 +87,9 @@ static inline void mmap_assert_write_locked(struct mm_struct *mm) VM_BUG_ON_MM(!rwsem_is_locked(&mm->mmap_lock), mm); } +static inline int mmap_lock_is_contended(struct mm_struct *mm) +{ + return rwsem_is_contended(&mm->mmap_lock); +} + #endif /* _LINUX_MMAP_LOCK_H */ -- 1.9.1 ^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCH 2/2] mm: proc: smaps_rollup: do not stall write attempts on mmap_lock 2020-08-11 4:42 [PATCH 0/2] Try to release mmap_lock temporarily in smaps_rollup Chinwen Chang 2020-08-11 4:42 ` [PATCH 1/2] mmap locking API: add mmap_lock_is_contended() Chinwen Chang @ 2020-08-11 4:42 ` Chinwen Chang 2020-08-12 8:39 ` Steven Price 1 sibling, 1 reply; 5+ messages in thread From: Chinwen Chang @ 2020-08-11 4:42 UTC (permalink / raw) To: Matthias Brugger, Michel Lespinasse, Andrew Morton, Vlastimil Babka, Daniel Jordan, Davidlohr Bueso, Chinwen Chang, Alexey Dobriyan, Matthew Wilcox (Oracle), Jason Gunthorpe, Steven Price, Song Liu, Jimmy Assarsson, Huang Ying Cc: linux-kernel, linux-arm-kernel, linux-mediatek, linux-fsdevel, wsd_upstream smaps_rollup will try to grab mmap_lock and go through the whole vma list until it finishes the iterating. When encountering large processes, the mmap_lock will be held for a longer time, which may block other write requests like mmap and munmap from progressing smoothly. There are upcoming mmap_lock optimizations like range-based locks, but the lock applied to smaps_rollup would be the coarse type, which doesn't avoid the occurrence of unpleasant contention. To solve aforementioned issue, we add a check which detects whether anyone wants to grab mmap_lock for write attempts. Signed-off-by: Chinwen Chang <chinwen.chang@mediatek.com> --- fs/proc/task_mmu.c | 21 +++++++++++++++++++++ 1 file changed, 21 insertions(+) diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c index dbda449..4b51f25 100644 --- a/fs/proc/task_mmu.c +++ b/fs/proc/task_mmu.c @@ -856,6 +856,27 @@ static int show_smaps_rollup(struct seq_file *m, void *v) for (vma = priv->mm->mmap; vma; vma = vma->vm_next) { smap_gather_stats(vma, &mss); last_vma_end = vma->vm_end; + + /* + * Release mmap_lock temporarily if someone wants to + * access it for write request. + */ + if (mmap_lock_is_contended(mm)) { + mmap_read_unlock(mm); + ret = mmap_read_lock_killable(mm); + if (ret) { + release_task_mempolicy(priv); + goto out_put_mm; + } + + /* Check whether current vma is available */ + vma = find_vma(mm, last_vma_end - 1); + if (vma && vma->vm_start < last_vma_end) + continue; + + /* Current vma is not available, just break */ + break; + } } show_vma_header_prefix(m, priv->mm->mmap->vm_start, -- 1.9.1 ^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH 2/2] mm: proc: smaps_rollup: do not stall write attempts on mmap_lock 2020-08-11 4:42 ` [PATCH 2/2] mm: proc: smaps_rollup: do not stall write attempts on mmap_lock Chinwen Chang @ 2020-08-12 8:39 ` Steven Price 2020-08-12 9:26 ` Chinwen Chang 0 siblings, 1 reply; 5+ messages in thread From: Steven Price @ 2020-08-12 8:39 UTC (permalink / raw) To: Chinwen Chang, Matthias Brugger, Michel Lespinasse, Andrew Morton, Vlastimil Babka, Daniel Jordan, Davidlohr Bueso, Alexey Dobriyan, Matthew Wilcox (Oracle), Jason Gunthorpe, Song Liu, Jimmy Assarsson, Huang Ying Cc: linux-kernel, linux-arm-kernel, linux-mediatek, linux-fsdevel, wsd_upstream On 11/08/2020 05:42, Chinwen Chang wrote: > smaps_rollup will try to grab mmap_lock and go through the whole vma > list until it finishes the iterating. When encountering large processes, > the mmap_lock will be held for a longer time, which may block other > write requests like mmap and munmap from progressing smoothly. > > There are upcoming mmap_lock optimizations like range-based locks, but > the lock applied to smaps_rollup would be the coarse type, which doesn't > avoid the occurrence of unpleasant contention. > > To solve aforementioned issue, we add a check which detects whether > anyone wants to grab mmap_lock for write attempts. > > Signed-off-by: Chinwen Chang <chinwen.chang@mediatek.com> > --- > fs/proc/task_mmu.c | 21 +++++++++++++++++++++ > 1 file changed, 21 insertions(+) > > diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c > index dbda449..4b51f25 100644 > --- a/fs/proc/task_mmu.c > +++ b/fs/proc/task_mmu.c > @@ -856,6 +856,27 @@ static int show_smaps_rollup(struct seq_file *m, void *v) > for (vma = priv->mm->mmap; vma; vma = vma->vm_next) { > smap_gather_stats(vma, &mss); > last_vma_end = vma->vm_end; > + > + /* > + * Release mmap_lock temporarily if someone wants to > + * access it for write request. > + */ > + if (mmap_lock_is_contended(mm)) { > + mmap_read_unlock(mm); > + ret = mmap_read_lock_killable(mm); > + if (ret) { > + release_task_mempolicy(priv); > + goto out_put_mm; > + } > + > + /* Check whether current vma is available */ > + vma = find_vma(mm, last_vma_end - 1); > + if (vma && vma->vm_start < last_vma_end) I may be wrong, but this looks like it could return incorrect results. For example if we start reading with the following VMAs: +------+------+-----------+ | VMA1 | VMA2 | VMA3 | +------+------+-----------+ | | | | 4k 8k 16k 400k Then after reading VMA2 we drop the lock due to contention. So: last_vma_end = 16k Then if VMA2 is freed while the lock is dropped, so we have: +------+ +-----------+ | VMA1 | | VMA3 | +------+ +-----------+ | | | | 4k 8k 16k 400k find_vma(mm, 16k-1) will then return VMA3 and the condition vm_start < last_vma_end will be false. > + continue; > + > + /* Current vma is not available, just break */ > + break; Which means we break out here and report an incomplete output (the numbers will be much smaller than reality). Would it be better to have a loop like: for (vma = priv->mm->mmap; vma;) { smap_gather_stats(vma, &mss); last_vma_end = vma->vm_end; if (contended) { /* drop/acquire lock */ vma = find_vma(mm, last_vma_end - 1); if (!vma) break; if (vma->vm_start >= last_vma_end) continue; } vma = vma->vm_next; } that way if the VMA is removed while the lock is dropped the loop can just continue from the next VMA. Or perhaps I missed something obvious? I haven't actually tested anything above. Steve > + } > } > > show_vma_header_prefix(m, priv->mm->mmap->vm_start, > ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH 2/2] mm: proc: smaps_rollup: do not stall write attempts on mmap_lock 2020-08-12 8:39 ` Steven Price @ 2020-08-12 9:26 ` Chinwen Chang 0 siblings, 0 replies; 5+ messages in thread From: Chinwen Chang @ 2020-08-12 9:26 UTC (permalink / raw) To: Steven Price Cc: Matthias Brugger, Michel Lespinasse, Andrew Morton, Vlastimil Babka, Daniel Jordan, Davidlohr Bueso, Alexey Dobriyan, Matthew Wilcox (Oracle), Jason Gunthorpe, Song Liu, Jimmy Assarsson, Huang Ying, linux-kernel, linux-arm-kernel, linux-mediatek, linux-fsdevel, wsd_upstream On Wed, 2020-08-12 at 09:39 +0100, Steven Price wrote: > On 11/08/2020 05:42, Chinwen Chang wrote: > > smaps_rollup will try to grab mmap_lock and go through the whole vma > > list until it finishes the iterating. When encountering large processes, > > the mmap_lock will be held for a longer time, which may block other > > write requests like mmap and munmap from progressing smoothly. > > > > There are upcoming mmap_lock optimizations like range-based locks, but > > the lock applied to smaps_rollup would be the coarse type, which doesn't > > avoid the occurrence of unpleasant contention. > > > > To solve aforementioned issue, we add a check which detects whether > > anyone wants to grab mmap_lock for write attempts. > > > > Signed-off-by: Chinwen Chang <chinwen.chang@mediatek.com> > > --- > > fs/proc/task_mmu.c | 21 +++++++++++++++++++++ > > 1 file changed, 21 insertions(+) > > > > diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c > > index dbda449..4b51f25 100644 > > --- a/fs/proc/task_mmu.c > > +++ b/fs/proc/task_mmu.c > > @@ -856,6 +856,27 @@ static int show_smaps_rollup(struct seq_file *m, void *v) > > for (vma = priv->mm->mmap; vma; vma = vma->vm_next) { > > smap_gather_stats(vma, &mss); > > last_vma_end = vma->vm_end; > > + > > + /* > > + * Release mmap_lock temporarily if someone wants to > > + * access it for write request. > > + */ > > + if (mmap_lock_is_contended(mm)) { > > + mmap_read_unlock(mm); > > + ret = mmap_read_lock_killable(mm); > > + if (ret) { > > + release_task_mempolicy(priv); > > + goto out_put_mm; > > + } > > + > > + /* Check whether current vma is available */ > > + vma = find_vma(mm, last_vma_end - 1); > > + if (vma && vma->vm_start < last_vma_end) > > I may be wrong, but this looks like it could return incorrect results. > For example if we start reading with the following VMAs: > > +------+------+-----------+ > | VMA1 | VMA2 | VMA3 | > +------+------+-----------+ > | | | | > 4k 8k 16k 400k > > Then after reading VMA2 we drop the lock due to contention. So: > > last_vma_end = 16k > > Then if VMA2 is freed while the lock is dropped, so we have: > > +------+ +-----------+ > | VMA1 | | VMA3 | > +------+ +-----------+ > | | | | > 4k 8k 16k 400k > > find_vma(mm, 16k-1) will then return VMA3 and the condition vm_start < > last_vma_end will be false. > Hi Steve, Thank you for reviewing this patch. You are correct. If the contention is detected and the current vma(here is VMA2) is freed while the lock is dropped, it will report an incomplete result. > > + continue; > > + > > + /* Current vma is not available, just break */ > > + break; > > Which means we break out here and report an incomplete output (the > numbers will be much smaller than reality). > > Would it be better to have a loop like: > > for (vma = priv->mm->mmap; vma;) { > smap_gather_stats(vma, &mss); > last_vma_end = vma->vm_end; > > if (contended) { > /* drop/acquire lock */ > > vma = find_vma(mm, last_vma_end - 1); > if (!vma) > break; > if (vma->vm_start >= last_vma_end) > continue; > } > vma = vma->vm_next; > } > > that way if the VMA is removed while the lock is dropped the loop can > just continue from the next VMA. > Thanks a lot for your great suggestion. > Or perhaps I missed something obvious? I haven't actually tested > anything above. > > Steve I will prepare new patch series for further reviews. Thank you. Chinwen > > > + } > > } > > > > show_vma_header_prefix(m, priv->mm->mmap->vm_start, > > > ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2020-08-12 9:26 UTC | newest] Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2020-08-11 4:42 [PATCH 0/2] Try to release mmap_lock temporarily in smaps_rollup Chinwen Chang 2020-08-11 4:42 ` [PATCH 1/2] mmap locking API: add mmap_lock_is_contended() Chinwen Chang 2020-08-11 4:42 ` [PATCH 2/2] mm: proc: smaps_rollup: do not stall write attempts on mmap_lock Chinwen Chang 2020-08-12 8:39 ` Steven Price 2020-08-12 9:26 ` Chinwen Chang
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).