* Re: A possible bug: Calling mutex_lock while holding spinlock
[not found] ` <20170803153902.71ceaa3b435083fc2e112631@linux-foundation.org>
@ 2017-08-04 13:49 ` Kirill A. Shutemov
2017-08-04 14:03 ` axie
0 siblings, 1 reply; 5+ messages in thread
From: Kirill A. Shutemov @ 2017-08-04 13:49 UTC (permalink / raw)
To: Andrew Morton; +Cc: axie, Alex Deucher, Writer, Tim, linux-mm
On Thu, Aug 03, 2017 at 03:39:02PM -0700, Andrew Morton wrote:
>
> (cc Kirill)
>
> On Thu, 3 Aug 2017 12:35:28 -0400 axie <axie@amd.com> wrote:
>
> > Hi Andrew,
> >
> >
> > I got a report yesterday with "BUG: sleeping function called from
> > invalid context at kernel/locking/mutex.c"
> >
> > I checked the relevant functions for the issue. Function
> > page_vma_mapped_walk did acquire spinlock. Later, in MMU notifier,
> > amdgpu_mn_invalidate_page called function mutex_lock, which triggered
> > the "bug".
> >
> > Function page_vma_mapped_walk was introduced recently by you in commit
> > c7ab0d2fdc840266b39db94538f74207ec2afbf6 and
> > ace71a19cec5eb430207c3269d8a2683f0574306.
> >
> > Would you advise how to proceed with this bug? Change
> > page_vma_mapped_walk not to use spinlock? Or change
> > amdgpu_mn_invalidate_page to use spinlock to meet the change, or
> > something else?
> >
>
> hm, as far as I can tell this was an unintended side-effect of
> c7ab0d2fd ("mm: convert try_to_unmap_one() to use
> page_vma_mapped_walk()"). Before that patch,
> mmu_notifier_invalidate_page() was not called under page_table_lock.
> After that patch, mmu_notifier_invalidate_page() is called under
> page_table_lock.
>
> Perhaps Kirill can suggest a fix?
Sorry for this.
What about the patch below?
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: A possible bug: Calling mutex_lock while holding spinlock
2017-08-04 13:49 ` A possible bug: Calling mutex_lock while holding spinlock Kirill A. Shutemov
@ 2017-08-04 14:03 ` axie
2017-08-08 16:51 ` axie
0 siblings, 1 reply; 5+ messages in thread
From: axie @ 2017-08-04 14:03 UTC (permalink / raw)
To: Kirill A. Shutemov, Andrew Morton
Cc: Alex Deucher, Writer, Tim, linux-mm, Xie, AlexBin
Hi Kirill,
Thanks for the patch. I have sent the patch to the user asking whether
he can give it a try.
Regards,
Alex (Bin) Xie
On 2017-08-04 09:49 AM, Kirill A. Shutemov wrote:
> On Thu, Aug 03, 2017 at 03:39:02PM -0700, Andrew Morton wrote:
>> (cc Kirill)
>>
>> On Thu, 3 Aug 2017 12:35:28 -0400 axie <axie@amd.com> wrote:
>>
>>> Hi Andrew,
>>>
>>>
>>> I got a report yesterday with "BUG: sleeping function called from
>>> invalid context at kernel/locking/mutex.c"
>>>
>>> I checked the relevant functions for the issue. Function
>>> page_vma_mapped_walk did acquire spinlock. Later, in MMU notifier,
>>> amdgpu_mn_invalidate_page called function mutex_lock, which triggered
>>> the "bug".
>>>
>>> Function page_vma_mapped_walk was introduced recently by you in commit
>>> c7ab0d2fdc840266b39db94538f74207ec2afbf6 and
>>> ace71a19cec5eb430207c3269d8a2683f0574306.
>>>
>>> Would you advise how to proceed with this bug? Change
>>> page_vma_mapped_walk not to use spinlock? Or change
>>> amdgpu_mn_invalidate_page to use spinlock to meet the change, or
>>> something else?
>>>
>> hm, as far as I can tell this was an unintended side-effect of
>> c7ab0d2fd ("mm: convert try_to_unmap_one() to use
>> page_vma_mapped_walk()"). Before that patch,
>> mmu_notifier_invalidate_page() was not called under page_table_lock.
>> After that patch, mmu_notifier_invalidate_page() is called under
>> page_table_lock.
>>
>> Perhaps Kirill can suggest a fix?
> Sorry for this.
>
> What about the patch below?
>
> From f48dbcdd0ed83dee9a157062b7ca1e2915172678 Mon Sep 17 00:00:00 2001
> From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
> Date: Fri, 4 Aug 2017 16:37:26 +0300
> Subject: [PATCH] rmap: do not call mmu_notifier_invalidate_page() under ptl
>
> MMU notifiers can sleep, but in page_mkclean_one() we call
> mmu_notifier_invalidate_page() under page table lock.
>
> Let's instead use mmu_notifier_invalidate_range() outside
> page_vma_mapped_walk() loop.
>
> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> Fixes: c7ab0d2fdc84 ("mm: convert try_to_unmap_one() to use page_vma_mapped_walk()")
> ---
> mm/rmap.c | 21 +++++++++++++--------
> 1 file changed, 13 insertions(+), 8 deletions(-)
>
> diff --git a/mm/rmap.c b/mm/rmap.c
> index ced14f1af6dc..b4b711a82c01 100644
> --- a/mm/rmap.c
> +++ b/mm/rmap.c
> @@ -852,10 +852,10 @@ static bool page_mkclean_one(struct page *page, struct vm_area_struct *vma,
> .flags = PVMW_SYNC,
> };
> int *cleaned = arg;
> + bool invalidation_needed = false;
>
> while (page_vma_mapped_walk(&pvmw)) {
> int ret = 0;
> - address = pvmw.address;
> if (pvmw.pte) {
> pte_t entry;
> pte_t *pte = pvmw.pte;
> @@ -863,11 +863,11 @@ static bool page_mkclean_one(struct page *page, struct vm_area_struct *vma,
> if (!pte_dirty(*pte) && !pte_write(*pte))
> continue;
>
> - flush_cache_page(vma, address, pte_pfn(*pte));
> - entry = ptep_clear_flush(vma, address, pte);
> + flush_cache_page(vma, pvmw.address, pte_pfn(*pte));
> + entry = ptep_clear_flush(vma, pvmw.address, pte);
> entry = pte_wrprotect(entry);
> entry = pte_mkclean(entry);
> - set_pte_at(vma->vm_mm, address, pte, entry);
> + set_pte_at(vma->vm_mm, pvmw.address, pte, entry);
> ret = 1;
> } else {
> #ifdef CONFIG_TRANSPARENT_HUGE_PAGECACHE
> @@ -877,11 +877,11 @@ static bool page_mkclean_one(struct page *page, struct vm_area_struct *vma,
> if (!pmd_dirty(*pmd) && !pmd_write(*pmd))
> continue;
>
> - flush_cache_page(vma, address, page_to_pfn(page));
> - entry = pmdp_huge_clear_flush(vma, address, pmd);
> + flush_cache_page(vma, pvmw.address, page_to_pfn(page));
> + entry = pmdp_huge_clear_flush(vma, pvmw.address, pmd);
> entry = pmd_wrprotect(entry);
> entry = pmd_mkclean(entry);
> - set_pmd_at(vma->vm_mm, address, pmd, entry);
> + set_pmd_at(vma->vm_mm, pvmw.address, pmd, entry);
> ret = 1;
> #else
> /* unexpected pmd-mapped page? */
> @@ -890,11 +890,16 @@ static bool page_mkclean_one(struct page *page, struct vm_area_struct *vma,
> }
>
> if (ret) {
> - mmu_notifier_invalidate_page(vma->vm_mm, address);
> (*cleaned)++;
> + invalidation_needed = true;
> }
> }
>
> + if (invalidation_needed) {
> + mmu_notifier_invalidate_range(vma->vm_mm, address,
> + address + (1UL << compound_order(page)));
> + }
> +
> return true;
> }
>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: A possible bug: Calling mutex_lock while holding spinlock
2017-08-04 14:03 ` axie
@ 2017-08-08 16:51 ` axie
2017-08-08 17:01 ` Kirill A. Shutemov
0 siblings, 1 reply; 5+ messages in thread
From: axie @ 2017-08-08 16:51 UTC (permalink / raw)
To: Kirill A. Shutemov, Andrew Morton
Cc: Alex Deucher, Writer, Tim, linux-mm, Xie, AlexBin
[-- Attachment #1: Type: text/plain, Size: 5085 bytes --]
Hi Kirill,
Here is the result from the user:"This patch does appear fix the issue."
Thanks,
Alex (Bin) Xie
On 2017-08-04 10:03 AM, axie wrote:
> Hi Kirill,
>
>
> Thanks for the patch. I have sent the patch to the user asking whether
> he can give it a try.
>
>
> Regards,
>
> Alex (Bin) Xie
>
>
>
> On 2017-08-04 09:49 AM, Kirill A. Shutemov wrote:
>> On Thu, Aug 03, 2017 at 03:39:02PM -0700, Andrew Morton wrote:
>>> (cc Kirill)
>>>
>>> On Thu, 3 Aug 2017 12:35:28 -0400 axie <axie@amd.com> wrote:
>>>
>>>> Hi Andrew,
>>>>
>>>>
>>>> I got a report yesterday with "BUG: sleeping function called from
>>>> invalid context at kernel/locking/mutex.c"
>>>>
>>>> I checked the relevant functions for the issue. Function
>>>> page_vma_mapped_walk did acquire spinlock. Later, in MMU notifier,
>>>> amdgpu_mn_invalidate_page called function mutex_lock, which triggered
>>>> the "bug".
>>>>
>>>> Function page_vma_mapped_walk was introduced recently by you in commit
>>>> c7ab0d2fdc840266b39db94538f74207ec2afbf6 and
>>>> ace71a19cec5eb430207c3269d8a2683f0574306.
>>>>
>>>> Would you advise how to proceed with this bug? Change
>>>> page_vma_mapped_walk not to use spinlock? Or change
>>>> amdgpu_mn_invalidate_page to use spinlock to meet the change, or
>>>> something else?
>>>>
>>> hm, as far as I can tell this was an unintended side-effect of
>>> c7ab0d2fd ("mm: convert try_to_unmap_one() to use
>>> page_vma_mapped_walk()"). Before that patch,
>>> mmu_notifier_invalidate_page() was not called under page_table_lock.
>>> After that patch, mmu_notifier_invalidate_page() is called under
>>> page_table_lock.
>>>
>>> Perhaps Kirill can suggest a fix?
>> Sorry for this.
>>
>> What about the patch below?
>>
>> From f48dbcdd0ed83dee9a157062b7ca1e2915172678 Mon Sep 17 00:00:00 2001
>> From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
>> Date: Fri, 4 Aug 2017 16:37:26 +0300
>> Subject: [PATCH] rmap: do not call mmu_notifier_invalidate_page()
>> under ptl
>>
>> MMU notifiers can sleep, but in page_mkclean_one() we call
>> mmu_notifier_invalidate_page() under page table lock.
>>
>> Let's instead use mmu_notifier_invalidate_range() outside
>> page_vma_mapped_walk() loop.
>>
>> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
>> Fixes: c7ab0d2fdc84 ("mm: convert try_to_unmap_one() to use
>> page_vma_mapped_walk()")
>> ---
>> mm/rmap.c | 21 +++++++++++++--------
>> 1 file changed, 13 insertions(+), 8 deletions(-)
>>
>> diff --git a/mm/rmap.c b/mm/rmap.c
>> index ced14f1af6dc..b4b711a82c01 100644
>> --- a/mm/rmap.c
>> +++ b/mm/rmap.c
>> @@ -852,10 +852,10 @@ static bool page_mkclean_one(struct page *page,
>> struct vm_area_struct *vma,
>> .flags = PVMW_SYNC,
>> };
>> int *cleaned = arg;
>> + bool invalidation_needed = false;
>> while (page_vma_mapped_walk(&pvmw)) {
>> int ret = 0;
>> - address = pvmw.address;
>> if (pvmw.pte) {
>> pte_t entry;
>> pte_t *pte = pvmw.pte;
>> @@ -863,11 +863,11 @@ static bool page_mkclean_one(struct page *page,
>> struct vm_area_struct *vma,
>> if (!pte_dirty(*pte) && !pte_write(*pte))
>> continue;
>> - flush_cache_page(vma, address, pte_pfn(*pte));
>> - entry = ptep_clear_flush(vma, address, pte);
>> + flush_cache_page(vma, pvmw.address, pte_pfn(*pte));
>> + entry = ptep_clear_flush(vma, pvmw.address, pte);
>> entry = pte_wrprotect(entry);
>> entry = pte_mkclean(entry);
>> - set_pte_at(vma->vm_mm, address, pte, entry);
>> + set_pte_at(vma->vm_mm, pvmw.address, pte, entry);
>> ret = 1;
>> } else {
>> #ifdef CONFIG_TRANSPARENT_HUGE_PAGECACHE
>> @@ -877,11 +877,11 @@ static bool page_mkclean_one(struct page *page,
>> struct vm_area_struct *vma,
>> if (!pmd_dirty(*pmd) && !pmd_write(*pmd))
>> continue;
>> - flush_cache_page(vma, address, page_to_pfn(page));
>> - entry = pmdp_huge_clear_flush(vma, address, pmd);
>> + flush_cache_page(vma, pvmw.address, page_to_pfn(page));
>> + entry = pmdp_huge_clear_flush(vma, pvmw.address, pmd);
>> entry = pmd_wrprotect(entry);
>> entry = pmd_mkclean(entry);
>> - set_pmd_at(vma->vm_mm, address, pmd, entry);
>> + set_pmd_at(vma->vm_mm, pvmw.address, pmd, entry);
>> ret = 1;
>> #else
>> /* unexpected pmd-mapped page? */
>> @@ -890,11 +890,16 @@ static bool page_mkclean_one(struct page *page,
>> struct vm_area_struct *vma,
>> }
>> if (ret) {
>> - mmu_notifier_invalidate_page(vma->vm_mm, address);
>> (*cleaned)++;
>> + invalidation_needed = true;
>> }
>> }
>> + if (invalidation_needed) {
>> + mmu_notifier_invalidate_range(vma->vm_mm, address,
>> + address + (1UL << compound_order(page)));
>> + }
>> +
>> return true;
>> }
>
[-- Attachment #2: Type: text/html, Size: 8922 bytes --]
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: A possible bug: Calling mutex_lock while holding spinlock
2017-08-08 16:51 ` axie
@ 2017-08-08 17:01 ` Kirill A. Shutemov
2017-08-08 20:29 ` Kirill A. Shutemov
0 siblings, 1 reply; 5+ messages in thread
From: Kirill A. Shutemov @ 2017-08-08 17:01 UTC (permalink / raw)
To: axie
Cc: Kirill A. Shutemov, Andrew Morton, Alex Deucher, Writer, Tim,
linux-mm, Xie, AlexBin
On Tue, Aug 08, 2017 at 12:51:15PM -0400, axie wrote:
> Hi Kirill,
>
> Here is the result from the user:"This patch does appear fix the issue."
Hm. Could you get logs from failure on the patched kernel?
--
Kirill A. Shutemov
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: A possible bug: Calling mutex_lock while holding spinlock
2017-08-08 17:01 ` Kirill A. Shutemov
@ 2017-08-08 20:29 ` Kirill A. Shutemov
0 siblings, 0 replies; 5+ messages in thread
From: Kirill A. Shutemov @ 2017-08-08 20:29 UTC (permalink / raw)
To: axie
Cc: Kirill A. Shutemov, Andrew Morton, Alex Deucher, Writer, Tim,
linux-mm, Xie, AlexBin
On Tue, Aug 08, 2017 at 08:01:27PM +0300, Kirill A. Shutemov wrote:
> On Tue, Aug 08, 2017 at 12:51:15PM -0400, axie wrote:
> > Hi Kirill,
> >
> > Here is the result from the user:"This patch does appear fix the issue."
>
> Hm. Could you get logs from failure on the patched kernel?
Please ignore. I've misread what you wrote. %)
--
Kirill A. Shutemov
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2017-08-08 20:29 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
[not found] <2d442de2-c5d4-ecce-2345-4f8f34314247@amd.com>
[not found] ` <20170803153902.71ceaa3b435083fc2e112631@linux-foundation.org>
2017-08-04 13:49 ` A possible bug: Calling mutex_lock while holding spinlock Kirill A. Shutemov
2017-08-04 14:03 ` axie
2017-08-08 16:51 ` axie
2017-08-08 17:01 ` Kirill A. Shutemov
2017-08-08 20:29 ` Kirill A. Shutemov
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.