All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mike Kravetz <mike.kravetz@oracle.com>
To: Muchun Song <songmuchun@bytedance.com>
Cc: Oscar Salvador <osalvador@suse.de>,
	Andrew Morton <akpm@linux-foundation.org>,
	Linux Memory Management List <linux-mm@kvack.org>,
	LKML <linux-kernel@vger.kernel.org>, Roman Gushchin <guro@fb.com>,
	Michal Hocko <mhocko@suse.com>,
	Shakeel Butt <shakeelb@google.com>,
	David Hildenbrand <david@redhat.com>,
	David Rientjes <rientjes@google.com>,
	Miaohe Lin <linmiaohe@huawei.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Matthew Wilcox <willy@infradead.org>,
	HORIGUCHI NAOYA <naoya.horiguchi@nec.com>,
	"Aneesh Kumar K . V" <aneesh.kumar@linux.ibm.com>,
	Waiman Long <longman@redhat.com>, Peter Xu <peterx@redhat.com>,
	Mina Almasry <almasrymina@google.com>,
	Hillf Danton <hdanton@sina.com>,
	Joonsoo Kim <iamjoonsoo.kim@lge.com>,
	Barry Song <song.bao.hua@hisilicon.com>,
	Will Deacon <will@kernel.org>
Subject: Re: [External] [PATCH v3 7/8] hugetlb: make free_huge_page irq safe
Date: Sat, 3 Apr 2021 17:00:42 -0700	[thread overview]
Message-ID: <9e146133-16f4-cdfc-0174-fe1ed53cddc7@oracle.com> (raw)
In-Reply-To: <CAMZfGtXjhq41iHDy-9gB5Uf8iYrXV1G_eeDgH1XPL-6WgROcuw@mail.gmail.com>

On 4/2/21 10:59 PM, Muchun Song wrote:
> On Sat, Apr 3, 2021 at 4:56 AM Mike Kravetz <mike.kravetz@oracle.com> wrote:
>>
>> On 4/2/21 5:47 AM, Muchun Song wrote:
>>> On Wed, Mar 31, 2021 at 11:42 AM Mike Kravetz <mike.kravetz@oracle.com> wrote:
>>>>
>>>> Commit c77c0a8ac4c5 ("mm/hugetlb: defer freeing of huge pages if in
>>>> non-task context") was added to address the issue of free_huge_page
>>>> being called from irq context.  That commit hands off free_huge_page
>>>> processing to a workqueue if !in_task.  However, this doesn't cover
>>>> all the cases as pointed out by 0day bot lockdep report [1].
>>>>
>>>> :  Possible interrupt unsafe locking scenario:
>>>> :
>>>> :        CPU0                    CPU1
>>>> :        ----                    ----
>>>> :   lock(hugetlb_lock);
>>>> :                                local_irq_disable();
>>>> :                                lock(slock-AF_INET);
>>>> :                                lock(hugetlb_lock);
>>>> :   <Interrupt>
>>>> :     lock(slock-AF_INET);
>>>>
>>>> Shakeel has later explained that this is very likely TCP TX zerocopy
>>>> from hugetlb pages scenario when the networking code drops a last
>>>> reference to hugetlb page while having IRQ disabled. Hugetlb freeing
>>>> path doesn't disable IRQ while holding hugetlb_lock so a lock dependency
>>>> chain can lead to a deadlock.
>>>>
>>>> This commit addresses the issue by doing the following:
>>>> - Make hugetlb_lock irq safe.  This is mostly a simple process of
>>>>   changing spin_*lock calls to spin_*lock_irq* calls.
>>>> - Make subpool lock irq safe in a similar manner.
>>>> - Revert the !in_task check and workqueue handoff.
>>>>
>>>> [1] https://lore.kernel.org/linux-mm/000000000000f1c03b05bc43aadc@google.com/
>>>>
>>>> Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
>>>> Acked-by: Michal Hocko <mhocko@suse.com>
>>>> Reviewed-by: Muchun Song <songmuchun@bytedance.com>
>>>
>>> Hi Mike,
>>>
>>> Today I pulled the newest code (next-20210401). I found that
>>> alloc_and_dissolve_huge_page is not updated. In this function,
>>> hugetlb_lock is still non-irq safe. Maybe you or Oscar need
>>> to fix.
>>>
>>> Thanks.
>>
>> Thank you Muchun,
>>
>> Oscar's changes were not in Andrew's tree when I started on this series
>> and I failed to notice their inclusion.  In addition,
>> isolate_or_dissolve_huge_page also needs updating as well as a change in
>> set_max_huge_pages that was omitted while rebasing.
>>
>> Andrew, the following patch addresses those missing changes.  Ideally,
>> the changes should be combined/included in this patch.  If you want me
>> to sent another version of this patch or another series, let me know.
>>
>> From 450593eb3cea895f499ddc343c22424c552ea502 Mon Sep 17 00:00:00 2001
>> From: Mike Kravetz <mike.kravetz@oracle.com>
>> Date: Fri, 2 Apr 2021 13:18:13 -0700
>> Subject: [PATCH] hugetlb: fix irq locking omissions
>>
>> The pach "hugetlb: make free_huge_page irq safe" changed spin_*lock
>> calls to spin_*lock_irq* calls.  However, it missed several places
>> in the file hugetlb.c.  Add the overlooked changes.
>>
>> Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
> 
> Thanks MIke. But there are still some places that need
> improvement. See below.
> 

Correct.  My apologies again for not fully taking into account the new
code from Oscar's series when working on this.

>> ---
>>  mm/hugetlb.c | 16 ++++++++--------
>>  1 file changed, 8 insertions(+), 8 deletions(-)
>>
>> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
>> index c22111f3da20..a6bfc6bcbc81 100644
>> --- a/mm/hugetlb.c
>> +++ b/mm/hugetlb.c
>> @@ -2284,7 +2284,7 @@ static int alloc_and_dissolve_huge_page(struct hstate *h, struct page *old_page,
>>          */
>>         page_ref_dec(new_page);
>>  retry:
>> -       spin_lock(&hugetlb_lock);
>> +       spin_lock_irq(&hugetlb_lock);
>>         if (!PageHuge(old_page)) {
>>                 /*
>>                  * Freed from under us. Drop new_page too.
>> @@ -2297,7 +2297,7 @@ static int alloc_and_dissolve_huge_page(struct hstate *h, struct page *old_page,
>>                  * Fail with -EBUSY if not possible.
>>                  */
>>                 update_and_free_page(h, new_page);
> 
> Now update_and_free_page can be called without holding
> hugetlb_lock. We can move it out of hugetlb_lock. In this
> function, there are 3 places which call update_and_free_page().
> We can move all of them out of hugetlb_lock. Right?

We will need to do more than that.
The call to update_and_free_page in alloc_and_dissolve_huge_page
assumes old functionality.  i.e. It assumes h->nr_huge_pages will be
decremented in update_and_free_page.  This is no longer the case.

This will need to be fixed in patch 4 of my series which changes the
functionality of update_and_free_page.  I'm afraid a change there will
end up requiring changes in subsequent patches due to context.

I will have an update on Monday.
-- 
Mike Kravetz

  reply	other threads:[~2021-04-04  0:02 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-31  3:41 [PATCH v3 0/8] make hugetlb put_page safe for all calling contexts Mike Kravetz
2021-03-31  3:41 ` [PATCH v3 1/8] mm/cma: change cma mutex to irq safe spinlock Mike Kravetz
2021-03-31  7:25   ` Michal Hocko
2021-03-31  7:52   ` David Hildenbrand
2021-03-31  3:41 ` [PATCH v3 2/8] hugetlb: no need to drop hugetlb_lock to call cma_release Mike Kravetz
2021-03-31  3:41 ` [PATCH v3 3/8] hugetlb: add per-hstate mutex to synchronize user adjustments Mike Kravetz
2021-03-31  3:41 ` [PATCH v3 4/8] hugetlb: create remove_hugetlb_page() to separate functionality Mike Kravetz
2021-03-31  3:41 ` [PATCH v3 5/8] hugetlb: call update_and_free_page without hugetlb_lock Mike Kravetz
2021-03-31  3:41 ` [PATCH v3 6/8] hugetlb: change free_pool_huge_page to remove_pool_huge_page Mike Kravetz
2021-03-31  3:41 ` [PATCH v3 7/8] hugetlb: make free_huge_page irq safe Mike Kravetz
2021-04-02 12:47   ` [External] " Muchun Song
2021-04-02 12:47     ` Muchun Song
2021-04-02 20:55     ` Mike Kravetz
2021-04-03  5:59       ` Muchun Song
2021-04-03  5:59         ` Muchun Song
2021-04-04  0:00         ` Mike Kravetz [this message]
2021-03-31  3:41 ` [PATCH v3 8/8] hugetlb: add lockdep_assert_held() calls for hugetlb_lock Mike Kravetz

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9e146133-16f4-cdfc-0174-fe1ed53cddc7@oracle.com \
    --to=mike.kravetz@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=almasrymina@google.com \
    --cc=aneesh.kumar@linux.ibm.com \
    --cc=david@redhat.com \
    --cc=guro@fb.com \
    --cc=hdanton@sina.com \
    --cc=iamjoonsoo.kim@lge.com \
    --cc=linmiaohe@huawei.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=longman@redhat.com \
    --cc=mhocko@suse.com \
    --cc=naoya.horiguchi@nec.com \
    --cc=osalvador@suse.de \
    --cc=peterx@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rientjes@google.com \
    --cc=shakeelb@google.com \
    --cc=song.bao.hua@hisilicon.com \
    --cc=songmuchun@bytedance.com \
    --cc=will@kernel.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.