linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Waiman Long <longman@redhat.com>
To: Andrew Morton <akpm@linux-foundation.org>,
	Mike Kravetz <mike.kravetz@oracle.com>,
	Matthew Wilcox <willy@infradead.org>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	Michal Hocko <mhocko@kernel.org>,
	aneesh.kumar@linux.ibm.com
Subject: Re: [PATCH v2] mm/hugetlb: defer free_huge_page() to a workqueue
Date: Thu, 12 Dec 2019 16:01:03 -0500	[thread overview]
Message-ID: <295a82ae-a575-b6a0-ae89-3196fea45b9f@redhat.com> (raw)
In-Reply-To: <20191212190427.ouyohviijf5inhur@linux-p48b>

On 12/12/19 2:04 PM, Davidlohr Bueso wrote:
> There have been deadlock reports[1, 2] where put_page is called
> from softirq context and this causes trouble with the hugetlb_lock,
> as well as potentially the subpool lock.
>
> For such an unlikely scenario, lets not add irq dancing overhead
> to the lock+unlock operations, which could incur in expensive
> instruction dependencies, particularly when considering hard-irq
> safety. For example PUSHF+POPF on x86.
>
> Instead, just use a workqueue and do the free_huge_page() in regular
> task context.
>
> [1]
> https://lore.kernel.org/lkml/20191211194615.18502-1-longman@redhat.com/
> [2]
> https://lore.kernel.org/lkml/20180905112341.21355-1-aneesh.kumar@linux.ibm.com/
>
> Reported-by: Waiman Long <longman@redhat.com>
> Reported-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
> Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
> ---
>
> - Changes from v1: Only use wq when in_interrupt(), otherwise business
>    as usual. Also include the proper header file.
>
> - While I have not reproduced this issue, the v1 using wq passes all
> hugetlb
>    related tests in ltp.
>
> mm/hugetlb.c | 45 ++++++++++++++++++++++++++++++++++++++++++++-
> 1 file changed, 44 insertions(+), 1 deletion(-)
>
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index ac65bb5e38ac..f28cf601938d 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -27,6 +27,7 @@
> #include <linux/swapops.h>
> #include <linux/jhash.h>
> #include <linux/numa.h>
> +#include <linux/workqueue.h>
>
> #include <asm/page.h>
> #include <asm/pgtable.h>
> @@ -1136,7 +1137,13 @@ static inline void
> ClearPageHugeTemporary(struct page *page)
>     page[2].mapping = NULL;
> }
>
> -void free_huge_page(struct page *page)
> +static struct workqueue_struct *hugetlb_free_page_wq;
> +struct hugetlb_free_page_work {
> +    struct page *page;
> +    struct work_struct work;
> +};
> +
> +static void __free_huge_page(struct page *page)
> {
>     /*
>      * Can't pass hstate in here because it is called from the
> @@ -1199,6 +1206,36 @@ void free_huge_page(struct page *page)
>     spin_unlock(&hugetlb_lock);
> }
>
> +static void free_huge_page_workfn(struct work_struct *work)
> +{
> +    struct page *page;
> +
> +    page = container_of(work, struct hugetlb_free_page_work,
> work)->page;
> +    __free_huge_page(page);
> +}
> +
> +void free_huge_page(struct page *page)
> +{
> +    if (unlikely(in_interrupt())) {

in_interrupt() also include context where softIRQ is disabled. So maybe
!in_task() is a better fit here.


> +        /*
> +         * While uncommon, free_huge_page() can be at least
> +         * called from softirq context, defer freeing such
> +         * that the hugetlb_lock and spool->lock need not have
> +         * to deal with irq dances just for this.
> +         */
> +        struct hugetlb_free_page_work work;
> +
> +        work.page = page;
> +        INIT_WORK_ONSTACK(&work.work, free_huge_page_workfn);
> +        queue_work(hugetlb_free_page_wq, &work.work);
> +
> +        /* wait until the huge page freeing is done */
> +        flush_work(&work.work);
> +        destroy_work_on_stack(&work.work);

The problem I see is that you don't want to wait too long while in the
hardirq context. However, the latency for the work to finish is
indeterminate.

Cheers,
Longman


  parent reply	other threads:[~2019-12-12 21:01 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-12-11 19:46 [PATCH v2] hugetlbfs: Disable softIRQ when taking hugetlb_lock Waiman Long
2019-12-11 22:04 ` Mike Kravetz
2019-12-11 22:19   ` Waiman Long
2019-12-12  1:11     ` Mike Kravetz
2019-12-12  6:06       ` Davidlohr Bueso
2019-12-12  6:30         ` Davidlohr Bueso
2019-12-12 19:04           ` [PATCH v2] mm/hugetlb: defer free_huge_page() to a workqueue Davidlohr Bueso
2019-12-12 19:22             ` Mike Kravetz
2019-12-12 19:36               ` Davidlohr Bueso
2019-12-12 20:52               ` Waiman Long
2019-12-12 21:04                 ` Waiman Long
2019-12-16 13:26                 ` Michal Hocko
2019-12-16 15:38                   ` Waiman Long
2019-12-16 18:44                     ` Waiman Long
2019-12-17  9:00                       ` Michal Hocko
2019-12-17 18:05                         ` Mike Kravetz
2019-12-18 12:18                           ` hugetlbfs testing coverage (was: Re: [PATCH v2] mm/hugetlb: defer free_huge_page() to a workqueue) Michal Hocko
2019-12-12 21:01             ` Waiman Long [this message]
2019-12-16 13:37             ` [PATCH v2] mm/hugetlb: defer free_huge_page() to a workqueue Michal Hocko
2019-12-16 16:17               ` Waiman Long
2019-12-16 16:17               ` Davidlohr Bueso
2019-12-16 17:18                 ` Matthew Wilcox
2019-12-16 19:08                 ` Mike Kravetz
2019-12-16 19:13                   ` Waiman Long
2019-12-12 21:32         ` [PATCH v2] hugetlbfs: Disable softIRQ when taking hugetlb_lock Andi Kleen
2019-12-12 22:42           ` Davidlohr Bueso
2019-12-11 23:05 ` Andi Kleen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=295a82ae-a575-b6a0-ae89-3196fea45b9f@redhat.com \
    --to=longman@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=aneesh.kumar@linux.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=mike.kravetz@oracle.com \
    --cc=willy@infradead.org \
    --subject='Re: [PATCH v2] mm/hugetlb: defer free_huge_page() to a workqueue' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).