From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.6 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY, SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C0308C33CA2 for ; Fri, 10 Jan 2020 00:28:56 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 6AB112080D for ; Fri, 10 Jan 2020 00:28:56 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="UPsxAOdU" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 6AB112080D Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id C55B48E0005; Thu, 9 Jan 2020 19:28:55 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id BDEF98E0001; Thu, 9 Jan 2020 19:28:55 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id AA7268E0005; Thu, 9 Jan 2020 19:28:55 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0061.hostedemail.com [216.40.44.61]) by kanga.kvack.org (Postfix) with ESMTP id 910958E0001 for ; Thu, 9 Jan 2020 19:28:55 -0500 (EST) Received: from smtpin27.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with SMTP id 442B44DBD for ; Fri, 10 Jan 2020 00:28:55 +0000 (UTC) X-FDA: 76359839430.27.skin94_139a838612e56 X-HE-Tag: skin94_139a838612e56 X-Filterd-Recvd-Size: 5911 Received: from mail-ed1-f68.google.com (mail-ed1-f68.google.com [209.85.208.68]) by imf26.hostedemail.com (Postfix) with ESMTP for ; Fri, 10 Jan 2020 00:28:54 +0000 (UTC) Received: by mail-ed1-f68.google.com with SMTP id c26so46141eds.8 for ; Thu, 09 Jan 2020 16:28:54 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=YK07dG9mS/i6t5DE8qmxV0Tg9yfN7MAs/hz7qzpudL4=; b=UPsxAOdUQON+FR7w22H7DQOXhIuTtYc6PmzaFpKapcLEEDcu8VMwaxdTNV+Xq1yAOx fA4KYSLvgUDS7iMJ66tdLniB4FfwqFrDa4vvjNbLauE3aycCDyJfvhIm07PTLVycCsnf 1g2VHo7KNWC4MunVPYa6VL0r7BtNiVpI567rgr0xL+GhDC3IN9S4ckIdqZAnO1oxCQTL bWM3i5B9Bg8usNme7XWiwfLMoFYOmPR844+Rw1dKSLUtudJfzAW2IXLVzCMpCWQzhf1c BCbG1myvcy3dzB23Zs0QZ+N7skreF52WJCOXbEQjOshrfrwRMnM7XtTLVPT0boqtyLKX ld0Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=YK07dG9mS/i6t5DE8qmxV0Tg9yfN7MAs/hz7qzpudL4=; b=BOl7P26Mw8akCke7QQ0q6zystmfmR1Gei4ZxrJJk4ygcNkBG6je0jbR1I8i0Y/6rCf 3+ccPx1KGsnwGGirI9J+J9v3SZAod7I90ceZZfskcImVKMUKZaeQra4OrPISt3YbVIet bRRoJ64eKznqPR6KT/QrucHDD8L4lATvomC9gy+815d7lbQmfsHuq3y8eACUKI13jg5a uh0+HwWOEOC8QDc3LG0wr83TNGf1jot5ahNGhTxWEf+Tok0xWL+nn38YxvR0HlPUutSE Nf/EAGc5YiRVmZtxtzt2yUz0/SoRW7x+l6g7DjmSf7xSTNBFEe6LW8nng8/FhF2vb7/o mk8A== X-Gm-Message-State: APjAAAWus/0Dp6uU3/c+1coYwrRpA57TnNXd/9yVyVJg48roDnwwbWQk 6ZWh0oBMytee825Qa5qNJtuPt0WkbdcIS/FJ5/E= X-Google-Smtp-Source: APXvYqyTpGDs94DxDuzM33I28G6l7opgYXI3yi8+fRZmsrI8GAn8omoeBcR4IHCIY9AgBow09tedzqJ55ScIUb2jXvs= X-Received: by 2002:aa7:c694:: with SMTP id n20mr501233edq.95.1578616133224; Thu, 09 Jan 2020 16:28:53 -0800 (PST) MIME-Version: 1.0 References: <20200109225646.22983-1-xiyou.wangcong@gmail.com> In-Reply-To: <20200109225646.22983-1-xiyou.wangcong@gmail.com> From: Yang Shi Date: Thu, 9 Jan 2020 16:28:40 -0800 Message-ID: Subject: Re: [PATCH] mm: avoid blocking lock_page() in kcompactd To: Cong Wang Cc: Linux Kernel Mailing List , Andrew Morton , Michal Hocko , Linux MM Content-Type: text/plain; charset="UTF-8" X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, Jan 9, 2020 at 2:57 PM Cong Wang wrote: > > We observed kcompactd hung at __lock_page(): > > INFO: task kcompactd0:57 blocked for more than 120 seconds. > Not tainted 4.19.56.x86_64 #1 > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > kcompactd0 D 0 57 2 0x80000000 > Call Trace: > ? __schedule+0x236/0x860 > schedule+0x28/0x80 > io_schedule+0x12/0x40 > __lock_page+0xf9/0x120 > ? page_cache_tree_insert+0xb0/0xb0 > ? update_pageblock_skip+0xb0/0xb0 > migrate_pages+0x88c/0xb90 > ? isolate_freepages_block+0x3b0/0x3b0 > compact_zone+0x5f1/0x870 > kcompactd_do_work+0x130/0x2c0 > ? __switch_to_asm+0x35/0x70 > ? __switch_to_asm+0x41/0x70 > ? kcompactd_do_work+0x2c0/0x2c0 > ? kcompactd+0x73/0x180 > kcompactd+0x73/0x180 > ? finish_wait+0x80/0x80 > kthread+0x113/0x130 > ? kthread_create_worker_on_cpu+0x50/0x50 > ret_from_fork+0x35/0x40 > > which faddr2line maps to: > > migrate_pages+0x88c/0xb90: > lock_page at include/linux/pagemap.h:483 > (inlined by) __unmap_and_move at mm/migrate.c:1024 > (inlined by) unmap_and_move at mm/migrate.c:1189 > (inlined by) migrate_pages at mm/migrate.c:1419 > > Sometimes kcompactd eventually got out of this situation, sometimes not. > > I think for memory compaction, it is a best effort to migrate the pages, > so it doesn't have to wait for I/O to complete. It is fine to call > trylock_page() here, which is pretty much similar to > buffer_migrate_lock_buffers(). > > Given MIGRATE_SYNC_LIGHT is used on compaction path, just relax the > check for it. But this changed the semantics of MIGRATE_SYNC_LIGHT which means blocking on most operations but not ->writepage. When MIGRATE_SYNC_LIGHT is used it means compaction priority is increased (the initial priority is ASYNC) due to whatever reason (i.e. not enough clean, non-writeback and non-locked pages to migrate). So, it has to wait for some pages to try to not backoff pre-maturely. If I read the code correctly, buffer_migrate_lock_buffers() also blocks on page lock with non-ASYNC mode. Since v5.1 Mel Gorman improved compaction a lot. So, I'm wondering if this happens on the latest upstream or not. And, did you figure out who is locking the page for such long time? Or there might be too many waiters on the list for this page? > > Cc: Andrew Morton > Cc: Michal Hocko > Cc: linux-mm@kvack.org > Signed-off-by: Cong Wang > --- > mm/migrate.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/mm/migrate.c b/mm/migrate.c > index 86873b6f38a7..df60026779d2 100644 > --- a/mm/migrate.c > +++ b/mm/migrate.c > @@ -1010,7 +1010,8 @@ static int __unmap_and_move(struct page *page, struct page *newpage, > bool is_lru = !__PageMovable(page); > > if (!trylock_page(page)) { > - if (!force || mode == MIGRATE_ASYNC) > + if (!force || mode == MIGRATE_ASYNC > + || mode == MIGRATE_SYNC_LIGHT) > goto out; > > /* > -- > 2.21.1 > >