From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-0.8 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,HTML_MESSAGE,MAILING_LIST_MULTI, MIME_QP_LONG_LINE,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 021E2C352AA for ; Mon, 7 Oct 2019 11:04:05 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 9D0D2206BB for ; Mon, 7 Oct 2019 11:04:04 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=lca.pw header.i=@lca.pw header.b="oJvQl8dw" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 9D0D2206BB Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=lca.pw Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 477358E0006; Mon, 7 Oct 2019 07:04:04 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 426078E0003; Mon, 7 Oct 2019 07:04:04 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 33C648E0006; Mon, 7 Oct 2019 07:04:04 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0082.hostedemail.com [216.40.44.82]) by kanga.kvack.org (Postfix) with ESMTP id 11D788E0003 for ; Mon, 7 Oct 2019 07:04:04 -0400 (EDT) Received: from smtpin15.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with SMTP id A5BFD62CE for ; Mon, 7 Oct 2019 11:04:03 +0000 (UTC) X-FDA: 76016703966.15.brass35_236373f0e565b X-HE-Tag: brass35_236373f0e565b X-Filterd-Recvd-Size: 15204 Received: from mail-qk1-f194.google.com (mail-qk1-f194.google.com [209.85.222.194]) by imf10.hostedemail.com (Postfix) with ESMTP for ; Mon, 7 Oct 2019 11:04:03 +0000 (UTC) Received: by mail-qk1-f194.google.com with SMTP id w2so12123884qkf.2 for ; Mon, 07 Oct 2019 04:04:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=lca.pw; s=google; h=content-transfer-encoding:from:mime-version:subject:date:message-id :references:cc:in-reply-to:to; bh=HxkYaxA4olYcOXeHhSp+rbTCVsog5j7/JTMsfvW5TsQ=; b=oJvQl8dwI6577P05SKtDCBBKhjERQmp0whslj0VgooxUxZHhefTdVdTumAHW2h70By 07UPOPNzJ3TfRuMcLhrsyMx4g9pjHdMh22k9DY58MOljd3OmZyZVHTeb2Gbd8b+1KTlr az6PI/IzvuzlY391+u0TKFKoLFHfQ5zhNpczU+jTXThyKJqph9xEnZMbMXegIXECf8Bw QsOcpNp2lj1p1EVcdolsryRJ5dTSjGQbWUg7PdnTpU9le2/kI9LI203RRACGiPPqLwDu jr0dwJXXFMA23TuRqIigBQIiorw5mLZD4oE0c2TGfxS/4+sD3ScrzkvDdJHF1kQln5Ta YVxw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:content-transfer-encoding:from:mime-version :subject:date:message-id:references:cc:in-reply-to:to; bh=HxkYaxA4olYcOXeHhSp+rbTCVsog5j7/JTMsfvW5TsQ=; b=LGnrboVEmKQ7Mj2YGgJKWlXVxK8vsLBBqDYCLb201WW1MewMsjx5qL6tKKl2DbNLJW OLZ8scMZ4DOdryUxWCg0ILbA3Tq3U3yYLCSq5GZaMJTGkGfnw0I9XSHqATp8Oc9AcdHn WhLwDLjU1+HencC/MxxNBpJ5uhR4vxLbVISvxIea2fjQn6ZaFJ4xiUnqOTesF4n8hUwd zZDEovtW3uEqGANKDdWdXZlO9LGKcFCvXZhcJVVkuZUbKKEtrwmX4hrEXg0115cq1jD0 ds5zEz9qGQGcRbMsKe6mklSpLLrDKx2UfNdnekYvhXGaxcQIQrlGhydEPe2ZHODyMr16 R/Sg== X-Gm-Message-State: APjAAAV8HgPUZH90E6R9IPnfm0cAeMpmcHJsup/hQ8YpSTi6nzVLUkY4 tvjHl2ZjIc17mlkaoVaxcOKjqQ== X-Google-Smtp-Source: APXvYqw7jtRUSUmk2KbXtILwSnFqcOYXTWaP44GHbl6buZJgVnpYXhlJgQkJY/CjNcwePiac5TNXmw== X-Received: by 2002:a37:8f02:: with SMTP id r2mr22954165qkd.197.1570446242216; Mon, 07 Oct 2019 04:04:02 -0700 (PDT) Received: from [192.168.1.183] (pool-71-184-117-43.bstnma.fios.verizon.net. [71.184.117.43]) by smtp.gmail.com with ESMTPSA id b4sm7867030qkd.121.2019.10.07.04.04.01 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 07 Oct 2019 04:04:01 -0700 (PDT) Content-Type: multipart/alternative; boundary=Apple-Mail-B812ACDE-0903-46D1-9EB2-A737E257EFB1 Content-Transfer-Encoding: 7bit From: Qian Cai Mime-Version: 1.0 (1.0) Subject: Re: [PATCH v2] mm/page_isolation: fix a deadlock with printk() Date: Mon, 7 Oct 2019 07:04:00 -0400 Message-Id: References: <20191007080742.GD2381@dhcp22.suse.cz> Cc: akpm@linux-foundation.org, sergey.senozhatsky.work@gmail.com, pmladek@suse.com, rostedt@goodmis.org, peterz@infradead.org, david@redhat.com, john.ogness@linutronix.de, linux-mm@kvack.org, linux-kernel@vger.kernel.org In-Reply-To: <20191007080742.GD2381@dhcp22.suse.cz> To: Michal Hocko X-Mailer: iPhone Mail (17A860) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: --Apple-Mail-B812ACDE-0903-46D1-9EB2-A737E257EFB1 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable > On Oct 7, 2019, at 4:07 AM, Michal Hocko wrote: >=20 > I do not think that removing the printk is the right long term solution. > While I do agree that removing the debugging printk __offline_isolated_pag= es > does make sense because it is essentially of a very limited use, this > doesn't really solve the underlying problem. There are likely other > printks from zone->lock. It would be much more saner to actually > disallow consoles to allocate any memory while printk is called from an > atomic context. No, there is only a handful of places called printk() from zone->lock. It is= normal that the callers will quietly process =E2=80=9Cstruct zone=E2=80=9D m= odification in a short section with zone->lock held. No, it is not about =E2=80=9Callocate any memory while printk is called from= an atomic context=E2=80=9D. It is opposite lock chain from different processor= s which has the same effect. For example, CPU0: CPU1: CPU2: console_owner sclp_lock sclp_lock zone_lock zone_lock console_owner Here it is a deadlock. >=20 >> The problem is probably there forever, but neither many developers will >> run memory offline with the lockdep enabled nor admins in the field are >> lucky enough yet to hit a perfect timing which required to trigger a >> real deadlock. In addition, there aren't many places that call printk() >> while zone->lock was held. >>=20 >> WARNING: possible circular locking dependency detected >> ------------------------------------------------------ >> test.sh/1724 is trying to acquire lock: >> 0000000052059ec0 (console_owner){-...}, at: console_unlock+0x >> 01: 328/0xa30 >>=20 >> but task is already holding lock: >> 000000006ffd89c8 (&(&zone->lock)->rlock){-.-.}, at: start_iso >> 01: late_page_range+0x216/0x538 >=20 > Show Quoted Content >> The problem is probably there forever, but neither many developers will >> run memory offline with the lockdep enabled nor admins in the field are >> lucky enough yet to hit a perfect timing which required to trigger a >> real deadlock. In addition, there aren't many places that call printk() >> while zone->lock was held. >>=20 >> WARNING: possible circular locking dependency detected >> ------------------------------------------------------ >> test.sh/1724 is trying to acquire lock: >> 0000000052059ec0 (console_owner){-...}, at: console_unlock+0x >> 01: 328/0xa30 >>=20 >> but task is already holding lock: >> 000000006ffd89c8 (&(&zone->lock)->rlock){-.-.}, at: start_iso >> 01: late_page_range+0x216/0x538 >=20 >=20 > I am also wondering what does this lockdep report actually say. How come > we have a dependency between a start_kernel path and a syscall? Petr explained it correctly.= --Apple-Mail-B812ACDE-0903-46D1-9EB2-A737E257EFB1 Content-Type: text/html; charset=utf-8 Content-Transfer-Encoding: quoted-printable


On Oct 7, 2019, at 4:07 AM, Michal Hocko <= mhocko@kernel.org> wrote:

I do not think that removing the printk is the r= ight long term solution.
While I do agree that removing the d= ebugging printk __offline_isolated_pages
does make sense bec= ause it is essentially of a very limited use, this
doesn't r= eally solve the underlying problem.  There are likely other
<= span>printks from zone->lock. It would be much more saner to actually
disallow consoles to allocate any memory while printk is called= from an
atomic context.
<= br>
No, there is only a handful of places called printk() from zon= e->lock. It is normal that the callers will quietly process =E2=80=9Cstru= ct zone=E2=80=9D modification in a short section with zone->lock held.

No, it is not about =E2=80=9Callocate any memory whil= e printk is called from an
atomic context=E2=80=9D. It is opposite= lock chain  from different processors which has the same effect. For e= xample,

CPU0:           &n= bsp;     CPU1:         CPU2:
console= _owner
                &nb= sp;           sclp_lock
sclp_lock   &= nbsp;                     &= nbsp;       zone_lock
        &= nbsp;                   zone_lo= ck
                  &= nbsp;                     &= nbsp;        console_owner

Here= it is a deadlock.


Show Quoted Content
lucky enough yet to hit a pe= rfect timing which required to trigger a
real deadlock. In addition, there a= ren't many places that call printk()
while zone->lock was held.
=
WARNI= NG: possible circular locking dependency detected
-------------------------= -----------------------------
test.sh/1724 is trying to acquire lock:
0000= 000052059ec0 (console_owner){-...}, at: console_unlock+0x
01: 328/0xa30

= but task is already holding lock:
000000006ffd89c8 (&(&zone->loc= k)->rlock){-.-.}, at: start_iso
01: late_page_range+0x216/0x538
The problem is probably there forever, but neither many developers w= ill
<= span>run memory offline with the lockdep enabled nor admins in the field are=
lucky enough yet to hit a perfect timing which required to trigger a
real= deadlock. In addition, there aren't many places that call printk()
while z= one->lock was held.

WARNING: possible circular locking dependency dete= cted
= ------------------------------------------------------
test.sh/1724 i= s trying to acquire lock:
0000000052059ec0 (console_owner){-...}, at: conso= le_unlock+0x
01: 328/0xa30

but task is already holding lock:
000000006ffd= 89c8 (&(&zone->lock)->rlock){-.-.}, at: start_iso
01: late_pa= ge_range+0x216/0x538

I a= m also wondering what does this lockdep report actually say. How come=
we have a dependency between a start_kernel path and a syscall?
Petr explained it correctly.
= --Apple-Mail-B812ACDE-0903-46D1-9EB2-A737E257EFB1--