From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.0 required=3.0 tests=DKIM_INVALID,DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D6681C432C2 for ; Wed, 25 Sep 2019 16:45:41 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 7320621D79 for ; Wed, 25 Sep 2019 16:45:41 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="hoPkJgMr" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 7320621D79 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id C1CF76B0272; Wed, 25 Sep 2019 12:45:40 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id BCDB66B0274; Wed, 25 Sep 2019 12:45:40 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id AE3EE6B0275; Wed, 25 Sep 2019 12:45:40 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0135.hostedemail.com [216.40.44.135]) by kanga.kvack.org (Postfix) with ESMTP id 8E4436B0272 for ; Wed, 25 Sep 2019 12:45:40 -0400 (EDT) Received: from smtpin09.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with SMTP id 2DBA482437D7 for ; Wed, 25 Sep 2019 16:45:40 +0000 (UTC) X-FDA: 75974019240.09.view20_8891b10b3694d X-HE-Tag: view20_8891b10b3694d X-Filterd-Recvd-Size: 8140 Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) by imf28.hostedemail.com (Postfix) with ESMTP for ; Wed, 25 Sep 2019 16:45:37 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20170209; h=In-Reply-To:Content-Transfer-Encoding :Content-Type:MIME-Version:References:Message-ID:Subject:Cc:To:From:Date: Sender:Reply-To:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id:List-Help: List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=6zJPlAYn8c0O7MARtGXgdQGRjAiE59wQIYqRXLNRrrQ=; b=hoPkJgMrJsNNzMte6q6cCv61ok s9DN3Gz4KN6NLLdcb8X8STVzN+Pz5LZuPw1Neg3c01Doqy3V4lgB+XB7QULc7MafsKOPEnEKWzsFe Uk5kJ0tNXlU95hTBhFB5y1JvS3htodRK92nE4C+Fc+sAsm8rzvz0apkhWH98QeEnWK+G3fUlKf6GJ tqN2NjhkY9lVVvSWWaI97XI0aDtpFMSeBEWH4TzuLxDZ4GICMcXqSPq4uwQLUnWdTcgaSqHSLz/WD tmbvQ4Q41Gd3oVbeJG03jkIfgzmJP/F2jbTVMbIp4TJ8tlmYnYHALgGjrN+Hm2MNU6UV+n+9bcMeY g+V7IdHg==; Received: from j217100.upc-j.chello.nl ([24.132.217.100] helo=noisy.programming.kicks-ass.net) by bombadil.infradead.org with esmtpsa (Exim 4.92.2 #3 (Red Hat Linux)) id 1iDAPy-00036k-00; Wed, 25 Sep 2019 16:45:30 +0000 Received: from hirez.programming.kicks-ass.net (hirez.programming.kicks-ass.net [192.168.1.225]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by noisy.programming.kicks-ass.net (Postfix) with ESMTPS id 9EE31302A71; Wed, 25 Sep 2019 18:44:41 +0200 (CEST) Received: by hirez.programming.kicks-ass.net (Postfix, from userid 1000) id 9EDD920249302; Wed, 25 Sep 2019 18:45:27 +0200 (CEST) Date: Wed, 25 Sep 2019 18:45:27 +0200 From: Peter Zijlstra To: Qian Cai Cc: akpm@linux-foundation.org, bigeasy@linutronix.de, tglx@linutronix.de, thgarnie@google.com, tytso@mit.edu, cl@linux.com, penberg@kernel.org, rientjes@google.com, mingo@redhat.com, will@kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, keescook@chromium.org Subject: Re: [PATCH] mm/slub: fix a deadlock in shuffle_freelist() Message-ID: <20190925164527.GG4553@hirez.programming.kicks-ass.net> References: <1568392064-3052-1-git-send-email-cai@lca.pw> <20190925093153.GC4553@hirez.programming.kicks-ass.net> <1569424727.5576.221.camel@lca.pw> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline In-Reply-To: <1569424727.5576.221.camel@lca.pw> User-Agent: Mutt/1.10.1 (2018-07-13) Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, Sep 25, 2019 at 11:18:47AM -0400, Qian Cai wrote: > On Wed, 2019-09-25 at 11:31 +0200, Peter Zijlstra wrote: > > On Fri, Sep 13, 2019 at 12:27:44PM -0400, Qian Cai wrote: > > > -> #3 (batched_entropy_u32.lock){-.-.}: > > > lock_acquire+0x31c/0x360 > > > _raw_spin_lock_irqsave+0x7c/0x9c > > > get_random_u32+0x6c/0x1dc > > > new_slab+0x234/0x6c0 > > > ___slab_alloc+0x3c8/0x650 > > > kmem_cache_alloc+0x4b0/0x590 > > > __debug_object_init+0x778/0x8b4 > > > debug_object_init+0x40/0x50 > > > debug_init+0x30/0x29c > > > hrtimer_init+0x30/0x50 > > > init_dl_task_timer+0x24/0x44 > > > __sched_fork+0xc0/0x168 > > > init_idle+0x78/0x26c > > > fork_idle+0x12c/0x178 > > > idle_threads_init+0x108/0x178 > > > smp_init+0x20/0x1bc > > > kernel_init_freeable+0x198/0x26c > > > kernel_init+0x18/0x334 > > > ret_from_fork+0x10/0x18 > > >=20 > > > -> #2 (&rq->lock){-.-.}: > >=20 > > This relation is silly.. > >=20 > > I suspect the below 'works'... >=20 > Unfortunately, the relation is still there, >=20 > copy_process()->rt_mutex_init_task()->"&p->pi_lock" >=20 > [24438.676716][=A0=A0=A0=A0T2] -> #2 (&rq->lock){-.-.}: > [24438.676727][=A0=A0=A0=A0T2]=A0=A0=A0=A0=A0=A0=A0=A0__lock_acquire+0x= 5b4/0xbf0 > [24438.676736][=A0=A0=A0=A0T2]=A0=A0=A0=A0=A0=A0=A0=A0lock_acquire+0x13= 0/0x360 > [24438.676754][=A0=A0=A0=A0T2]=A0=A0=A0=A0=A0=A0=A0=A0_raw_spin_lock+0x= 54/0x80 > [24438.676771][=A0=A0=A0=A0T2]=A0=A0=A0=A0=A0=A0=A0=A0task_fork_fair+0x= 60/0x190 > [24438.676788][=A0=A0=A0=A0T2]=A0=A0=A0=A0=A0=A0=A0=A0sched_fork+0x128/= 0x270 > [24438.676806][=A0=A0=A0=A0T2]=A0=A0=A0=A0=A0=A0=A0=A0copy_process+0x7a= 4/0x1bf0 > [24438.676823][=A0=A0=A0=A0T2]=A0=A0=A0=A0=A0=A0=A0=A0_do_fork+0xac/0xa= c0 > [24438.676841][=A0=A0=A0=A0T2]=A0=A0=A0=A0=A0=A0=A0=A0kernel_thread+0x7= 0/0xa0 > [24438.676858][=A0=A0=A0=A0T2]=A0=A0=A0=A0=A0=A0=A0=A0rest_init+0x4c/0x= 42c > [24438.676884][=A0=A0=A0=A0T2]=A0=A0=A0=A0=A0=A0=A0=A0start_kernel+0x77= 8/0x7c0 > [24438.676902][=A0=A0=A0=A0T2]=A0=A0=A0=A0=A0=A0=A0=A0start_here_common= +0x1c/0x334 That's the 'where we took #2 while holding #1' stacktrace and not relevant to our discussion. > [24438.675836][=A0=A0=A0=A0T2] -> #4 (batched_entropy_u64.lock){-...}: > [24438.675860][=A0=A0=A0=A0T2]=A0=A0=A0=A0=A0=A0=A0=A0__lock_acquire+0x= 5b4/0xbf0 > [24438.675878][=A0=A0=A0=A0T2]=A0=A0=A0=A0=A0=A0=A0=A0lock_acquire+0x13= 0/0x360 > [24438.675906][=A0=A0=A0=A0T2]=A0=A0=A0=A0=A0=A0=A0=A0_raw_spin_lock_ir= qsave+0x70/0xa0 > [24438.675923][=A0=A0=A0=A0T2]=A0=A0=A0=A0=A0=A0=A0=A0get_random_u64+0x= 60/0x100 > [24438.675944][=A0=A0=A0=A0T2]=A0=A0=A0=A0=A0=A0=A0=A0add_to_free_area_= random+0x164/0x1b0 > [24438.675962][=A0=A0=A0=A0T2]=A0=A0=A0=A0=A0=A0=A0=A0free_one_page+0xb= 24/0xcf0 > [24438.675980][=A0=A0=A0=A0T2]=A0=A0=A0=A0=A0=A0=A0=A0__free_pages_ok+0= x448/0xbf0 > [24438.675999][=A0=A0=A0=A0T2]=A0=A0=A0=A0=A0=A0=A0=A0deferred_init_max= order+0x404/0x4a4 > [24438.676018][=A0=A0=A0=A0T2]=A0=A0=A0=A0=A0=A0=A0=A0deferred_grow_zon= e+0x158/0x1f0 > [24438.676035][=A0=A0=A0=A0T2]=A0=A0=A0=A0=A0=A0=A0=A0get_page_from_fre= elist+0x1dc8/0x1e10 > [24438.676063][=A0=A0=A0=A0T2]=A0=A0=A0=A0=A0=A0=A0=A0__alloc_pages_nod= emask+0x1d8/0x1940 > [24438.676083][=A0=A0=A0=A0T2]=A0=A0=A0=A0=A0=A0=A0=A0allocate_slab+0x1= 30/0x2740 > [24438.676091][=A0=A0=A0=A0T2]=A0=A0=A0=A0=A0=A0=A0=A0new_slab+0xa8/0xe= 0 > [24438.676101][=A0=A0=A0=A0T2]=A0=A0=A0=A0=A0=A0=A0=A0kmem_cache_open+0= x254/0x660 > [24438.676119][=A0=A0=A0=A0T2]=A0=A0=A0=A0=A0=A0=A0=A0__kmem_cache_crea= te+0x44/0x2a0 > [24438.676136][=A0=A0=A0=A0T2]=A0=A0=A0=A0=A0=A0=A0=A0create_boot_cache= +0xcc/0x110 > [24438.676154][=A0=A0=A0=A0T2]=A0=A0=A0=A0=A0=A0=A0=A0kmem_cache_init+0= x90/0x1f0 > [24438.676173][=A0=A0=A0=A0T2]=A0=A0=A0=A0=A0=A0=A0=A0start_kernel+0x3b= 8/0x7c0 > [24438.676191][=A0=A0=A0=A0T2]=A0=A0=A0=A0=A0=A0=A0=A0start_here_common= +0x1c/0x334 > [24438.676208][=A0=A0=A0=A0T2]=A0 > [24438.676208][=A0=A0=A0=A0T2] -> #3 (&(&zone->lock)->rlock){-.-.}: > [24438.676221][=A0=A0=A0=A0T2]=A0=A0=A0=A0=A0=A0=A0=A0__lock_acquire+0x= 5b4/0xbf0 > [24438.676247][=A0=A0=A0=A0T2]=A0=A0=A0=A0=A0=A0=A0=A0lock_acquire+0x13= 0/0x360 > [24438.676264][=A0=A0=A0=A0T2]=A0=A0=A0=A0=A0=A0=A0=A0_raw_spin_lock+0x= 54/0x80 > [24438.676282][=A0=A0=A0=A0T2]=A0=A0=A0=A0=A0=A0=A0=A0rmqueue_bulk.cons= tprop.23+0x64/0xf20 > [24438.676300][=A0=A0=A0=A0T2]=A0=A0=A0=A0=A0=A0=A0=A0get_page_from_fre= elist+0x718/0x1e10 > [24438.676319][=A0=A0=A0=A0T2]=A0=A0=A0=A0=A0=A0=A0=A0__alloc_pages_nod= emask+0x1d8/0x1940 > [24438.676339][=A0=A0=A0=A0T2]=A0=A0=A0=A0=A0=A0=A0=A0alloc_page_interl= eave+0x34/0x170 > [24438.676356][=A0=A0=A0=A0T2]=A0=A0=A0=A0=A0=A0=A0=A0allocate_slab+0xd= 1c/0x2740 > [24438.676374][=A0=A0=A0=A0T2]=A0=A0=A0=A0=A0=A0=A0=A0new_slab+0xa8/0xe= 0 > [24438.676391][=A0=A0=A0=A0T2]=A0=A0=A0=A0=A0=A0=A0=A0___slab_alloc+0x5= 80/0xef0 > [24438.676408][=A0=A0=A0=A0T2]=A0=A0=A0=A0=A0=A0=A0=A0__slab_alloc+0x64= /0xd0 > [24438.676426][=A0=A0=A0=A0T2]=A0=A0=A0=A0=A0=A0=A0=A0kmem_cache_alloc+= 0x5c4/0x6c0 > [24438.676444][=A0=A0=A0=A0T2]=A0=A0=A0=A0=A0=A0=A0=A0fill_pool+0x280/0= x540 > [24438.676461][=A0=A0=A0=A0T2]=A0=A0=A0=A0=A0=A0=A0=A0__debug_object_in= it+0x60/0x6b0 > [24438.676479][=A0=A0=A0=A0T2]=A0=A0=A0=A0=A0=A0=A0=A0hrtimer_init+0x5c= /0x310 > [24438.676497][=A0=A0=A0=A0T2]=A0=A0=A0=A0=A0=A0=A0=A0init_dl_task_time= r+0x34/0x60 > [24438.676516][=A0=A0=A0=A0T2]=A0=A0=A0=A0=A0=A0=A0=A0__sched_fork+0x8c= /0x110 > [24438.676535][=A0=A0=A0=A0T2]=A0=A0=A0=A0=A0=A0=A0=A0init_idle+0xb4/0x= 3c0 > [24438.676553][=A0=A0=A0=A0T2]=A0=A0=A0=A0=A0=A0=A0=A0idle_thread_get+0= x78/0x120 > [24438.676572][=A0=A0=A0=A0T2]=A0=A0=A0=A0=A0=A0=A0=A0bringup_cpu+0x30/= 0x230 > [24438.676590][=A0=A0=A0=A0T2]=A0=A0=A0=A0=A0=A0=A0=A0cpuhp_invoke_call= back+0x190/0x1580 > [24438.676618][=A0=A0=A0=A0T2]=A0=A0=A0=A0=A0=A0=A0=A0do_cpu_up+0x248/0= x460 > [24438.676636][=A0=A0=A0=A0T2]=A0=A0=A0=A0=A0=A0=A0=A0smp_init+0x118/0x= 1c0 > [24438.676662][=A0=A0=A0=A0T2]=A0=A0=A0=A0=A0=A0=A0=A0kernel_init_freea= ble+0x3f8/0x8dc > [24438.676681][=A0=A0=A0=A0T2]=A0=A0=A0=A0=A0=A0=A0=A0kernel_init+0x2c/= 0x154 > [24438.676699][=A0=A0=A0=A0T2]=A0=A0=A0=A0=A0=A0=A0=A0ret_from_kernel_t= hread+0x5c/0x74 > [24438.676716][=A0=A0=A0=A0T2]=A0 > [24438.676716][=A0=A0=A0=A0T2] -> #2 (&rq->lock){-.-.}: This then shows we now have: rq->lock zone->lock.rlock batched_entropy_u64.lock Which, to me, appears to be distinctly different from the last time, which was: rq->lock batched_entropy_u32.lock Notable: "u32" !=3D "u64". But #3 has: > [24438.676516][=A0=A0=A0=A0T2]=A0=A0=A0=A0=A0=A0=A0=A0__sched_fork+0x8c= /0x110 > [24438.676535][=A0=A0=A0=A0T2]=A0=A0=A0=A0=A0=A0=A0=A0init_idle+0xb4/0x= 3c0 Which seems to suggest you didn't actually apply the patch; or rather, if you did, i'm not immediately seeing where #2 is acquired.