From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-10.1 required=3.0 tests=DKIM_INVALID,DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,MENTIONS_GIT_HOSTING, SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6FE03C3F68F for ; Fri, 3 Jan 2020 18:36:00 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 4728122522 for ; Fri, 3 Jan 2020 18:36:00 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="sWAv4YQS" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728319AbgACSf7 (ORCPT ); Fri, 3 Jan 2020 13:35:59 -0500 Received: from mail-pg1-f193.google.com ([209.85.215.193]:40153 "EHLO mail-pg1-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728297AbgACSf7 (ORCPT ); Fri, 3 Jan 2020 13:35:59 -0500 Received: by mail-pg1-f193.google.com with SMTP id k25so23766064pgt.7; Fri, 03 Jan 2020 10:35:58 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=gxmBgrmn5KtAxI4sjTOIEP0PiEZjeLGN8jMBIc75LbE=; b=sWAv4YQSICEIAxpfihjQ9TyjRy/OJGu85DklbRRzRlh3w50aebZezwY0RFEwHg4yHm oUMu1RASS9KJRBjQxvvFw29bI7FPVdCOdKU1XsYCAn4zz8S2dGCW3JO//oS7QTocGV2d LilxCOM7G8XFSiucWf00WqqCRxN2SsEtsAfuIRJIPIRXZ17uv6s+2B3YmAxRTgom3r2s w2RMtz9KNvJwCPAKscaDwUyCS+rh0i8CtUBEvWXihoNSU/xnLoch97yIZDdBQwCUYDse /JLdUhXh75KKLcNqCn28YLOapbMSL/D7Z8rrgVR60C4vnK/lnmIuqd+gU8PvQ9UXSz64 TUHw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:from:to:cc:subject:message-id :references:mime-version:content-disposition:in-reply-to:user-agent; bh=gxmBgrmn5KtAxI4sjTOIEP0PiEZjeLGN8jMBIc75LbE=; b=AC2ckR2C71y2uqgazSL6RKxaX/GX6WFy+H6Wz0dzX4gdIFF3UikO7kUDdwNnJAWL7B q3xYmKhZ7XC/nVPxuajZUfFhXcArtS0wK9/rdsc6/wNXCa5rU0CpqVJlwAgbjJVEv3JG XDwvThO4BSNtB2LFXq3ABA6n2ZPaEBmtA0FIbFbeA+bH7cRW6vBJNcf72mNHppfSH3so 5Ew1e3M4RBCWwFFr/T2B4P3y0EDld5v/5Pqctr4UXpfooNzcqTs3YQCXGc2fwq3ZVdvi rThV9fdTm0N1vqrixxvrtUUBiM3k1+DATE0p/RM7B/1N7/vPe4IJT74zYQNk6ov/T+ei 3CUw== X-Gm-Message-State: APjAAAUvx3C0cixYas0PyYkeIajMQmb06S0/F8uJ6YWyFTEAr+OaLVcw B/bWrPMvaDLVDVS5DY7+105XZq8r X-Google-Smtp-Source: APXvYqzcox5xB9DwP7boYmuMXeNJ3NHuigD3kc6fea6Dfkf87yERGMBnAoA785kVIj4dBke8ipdXMg== X-Received: by 2002:a63:28a:: with SMTP id 132mr96212102pgc.165.1578076558098; Fri, 03 Jan 2020 10:35:58 -0800 (PST) Received: from localhost ([2600:1700:e321:62f0:329c:23ff:fee3:9d7c]) by smtp.gmail.com with ESMTPSA id o31sm64267738pgb.56.2020.01.03.10.35.56 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Fri, 03 Jan 2020 10:35:57 -0800 (PST) Date: Fri, 3 Jan 2020 10:35:55 -0800 From: Guenter Roeck To: Petr Mladek Cc: Sakari Ailus , linux-kernel@vger.kernel.org, rafael@kernel.org, Rob Herring , Andy Shevchenko , linux-acpi@vger.kernel.org, devicetree@vger.kernel.org, Heikki Krogerus , Joe Perches Subject: Re: [PATCH v9 09/12] lib/vsprintf: Make use of fwnode API to obtain node names and separators Message-ID: <20200103183555.GA28369@roeck-us.net> References: <20191003123219.11237-1-sakari.ailus@linux.intel.com> <20191003123219.11237-10-sakari.ailus@linux.intel.com> <20200102222041.GA29067@roeck-us.net> <20200103112145.GM19828@paasikivi.fi.intel.com> <20200103144253.y6fnw44oe7asyniz@pathway.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20200103144253.y6fnw44oe7asyniz@pathway.suse.cz> User-Agent: Mutt/1.9.4 (2018-02-28) Sender: linux-acpi-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-acpi@vger.kernel.org On Fri, Jan 03, 2020 at 03:42:53PM +0100, Petr Mladek wrote: > On Fri 2020-01-03 13:21:45, Sakari Ailus wrote: > > Hi Guenter, > > > > On Thu, Jan 02, 2020 at 02:20:41PM -0800, Guenter Roeck wrote: > > > Hi, > > > > > > On Thu, Oct 03, 2019 at 03:32:16PM +0300, Sakari Ailus wrote: > > > > Instead of implementing our own means of discovering parent nodes, node > > > > names or counting how many parents a node has, use the newly added > > > > functions in the fwnode API to obtain that information. > > > > > > > > Signed-off-by: Sakari Ailus > > > > Reviewed-by: Andy Shevchenko > > > > Reviewed-by: Petr Mladek > > > > --- > > > > > > This patch results in a lockdep splat when running one of my qemu > > > emulations. See below for log and bisect results. A complete log > > > is available at > > > https://kerneltests.org/builders/qemu-arm-master/builds/1408/steps/qemubuildcommand/logs/stdio > > > > > > Guenter > > > > Thank you for reporting this. > > > > I looked into the issue, and indeed I can conform the patch introduces this > > as it takes the devtree_lock for printing the name of the fwnode. There is > > I guess that you meant "is not". > > > > however chance of a deadlock in practice as the code in mm/slub.c does not > > deal with fwnodes (in which case acquiring devtree_lock could be possible), > > maybe for other reasons as well. The patch however introduces an unpleasant > > source of such warnings. > > I agree that it is a false positive. alloc/free is called in OF code > under devtree_lock. But OF code is not called from alloc/free (slub.c) > and it should not happen. > Assuming that memory allocation is indeed called from code holding devtree_lock: The problem, as I see it, is that the order of acquiring locks is different. In OF code, the order is devtree_lock (&n->list_lock)->rlock Elsewhere, in %pOF print sequences, it is (&n->list_lock)->rlock devtree_lock The OF code, while holding devtree_lock, may try to allocate or release memory and is waiting for (&n->list_lock)->rlock. At the same time, some other thread may try to print %pOF, has acquired (&n->list_lock)->rlock, and is waiting for devtree_lock. Are you sure that this can not happen ? Thanks, Guenter > lockdep sees the cycle only because the chains are connected via > printk() and logbuf_lock. > > > > One approach to address this could be not allocating memory while holding > > devtree_lock spinlock. That seems entirely feasible. But could also > > releasing memory cause something to be printed, effectively causing the > > same problem? > > I expect that &n->list_lock)->rlock will be needed in kfree() as well. > > Anyway, IMHO, allocation outside devtree_lock spinlock would create hairy > and tricky code. > > The number of needed "cache_entries" need to be counted under > devtree_lock before the allocation. It means that we would need to > take and release the lock twice. It might create a bunch of possible > races. For example, when new entries are added in the mean time. Or > when this function is called twice in parallel. > > > > Perhaps a safest way to fix this could be returning to use dn->full_name > > for printing node names, in which case the devtree_lock would no longer be > > taken for printing names. The effect would be though that there would be > > again one more user for the full_name field, information that can be > > reconstructed from the node's parents. > > Would this avoid using devtree_lock in all %pO? modifiers? > > Removing the lock usage in vsprintf() would make sense > from two reasons: > > + It would allow to use %pOF from inside drivers/of/ > code called under devtree_lock. The current implementation > would cause a deadlock because of the recursion. > > + There is a huge effort to make printk() lockless. Any lock > in vsprintf() is just a call for troubles. We probably should > not have allowed this in the first place. > > > Finally, this problem will be gone when printk uses a lockless > ringbuffer. I hope that it will happen either for-5.6 or 5.7. > The most tricky part, the ringbuffer itself is in good shape now, see > https://lkml.kernel.org/r/20191128015235.12940-1-john.ogness@linutronix.de > > Temporary solution would be to disable lockdep in vsprintf() code. > But I would really prefer to avoid the lock in vsprintf() at all. > > Best Regards, > Petr > > > > --- > > > ====================================================== > > > WARNING: possible circular locking dependency detected > > > 5.5.0-rc4-00066-g738d2902773e #1 Not tainted > > > ------------------------------------------------------ > > > swapper/0/1 is trying to acquire lock: > > > c1313b00 (logbuf_lock){-.-.}, at: vprintk_emit+0x68/0x2d4 > > > > > > but task is already holding lock: > > > ef030b90 (&(&n->list_lock)->rlock){..-.}, at: free_debug_processing+0x38/0x418 > > > > > > which lock already depends on the new lock. > > > > > > > > > the existing dependency chain (in reverse order) is: > > > > > > -> #2 (&(&n->list_lock)->rlock){..-.}: > > > ___slab_alloc.constprop.23+0x12c/0x798 > > > __slab_alloc.constprop.22+0x44/0x70 > > > __kmalloc+0x384/0x41c > > > of_populate_phandle_cache+0xcc/0x148 > > > of_core_init+0x8/0xbc > > > driver_init+0x1c/0x2c > > > kernel_init_freeable+0xac/0x1b4 > > > kernel_init+0x8/0x118 > > > ret_from_fork+0x14/0x20 > > > 0x0 > > > > > > -> #1 (devtree_lock){....}: > > > of_get_parent+0x18/0x34 > > > of_fwnode_get_parent+0x34/0x40 > > > fwnode_count_parents+0x28/0x58 > > > fwnode_full_name_string+0x18/0xa0 > > > device_node_string+0x490/0x4f0 > > > pointer+0x440/0x4d8 > > > vsnprintf+0x1bc/0x3d8 > > > vscnprintf+0xc/0x24 > > > vprintk_store+0x34/0x204 > > > vprintk_emit+0x94/0x2d4 > > > vprintk_default+0x20/0x28 > > > printk+0x30/0x54 > > > exynos4_pm_init_power_domain+0x220/0x258 > > > do_one_initcall+0x8c/0x440 > > > kernel_init_freeable+0x150/0x1b4 > > > kernel_init+0x8/0x118 > > > ret_from_fork+0x14/0x20 > > > 0x0 > > > > > > -> #0 (logbuf_lock){-.-.}: > > > lock_acquire+0xec/0x290 > > > _raw_spin_lock+0x38/0x48 > > > vprintk_emit+0x68/0x2d4 > > > vprintk_default+0x20/0x28 > > > printk+0x30/0x54 > > > unwind_frame+0x6a8/0x6fc > > > walk_stackframe+0x2c/0x38 > > > __save_stack_trace+0x84/0x8c > > > stack_trace_save+0x3c/0x5c > > > set_track+0x40/0x9c > > > free_debug_processing+0x1a4/0x418 > > > __slab_free+0x2d4/0x510 > > > kmem_cache_free+0x44c/0x49c > > > rcu_core+0x348/0x994 > > > __do_softirq+0x164/0x668 > > > irq_exit+0x16c/0x170 > > > __handle_domain_irq+0x80/0xec > > > gic_handle_irq+0x58/0x9c > > > __irq_svc+0x70/0xb0 > > > raid6_neon8_gen_syndrome_real+0x264/0x39c > > > raid6_neon8_gen_syndrome_real+0x264/0x39c > > > > > > other info that might help us debug this: > > > > > > Chain exists of: > > > logbuf_lock --> devtree_lock --> &(&n->list_lock)->rlock > > > > > > Possible unsafe locking scenario: > > > > > > CPU0 CPU1 > > > ---- ---- > > > lock(&(&n->list_lock)->rlock); > > > lock(devtree_lock); > > > lock(&(&n->list_lock)->rlock); > > > lock(logbuf_lock); > > > > > > *** DEADLOCK *** > > > > > > 2 locks held by swapper/0/1: > > > #0: c131466c (rcu_callback){....}, at: rcu_core+0x304/0x994 > > > #1: ef030b90 (&(&n->list_lock)->rlock){..-.}, at: free_debug_processing+0x38/0x418 > > > > > > stack backtrace: > > > CPU: 1 PID: 1 Comm: swapper/0 Not tainted 5.5.0-rc4-00066-g738d2902773e #1 > > > Hardware name: SAMSUNG EXYNOS (Flattened Device Tree) > > > [] (unwind_backtrace) from [] (show_stack+0x10/0x14) > > > [] (show_stack) from [] (dump_stack+0xa4/0xd0) > > > [] (dump_stack) from [] (check_noncircular+0x258/0x274) > > > [] (check_noncircular) from [] (__lock_acquire+0x1870/0x2860) > > > [] (__lock_acquire) from [] (lock_acquire+0xec/0x290) > > > [] (lock_acquire) from [] (_raw_spin_lock+0x38/0x48) > > > [] (_raw_spin_lock) from [] (vprintk_emit+0x68/0x2d4) > > > [] (vprintk_emit) from [] (vprintk_default+0x20/0x28) > > > [] (vprintk_default) from [] (printk+0x30/0x54) > > > [] (printk) from [] (unwind_frame+0x6a8/0x6fc) > > > [] (unwind_frame) from [] (walk_stackframe+0x2c/0x38) > > > [] (walk_stackframe) from [] (__save_stack_trace+0x84/0x8c) > > > [] (__save_stack_trace) from [] (stack_trace_save+0x3c/0x5c) > > > [] (stack_trace_save) from [] (set_track+0x40/0x9c) > > > [] (set_track) from [] (free_debug_processing+0x1a4/0x418) > > > [] (free_debug_processing) from [] (__slab_free+0x2d4/0x510) > > > [] (__slab_free) from [] (kmem_cache_free+0x44c/0x49c) > > > [] (kmem_cache_free) from [] (rcu_core+0x348/0x994) > > > [] (rcu_core) from [] (__do_softirq+0x164/0x668) > > > [] (__do_softirq) from [] (irq_exit+0x16c/0x170) > > > [] (irq_exit) from [] (__handle_domain_irq+0x80/0xec) > > > [] (__handle_domain_irq) from [] (gic_handle_irq+0x58/0x9c) > > > [] (gic_handle_irq) from [] (__irq_svc+0x70/0xb0) > > > Exception stack(0xef19bd30 to 0xef19bd78) > > > bd20: c0d43e50 c0d43e60 ef19bebc c0d43e70 > > > bd40: c0d43e20 c0d43e10 00000400 00000430 00000440 00000450 00000460 00000470 > > > bd60: c0d43e40 ef19bd80 c0d43e30 c0625c00 20000013 ffffffff > > > [] (__irq_svc) from [] (raid6_neon8_gen_syndrome_real+0x264/0x39c) > > > > > > --- > > > Bisect: > > > > > > # bad: [738d2902773e30939a982c8df7a7f94293659810] Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net > > > # good: [219d54332a09e8d8741c1e1982f5eae56099de85] Linux 5.4 > > > git bisect start 'HEAD' 'v5.4' > > > # bad: [8c39f71ee2019e77ee14f88b1321b2348db51820] Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net > > > git bisect bad 8c39f71ee2019e77ee14f88b1321b2348db51820 > > > # good: [3b397c7ccafe0624018cb09fc96729f8f6165573] Merge tag 'regmap-v5.5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap > > > git bisect good 3b397c7ccafe0624018cb09fc96729f8f6165573 > > > # bad: [89d57dddd7d319ded00415790a0bb3c954b7e386] Merge tag 'media/v5.5-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media > > > git bisect bad 89d57dddd7d319ded00415790a0bb3c954b7e386 > > > # good: [9e7a03233e02afd3ee061e373355f34d7254f1e6] Merge tag 'pm-5.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm > > > git bisect good 9e7a03233e02afd3ee061e373355f34d7254f1e6 > > > # good: [09578eacaaa44149738267083ccc050990409f86] Merge tag 'asoc-v5.5-2' of https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus > > > git bisect good 09578eacaaa44149738267083ccc050990409f86 > > > # good: [1c7ae4a51298d52a21f63b2214657982036c7498] media: ad5820: Add support for of-autoload > > > git bisect good 1c7ae4a51298d52a21f63b2214657982036c7498 > > > # good: [99cf8a7074c4ce3ff3685cd389f54e7bd4bbf510] media: dt-bindings: Fix building error for dt_binding_check > > > git bisect good 99cf8a7074c4ce3ff3685cd389f54e7bd4bbf510 > > > # good: [782b59711e1561ee0da06bc478ca5e8249aa8d09] Merge branch 'acpi-mm' > > > git bisect good 782b59711e1561ee0da06bc478ca5e8249aa8d09 > > > # good: [0ca40f41d795fd91811e44506bb73d0b9ca33bdd] Merge branch 'patchwork' into v4l_for_linus > > > git bisect good 0ca40f41d795fd91811e44506bb73d0b9ca33bdd > > > # bad: [a00351687f8a05773c1c57be80a5bbca68fa9ae8] software node: remove DEV_PROP_MAX > > > git bisect bad a00351687f8a05773c1c57be80a5bbca68fa9ae8 > > > # good: [9af7706492f985867d070861fe39fee0fe41326f] lib/vsprintf: Remove support for %pF and %pf in favour of %pS and %ps > > > git bisect good 9af7706492f985867d070861fe39fee0fe41326f > > > # bad: [83abc5a77f3b028b8c845c39ce4053119e1de35b] lib/vsprintf: OF nodes are first and foremost, struct device_nodes > > > git bisect bad 83abc5a77f3b028b8c845c39ce4053119e1de35b > > > # bad: [a92eb7621b9fb2c28a588ce333d226f56fab6a85] lib/vsprintf: Make use of fwnode API to obtain node names and separators > > > git bisect bad a92eb7621b9fb2c28a588ce333d226f56fab6a85 > > > # good: [1586c5ae2f9310235b5e70abe712c73fc32eb98f] lib/vsprintf: Add a note on re-using %pf or %pF > > > git bisect good 1586c5ae2f9310235b5e70abe712c73fc32eb98f > > > # first bad commit: [a92eb7621b9fb2c28a588ce333d226f56fab6a85] lib/vsprintf: Make use of fwnode API to obtain node names and separators > > > > -- > > Kind regards, > > > > Sakari Ailus