From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.4 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS,USER_AGENT_MUTT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0FBFBC43218 for ; Tue, 11 Jun 2019 11:56:55 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id CE312212F5 for ; Tue, 11 Jun 2019 11:56:54 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=alien8.de header.i=@alien8.de header.b="pnpCqfjp" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2390098AbfFKL4y (ORCPT ); Tue, 11 Jun 2019 07:56:54 -0400 Received: from mail.skyhub.de ([5.9.137.197]:35390 "EHLO mail.skyhub.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2389499AbfFKL4y (ORCPT ); Tue, 11 Jun 2019 07:56:54 -0400 Received: from zn.tnic (p200300EC2F0A6800DC92A88D55C2D513.dip0.t-ipconnect.de [IPv6:2003:ec:2f0a:6800:dc92:a88d:55c2:d513]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.skyhub.de (SuperMail on ZX Spectrum 128k) with ESMTPSA id CC9F21EC0467; Tue, 11 Jun 2019 13:56:51 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=alien8.de; s=dkim; t=1560254211; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:in-reply-to:in-reply-to: references:references; bh=QukcMNGwZckUnii5x0hqUwmjRq1rlJQHMCtyQ0Z8UYY=; b=pnpCqfjp/JTTeaGIsxUClj6wZjVmlGUmmX/r6HmRNnoLGEWgqXFfkDNSEqoXl1CwLtGWmG 3WXVYFeZWUgZdYROGprUTVIKezGTXxbqmTGfBYosVD6Xdf7H7DMDa0qHJYsVn7+MkJCOsn bbSOc4P5r72saQcFPaXyQ50MzJz0mxU= Date: Tue, 11 Jun 2019 13:56:51 +0200 From: Borislav Petkov To: Benjamin Herrenschmidt Cc: James Morse , "Hawa, Hanna" , "robh+dt@kernel.org" , "Woodhouse, David" , "paulmck@linux.ibm.com" , "mchehab@kernel.org" , "mark.rutland@arm.com" , "gregkh@linuxfoundation.org" , "davem@davemloft.net" , "nicolas.ferre@microchip.com" , "devicetree@vger.kernel.org" , "Shenhar, Talel" , "linux-kernel@vger.kernel.org" , "Chocron, Jonathan" , "Krupnik, Ronen" , "linux-edac@vger.kernel.org" , "Hanoch, Uri" Subject: Re: [PATCH 2/2] edac: add support for Amazon's Annapurna Labs EDAC Message-ID: <20190611115651.GD31772@zn.tnic> References: <1559211329-13098-3-git-send-email-hhhawa@amazon.com> <20190531051400.GA2275@cz.tnic> <32431fa2-2285-6c41-ce32-09630205bb54@arm.com> <9a2aaf4a9545ed30568a0613e64bc3f57f047799.camel@kernel.crashing.org> <20190608090556.GA32464@zn.tnic> <1ae5e7a3464f9d8e16b112cd371957ea20472864.camel@kernel.crashing.org> <68446361fd1e742b284555b96b638fe6b5218b8b.camel@kernel.crashing.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <68446361fd1e742b284555b96b638fe6b5218b8b.camel@kernel.crashing.org> User-Agent: Mutt/1.10.1 (2018-07-13) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Jun 11, 2019 at 05:21:39PM +1000, Benjamin Herrenschmidt wrote: > So looking again ... all the registration/removal of edac devices seem > to already be protected by mutexes, so that's not a problem. > > Tell me more about what specific races you think we might have here, > I'm not sure I follow... Well, as I said "it might work or it might set your cat on fire." For example, one of the error logging paths is edac_mc_handle_error() and that thing mostly operates using the *mci pointer which should be ok but then it calls the "trace_mc_event" tracepoint and I'd suppose that tracepoints can do lockless but I'm not sure. So what needs to happen is for paths which weren't called by multiple EDAC agents in parallel but need to get called in parallel now due to ARM drivers wanting to do that, to get audited that they're safe. Situation is easy if you have one platform driver where you can synchronize things in the driver but since you guys need to do separate drivers for whatever reason, then that would need to be done prior. Makes more sense? -- Regards/Gruss, Boris. Good mailing practices for 400: avoid top-posting and trim the reply.