From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.4 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id EAD1EC11D05 for ; Thu, 20 Feb 2020 16:27:56 +0000 (UTC) Received: from lists.ozlabs.org (lists.ozlabs.org [203.11.71.2]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 91E33206F4 for ; Thu, 20 Feb 2020 16:27:56 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=c-s.fr header.i=@c-s.fr header.b="jJASL1fe" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 91E33206F4 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=c-s.fr Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=linuxppc-dev-bounces+linuxppc-dev=archiver.kernel.org@lists.ozlabs.org Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) by lists.ozlabs.org (Postfix) with ESMTP id 48Ng4B56z5zDqcL for ; Fri, 21 Feb 2020 03:27:54 +1100 (AEDT) Authentication-Results: lists.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=c-s.fr (client-ip=93.17.236.30; helo=pegase1.c-s.fr; envelope-from=christophe.leroy@c-s.fr; receiver=) Authentication-Results: lists.ozlabs.org; dmarc=none (p=none dis=none) header.from=c-s.fr Authentication-Results: lists.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=c-s.fr header.i=@c-s.fr header.a=rsa-sha256 header.s=mail header.b=jJASL1fe; dkim-atps=neutral Received: from pegase1.c-s.fr (pegase1.c-s.fr [93.17.236.30]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 48Ng1H5Zg9zDqLs for ; Fri, 21 Feb 2020 03:25:23 +1100 (AEDT) Received: from localhost (mailhub1-int [192.168.12.234]) by localhost (Postfix) with ESMTP id 48Ng1B1T63z9v9DJ; Thu, 20 Feb 2020 17:25:18 +0100 (CET) Authentication-Results: localhost; dkim=pass reason="1024-bit key; insecure key" header.d=c-s.fr header.i=@c-s.fr header.b=jJASL1fe; dkim-adsp=pass; dkim-atps=neutral X-Virus-Scanned: Debian amavisd-new at c-s.fr Received: from pegase1.c-s.fr ([192.168.12.234]) by localhost (pegase1.c-s.fr [192.168.12.234]) (amavisd-new, port 10024) with ESMTP id lgxD27TnwSFB; Thu, 20 Feb 2020 17:25:18 +0100 (CET) Received: from messagerie.si.c-s.fr (messagerie.si.c-s.fr [192.168.25.192]) by pegase1.c-s.fr (Postfix) with ESMTP id 48Ng1B09y4z9v9DG; Thu, 20 Feb 2020 17:25:18 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=c-s.fr; s=mail; t=1582215918; bh=noDN+g7RqQflzLdckfMbm/Rys1UhPtb8wxrfE5KCYlY=; h=Subject:To:Cc:References:From:Date:In-Reply-To:From; b=jJASL1feaxHVIhv9kgc27RgkkB8f5POuuhTopgVEoV19bSGEYsled7qiYlULl/iR1 RpRYInhZ0INNdsxqJGL06ew7RphkPiW4PeGGgCkI4O5/lunSEAID8naPHqYUPZ6744 DhfStNsbeJz/Zbumg9xKB6rjYMuUR9jT63CfjDYk= Received: from localhost (localhost [127.0.0.1]) by messagerie.si.c-s.fr (Postfix) with ESMTP id A15BD8B87B; Thu, 20 Feb 2020 17:25:19 +0100 (CET) X-Virus-Scanned: amavisd-new at c-s.fr Received: from messagerie.si.c-s.fr ([127.0.0.1]) by localhost (messagerie.si.c-s.fr [127.0.0.1]) (amavisd-new, port 10023) with ESMTP id r3I8kBAQzloT; Thu, 20 Feb 2020 17:25:19 +0100 (CET) Received: from [192.168.4.90] (unknown [192.168.4.90]) by messagerie.si.c-s.fr (Postfix) with ESMTP id 4D12D8B866; Thu, 20 Feb 2020 17:25:19 +0100 (CET) Subject: Re: MCE handler gets NIP wrong on MPC8378 To: Radu Rendec References: <20200219220829.Horde.I5UfTmHgQd92hm3jMgSMMA1@messagerie.si.c-s.fr> <20200219222110.Horde.MNo_rRZ0LaYxBYa_bppgCw1@messagerie.si.c-s.fr> <09e9a042-766c-d2e6-2300-cebc372cabde@c-s.fr> From: Christophe Leroy Message-ID: <8008403c-49cd-29bc-712d-2e13b601041c@c-s.fr> Date: Thu, 20 Feb 2020 17:25:19 +0100 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:68.0) Gecko/20100101 Thunderbird/68.5.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: fr Content-Transfer-Encoding: 8bit X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linuxppc-dev@lists.ozlabs.org Errors-To: linuxppc-dev-bounces+linuxppc-dev=archiver.kernel.org@lists.ozlabs.org Sender: "Linuxppc-dev" Le 20/02/2020 à 17:02, Radu Rendec a écrit : > On 02/20/2020 at 3:38 AM Christophe Leroy wrote: >> On 02/19/2020 10:39 PM, Radu Rendec wrote: >>> On 02/19/2020 at 4:21 PM Christophe Leroy wrote: >>>>> Interesting. >>>>> >>>>> 0x900 is the adress of the timer interrupt. >>>>> >>>>> Would the MCE occur just after the timer interrupt ? >>> >>> I doubt that. I'm using a small test module to artificially trigger the >>> MCE. Basically it's just this (the full code is in my original post): >>> >>> bad_addr_base = ioremap(0xf0000000, 0x100); >>> x = ioread32(bad_addr_base); >>> >>> I find it hard to believe that every time I load the module the lwbrx >>> instruction that triggers the MCE is executed exactly after the timer >>> interrupt (or that the timer interrupt always occurs close to the lwbrx >>> instruction). >> >> Can you try to see how much time there is between your read and the MCE ? >> The below should allow it, you'll see first value in r13 and the other >> in r14 (mce.c is your test code) >> >> Also provide the timebase frequency as reported in /proc/cpuinfo > > I just ran a test: r13 is 0xda8e0f91 and r14 is 0xdaae0f9c. > > # cat /proc/cpuinfo > processor : 0 > cpu : e300c4 > clock : 800.000004MHz > revision : 1.1 (pvr 8086 1011) > bogomips : 200.00 > timebase : 100000000 > > The difference between r14 and r13 is 0x20000b. Assuming TB is > incremented with 'timebase' frequency, that means 20.97 milliseconds > (although the e300 manual says TB is "incremented once every four core > input clock cycles"). I wouldn't be surprised that the internal CPU clock be twice the input clock. So that's long enough to surely get a timer interrupt during every bad access. Now we have to understand why SRR1 contains the address of the timer exception entry and not the address of the bad access. The value of SRR1 confirms that it comes from 0x900 as MSR[IR] and [DR] are cleared when interrupts are enabled. Maybe you should file a support case at NXP. They are usually quite professionnal at responding. Christophe