From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.7 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C0233C48BD1 for ; Fri, 11 Jun 2021 10:37:09 +0000 (UTC) Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 7D8B4613DE for ; Fri, 11 Jun 2021 10:37:09 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 7D8B4613DE Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:In-Reply-To:MIME-Version:References: Message-ID:Subject:Cc:To:From:Date:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=GT1266mWJaQo8efoBsdEdnRv2qN+OteEMuKko7xe5gU=; b=XED61Fj3wfCHnv 0Q6yREBXkty//zP6Zt9YnzGMMSQRPGEZ+oZgAwCoGDdbqx0DRoFK4Slqtn3UjQ1a4xHVu3hoo2jAu 74lq1eNLa5UXyCktpIRFUKIsT95QtRX7+JL9vHNcKUJrVlO7CUATaf6MENr8isUwjTZpfk+gJMP9z 8ZUSUn5Or3CNntLyVD3b054guLbxrloW+OfIGrKuHAEUFbYmu3dvMg1HPNRRa0o0hKRsc5ohwH/6f kMjMHQBOhA44122ozeboPBP64ivxuLmpcTZiEvITuZsCu4gsoJKpj/ddvHJhbgF0pKpV3032aCWGb tkzP1F0UUy/2CjPL6TCw==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1lreVG-004l9c-Uq; Fri, 11 Jun 2021 10:35:07 +0000 Received: from mail.kernel.org ([198.145.29.99]) by bombadil.infradead.org with esmtps (Exim 4.94.2 #2 (Red Hat Linux)) id 1lreVD-004l8g-AG for linux-arm-kernel@lists.infradead.org; Fri, 11 Jun 2021 10:35:04 +0000 Received: by mail.kernel.org (Postfix) with ESMTPSA id D54AA613DE; Fri, 11 Jun 2021 10:35:00 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1623407703; bh=P/LsqYFX5QQBDw25miHYi5K80W1qB+zD8ZNnrA/c27E=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=HC8A+NtYoJcfNxKGADrDza8GzdSIOUawZXG34RqqzJ4pRR+fbEV5Ny2wx3cGJmJGl 2oCTJ8YUM1ArxinAH6ID92K0fKcyDkIcOkesD6jmUGFiHUSzEVLIknY3HMFG2Lvg7d 4ABdH0Ffop2BkXNiGvE6ocYq/076Uik2SK4JanClP4yXloEN2VlBH1dpHlt2hjySDr /mehAZP7tbznaPXVyJmmnWA2nXjXhLgj7qPOceNMDOwCp1QHrVk0ZRz7YKXc+IKyQ9 TQe+YVYxPWUTcwCDabZnkUmF71pj8mp3k5478dvZ865pRsRNlLowPLia2F8CO1ntvw zPyEd1VgQB6nQ== Date: Fri, 11 Jun 2021 11:34:57 +0100 From: Will Deacon To: Veronika Kabatova Cc: Ard Biesheuvel , Mark Rutland , Lorenzo Pieralisi , Anshuman Khandual , Marc Zyngier , Memory Management , skt-results-master@redhat.com, Jeff Bastian , CKI Project , Catalin Marinas , Jan Stancek , Linux ARM Subject: Re: =?utf-8?B?4p2MIEZBSUw=?= =?utf-8?Q?=3A?= Test report for kernel 5.13.0-rc4 (arm-next, 8124c8a6) Message-ID: <20210611103457.GC15274@willie-the-truck> References: <20210602101227.GB30503@willie-the-truck> <20210602105135.GC30593@willie-the-truck> <20210602171033.GA31957@willie-the-truck> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.10.1 (2018-07-13) X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20210611_033503_421422_08819CCB X-CRM114-Status: GOOD ( 43.84 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On Thu, Jun 10, 2021 at 01:59:12PM +0200, Veronika Kabatova wrote: > On Thu, Jun 3, 2021 at 12:44 PM Veronika Kabatova wrote: > > > > On Wed, Jun 2, 2021 at 7:10 PM Will Deacon wrote: > > > > > > On Wed, Jun 02, 2021 at 01:00:47PM +0200, Veronika Kabatova wrote: > > > > On Wed, Jun 2, 2021 at 12:51 PM Will Deacon wrote: > > > > > On Wed, Jun 02, 2021 at 12:40:07PM +0200, Ard Biesheuvel wrote: > > > > > > On Wed, 2 Jun 2021 at 12:12, Will Deacon wrote: > > > > > > > On Wed, Jun 02, 2021 at 01:35:01AM -0000, CKI Project wrote: > > > > > > > > stress: stress-ng > > > > > > > > > > > > > > This explodes pretty badly. Some CPUs detect RCU stalls when trying to use > > > > > > > the EFI "efi_read_time" service, which eventually fails but soon after we > > > > > > > explode trying to access memory which I think is mapped by > > > > > > > acpi_os_ioremap(), so it looks like the f/w might be the culprit here. Is > > > > > > > the "HPE Apollo 70" machine known to have bad EFI firmware? > > > > > > > > > > > > > > https://arr-cki-prod-datawarehouse-public.s3.amazonaws.com/datawarehouse-public/2021/06/01/313156257/build_aarch64_redhat%3A1310052388/tests/stress_stress_ng/10079827_aarch64_2_dmesg.log > > > > > > > > > > > > > > (scroll to the end for the fireworks) > > > > > > > > > > > > > > > > > > > Wow that looks pretty horrible. I take it this tree has your MAIR changes? > > > > > > > > > > Nope, this is just vanilla -rc4! I'm trying to get a "known good" base > > > > > before I throw all the new things at it :) > > > > > > > > > > > Would be useful to have a log with efi=debug, to see what the EFI > > > > > > memory map looks like. > > > > > > > > > > Veronika -- please could you help us with that? > > > > > > > > Sure, I'll get a rerun with that option and report back when I have any > > > > results. I am also planning just a plain rerun on the machine to see if it > > > > reproduces somewhat reliably, however the machine is taken up by > > > > other automation now so it will take a while. > > > > > > Thanks. In the meantime, I've pushed a bunch of new stuff into for-kernelci, > > > so I can at least see if it regresses when compared to the three failures > > > we're seeing here. > > > > > > > Hi, > > > > I don't have very good news so far. We did 4 targeted runs with the machine > > and weren't able to reproduce the panic. However, there was a panic hit in > > the new test run you should have in the inbox and it also reproduced in a > > completely unrelated test run with *this* kernel (not the new one). In all 3 > > cases the HW model is the same, but they were all different machines. > > > > I'm currently doing a full run which includes all tests from the run instead > > of just stress-ng to see if it reproduces that way - there was a panic case > > last year (not ARM specific :) that we weren't able to pinpoint to a nice > > reproducer and had to run multiple tests to trigger it so it's possible this > > one is similar. I'll try to pair down the tests if this strategy works and > > keep you updated. > > > > I just wanted to follow up here. Outside of the single run I mentioned > previously, we are still unable to reproduce the panic. We tried a lot of > runs on the various machines of the model that hit it, with both full test > runs and stress-ng test only. > > We'll still reach out if we manage to hit it in the future, but it looks like > a race condition that's not easy to reproduce. Of course if anyone has > an idea we should try (whether it's about reproducing or debugging what > the problem is) we can try that. Thanks for the follow-up, Veronika. I also noticed that it seems to have disappeared from subsequent runs :/ Will _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel