From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.4 required=3.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS,URIBL_BLOCKED,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8FA6CC2BA1A for ; Tue, 7 Apr 2020 16:37:41 +0000 (UTC) Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 360442072A for ; Tue, 7 Apr 2020 16:37:41 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=lists.infradead.org header.i=@lists.infradead.org header.b="mAnDxy9g" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 360442072A Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=arm.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+infradead-linux-arm-kernel=archiver.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20170209; h=Sender: Content-Transfer-Encoding:Content-Type:Cc:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:In-Reply-To:MIME-Version:Date: Message-ID:From:References:To:Subject:Reply-To:Content-ID:Content-Description :Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=N9xC6L+OLsiV9g36sIQaALL75VLP/ZmO45WodMDFYQY=; b=mAnDxy9gJunhlt CMqvXLxD0IkKcJS0z6e2TEtq1598zdzuRcwt8PeJuc9JRQev1dWmm/NbIGtj2HSHhsFZ+ITtfDoLr c8JoSMBlTU7edHxmb83R6j2ec1LvmzD55LzPJvUeDJuwbg7sz4xgIkh68tmv4SB17lwiQebyP/E2+ M2yWKIa2gEYkATBLLpP30ypzINO3/0BxV9vtZb/Fe3rZamZHHnZK8Jr/yRbeySCGo0RPkTTk3r+/f qfK53ZOBFYyxA2H7accUyx30S5pbu+Ikd0/1Vv9IQNpqFR76Ta2QbeiwvzBb8TsOVZt4/1jvr7AHu At3mXAe3tAVEFT+Q1UZg==; Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.92.3 #3 (Red Hat Linux)) id 1jLrEK-0004i0-B6; Tue, 07 Apr 2020 16:37:40 +0000 Received: from foss.arm.com ([217.140.110.172]) by bombadil.infradead.org with esmtp (Exim 4.92.3 #3 (Red Hat Linux)) id 1jLrEA-0004ZY-Gy for linux-arm-kernel@lists.infradead.org; Tue, 07 Apr 2020 16:37:32 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id BAF111045; Tue, 7 Apr 2020 09:37:29 -0700 (PDT) Received: from [192.168.0.14] (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id AF3553F52E; Tue, 7 Apr 2020 09:37:28 -0700 (PDT) Subject: Re: Question about SEA handling process happened in user space To: Xiaofei Tan References: <5E81EFCD.6020605@huawei.com> <2b0e5507-ad75-9af1-6afe-aa87d8cf597f@arm.com> <5E8587A3.6030101@huawei.com> From: James Morse Message-ID: <558ffd42-74d7-e364-2b79-93ab0998ab6e@arm.com> Date: Tue, 7 Apr 2020 17:37:27 +0100 User-Agent: Mozilla/5.0 (X11; Linux aarch64; rv:60.0) Gecko/20100101 Thunderbird/60.9.0 MIME-Version: 1.0 In-Reply-To: <5E8587A3.6030101@huawei.com> Content-Language: en-GB X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20200407_093730_617043_9A2A92EF X-CRM114-Status: GOOD ( 14.20 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Catalin Marinas , Linuxarm , Will Deacon , Dave Martin , linux-arm-kernel@lists.infradead.org, Shiju Jose Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+infradead-linux-arm-kernel=archiver.kernel.org@lists.infradead.org Hi Xiaofei, On 02/04/2020 07:35, Xiaofei Tan wrote: > On 2020/3/31 0:49, James Morse wrote: >> If the CPU doesn't tell us the address, we can't tell user-space what it is. The >> alternative is to upgrade to SIGKILL in that case. >> >> >> If you see this instead of the address provided via firmware-first, there is a >> series to improve that here: >> https://lore.kernel.org/linux-acpi/20200228174817.74278-1-james.morse@arm.com/ >> >> (We skip this signal code of APEI promises it did all the work. This lets you >> take the signal from memory_failure() instead, which may have better information.) > There may be an competition issue. > APEI run memory_failure() in an bottom half for memory errors. Then it may be not finished > before here SEA handling end, and application process may back to run. I'm not sure what you mean by 'bottom half', isn't this a softirq term? With that series, it runs in process-context as task-work. memory_failure() needs to sleep, so it has to run in process-context. Doing it as task-work means it runs before the thread returns to user-space. If another thread in the same process accesses the affected memory, I'd expect to take a second external abort. If another process had the page mapped, it could access the affected memory, again taking an external abort. These two could happen while the first CPU was in firmware generating the CPER records, so its not a race we can fix. It should be harmless, the recovery action is the same, its just the error counters that count more events than errors. If you actually see it happen, we can try and make it smaller... Thanks, James _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel