From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.8 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9B040C352A3 for ; Tue, 11 Feb 2020 14:51:44 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 42B5420675 for ; Tue, 11 Feb 2020 14:51:44 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=shutemov-name.20150623.gappssmtp.com header.i=@shutemov-name.20150623.gappssmtp.com header.b="fx57XQQs" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 42B5420675 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=shutemov.name Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id DF7C46B02E6; Tue, 11 Feb 2020 09:51:43 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id DCE5B6B02E7; Tue, 11 Feb 2020 09:51:43 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CBD056B02E8; Tue, 11 Feb 2020 09:51:43 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0164.hostedemail.com [216.40.44.164]) by kanga.kvack.org (Postfix) with ESMTP id B0FC66B02E6 for ; Tue, 11 Feb 2020 09:51:43 -0500 (EST) Received: from smtpin13.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 498322491 for ; Tue, 11 Feb 2020 14:51:43 +0000 (UTC) X-FDA: 76478135286.13.unit34_6637d1d4cd90d X-HE-Tag: unit34_6637d1d4cd90d X-Filterd-Recvd-Size: 6811 Received: from mail-lf1-f65.google.com (mail-lf1-f65.google.com [209.85.167.65]) by imf41.hostedemail.com (Postfix) with ESMTP for ; Tue, 11 Feb 2020 14:51:42 +0000 (UTC) Received: by mail-lf1-f65.google.com with SMTP id f24so7211311lfh.3 for ; Tue, 11 Feb 2020 06:51:42 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=shutemov-name.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=AEcjp8yWRc6AsAsI9CsO9gvHIgoVpPy7RzOlsPkD1sg=; b=fx57XQQsvMGFBEDlCcZFB4jNa6gQjIt9TW041yRHopOFP6Y5GSRaDPlgxp1vzBp3/V 4JOLn5FW+S/rCnhmna9g1JkIRCfw9ZMHKm0/myXoujNdAIhyMuNSfQz+wm0yjsdarck2 TBcpv6QuQxQSs31VWiLoyOr2/YzTn6zHf8p/MQP7i4aE3koJATycYEM6T/kV8JFrfw2t 5zu7UPVZHoXtp3K07L7pwBTx/HhIcp37lpTLCbCdsq2v9LFh1vKpfc59cFcrkLbnlBXE 4hJHGsi3TL+WhvEU0l9Jv4W1PqX2SxSudNdhlhT8PXu9Wkee7cwgmmNspwfA6orXJwVz U5Qg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=AEcjp8yWRc6AsAsI9CsO9gvHIgoVpPy7RzOlsPkD1sg=; b=qOdiLeddtg6seE+f00Ip5ytFRgwU/e9mdsXYoVPwQ6t7S8GUmXjxxRG7u4UAuiqMUB Xr+jcD3AKMcOXt30oVzwmNO9ObknILVbJLDBqyJ2Mk1wazpOkM2OpRgqDOl4JbINEgy6 5NLJFKeFxpseOjl2qIt/tfyd5raAkK+/cNIr98kf2qettdepB+8KrIsLrrtpsAQcNDdH jS7eZVwFoIlx1d9GKmljcJW9Ukj70FRX5wiFWRYPdAl/4t8tMNP2SeWyWyT8lnCbI0Cr eGEo6Ajwh2W6Gk4gvoJsxEyAMlbvO2EjTUTf25xewjjtp67uQlkstaMyHbs3c3Dz7FWb iljw== X-Gm-Message-State: APjAAAUH4GLtQKeKk1uEdKv34vWGqhbz6OoDydoqrxqyA4K1synLssSS A5WQrZ7oMouFMcp6gawfOef1qA== X-Google-Smtp-Source: APXvYqz5RPZoC4V3JI54AWHSA2OQys9kubHigJzSy2U/14AXIp9EKLXs9DvfEImsbvslOj1EZkyN0Q== X-Received: by 2002:a05:6512:284:: with SMTP id j4mr3796051lfp.109.1581432700673; Tue, 11 Feb 2020 06:51:40 -0800 (PST) Received: from box.localdomain ([86.57.175.117]) by smtp.gmail.com with ESMTPSA id r20sm1922944lfi.91.2020.02.11.06.51.39 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 11 Feb 2020 06:51:40 -0800 (PST) Received: by box.localdomain (Postfix, from userid 1000) id D0844100AFB; Tue, 11 Feb 2020 17:51:58 +0300 (+03) Date: Tue, 11 Feb 2020 17:51:58 +0300 From: "Kirill A. Shutemov" To: Jeff Moyer Cc: Jia He , Catalin Marinas , "Kirill A. Shutemov" , linux-mm@kvack.org Subject: Re: bug: data corruption introduced by commit 83d116c53058 ("mm: fix double page fault on arm64 if PTE_AF is cleared") Message-ID: <20200211145158.5wt7nepe3flx25bj@box> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Mon, Feb 10, 2020 at 05:51:50PM -0500, Jeff Moyer wrote: > Hi, > > While running xfstests test generic/437, I noticed that the following > WARN_ON_ONCE inside cow_user_page() was triggered: > > /* > * This really shouldn't fail, because the page is there > * in the page tables. But it might just be unreadable, > * in which case we just give up and fill the result with > * zeroes. > */ > if (__copy_from_user_inatomic(kaddr, uaddr, PAGE_SIZE)) { > /* > * Give a warn in case there can be some obscure > * use-case > */ > WARN_ON_ONCE(1); > clear_page(kaddr); > } > > Just clearing the page, in this case, will result in data corruption. I > think the assumption that the copy fails because the memory is > inaccessible may be wrong. In this instance, it may be the other thread > issuing the madvise call? > > I reverted the commit in question, and the data corruption is gone. > > Below is the (ppc64) stack trace (this will reproduce on x86 as well). > I've attached the reproducer, which is a modified version of the > xfs test. You'll need to setup a file system on pmem, and mount it with > -o dax. Then issue "./t_mmap_cow_race /mnt/pmem/foo". > > Any help tracking this down is appreciated. My guess is that MADV_DONTNEED get the page unmapped under you and __copy_from_user_inatomic() sees empty PTE instead of the populated PTE it expects. Below is my completely untested attempt to fix it. It is going to hurt perfomance in common case, but it should be good enough to test my idea. The real solution would be to retry __copy_from_user_inatomic() under ptl if the first attempt fails. I expect it to be ugly. diff --git a/mm/memory.c b/mm/memory.c index 0bccc622e482..362a791f47fd 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -2257,7 +2257,6 @@ static inline bool cow_user_page(struct page *dst, struct page *src, bool ret; void *kaddr; void __user *uaddr; - bool force_mkyoung; struct vm_area_struct *vma = vmf->vma; struct mm_struct *mm = vma->vm_mm; unsigned long addr = vmf->address; @@ -2278,27 +2277,18 @@ static inline bool cow_user_page(struct page *dst, struct page *src, kaddr = kmap_atomic(dst); uaddr = (void __user *)(addr & PAGE_MASK); + vmf->pte = pte_offset_map_lock(mm, vmf->pmd, addr, &vmf->ptl); + if (!pte_same(*vmf->pte, vmf->orig_pte)) { + ret = false; + goto pte_unlock; + } + /* * On architectures with software "accessed" bits, we would * take a double page fault, so mark it accessed here. */ - force_mkyoung = arch_faults_on_old_pte() && !pte_young(vmf->orig_pte); - if (force_mkyoung) { - pte_t entry; - - vmf->pte = pte_offset_map_lock(mm, vmf->pmd, addr, &vmf->ptl); - if (!likely(pte_same(*vmf->pte, vmf->orig_pte))) { - /* - * Other thread has already handled the fault - * and we don't need to do anything. If it's - * not the case, the fault will be triggered - * again on the same address. - */ - ret = false; - goto pte_unlock; - } - - entry = pte_mkyoung(vmf->orig_pte); + if (arch_faults_on_old_pte() && !pte_young(vmf->orig_pte)) { + pte_t entry = pte_mkyoung(vmf->orig_pte); if (ptep_set_access_flags(vma, addr, vmf->pte, entry, 0)) update_mmu_cache(vma, addr, vmf->pte); } @@ -2321,8 +2311,7 @@ static inline bool cow_user_page(struct page *dst, struct page *src, ret = true; pte_unlock: - if (force_mkyoung) - pte_unmap_unlock(vmf->pte, vmf->ptl); + pte_unmap_unlock(vmf->pte, vmf->ptl); kunmap_atomic(kaddr); flush_dcache_page(dst); -- Kirill A. Shutemov