From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.5 required=3.0 tests=DKIM_INVALID,DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 437B9C433E2 for ; Tue, 7 Jul 2020 22:50:34 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id DFB6520720 for ; Tue, 7 Jul 2020 22:50:33 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="Ka9TGPtj" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org DFB6520720 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id BE7916B009A; Tue, 7 Jul 2020 18:50:31 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id AA5016B00A6; Tue, 7 Jul 2020 18:50:31 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7F0196B009F; Tue, 7 Jul 2020 18:50:31 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0069.hostedemail.com [216.40.44.69]) by kanga.kvack.org (Postfix) with ESMTP id 526286B0081 for ; Tue, 7 Jul 2020 18:50:31 -0400 (EDT) Received: from smtpin20.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id C56CD180AD806 for ; Tue, 7 Jul 2020 22:50:30 +0000 (UTC) X-FDA: 77012775420.20.help22_2a0b54b26eb8 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin20.hostedemail.com (Postfix) with ESMTP id 93DB5180C07A3 for ; Tue, 7 Jul 2020 22:50:30 +0000 (UTC) X-HE-Tag: help22_2a0b54b26eb8 X-Filterd-Recvd-Size: 24860 Received: from us-smtp-1.mimecast.com (us-smtp-delivery-1.mimecast.com [207.211.31.120]) by imf23.hostedemail.com (Postfix) with ESMTP for ; Tue, 7 Jul 2020 22:50:29 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1594162229; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=WOgxlJaBDHRdhPiysM4bErvDesX8OsHWNwshhmEutLk=; b=Ka9TGPtjP/p2XWV3uM0X1Vz2SjXlmeWIbRgPQBz8Sydw7tktq3bv1N8rZW2qmTMs5RS2Ob wWdJJdb1+OLYrlCQbsq0LTgQNYRTgtJ89lnydVpEfs47X2jZkTDe1z53KLQia7WbKt7hu2 fcTM1Z2wC9WvrGxlU2V8+v9owbRUH5E= Received: from mail-qt1-f198.google.com (mail-qt1-f198.google.com [209.85.160.198]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-253-zfGMqXdDOX60NvNrwjtMFw-1; Tue, 07 Jul 2020 18:50:27 -0400 X-MC-Unique: zfGMqXdDOX60NvNrwjtMFw-1 Received: by mail-qt1-f198.google.com with SMTP id u93so31777161qtd.8 for ; Tue, 07 Jul 2020 15:50:27 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=WOgxlJaBDHRdhPiysM4bErvDesX8OsHWNwshhmEutLk=; b=LSQXeg7745D+NgzYVJHc/5LDLibl5qWeUAwGWfZcoP5EUk4k+wLn0KQOKVr+mA1rjj 6xSQzUrLLNUV2SI4NqZljGGkgoY6qiiWnY8RVpqSoE4t36vaAikXmUpLZx+WBgTeUCEr N84Z5BMR6AON2tZl5h95c1wQEzxl+wlr2sNf5tRvb/sdX1zNK5pbpWRaDf8XdKjfHH6z dcEtaDDsg9q8k72GOtJB86VwGnPMINnQU0Bb9FZ+lDn/ZpX2JL0UW5edEvvoDTWN57SU 7PzdIMITWWIJtI6cgQlmEQpbAXhfqmPSCS1QVQipWb78q3+YzcoaRn6CCOhGOJOswKNF /YKQ== X-Gm-Message-State: AOAM533LABl+xdLm3BYrYADfVuY1a5xEUB6OkstBk5mUCVEghEUjrexZ mRmvNfR4V3bIG5P59XSZNHwNBqDzM+BjN4z63mbQmgI6OSsIN4f/R3dwz+p9AM+g25tvsCUuPow b9uLge/SkIAg= X-Received: by 2002:a05:620a:50:: with SMTP id t16mr55730966qkt.82.1594162226508; Tue, 07 Jul 2020 15:50:26 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxDbJh3mdlcqMj2gcuFQwgf+qlV3HsV2boCushIcPSem9RJEwvEnQ4whoA4khGepjw4ume4mg== X-Received: by 2002:a05:620a:50:: with SMTP id t16mr55730932qkt.82.1594162225989; Tue, 07 Jul 2020 15:50:25 -0700 (PDT) Received: from xz-x1.redhat.com ([2607:9880:19c0:32::2]) by smtp.gmail.com with ESMTPSA id j16sm26267642qtp.92.2020.07.07.15.50.24 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 07 Jul 2020 15:50:25 -0700 (PDT) From: Peter Xu To: linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: Gerald Schaefer , Linus Torvalds , peterx@redhat.com, Andrew Morton , Will Deacon , Andrea Arcangeli , David Rientjes , John Hubbard , Michael Ellerman Subject: [PATCH v5 01/25] mm: Do page fault accounting in handle_mm_fault Date: Tue, 7 Jul 2020 18:49:57 -0400 Message-Id: <20200707225021.200906-2-peterx@redhat.com> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20200707225021.200906-1-peterx@redhat.com> References: <20200707225021.200906-1-peterx@redhat.com> MIME-Version: 1.0 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=peterx@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=US-ASCII X-Rspamd-Queue-Id: 93DB5180C07A3 X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam01 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: This is a preparation patch to move page fault accountings into the gener= al code in handle_mm_fault(). This includes both the per task flt_maj/flt_m= in counters, and the major/minor page fault perf events. To do this, the pt= _regs pointer is passed into handle_mm_fault(). PERF_COUNT_SW_PAGE_FAULTS should still be kept in per-arch page fault han= dlers. So far, all the pt_regs pointer that passed into handle_mm_fault() is NUL= L, which means this patch should have no intented functional change. Suggested-by: Linus Torvalds Signed-off-by: Peter Xu --- arch/alpha/mm/fault.c | 2 +- arch/arc/mm/fault.c | 2 +- arch/arm/mm/fault.c | 2 +- arch/arm64/mm/fault.c | 2 +- arch/csky/mm/fault.c | 3 +- arch/hexagon/mm/vm_fault.c | 2 +- arch/ia64/mm/fault.c | 2 +- arch/m68k/mm/fault.c | 2 +- arch/microblaze/mm/fault.c | 2 +- arch/mips/mm/fault.c | 2 +- arch/nds32/mm/fault.c | 2 +- arch/nios2/mm/fault.c | 2 +- arch/openrisc/mm/fault.c | 2 +- arch/parisc/mm/fault.c | 2 +- arch/powerpc/mm/copro_fault.c | 2 +- arch/powerpc/mm/fault.c | 2 +- arch/riscv/mm/fault.c | 2 +- arch/s390/mm/fault.c | 2 +- arch/sh/mm/fault.c | 2 +- arch/sparc/mm/fault_32.c | 4 +-- arch/sparc/mm/fault_64.c | 2 +- arch/um/kernel/trap.c | 2 +- arch/x86/mm/fault.c | 2 +- arch/xtensa/mm/fault.c | 2 +- drivers/iommu/amd/iommu_v2.c | 2 +- drivers/iommu/intel/svm.c | 3 +- include/linux/mm.h | 7 ++-- mm/gup.c | 4 +-- mm/hmm.c | 3 +- mm/ksm.c | 3 +- mm/memory.c | 64 ++++++++++++++++++++++++++++++++++- 31 files changed, 103 insertions(+), 34 deletions(-) diff --git a/arch/alpha/mm/fault.c b/arch/alpha/mm/fault.c index c2303a8c2b9f..1983e43a5e2f 100644 --- a/arch/alpha/mm/fault.c +++ b/arch/alpha/mm/fault.c @@ -148,7 +148,7 @@ do_page_fault(unsigned long address, unsigned long mm= csr, /* If for any reason at all we couldn't handle the fault, make sure we exit gracefully rather than endlessly redo the fault. */ - fault =3D handle_mm_fault(vma, address, flags); + fault =3D handle_mm_fault(vma, address, flags, NULL); =20 if (fault_signal_pending(fault, regs)) return; diff --git a/arch/arc/mm/fault.c b/arch/arc/mm/fault.c index 7287c793d1c9..587dea524e6b 100644 --- a/arch/arc/mm/fault.c +++ b/arch/arc/mm/fault.c @@ -130,7 +130,7 @@ void do_page_fault(unsigned long address, struct pt_r= egs *regs) goto bad_area; } =20 - fault =3D handle_mm_fault(vma, address, flags); + fault =3D handle_mm_fault(vma, address, flags, NULL); =20 /* Quick path to respond to signals */ if (fault_signal_pending(fault, regs)) { diff --git a/arch/arm/mm/fault.c b/arch/arm/mm/fault.c index c6550eddfce1..01a8e0f8fef7 100644 --- a/arch/arm/mm/fault.c +++ b/arch/arm/mm/fault.c @@ -224,7 +224,7 @@ __do_page_fault(struct mm_struct *mm, unsigned long a= ddr, unsigned int fsr, goto out; } =20 - return handle_mm_fault(vma, addr & PAGE_MASK, flags); + return handle_mm_fault(vma, addr & PAGE_MASK, flags, NULL); =20 check_stack: /* Don't allow expansion below FIRST_USER_ADDRESS */ diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c index 5e832b3387f1..f885940035ce 100644 --- a/arch/arm64/mm/fault.c +++ b/arch/arm64/mm/fault.c @@ -428,7 +428,7 @@ static vm_fault_t __do_page_fault(struct mm_struct *m= m, unsigned long addr, */ if (!(vma->vm_flags & vm_flags)) return VM_FAULT_BADACCESS; - return handle_mm_fault(vma, addr & PAGE_MASK, mm_flags); + return handle_mm_fault(vma, addr & PAGE_MASK, mm_flags, NULL); } =20 static bool is_el0_instruction_abort(unsigned int esr) diff --git a/arch/csky/mm/fault.c b/arch/csky/mm/fault.c index 0b9cbf2cf6a9..7137e2e8dc57 100644 --- a/arch/csky/mm/fault.c +++ b/arch/csky/mm/fault.c @@ -150,7 +150,8 @@ asmlinkage void do_page_fault(struct pt_regs *regs, u= nsigned long write, * make sure we exit gracefully rather than endlessly redo * the fault. */ - fault =3D handle_mm_fault(vma, address, write ? FAULT_FLAG_WRITE : 0); + fault =3D handle_mm_fault(vma, address, write ? FAULT_FLAG_WRITE : 0, + NULL); if (unlikely(fault & VM_FAULT_ERROR)) { if (fault & VM_FAULT_OOM) goto out_of_memory; diff --git a/arch/hexagon/mm/vm_fault.c b/arch/hexagon/mm/vm_fault.c index cd3808f96b93..f12f330e7946 100644 --- a/arch/hexagon/mm/vm_fault.c +++ b/arch/hexagon/mm/vm_fault.c @@ -88,7 +88,7 @@ void do_page_fault(unsigned long address, long cause, s= truct pt_regs *regs) break; } =20 - fault =3D handle_mm_fault(vma, address, flags); + fault =3D handle_mm_fault(vma, address, flags, NULL); =20 if (fault_signal_pending(fault, regs)) return; diff --git a/arch/ia64/mm/fault.c b/arch/ia64/mm/fault.c index 3a4dec334cc5..abf2808f9b4b 100644 --- a/arch/ia64/mm/fault.c +++ b/arch/ia64/mm/fault.c @@ -143,7 +143,7 @@ ia64_do_page_fault (unsigned long address, unsigned l= ong isr, struct pt_regs *re * sure we exit gracefully rather than endlessly redo the * fault. */ - fault =3D handle_mm_fault(vma, address, flags); + fault =3D handle_mm_fault(vma, address, flags, NULL); =20 if (fault_signal_pending(fault, regs)) return; diff --git a/arch/m68k/mm/fault.c b/arch/m68k/mm/fault.c index 508abb63da67..08b35a318ebe 100644 --- a/arch/m68k/mm/fault.c +++ b/arch/m68k/mm/fault.c @@ -134,7 +134,7 @@ int do_page_fault(struct pt_regs *regs, unsigned long= address, * the fault. */ =20 - fault =3D handle_mm_fault(vma, address, flags); + fault =3D handle_mm_fault(vma, address, flags, NULL); pr_debug("handle_mm_fault returns %x\n", fault); =20 if (fault_signal_pending(fault, regs)) diff --git a/arch/microblaze/mm/fault.c b/arch/microblaze/mm/fault.c index a2bfe587b491..1a3d4c4ca28b 100644 --- a/arch/microblaze/mm/fault.c +++ b/arch/microblaze/mm/fault.c @@ -214,7 +214,7 @@ void do_page_fault(struct pt_regs *regs, unsigned lon= g address, * make sure we exit gracefully rather than endlessly redo * the fault. */ - fault =3D handle_mm_fault(vma, address, flags); + fault =3D handle_mm_fault(vma, address, flags, NULL); =20 if (fault_signal_pending(fault, regs)) return; diff --git a/arch/mips/mm/fault.c b/arch/mips/mm/fault.c index 01b168a90434..b1db39784db9 100644 --- a/arch/mips/mm/fault.c +++ b/arch/mips/mm/fault.c @@ -152,7 +152,7 @@ static void __kprobes __do_page_fault(struct pt_regs = *regs, unsigned long write, * make sure we exit gracefully rather than endlessly redo * the fault. */ - fault =3D handle_mm_fault(vma, address, flags); + fault =3D handle_mm_fault(vma, address, flags, NULL); =20 if (fault_signal_pending(fault, regs)) return; diff --git a/arch/nds32/mm/fault.c b/arch/nds32/mm/fault.c index 8fb73f6401a0..d0ecc8fb5b23 100644 --- a/arch/nds32/mm/fault.c +++ b/arch/nds32/mm/fault.c @@ -206,7 +206,7 @@ void do_page_fault(unsigned long entry, unsigned long= addr, * the fault. */ =20 - fault =3D handle_mm_fault(vma, addr, flags); + fault =3D handle_mm_fault(vma, addr, flags, NULL); =20 /* * If we need to retry but a fatal signal is pending, handle the diff --git a/arch/nios2/mm/fault.c b/arch/nios2/mm/fault.c index 4112ef0e247e..86beb9a2698e 100644 --- a/arch/nios2/mm/fault.c +++ b/arch/nios2/mm/fault.c @@ -131,7 +131,7 @@ asmlinkage void do_page_fault(struct pt_regs *regs, u= nsigned long cause, * make sure we exit gracefully rather than endlessly redo * the fault. */ - fault =3D handle_mm_fault(vma, address, flags); + fault =3D handle_mm_fault(vma, address, flags, NULL); =20 if (fault_signal_pending(fault, regs)) return; diff --git a/arch/openrisc/mm/fault.c b/arch/openrisc/mm/fault.c index d2224ccca294..3daa491d1edb 100644 --- a/arch/openrisc/mm/fault.c +++ b/arch/openrisc/mm/fault.c @@ -159,7 +159,7 @@ asmlinkage void do_page_fault(struct pt_regs *regs, u= nsigned long address, * the fault. */ =20 - fault =3D handle_mm_fault(vma, address, flags); + fault =3D handle_mm_fault(vma, address, flags, NULL); =20 if (fault_signal_pending(fault, regs)) return; diff --git a/arch/parisc/mm/fault.c b/arch/parisc/mm/fault.c index 66ac0719bd49..e32d06928c24 100644 --- a/arch/parisc/mm/fault.c +++ b/arch/parisc/mm/fault.c @@ -302,7 +302,7 @@ void do_page_fault(struct pt_regs *regs, unsigned lon= g code, * fault. */ =20 - fault =3D handle_mm_fault(vma, address, flags); + fault =3D handle_mm_fault(vma, address, flags, NULL); =20 if (fault_signal_pending(fault, regs)) return; diff --git a/arch/powerpc/mm/copro_fault.c b/arch/powerpc/mm/copro_fault.= c index b83abbead4a2..2d0276abe0a6 100644 --- a/arch/powerpc/mm/copro_fault.c +++ b/arch/powerpc/mm/copro_fault.c @@ -64,7 +64,7 @@ int copro_handle_mm_fault(struct mm_struct *mm, unsigne= d long ea, } =20 ret =3D 0; - *flt =3D handle_mm_fault(vma, ea, is_write ? FAULT_FLAG_WRITE : 0); + *flt =3D handle_mm_fault(vma, ea, is_write ? FAULT_FLAG_WRITE : 0, NULL= ); if (unlikely(*flt & VM_FAULT_ERROR)) { if (*flt & VM_FAULT_OOM) { ret =3D -ENOMEM; diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c index 641fc5f3d7dd..25dee001d8e1 100644 --- a/arch/powerpc/mm/fault.c +++ b/arch/powerpc/mm/fault.c @@ -607,7 +607,7 @@ static int __do_page_fault(struct pt_regs *regs, unsi= gned long address, * make sure we exit gracefully rather than endlessly redo * the fault. */ - fault =3D handle_mm_fault(vma, address, flags); + fault =3D handle_mm_fault(vma, address, flags, NULL); =20 major |=3D fault & VM_FAULT_MAJOR; =20 diff --git a/arch/riscv/mm/fault.c b/arch/riscv/mm/fault.c index 5873835a3e6b..30c1124d0fb6 100644 --- a/arch/riscv/mm/fault.c +++ b/arch/riscv/mm/fault.c @@ -109,7 +109,7 @@ asmlinkage void do_page_fault(struct pt_regs *regs) * make sure we exit gracefully rather than endlessly redo * the fault. */ - fault =3D handle_mm_fault(vma, addr, flags); + fault =3D handle_mm_fault(vma, addr, flags, NULL); =20 /* * If we need to retry but a fatal signal is pending, handle the diff --git a/arch/s390/mm/fault.c b/arch/s390/mm/fault.c index d53c2e2ea1fd..fc14df0b4d6e 100644 --- a/arch/s390/mm/fault.c +++ b/arch/s390/mm/fault.c @@ -478,7 +478,7 @@ static inline vm_fault_t do_exception(struct pt_regs = *regs, int access) * make sure we exit gracefully rather than endlessly redo * the fault. */ - fault =3D handle_mm_fault(vma, address, flags); + fault =3D handle_mm_fault(vma, address, flags, NULL); if (fault_signal_pending(fault, regs)) { fault =3D VM_FAULT_SIGNAL; if (flags & FAULT_FLAG_RETRY_NOWAIT) diff --git a/arch/sh/mm/fault.c b/arch/sh/mm/fault.c index fbe1f2fe9a8c..3c0a11827f7e 100644 --- a/arch/sh/mm/fault.c +++ b/arch/sh/mm/fault.c @@ -482,7 +482,7 @@ asmlinkage void __kprobes do_page_fault(struct pt_reg= s *regs, * make sure we exit gracefully rather than endlessly redo * the fault. */ - fault =3D handle_mm_fault(vma, address, flags); + fault =3D handle_mm_fault(vma, address, flags, NULL); =20 if (unlikely(fault & (VM_FAULT_RETRY | VM_FAULT_ERROR))) if (mm_fault_error(regs, error_code, address, fault)) diff --git a/arch/sparc/mm/fault_32.c b/arch/sparc/mm/fault_32.c index cfef656eda0f..06af03db4417 100644 --- a/arch/sparc/mm/fault_32.c +++ b/arch/sparc/mm/fault_32.c @@ -234,7 +234,7 @@ asmlinkage void do_sparc_fault(struct pt_regs *regs, = int text_fault, int write, * make sure we exit gracefully rather than endlessly redo * the fault. */ - fault =3D handle_mm_fault(vma, address, flags); + fault =3D handle_mm_fault(vma, address, flags, NULL); =20 if (fault_signal_pending(fault, regs)) return; @@ -410,7 +410,7 @@ static void force_user_fault(unsigned long address, i= nt write) if (!(vma->vm_flags & (VM_READ | VM_EXEC))) goto bad_area; } - switch (handle_mm_fault(vma, address, flags)) { + switch (handle_mm_fault(vma, address, flags, NULL)) { case VM_FAULT_SIGBUS: case VM_FAULT_OOM: goto do_sigbus; diff --git a/arch/sparc/mm/fault_64.c b/arch/sparc/mm/fault_64.c index a3806614e4dc..9ebee14ee893 100644 --- a/arch/sparc/mm/fault_64.c +++ b/arch/sparc/mm/fault_64.c @@ -422,7 +422,7 @@ asmlinkage void __kprobes do_sparc64_fault(struct pt_= regs *regs) goto bad_area; } =20 - fault =3D handle_mm_fault(vma, address, flags); + fault =3D handle_mm_fault(vma, address, flags, NULL); =20 if (fault_signal_pending(fault, regs)) goto exit_exception; diff --git a/arch/um/kernel/trap.c b/arch/um/kernel/trap.c index 2b3afa354a90..8d9870d76da1 100644 --- a/arch/um/kernel/trap.c +++ b/arch/um/kernel/trap.c @@ -71,7 +71,7 @@ int handle_page_fault(unsigned long address, unsigned l= ong ip, do { vm_fault_t fault; =20 - fault =3D handle_mm_fault(vma, address, flags); + fault =3D handle_mm_fault(vma, address, flags, NULL); =20 if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) goto out_nosemaphore; diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c index 02536b04d9f3..0adbff41adec 100644 --- a/arch/x86/mm/fault.c +++ b/arch/x86/mm/fault.c @@ -1291,7 +1291,7 @@ void do_user_addr_fault(struct pt_regs *regs, * userland). The return to userland is identified whenever * FAULT_FLAG_USER|FAULT_FLAG_KILLABLE are both set in flags. */ - fault =3D handle_mm_fault(vma, address, flags); + fault =3D handle_mm_fault(vma, address, flags, NULL); major |=3D fault & VM_FAULT_MAJOR; =20 /* Quick path to respond to signals */ diff --git a/arch/xtensa/mm/fault.c b/arch/xtensa/mm/fault.c index c128dcc7c85b..e72c8c1359a6 100644 --- a/arch/xtensa/mm/fault.c +++ b/arch/xtensa/mm/fault.c @@ -107,7 +107,7 @@ void do_page_fault(struct pt_regs *regs) * make sure we exit gracefully rather than endlessly redo * the fault. */ - fault =3D handle_mm_fault(vma, address, flags); + fault =3D handle_mm_fault(vma, address, flags, NULL); =20 if (fault_signal_pending(fault, regs)) return; diff --git a/drivers/iommu/amd/iommu_v2.c b/drivers/iommu/amd/iommu_v2.c index e4b025c5637c..c259108ab6dd 100644 --- a/drivers/iommu/amd/iommu_v2.c +++ b/drivers/iommu/amd/iommu_v2.c @@ -495,7 +495,7 @@ static void do_fault(struct work_struct *work) if (access_error(vma, fault)) goto out; =20 - ret =3D handle_mm_fault(vma, address, flags); + ret =3D handle_mm_fault(vma, address, flags, NULL); out: mmap_read_unlock(mm); =20 diff --git a/drivers/iommu/intel/svm.c b/drivers/iommu/intel/svm.c index 6c87c807a0ab..5ae59a6ad681 100644 --- a/drivers/iommu/intel/svm.c +++ b/drivers/iommu/intel/svm.c @@ -872,7 +872,8 @@ static irqreturn_t prq_event_thread(int irq, void *d) goto invalid; =20 ret =3D handle_mm_fault(vma, address, - req->wr_req ? FAULT_FLAG_WRITE : 0); + req->wr_req ? FAULT_FLAG_WRITE : 0, + NULL); if (ret & VM_FAULT_ERROR) goto invalid; =20 diff --git a/include/linux/mm.h b/include/linux/mm.h index 809cbbf98fbc..33f8236a68a2 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -39,6 +39,7 @@ struct file_ra_state; struct user_struct; struct writeback_control; struct bdi_writeback; +struct pt_regs; =20 void init_mm_internals(void); =20 @@ -1659,7 +1660,8 @@ int invalidate_inode_page(struct page *page); =20 #ifdef CONFIG_MMU extern vm_fault_t handle_mm_fault(struct vm_area_struct *vma, - unsigned long address, unsigned int flags); + unsigned long address, unsigned int flags, + struct pt_regs *regs); extern int fixup_user_fault(struct task_struct *tsk, struct mm_struct *m= m, unsigned long address, unsigned int fault_flags, bool *unlocked); @@ -1669,7 +1671,8 @@ void unmap_mapping_range(struct address_space *mapp= ing, loff_t const holebegin, loff_t const holelen, int even_cows); #else static inline vm_fault_t handle_mm_fault(struct vm_area_struct *vma, - unsigned long address, unsigned int flags) + unsigned long address, unsigned int flags, + struct pt_regs *regs) { /* should never happen if there's no MMU */ BUG(); diff --git a/mm/gup.c b/mm/gup.c index 6ec1807cd2a7..80fd1610d43e 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -884,7 +884,7 @@ static int faultin_page(struct task_struct *tsk, stru= ct vm_area_struct *vma, fault_flags |=3D FAULT_FLAG_TRIED; } =20 - ret =3D handle_mm_fault(vma, address, fault_flags); + ret =3D handle_mm_fault(vma, address, fault_flags, NULL); if (ret & VM_FAULT_ERROR) { int err =3D vm_fault_to_errno(ret, *flags); =20 @@ -1238,7 +1238,7 @@ int fixup_user_fault(struct task_struct *tsk, struc= t mm_struct *mm, fatal_signal_pending(current)) return -EINTR; =20 - ret =3D handle_mm_fault(vma, address, fault_flags); + ret =3D handle_mm_fault(vma, address, fault_flags, NULL); major |=3D ret & VM_FAULT_MAJOR; if (ret & VM_FAULT_ERROR) { int err =3D vm_fault_to_errno(ret, 0); diff --git a/mm/hmm.c b/mm/hmm.c index e9a545751108..0be32b8a47be 100644 --- a/mm/hmm.c +++ b/mm/hmm.c @@ -75,7 +75,8 @@ static int hmm_vma_fault(unsigned long addr, unsigned l= ong end, } =20 for (; addr < end; addr +=3D PAGE_SIZE) - if (handle_mm_fault(vma, addr, fault_flags) & VM_FAULT_ERROR) + if (handle_mm_fault(vma, addr, fault_flags, NULL) & + VM_FAULT_ERROR) return -EFAULT; return -EBUSY; } diff --git a/mm/ksm.c b/mm/ksm.c index 5fb176d497ea..90a625b02a1d 100644 --- a/mm/ksm.c +++ b/mm/ksm.c @@ -480,7 +480,8 @@ static int break_ksm(struct vm_area_struct *vma, unsi= gned long addr) break; if (PageKsm(page)) ret =3D handle_mm_fault(vma, addr, - FAULT_FLAG_WRITE | FAULT_FLAG_REMOTE); + FAULT_FLAG_WRITE | FAULT_FLAG_REMOTE, + NULL); else ret =3D VM_FAULT_WRITE; put_page(page); diff --git a/mm/memory.c b/mm/memory.c index 072c72d88471..bb7ba127661a 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -71,6 +71,8 @@ #include #include #include +#include +#include =20 #include =20 @@ -4360,6 +4362,64 @@ static vm_fault_t __handle_mm_fault(struct vm_area= _struct *vma, return handle_pte_fault(&vmf); } =20 +/** + * mm_account_fault - Do page fault accountings + * + * @regs: the pt_regs struct pointer. When set to NULL, will skip accou= nting + * of perf event counters, but we'll still do the per-task accoun= ting to + * the task who triggered this page fault. + * @address: the faulted address. + * @flags: the fault flags. + * @ret: the fault retcode. + * + * This will take care of most of the page fault accountings. Meanwhile= , it + * will also include the PERF_COUNT_SW_PAGE_FAULTS_[MAJ|MIN] perf counte= r + * updates. However note that the handling of PERF_COUNT_SW_PAGE_FAULTS= should + * still be in per-arch page fault handlers at the entry of page fault. + */ +static inline void mm_account_fault(struct pt_regs *regs, + unsigned long address, unsigned int flags, + vm_fault_t ret) +{ + bool major; + + /* + * We don't do accounting for some specific faults: + * + * - Unsuccessful faults (e.g. when the address wasn't valid). That + * includes arch_vma_access_permitted() failing before reaching here. + * So this is not a "this many hardware page faults" counter. We + * should use the hw profiling for that. + * + * - Incomplete faults (VM_FAULT_RETRY). They will only be counted + * once they're completed. + */ + if (ret & (VM_FAULT_ERROR | VM_FAULT_RETRY)) + return; + + /* + * We define the fault as a major fault when the final successful fault + * is VM_FAULT_MAJOR, or if it retried (which implies that we couldn't + * handle it immediately previously). + */ + major =3D (ret & VM_FAULT_MAJOR) || (flags & FAULT_FLAG_TRIED); + + /* + * If the fault is done for GUP, regs will be NULL, and we will skip + * the fault accounting. + */ + if (!regs) + return; + + if (major) { + current->maj_flt++; + perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MAJ, 1, regs, address); + } else { + current->min_flt++; + perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MIN, 1, regs, address); + } +} + /* * By the time we get here, we already hold the mm semaphore * @@ -4367,7 +4427,7 @@ static vm_fault_t __handle_mm_fault(struct vm_area_= struct *vma, * return value. See filemap_fault() and __lock_page_or_retry(). */ vm_fault_t handle_mm_fault(struct vm_area_struct *vma, unsigned long add= ress, - unsigned int flags) + unsigned int flags, struct pt_regs *regs) { vm_fault_t ret; =20 @@ -4408,6 +4468,8 @@ vm_fault_t handle_mm_fault(struct vm_area_struct *v= ma, unsigned long address, mem_cgroup_oom_synchronize(false); } =20 + mm_account_fault(regs, address, flags, ret); + return ret; } EXPORT_SYMBOL_GPL(handle_mm_fault); --=20 2.26.2