From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1040630AbdDUOQi (ORCPT ); Fri, 21 Apr 2017 10:16:38 -0400 Received: from mail-wm0-f68.google.com ([74.125.82.68]:32914 "EHLO mail-wm0-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1036447AbdDUOQb (ORCPT ); Fri, 21 Apr 2017 10:16:31 -0400 Date: Fri, 21 Apr 2017 17:16:28 +0300 From: "Kirill A. Shutemov" To: Dan Williams Cc: Catalin Marinas , aneesh.kumar@linux.vnet.ibm.com, steve.capper@linaro.org, Thomas Gleixner , Peter Zijlstra , Linux Kernel Mailing List , Ingo Molnar , Andrew Morton , "Kirill A. Shutemov" , "H. Peter Anvin" , dave.hansen@intel.com, Borislav Petkov , Rik van Riel , dann.frazier@canonical.com, Linus Torvalds , Michal Hocko , linux-tip-commits@vger.kernel.org Subject: Re: [tip:x86/mm] x86/mm/gup: Switch GUP to the generic get_user_page_fast() implementation Message-ID: <20170421141628.ruxxnq54jvuhiqnz@node.shutemov.name> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: NeoMutt/20170306 (1.8.0) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Apr 20, 2017 at 02:46:51PM -0700, Dan Williams wrote: > On Sat, Mar 18, 2017 at 2:52 AM, tip-bot for Kirill A. Shutemov > wrote: > > Commit-ID: 2947ba054a4dabbd82848728d765346886050029 > > Gitweb: http://git.kernel.org/tip/2947ba054a4dabbd82848728d765346886050029 > > Author: Kirill A. Shutemov > > AuthorDate: Fri, 17 Mar 2017 00:39:06 +0300 > > Committer: Ingo Molnar > > CommitDate: Sat, 18 Mar 2017 09:48:03 +0100 > > > > x86/mm/gup: Switch GUP to the generic get_user_page_fast() implementation > > > > This patch provides all required callbacks required by the generic > > get_user_pages_fast() code and switches x86 over - and removes > > the platform specific implementation. > > > > Signed-off-by: Kirill A. Shutemov > > Cc: Andrew Morton > > Cc: Aneesh Kumar K . V > > Cc: Borislav Petkov > > Cc: Catalin Marinas > > Cc: Dann Frazier > > Cc: Dave Hansen > > Cc: H. Peter Anvin > > Cc: Linus Torvalds > > Cc: Peter Zijlstra > > Cc: Rik van Riel > > Cc: Steve Capper > > Cc: Thomas Gleixner > > Cc: linux-arch@vger.kernel.org > > Cc: linux-mm@kvack.org > > Link: http://lkml.kernel.org/r/20170316213906.89528-1-kirill.shutemov@linux.intel.com > > [ Minor readability edits. ] > > Signed-off-by: Ingo Molnar > > I'm still trying to spot the bug, but bisect points to this patch as > the point at which my unit tests start failing with the following > signature: I can't find the issue either. Is it something reproducible without hardware? In KVM? If yes, could you share the test-case? > [ 35.423841] WARNING: CPU: 8 PID: 245 at lib/percpu-refcount.c:155 > percpu_ref_switch_to_atomic_rcu+0x1f5/0x200 > [ 35.425328] percpu ref (dax_pmem_percpu_release [dax_pmem]) <= 0 > (0) after switching to atomic > [ 35.425329] Modules linked in: ip6t_rpfilter ip6t_REJECT > nf_reject_ipv6 xt_conntrack ebtable_nat ebtable_broute bridge stp llc > ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip > 6table_mangle ip6table_raw ip6table_security iptable_nat > nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack > iptable_mangle iptable_raw iptable_security ebtable_filter ebtables > ip6table_filter ip6_tables crct10dif_pclmul crc32_pclmul crc32c_intel > ghash_clmulni_intel nd_pmem(O) dax_pmem(O) nd_btt(O) dax(O) serio_raw > nfit(O) nd_e820(O) libnvdimm(O) tpm_tis tpm_tis_co > re tpm nfit_test_iomap(O) nfsd nfs_acl > [ 35.433683] CPU: 8 PID: 245 Comm: rcuos/29 Tainted: G O > 4.11.0-rc2+ #55 > [ 35.435538] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), > BIOS 1.9.3-1.fc25 04/01/2014 > [ 35.437500] Call Trace: > [ 35.438270] dump_stack+0x86/0xc3 > [ 35.439156] __warn+0xcb/0xf0 > [ 35.439995] warn_slowpath_fmt+0x5f/0x80 > [ 35.440962] ? rcu_nocb_kthread+0x27a/0x500 > [ 35.441957] ? dax_pmem_percpu_exit+0x50/0x50 [dax_pmem] > [ 35.443107] percpu_ref_switch_to_atomic_rcu+0x1f5/0x200 > [ 35.444251] ? percpu_ref_exit+0x60/0x60 > [ 35.445206] rcu_nocb_kthread+0x327/0x500 > [ 35.446186] ? rcu_nocb_kthread+0x27a/0x500 > [ 35.447188] kthread+0x10c/0x140 > [ 35.448058] ? rcu_eqs_enter+0x50/0x50 > [ 35.448990] ? kthread_create_on_node+0x60/0x60 > [ 35.450038] ret_from_fork+0x31/0x40 > [ 35.450976] ---[ end trace eaa40898a09519b5 ]--- > > This is similar to the backtrace when we were not properly handling > pud faults and was fixed with this commit: 220ced1676c4 "mm: fix > get_user_pages() vs device-dax pud mappings" > > I've found some missing _devmap checks in the generic > get_user_pages_fast() path, but this does not fix the regression: I don't see these in x86 GUP. Was the bug there too? -- Kirill A. Shutemov