All of lore.kernel.org
 help / color / mirror / Atom feed
From: Naoya Horiguchi <naoya.horiguchi@linux.dev>
To: ville.syrjala@linux.intel.com
Cc: linux-mm@kvack.org, Andrew Morton <akpm@linux-foundation.org>,
	David Hildenbrand <david@redhat.com>,
	Mike Kravetz <mike.kravetz@oracle.com>,
	Miaohe Lin <linmiaohe@huawei.com>,
	Liu Shixin <liushixin2@huawei.com>,
	Yang Shi <shy828301@gmail.com>,
	Oscar Salvador <osalvador@suse.de>,
	Muchun Song <songmuchun@bytedance.com>,
	linux-kernel@vger.kernel.org, intel-gfx@lists.freedesktop.org,
	regressions@lists.linux.dev, naoya.horiguchi@nec.com
Subject: Re: [mm-unstable PATCH v7 2/8] mm/hugetlb: make pud_huge() and follow_huge_pud() aware of non-present pud entry
Date: Sat, 5 Nov 2022 00:59:30 +0900	[thread overview]
Message-ID: <20221104155930.GA527246@ik1-406-35019.vs.sakura.ne.jp> (raw)
In-Reply-To: <Y2LYXItKQyaJTv8j@intel.com>

On Wed, Nov 02, 2022 at 10:51:40PM +0200, Ville Syrjälä wrote:
> On Thu, Jul 14, 2022 at 01:24:14PM +0900, Naoya Horiguchi wrote:
> > +/*
> > + * pud_huge() returns 1 if @pud is hugetlb related entry, that is normal
> > + * hugetlb entry or non-present (migration or hwpoisoned) hugetlb entry.
> > + * Otherwise, returns 0.
> > + */
> >  int pud_huge(pud_t pud)
> >  {
> > -	return !!(pud_val(pud) & _PAGE_PSE);
> > +	return !pud_none(pud) &&
> > +		(pud_val(pud) & (_PAGE_PRESENT|_PAGE_PSE)) != _PAGE_PRESENT;
> >  }
> 
> Hi,
> 
> This causes i915 to trip a BUG_ON() on x86-32 when I start X.

Hello,

Thank you for finding and reporting the issue.

x86-32 does not enable CONFIG_ARCH_HAS_GIGANTIC_PAGE, so pud_huge() is
supposed to be false on x86-32.  Doing like below looks to me a fix
(reverting to the original behavior for x86-32):


diff --git a/arch/x86/mm/hugetlbpage.c b/arch/x86/mm/hugetlbpage.c
index 6b3033845c6d..bf73f25aaa32 100644
--- a/arch/x86/mm/hugetlbpage.c
+++ b/arch/x86/mm/hugetlbpage.c
@@ -37,8 +37,12 @@ int pmd_huge(pmd_t pmd)
  */
 int pud_huge(pud_t pud)
 {
+#ifdef CONFIG_ARCH_HAS_GIGANTIC_PAGE
        return !pud_none(pud) &&
                (pud_val(pud) & (_PAGE_PRESENT|_PAGE_PSE)) != _PAGE_PRESENT;
+#else
+       return !!(pud_val(pud) & _PAGE_PSE);    // or "return 0;" ?
+#endif
 }

 #ifdef CONFIG_HUGETLB_PAGE


Let me guess what the PUD entry was there when triggering the issue.
Assuming that the original code (before 3a194f3f8ad0) was correct, the PSE
bit in pud_val(pud) should be always cleared.  So, when pud_huge() returns
true since 3a194f3f8ad0, the PRESENT bit should be clear and some other
bits (rather than PRESENT and PSE) are set so that pud_none() is false.
I'm not sure what such a non-present PUD entry does mean.

Thanks,
Naoya Horiguchi

> 
> [  225.777375] kernel BUG at mm/memory.c:2664!
> [  225.777391] invalid opcode: 0000 [#1] PREEMPT SMP
> [  225.777405] CPU: 0 PID: 2402 Comm: Xorg Not tainted 6.1.0-rc3-bdg+ #86
> [  225.777415] Hardware name:  /8I865G775-G, BIOS F1 08/29/2006
> [  225.777421] EIP: __apply_to_page_range+0x24d/0x31c
> [  225.777437] Code: ff ff 8b 55 e8 8b 45 cc e8 0a 11 ec ff 89 d8 83 c4 28 5b 5e 5f 5d c3 81 7d e0 a0 ef 96 c1 74 ad 8b 45 d0 e8 2d 83 49 00 eb a3 <0f> 0b 25 00 f0 ff ff 81 eb 00 00 00 40 01 c3 8b 45 ec 8b 00 e8 76
> [  225.777446] EAX: 00000001 EBX: c53a3b58 ECX: b5c00000 EDX: c258aa00
> [  225.777454] ESI: b5c00000 EDI: b5900000 EBP: c4b0fdb4 ESP: c4b0fd80
> [  225.777462] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 EFLAGS: 00010202
> [  225.777470] CR0: 80050033 CR2: b5900000 CR3: 053a3000 CR4: 000006d0
> [  225.777479] Call Trace:
> [  225.777486]  ? i915_memcpy_init_early+0x63/0x63 [i915]
> [  225.777684]  apply_to_page_range+0x21/0x27
> [  225.777694]  ? i915_memcpy_init_early+0x63/0x63 [i915]
> [  225.777870]  remap_io_mapping+0x49/0x75 [i915]
> [  225.778046]  ? i915_memcpy_init_early+0x63/0x63 [i915]
> [  225.778220]  ? mutex_unlock+0xb/0xd
> [  225.778231]  ? i915_vma_pin_fence+0x6d/0xf7 [i915]
> [  225.778420]  vm_fault_gtt+0x2a9/0x8f1 [i915]
> [  225.778644]  ? lock_is_held_type+0x56/0xe7
> [  225.778655]  ? lock_is_held_type+0x7a/0xe7
> [  225.778663]  ? 0xc1000000
> [  225.778670]  __do_fault+0x21/0x6a
> [  225.778679]  handle_mm_fault+0x708/0xb21
> [  225.778686]  ? mt_find+0x21e/0x5ae
> [  225.778696]  exc_page_fault+0x185/0x705
> [  225.778704]  ? doublefault_shim+0x127/0x127
> [  225.778715]  handle_exception+0x130/0x130
> [  225.778723] EIP: 0xb700468a
> [  225.778730] Code: 44 24 40 8b 7c 24 1c 89 47 54 8b 44 24 5c 65 2b 05 14 00 00 00 0f 85 8a 01 00 00 83 c4 6c 5b 5e 5f 5d c3 8b 44 24 1c 8b 40 28 <c7> 00 00 00 00 00 8b 44 24 20 8d 90 20 1b 00 00 8b 02 83 e8 01 89
> [  225.778738] EAX: b5900000 EBX: b7148000 ECX: 00000000 EDX: 00000000
> [  225.778745] ESI: 0103eb60 EDI: b7148000 EBP: b6cf7000 ESP: bfd76650
> [  225.778752] DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b EFLAGS: 00010246
> [  225.778761]  ? doublefault_shim+0x127/0x127
> [  225.778769] Modules linked in: i915 prime_numbers i2c_algo_bit iosf_mbi drm_buddy video wmi drm_display_helper drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm drm_panel_orientation_quirks backlight cfg80211 rfkill sch_fq_codel xt_tcpudp xt_multiport xt_state iptable_filter iptable_nat nf_nat nf_conntrack nf_defrag_ipv4 ip_tables x_tables binfmt_misc i2c_dev iTCO_wdt snd_intel8x0 snd_ac97_codec ac97_bus snd_pcm snd_timer psmouse i2c_i801 snd i2c_smbus uhci_hcd i2c_core pcspkr soundcore lpc_ich mfd_core ehci_pci ehci_hcd skge intel_agp intel_gtt usbcore agpgart usb_common rng_core parport_pc parport evdev
> [  225.778899] ---[ end trace 0000000000000000 ]---
> [  225.778906] EIP: __apply_to_page_range+0x24d/0x31c
> [  225.778916] Code: ff ff 8b 55 e8 8b 45 cc e8 0a 11 ec ff 89 d8 83 c4 28 5b 5e 5f 5d c3 81 7d e0 a0 ef 96 c1 74 ad 8b 45 d0 e8 2d 83 49 00 eb a3 <0f> 0b 25 00 f0 ff ff 81 eb 00 00 00 40 01 c3 8b 45 ec 8b 00 e8 76
> [  225.778924] EAX: 00000001 EBX: c53a3b58 ECX: b5c00000 EDX: c258aa00
> [  225.778931] ESI: b5c00000 EDI: b5900000 EBP: c4b0fdb4 ESP: c4b0fd80
> [  225.778938] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 EFLAGS: 00010202
> [  225.778946] CR0: 80050033 CR2: b5900000 CR3: 053a3000 CR4: 000006d0
> 
> -- 
> Ville Syrjälä
> Intel

WARNING: multiple messages have this Message-ID (diff)
From: Naoya Horiguchi <naoya.horiguchi@linux.dev>
To: ville.syrjala@linux.intel.com
Cc: Miaohe Lin <linmiaohe@huawei.com>,
	regressions@lists.linux.dev, David Hildenbrand <david@redhat.com>,
	Yang Shi <shy828301@gmail.com>,
	naoya.horiguchi@nec.com, linux-kernel@vger.kernel.org,
	Liu Shixin <liushixin2@huawei.com>,
	linux-mm@kvack.org, Muchun Song <songmuchun@bytedance.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Oscar Salvador <osalvador@suse.de>,
	intel-gfx@lists.freedesktop.org,
	Mike Kravetz <mike.kravetz@oracle.com>
Subject: Re: [Intel-gfx] [mm-unstable PATCH v7 2/8] mm/hugetlb: make pud_huge() and follow_huge_pud() aware of non-present pud entry
Date: Sat, 5 Nov 2022 00:59:30 +0900	[thread overview]
Message-ID: <20221104155930.GA527246@ik1-406-35019.vs.sakura.ne.jp> (raw)
In-Reply-To: <Y2LYXItKQyaJTv8j@intel.com>

On Wed, Nov 02, 2022 at 10:51:40PM +0200, Ville Syrjälä wrote:
> On Thu, Jul 14, 2022 at 01:24:14PM +0900, Naoya Horiguchi wrote:
> > +/*
> > + * pud_huge() returns 1 if @pud is hugetlb related entry, that is normal
> > + * hugetlb entry or non-present (migration or hwpoisoned) hugetlb entry.
> > + * Otherwise, returns 0.
> > + */
> >  int pud_huge(pud_t pud)
> >  {
> > -	return !!(pud_val(pud) & _PAGE_PSE);
> > +	return !pud_none(pud) &&
> > +		(pud_val(pud) & (_PAGE_PRESENT|_PAGE_PSE)) != _PAGE_PRESENT;
> >  }
> 
> Hi,
> 
> This causes i915 to trip a BUG_ON() on x86-32 when I start X.

Hello,

Thank you for finding and reporting the issue.

x86-32 does not enable CONFIG_ARCH_HAS_GIGANTIC_PAGE, so pud_huge() is
supposed to be false on x86-32.  Doing like below looks to me a fix
(reverting to the original behavior for x86-32):


diff --git a/arch/x86/mm/hugetlbpage.c b/arch/x86/mm/hugetlbpage.c
index 6b3033845c6d..bf73f25aaa32 100644
--- a/arch/x86/mm/hugetlbpage.c
+++ b/arch/x86/mm/hugetlbpage.c
@@ -37,8 +37,12 @@ int pmd_huge(pmd_t pmd)
  */
 int pud_huge(pud_t pud)
 {
+#ifdef CONFIG_ARCH_HAS_GIGANTIC_PAGE
        return !pud_none(pud) &&
                (pud_val(pud) & (_PAGE_PRESENT|_PAGE_PSE)) != _PAGE_PRESENT;
+#else
+       return !!(pud_val(pud) & _PAGE_PSE);    // or "return 0;" ?
+#endif
 }

 #ifdef CONFIG_HUGETLB_PAGE


Let me guess what the PUD entry was there when triggering the issue.
Assuming that the original code (before 3a194f3f8ad0) was correct, the PSE
bit in pud_val(pud) should be always cleared.  So, when pud_huge() returns
true since 3a194f3f8ad0, the PRESENT bit should be clear and some other
bits (rather than PRESENT and PSE) are set so that pud_none() is false.
I'm not sure what such a non-present PUD entry does mean.

Thanks,
Naoya Horiguchi

> 
> [  225.777375] kernel BUG at mm/memory.c:2664!
> [  225.777391] invalid opcode: 0000 [#1] PREEMPT SMP
> [  225.777405] CPU: 0 PID: 2402 Comm: Xorg Not tainted 6.1.0-rc3-bdg+ #86
> [  225.777415] Hardware name:  /8I865G775-G, BIOS F1 08/29/2006
> [  225.777421] EIP: __apply_to_page_range+0x24d/0x31c
> [  225.777437] Code: ff ff 8b 55 e8 8b 45 cc e8 0a 11 ec ff 89 d8 83 c4 28 5b 5e 5f 5d c3 81 7d e0 a0 ef 96 c1 74 ad 8b 45 d0 e8 2d 83 49 00 eb a3 <0f> 0b 25 00 f0 ff ff 81 eb 00 00 00 40 01 c3 8b 45 ec 8b 00 e8 76
> [  225.777446] EAX: 00000001 EBX: c53a3b58 ECX: b5c00000 EDX: c258aa00
> [  225.777454] ESI: b5c00000 EDI: b5900000 EBP: c4b0fdb4 ESP: c4b0fd80
> [  225.777462] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 EFLAGS: 00010202
> [  225.777470] CR0: 80050033 CR2: b5900000 CR3: 053a3000 CR4: 000006d0
> [  225.777479] Call Trace:
> [  225.777486]  ? i915_memcpy_init_early+0x63/0x63 [i915]
> [  225.777684]  apply_to_page_range+0x21/0x27
> [  225.777694]  ? i915_memcpy_init_early+0x63/0x63 [i915]
> [  225.777870]  remap_io_mapping+0x49/0x75 [i915]
> [  225.778046]  ? i915_memcpy_init_early+0x63/0x63 [i915]
> [  225.778220]  ? mutex_unlock+0xb/0xd
> [  225.778231]  ? i915_vma_pin_fence+0x6d/0xf7 [i915]
> [  225.778420]  vm_fault_gtt+0x2a9/0x8f1 [i915]
> [  225.778644]  ? lock_is_held_type+0x56/0xe7
> [  225.778655]  ? lock_is_held_type+0x7a/0xe7
> [  225.778663]  ? 0xc1000000
> [  225.778670]  __do_fault+0x21/0x6a
> [  225.778679]  handle_mm_fault+0x708/0xb21
> [  225.778686]  ? mt_find+0x21e/0x5ae
> [  225.778696]  exc_page_fault+0x185/0x705
> [  225.778704]  ? doublefault_shim+0x127/0x127
> [  225.778715]  handle_exception+0x130/0x130
> [  225.778723] EIP: 0xb700468a
> [  225.778730] Code: 44 24 40 8b 7c 24 1c 89 47 54 8b 44 24 5c 65 2b 05 14 00 00 00 0f 85 8a 01 00 00 83 c4 6c 5b 5e 5f 5d c3 8b 44 24 1c 8b 40 28 <c7> 00 00 00 00 00 8b 44 24 20 8d 90 20 1b 00 00 8b 02 83 e8 01 89
> [  225.778738] EAX: b5900000 EBX: b7148000 ECX: 00000000 EDX: 00000000
> [  225.778745] ESI: 0103eb60 EDI: b7148000 EBP: b6cf7000 ESP: bfd76650
> [  225.778752] DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b EFLAGS: 00010246
> [  225.778761]  ? doublefault_shim+0x127/0x127
> [  225.778769] Modules linked in: i915 prime_numbers i2c_algo_bit iosf_mbi drm_buddy video wmi drm_display_helper drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm drm_panel_orientation_quirks backlight cfg80211 rfkill sch_fq_codel xt_tcpudp xt_multiport xt_state iptable_filter iptable_nat nf_nat nf_conntrack nf_defrag_ipv4 ip_tables x_tables binfmt_misc i2c_dev iTCO_wdt snd_intel8x0 snd_ac97_codec ac97_bus snd_pcm snd_timer psmouse i2c_i801 snd i2c_smbus uhci_hcd i2c_core pcspkr soundcore lpc_ich mfd_core ehci_pci ehci_hcd skge intel_agp intel_gtt usbcore agpgart usb_common rng_core parport_pc parport evdev
> [  225.778899] ---[ end trace 0000000000000000 ]---
> [  225.778906] EIP: __apply_to_page_range+0x24d/0x31c
> [  225.778916] Code: ff ff 8b 55 e8 8b 45 cc e8 0a 11 ec ff 89 d8 83 c4 28 5b 5e 5f 5d c3 81 7d e0 a0 ef 96 c1 74 ad 8b 45 d0 e8 2d 83 49 00 eb a3 <0f> 0b 25 00 f0 ff ff 81 eb 00 00 00 40 01 c3 8b 45 ec 8b 00 e8 76
> [  225.778924] EAX: 00000001 EBX: c53a3b58 ECX: b5c00000 EDX: c258aa00
> [  225.778931] ESI: b5c00000 EDI: b5900000 EBP: c4b0fdb4 ESP: c4b0fd80
> [  225.778938] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 EFLAGS: 00010202
> [  225.778946] CR0: 80050033 CR2: b5900000 CR3: 053a3000 CR4: 000006d0
> 
> -- 
> Ville Syrjälä
> Intel

  reply	other threads:[~2022-11-04 15:59 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-07-14  4:24 [mm-unstable PATCH v7 0/8] mm, hwpoison: enable 1GB hugepage support (v7) Naoya Horiguchi
2022-07-14  4:24 ` [mm-unstable PATCH v7 1/8] mm/hugetlb: check gigantic_page_runtime_supported() in return_unused_surplus_pages() Naoya Horiguchi
2022-07-14  4:24 ` [mm-unstable PATCH v7 2/8] mm/hugetlb: make pud_huge() and follow_huge_pud() aware of non-present pud entry Naoya Horiguchi
2022-11-02 20:51   ` Ville Syrjälä
2022-11-02 20:51     ` [Intel-gfx] " Ville Syrjälä
2022-11-04 15:59     ` Naoya Horiguchi [this message]
2022-11-04 15:59       ` Naoya Horiguchi
2022-11-04 22:23       ` Ville Syrjälä
2022-11-04 22:23         ` [Intel-gfx] " Ville Syrjälä
2022-11-06 23:52         ` HORIGUCHI NAOYA(堀口 直也)
2022-11-06 23:52           ` [Intel-gfx] " HORIGUCHI NAOYA(堀口 直也)
2022-07-14  4:24 ` [mm-unstable PATCH v7 3/8] mm, hwpoison, hugetlb: support saving mechanism of raw error pages Naoya Horiguchi
2022-07-14  4:24 ` [mm-unstable PATCH v7 4/8] mm, hwpoison: make unpoison aware of raw error info in hwpoisoned hugepage Naoya Horiguchi
2022-07-14  4:24 ` [mm-unstable PATCH v7 5/8] mm, hwpoison: set PG_hwpoison for busy hugetlb pages Naoya Horiguchi
2022-07-14  4:24 ` [mm-unstable PATCH v7 6/8] mm, hwpoison: make __page_handle_poison returns int Naoya Horiguchi
2022-07-14  4:24 ` [mm-unstable PATCH v7 7/8] mm, hwpoison: skip raw hwpoison page in freeing 1GB hugepage Naoya Horiguchi
2022-07-14  4:24 ` [mm-unstable PATCH v7 8/8] mm, hwpoison: enable memory error handling on " Naoya Horiguchi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20221104155930.GA527246@ik1-406-35019.vs.sakura.ne.jp \
    --to=naoya.horiguchi@linux.dev \
    --cc=akpm@linux-foundation.org \
    --cc=david@redhat.com \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=linmiaohe@huawei.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=liushixin2@huawei.com \
    --cc=mike.kravetz@oracle.com \
    --cc=naoya.horiguchi@nec.com \
    --cc=osalvador@suse.de \
    --cc=regressions@lists.linux.dev \
    --cc=shy828301@gmail.com \
    --cc=songmuchun@bytedance.com \
    --cc=ville.syrjala@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.