linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Josef Johansson <josef@oderland.se>
To: Thomas Gleixner <tglx@linutronix.de>
Cc: boris.ostrovsky@oracle.com, helgaas@kernel.org, jgross@suse.com,
	linux-pci@vger.kernel.org, maz@kernel.org,
	xen-devel@lists.xenproject.org,
	Jason Andryuk <jandryuk@gmail.com>,
	David Woodhouse <dwmw2@infradead.org>,
	Peter Jones <pjones@redhat.com>,
	linux-fbdev@vger.kernel.org,
	Peter Zijlstra <peterz@infradead.org>,
	LKML <linux-kernel@vger.kernel.org>,
	x86@kernel.org
Subject: Re: [PATCH] PCI/MSI: Move non-mask check back into low level accessors
Date: Thu, 11 Nov 2021 00:13:42 +0100	[thread overview]
Message-ID: <607f86a1-9295-aa97-1765-43050be29d8b@oderland.se> (raw)
In-Reply-To: <19176a3c-e554-0ff4-2e0b-5813d353d15e@oderland.se>

On 11/10/21 21:30, Josef Johansson wrote:
> On 11/4/21 00:45, Thomas Gleixner wrote:
>> On Wed, Oct 27 2021 at 17:29, Josef Johansson wrote:
>>
>> CC+: EFIFB and scheduler folks
>>
>>> On 10/27/21 14:01, Josef Johansson wrote:
>>>
>>> printk: Suspending console(s) (use no_console_suspend to debug)
>>> [drm] free PSP TMR buffer
>>> PM: suspend devices took 0.428 seconds
>>> ACPI: EC: interrupt blocked
>>> ACPI: PM: Preparing to enter system sleep state S3
>>> ACPI: EC: event blocked
>>> ACPI: EC: EC stopped
>>> ACPI: PM: Saving platform NVS memory
>>> Disabling non-boot CPUs ...
>>> ------------[ cut here ]------------
>>> WARNING: CPU: 1 PID: 0 at arch/x86/mm/tlb.c:522  switch_mm_irqs_off+0x3c5/0x400
>> 	if (WARN_ON_ONCE(__read_cr3() != build_cr3(real_prev->pgd, prev_asid))) {
>>
>>> Modules linked in: snd_seq_dummy snd_hrtimer snd_seq snd_seq_device snd_timer nf_tables nfnetlink vfat fat intel_rapl_msr think_lmi firmware_attributes_class wmi_bmof intel_rapl_common pcspkr uvcvideo videobuf2_vmalloc videobuf2_memops joydev videobuf2_v4l2 sp5100_tco k10temp videobuf2_common i2c_piix4 iwlwifi videodev mc cfg80211 thinkpad_acpi ipmi_devintf ucsi_acpi platform_profile typec_ucsi ledtrig_audio ipmi_msghandler r8169 rfkill typec snd wmi soundcore video i2c_scmi fuse xenfs ip_tables dm_thin_pool dm_persistent_data dm_bio_prison dm_crypt trusted asn1_encoder hid_multitouch amdgpu crct10dif_pclmul crc32_pclmul crc32c_intel gpu_sched i2c_algo_bit drm_ttm_helper ghash_clmulni_intel ttm serio_raw drm_kms_helper cec sdhci_pci cqhci sdhci xhci_pci drm xhci_pci_renesas nvme xhci_hcd ehci_pci mmc_core ehci_hcd nvme_core xen_acpi_processor xen_privcmd xen_pciback xen_blkback xen_gntalloc xen_gntdev xen_evtchn uinput
>>> CPU: 1 PID: 0 Comm: swapper/1 Tainted: G        W        --------- ---  5.15.0-0.rc7.0.fc32.qubes.x86_64 #1
>>> Hardware name: LENOVO 20Y1S02400/20Y1S02400, BIOS R1BET65W(1.34 ) 06/17/2021
>>> RIP: e030:switch_mm_irqs_off+0x3c5/0x400
>>> Code: f0 41 80 65 01 fb ba 01 00 00 00 49 8d b5 60 23 00 00 4c 89 ef 49 c7 85 68 23 00 00 60 1d 08 81 e8 a0 f3 08 00 e9 15 fd ff ff <0f> 0b e8 34 fa ff ff e9 ad fc ff ff 0f 0b e9 31 fe ff ff 0f 0b e9
>>> RSP: e02b:ffffc900400f3eb0 EFLAGS: 00010006
>>> RAX: 00000001336c6000 RBX: ffff888140660000 RCX: 0000000000000040
>>> RDX: ffff8881003027c0 RSI: 0000000000000000 RDI: ffff8881b36c6000
>>> RBP: ffffffff829d91c0 R08: 0000000000000000 R09: 0000000000000000
>>> R10: 0000000000000008 R11: 0000000000000000 R12: ffff888104e88440
>>> R13: ffff8881003027c0 R14: 0000000000000000 R15: 0000000000000001
>>> FS:  0000000000000000(0000) GS:ffff888140640000(0000) knlGS:0000000000000000
>>> CS:  10000e030 DS: 002b ES: 002b CR0: 0000000080050033
>>> CR2: 000060b7d78bf198 CR3: 0000000002810000 CR4: 0000000000050660
>>> Call Trace:
>>>  switch_mm+0x1c/0x30
>>>  idle_task_exit+0x55/0x60
>>>  play_dead_common+0xa/0x20
>>>  xen_pv_play_dead+0xa/0x60
>> So this is when bringing the non boot CPUs down and the switch_mm() code
>> discovers inconsistency between CR3 and the expected value.
>>
>> Would probably be interesting to print the actual values, but XEN folks
>> might have an idea.
> I can install some print-statements showing some more info here.
> I guess I will be getting memory addresses, we already know that CR3 is
> 0000000002810000
>
> If you have any hints on how to do an effective print statement for this
> please do say so :)
> I'll try though and see what I find out.

diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
index 59ba2968af1b..511792852e9e 100644
--- a/arch/x86/mm/tlb.c
+++ b/arch/x86/mm/tlb.c
@@ -520,6 +520,10 @@ void switch_mm_irqs_off(struct mm_struct *prev, struct mm_struct *next,
 	 */
 #ifdef CONFIG_DEBUG_VM
 	if (WARN_ON_ONCE(__read_cr3() != build_cr3(real_prev->pgd, prev_asid))) {
+		printk("josef-debug: cr3: %lx, build_cr3: %lx, (%px, %x)",
+			__read_cr3(),
+			build_cr3(real_prev->pgd, prev_asid),
+			real_prev->pgd, prev_asid);
 		/*
 		 * If we were to BUG here, we'd be very likely to kill
 		 * the system so hard that we don't see the call trace.

this patch gave me the three values which where already known,
__read_cr3 = CR3 = 0000000002810000
build_cr3 = RAX = 00000001336c6000
real_prev->pgd = RDI = ffff8881b36c6000
prev_asid = RSI = 0

Not sure what conclusions I should draw though.

>>>  do_idle+0xd1/0xe0
>>>  cpu_startup_entry+0x19/0x20
>>>  asm_cpu_bringup_and_idle+0x5/0x1000
>>> ---[ end trace b068d3cd1b7f5f4b ]---
>>> smpboot: CPU 1 is now offline
>>> smpboot: CPU 2 is now offline
>>> smpboot: CPU 3 is now offline
>>> smpboot: CPU 4 is now offline
>>> smpboot: CPU 5 is now offline
>>> smpboot: CPU 6 is now offline
>>> smpboot: CPU 7 is now offline
>>> ACPI: PM: Low-level resume complete
>>> ACPI: EC: EC started
>>> ACPI: PM: Restoring platform NVS memory
>>> xen_acpi_processor: Uploading Xen processor PM info
>>> xen_acpi_processor: (_PXX): Hypervisor error (-19) for ACPI CPU1
>>> xen_acpi_processor: (_PXX): Hypervisor error (-19) for ACPI CPU3
>>> xen_acpi_processor: (_PXX): Hypervisor error (-19) for ACPI CPU5
>>> xen_acpi_processor: (_PXX): Hypervisor error (-19) for ACPI CPU7
>>> xen_acpi_processor: (_PXX): Hypervisor error (-19) for ACPI CPU9
>>>


  reply	other threads:[~2021-11-10 23:13 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-18  6:22 [PATCH] PCI/MSI: Re-add checks for skip masking MSI-X on Xen PV Josef Johansson
2021-10-19 19:57 ` Bjorn Helgaas
2021-10-19 20:15   ` Josef Johansson
2021-10-19 20:29     ` Bjorn Helgaas
2021-10-19 21:48       ` [PATCH v2] " Josef Johansson
2021-10-20 12:51         ` Marc Zyngier
2021-10-20 14:03           ` Jason Andryuk
2021-10-21  8:25             ` Josef Johansson
2021-10-24 18:55               ` Josef Johansson
2021-10-25  1:25                 ` [PATCH] PCI/MSI: Fix masking MSI/MSI-X " Jason Andryuk
2021-10-25  7:44                   ` David Woodhouse
2021-10-25 11:43                     ` Roger Pau Monné
2021-10-25 11:53                       ` David Woodhouse
2021-10-25 12:58                         ` Roger Pau Monné
2021-10-25 13:02                           ` David Woodhouse
2021-10-25 14:12                             ` Roger Pau Monné
2021-10-25 12:31                     ` Jason Andryuk
2021-10-25 12:27                   ` Jason Andryuk
2021-10-25 16:46                     ` Josef Johansson
2021-10-26 21:17                       ` Josef Johansson
2021-10-27  8:45                   ` Thomas Gleixner
2021-10-27  9:50                     ` [PATCH] PCI/MSI: Move non-mask check back into low level accessors Thomas Gleixner
2021-10-27  9:54                       ` Josef Johansson
2021-10-27 12:01                         ` Josef Johansson
2021-10-27 15:29                           ` Josef Johansson
2021-11-03 23:26                             ` Thomas Gleixner
2021-11-03 23:27                               ` [PATCH v2] " Thomas Gleixner
2021-11-09 14:53                                 ` Thomas Gleixner
2021-11-10 13:31                                   ` Josef Johansson
2021-11-10 16:05                                     ` Josef Johansson
2021-11-03 23:45                             ` [PATCH] " Thomas Gleixner
2021-11-04  9:00                               ` Josef Johansson
2021-11-04 17:12                               ` Peter Zijlstra
2021-11-04 17:31                               ` Vincent Guittot
2021-11-10 20:30                               ` Josef Johansson
2021-11-10 23:13                                 ` Josef Johansson [this message]
2021-10-27  9:57                       ` David Woodhouse
2021-10-25  1:25                 ` [PATCH v2] PCI/MSI: Re-add checks for skip masking MSI-X on Xen PV Jason Andryuk
2021-10-25 19:21                   ` Josef Johansson
2021-10-27  6:24                     ` David Woodhouse
2021-10-27  8:13                       ` Josef Johansson
2021-10-27  8:26                         ` David Woodhouse

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=607f86a1-9295-aa97-1765-43050be29d8b@oderland.se \
    --to=josef@oderland.se \
    --cc=boris.ostrovsky@oracle.com \
    --cc=dwmw2@infradead.org \
    --cc=helgaas@kernel.org \
    --cc=jandryuk@gmail.com \
    --cc=jgross@suse.com \
    --cc=linux-fbdev@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=maz@kernel.org \
    --cc=peterz@infradead.org \
    --cc=pjones@redhat.com \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).