From: Takao Indoh <indou.takao@jp.fujitsu.com> To: trenn@suse.de Cc: yinghai@kernel.org, muneda.takahiro@jp.fujitsu.com, linux-pci@vger.kernel.org, x86@kernel.org, linux-kernel@vger.kernel.org, andi@firstfloor.org, tokunaga.keiich@jp.fujitsu.com, kexec@lists.infradead.org, hbabu@us.ibm.com, mingo@redhat.com, ddutile@redhat.com, vgoyal@redhat.com, ishii.hironobu@jp.fujitsu.com, hpa@zytor.com, bhelgaas@google.com, tglx@linutronix.de, khalid@gonehiking.org Subject: Re: [PATCH v7 0/5] Reset PCIe devices to address DMA problem on kdump with iommu Date: Mon, 04 Mar 2013 09:56:45 +0900 [thread overview] Message-ID: <5133F14D.4060906@jp.fujitsu.com> (raw) In-Reply-To: <1508535.gXZDAVy6sT@hammer82.arch.suse.de> (2013/01/23 9:47), Thomas Renninger wrote: > On Monday, January 21, 2013 10:11:04 AM Takao Indoh wrote: >> (2013/01/08 4:09), Thomas Renninger wrote: > ... >>> I tried the provided patches first on 2.6.32, then I verfied with 3.8-rc2 >>> and in both cases the disk is not detected anymore in >>> reset_devices (kexec'ed/kdump) case (but things work fine without these >>> patches). >> >> So the problem that the disk is not detected was caused by exactmap >> problem you guys are discussing? Or still not detected even if exactmap >> problem is fixed? > This problem is related to the 5 PCI resetting patches. > Dumping worked with a 2.6.32 and a 3.8-rc2 kernel, adding the PCI resetting > patches broke both. I first tried 2.6.32 and verified with 3.8-rc2 to make sure > I didn't mess up the backport adjustings of the patches to 2.6.32. > > Unfortunately this Dell platform takes really long to boot. > I can give it the one or other test, but please do not bomb me with patches. > > For info: > About the interrupt remapping error interrupt storm in kdump case I tried to > reproduce on this machine, but never could: The guys who saw that also cannot > reproduce this anymore. > > Two ideas I had about this: > - As said already, (also) try to catch the error case and try to reset the > the device in AER/Specific iterrupt remapping error interrupt caught. I tried this idea but it did not work on megaraid_sas. I made a experimental patch so that devices are reset when DMAR error is detected on it. What happened is that: 1) megaraid_sas module is loaded. 2) DMAR error is detected during the driver initialization. 3) Reset device 4) kdump fails because the disk is not found. When I tested patches which reset all devices in early boot time, the disk was recognized correctly, so it seems that device reset during its driver loading does something wrong. I think we need reset device at least before its driver is loaded. Thanks, Takao Indoh > - Have a look at coreboot, these guys should know how to initialize the PCI > subsystem from scratch and might have some well tested PCI resetting > code in place already (no idea, just a thought). > > Thomas > >
WARNING: multiple messages have this Message-ID (diff)
From: Takao Indoh <indou.takao@jp.fujitsu.com> To: trenn@suse.de Cc: muneda.takahiro@jp.fujitsu.com, tokunaga.keiich@jp.fujitsu.com, linux-pci@vger.kernel.org, x86@kernel.org, kexec@lists.infradead.org, linux-kernel@vger.kernel.org, hbabu@us.ibm.com, andi@firstfloor.org, ddutile@redhat.com, ishii.hironobu@jp.fujitsu.com, hpa@zytor.com, bhelgaas@google.com, tglx@linutronix.de, yinghai@kernel.org, mingo@redhat.com, vgoyal@redhat.com, khalid@gonehiking.org Subject: Re: [PATCH v7 0/5] Reset PCIe devices to address DMA problem on kdump with iommu Date: Mon, 04 Mar 2013 09:56:45 +0900 [thread overview] Message-ID: <5133F14D.4060906@jp.fujitsu.com> (raw) In-Reply-To: <1508535.gXZDAVy6sT@hammer82.arch.suse.de> (2013/01/23 9:47), Thomas Renninger wrote: > On Monday, January 21, 2013 10:11:04 AM Takao Indoh wrote: >> (2013/01/08 4:09), Thomas Renninger wrote: > ... >>> I tried the provided patches first on 2.6.32, then I verfied with 3.8-rc2 >>> and in both cases the disk is not detected anymore in >>> reset_devices (kexec'ed/kdump) case (but things work fine without these >>> patches). >> >> So the problem that the disk is not detected was caused by exactmap >> problem you guys are discussing? Or still not detected even if exactmap >> problem is fixed? > This problem is related to the 5 PCI resetting patches. > Dumping worked with a 2.6.32 and a 3.8-rc2 kernel, adding the PCI resetting > patches broke both. I first tried 2.6.32 and verified with 3.8-rc2 to make sure > I didn't mess up the backport adjustings of the patches to 2.6.32. > > Unfortunately this Dell platform takes really long to boot. > I can give it the one or other test, but please do not bomb me with patches. > > For info: > About the interrupt remapping error interrupt storm in kdump case I tried to > reproduce on this machine, but never could: The guys who saw that also cannot > reproduce this anymore. > > Two ideas I had about this: > - As said already, (also) try to catch the error case and try to reset the > the device in AER/Specific iterrupt remapping error interrupt caught. I tried this idea but it did not work on megaraid_sas. I made a experimental patch so that devices are reset when DMAR error is detected on it. What happened is that: 1) megaraid_sas module is loaded. 2) DMAR error is detected during the driver initialization. 3) Reset device 4) kdump fails because the disk is not found. When I tested patches which reset all devices in early boot time, the disk was recognized correctly, so it seems that device reset during its driver loading does something wrong. I think we need reset device at least before its driver is loaded. Thanks, Takao Indoh > - Have a look at coreboot, these guys should know how to initialize the PCI > subsystem from scratch and might have some well tested PCI resetting > code in place already (no idea, just a thought). > > Thomas > > _______________________________________________ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
next prev parent reply other threads:[~2013-03-04 0:57 UTC|newest] Thread overview: 98+ messages / expand[flat|nested] mbox.gz Atom feed top 2012-11-27 0:42 [PATCH v7 0/5] Reset PCIe devices to address DMA problem on kdump with iommu Takao Indoh 2012-11-27 0:42 ` Takao Indoh 2012-11-27 0:42 ` [PATCH v7 1/5] x86, pci: add dummy pci device for early stage Takao Indoh 2012-11-27 0:42 ` Takao Indoh 2012-11-27 0:42 ` [PATCH v7 2/5] PCI: Define the maximum number of PCI function Takao Indoh 2012-11-27 0:42 ` Takao Indoh 2012-11-27 0:42 ` [PATCH v7 3/5] Make reset_devices available at early stage Takao Indoh 2012-11-27 0:42 ` Takao Indoh 2012-11-27 0:43 ` [PATCH v7 4/5] x86, pci: Reset PCIe devices at boot time Takao Indoh 2012-11-27 0:43 ` Takao Indoh 2012-11-27 0:43 ` [PATCH v7 5/5] x86, pci: Enable PCI INTx when MSI is disabled Takao Indoh 2012-11-27 0:43 ` Takao Indoh 2012-11-30 15:49 ` [PATCH v7 0/5] Reset PCIe devices to address DMA problem on kdump with iommu MUNEDA Takahiro 2012-11-30 15:49 ` MUNEDA Takahiro 2012-12-21 16:19 ` Yinghai Lu 2012-12-21 16:19 ` Yinghai Lu 2013-01-07 19:09 ` Thomas Renninger 2013-01-07 19:09 ` Thomas Renninger 2013-01-07 20:16 ` Yinghai Lu 2013-01-07 20:16 ` Yinghai Lu 2013-01-08 0:42 ` Thomas Renninger 2013-01-08 0:42 ` Thomas Renninger 2013-01-08 3:04 ` Yinghai Lu 2013-01-08 3:04 ` Yinghai Lu 2013-01-08 16:47 ` [PATCH] Only reset e820 once, even with multiple memmap=exactmap params Thomas Renninger 2013-01-08 16:47 ` Thomas Renninger 2013-01-08 17:19 ` Yinghai Lu 2013-01-08 17:19 ` Yinghai Lu 2013-01-10 3:21 ` Thomas Renninger 2013-01-10 3:21 ` Thomas Renninger 2013-01-10 14:26 ` Vivek Goyal 2013-01-10 14:26 ` Vivek Goyal 2013-01-10 16:53 ` Yinghai Lu 2013-01-10 16:53 ` Yinghai Lu 2013-01-10 17:01 ` Vivek Goyal 2013-01-10 17:01 ` Vivek Goyal 2013-01-10 17:11 ` Yinghai Lu 2013-01-10 17:11 ` Yinghai Lu 2013-01-10 23:34 ` Yinghai Lu 2013-01-11 12:33 ` [PATCH] x86 e820: only void usable memory areas in memmap=exactmap case Thomas Renninger 2013-01-11 12:33 ` Thomas Renninger 2013-01-11 16:16 ` Yinghai Lu 2013-01-11 16:16 ` Yinghai Lu 2013-01-11 18:24 ` Thomas Renninger 2013-01-11 18:24 ` Thomas Renninger 2013-01-11 19:59 ` Yinghai Lu 2013-01-11 19:59 ` Yinghai Lu 2013-01-11 20:06 ` H. Peter Anvin 2013-01-11 20:06 ` H. Peter Anvin 2013-01-11 21:09 ` Yinghai Lu 2013-01-11 21:09 ` Yinghai Lu 2013-01-11 22:16 ` H. Peter Anvin 2013-01-11 22:16 ` H. Peter Anvin 2013-01-12 11:31 ` Thomas Renninger 2013-01-12 11:31 ` Thomas Renninger 2013-01-12 17:07 ` Yinghai Lu 2013-01-12 17:07 ` Yinghai Lu 2013-01-14 2:08 ` Thomas Renninger 2013-01-14 2:08 ` Thomas Renninger 2013-01-14 2:43 ` Yinghai Lu 2013-01-14 2:43 ` Yinghai Lu 2013-01-14 15:05 ` Thomas Renninger 2013-01-14 15:05 ` Thomas Renninger 2013-01-14 19:04 ` Yinghai Lu 2013-01-14 19:04 ` Yinghai Lu 2013-01-15 0:54 ` Thomas Renninger 2013-01-15 0:54 ` Thomas Renninger 2013-01-15 4:45 ` Yinghai Lu 2013-01-15 4:45 ` Yinghai Lu 2013-01-22 15:21 ` Thomas Renninger 2013-01-22 15:21 ` Thomas Renninger 2013-01-08 16:50 ` [PATCH v7 0/5] Reset PCIe devices to address DMA problem on kdump with iommu Thomas Renninger 2013-01-08 16:50 ` Thomas Renninger 2013-01-08 17:27 ` Yinghai Lu 2013-01-08 17:27 ` Yinghai Lu 2013-01-09 2:32 ` Thomas Renninger 2013-01-09 2:32 ` Thomas Renninger 2013-01-09 4:39 ` Takao Indoh 2013-01-09 4:39 ` Takao Indoh 2013-01-21 1:11 ` Takao Indoh 2013-01-21 1:11 ` Takao Indoh 2013-01-23 0:47 ` Thomas Renninger 2013-01-23 0:47 ` Thomas Renninger 2013-01-24 0:23 ` Takao Indoh 2013-01-24 0:23 ` Takao Indoh 2013-01-29 1:14 ` Thomas Renninger 2013-01-29 1:14 ` Thomas Renninger 2013-01-30 5:01 ` Takao Indoh 2013-01-30 5:01 ` Takao Indoh 2013-03-04 0:56 ` Takao Indoh [this message] 2013-03-04 0:56 ` Takao Indoh 2013-03-04 22:00 ` Don Dutile 2013-03-04 22:00 ` Don Dutile 2013-03-05 0:56 ` Takao Indoh 2013-03-05 0:56 ` Takao Indoh 2012-12-21 9:59 ` oliver yang 2012-12-21 10:37 ` Takao Indoh 2012-12-21 10:37 ` Takao Indoh
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=5133F14D.4060906@jp.fujitsu.com \ --to=indou.takao@jp.fujitsu.com \ --cc=andi@firstfloor.org \ --cc=bhelgaas@google.com \ --cc=ddutile@redhat.com \ --cc=hbabu@us.ibm.com \ --cc=hpa@zytor.com \ --cc=ishii.hironobu@jp.fujitsu.com \ --cc=kexec@lists.infradead.org \ --cc=khalid@gonehiking.org \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-pci@vger.kernel.org \ --cc=mingo@redhat.com \ --cc=muneda.takahiro@jp.fujitsu.com \ --cc=tglx@linutronix.de \ --cc=tokunaga.keiich@jp.fujitsu.com \ --cc=trenn@suse.de \ --cc=vgoyal@redhat.com \ --cc=x86@kernel.org \ --cc=yinghai@kernel.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.