All of lore.kernel.org
 help / color / mirror / Atom feed
From: Baoquan He <bhe@redhat.com>
To: Dave Young <dyoung@redhat.com>
Cc: "Li, ZhenHua" <zhen-hual@hp.com>,
	dwmw2@infradead.org, indou.takao@jp.fujitsu.com, joro@8bytes.org,
	vgoyal@redhat.com, iommu@lists.linux-foundation.org,
	linux-kernel@vger.kernel.org, linux-pci@vger.kernel.org,
	kexec@lists.infradead.org, alex.williamson@redhat.com,
	ddutile@redhat.com, ishii.hironobu@jp.fujitsu.com,
	bhelgaas@google.com, doug.hatch@hp.com, jerry.hoemann@hp.com,
	tom.vaden@hp.com, li.zhang6@hp.com, lisa.mitchell@hp.com,
	billsumnerlinux@gmail.com, rwright@hp.com
Subject: Re: [PATCH v10 0/10] iommu/vt-d: Fix intel vt-d faults in kdump kernel
Date: Fri, 24 Apr 2015 16:01:47 +0800	[thread overview]
Message-ID: <20150424080147.GC4458@dhcp-16-116.nay.redhat.com> (raw)
In-Reply-To: <20150415064803.GF19051@localhost.localdomain>

On 04/15/15 at 02:48pm, Dave Young wrote:
> On 04/15/15 at 01:47pm, Li, ZhenHua wrote:
> > On 04/15/2015 08:57 AM, Dave Young wrote:
> > >Again, I think it is bad to use old page table, below issues need consider:
> > >1) make sure old page table are reliable across crash
> > >2) do not allow writing oldmem after crash
> > >
> > >Please correct me if I'm wrong, or if above is not doable I think I will vote for
> > >resetting pci bus.
> > >
> > >Thanks
> > >Dave
> > >
> > Hi Dave,
> > 
> > When updating the context tables, we have to write their address to root
> > tables, this will cause writing to old mem.
> > 
> > Resetting the pci bus has been discussed, please check this:
> > http://lists.infradead.org/pipermail/kexec/2014-October/012752.html
> > https://lkml.org/lkml/2014/10/21/890

I support this patchset.

We should not fear oldmem since reserved crashkernel region is similar.
No one can guarantee that any crazy code won't step into crashkernel
region just because 1st kernel says it's reversed for kdump kernel. Here
the root table and context tables are also not built to allow legal code
to danamge. Both of them has the risk to be corrupted, for trying our
best to get a dumped vmcore the risk is worth being taken.

And the resetting pci way has been NACKed by David Woodhouse, the
maintainer of intel iommu. Because the place calling the resetting pci
code is ugly before kdump kernel or in kdump kernel. And as he said a
certain device made mistakes why we blame on all devices. We should fix
that device who made mistakes. 

As for me, periodically poked by customers to ask how iommu fix is
going, I really think this patchset is good enough. Aren't we going to
do thing just because there's a risk with tiny possibility or not perfect
enough. I think people won't agree. Otherwise kdump could have been
killed when author proposed it since crashkernel reserved region is
risky and could be corrupted by 1st kernel.

Anyway, let's comprimise a little. At worst it can be reverted if it's
not satisfactory.

Personal opinion.

By the way, I tested it and it works well on my HP z420 workstation.

Thanks
Baoquan


> 
> I know one reason to use old pgtable is this looks better because it fixes the
> real problem, but it is not a good way if it introduce more problems because of
> it have to use oldmem. I will be glad if this is not a problem but I have not
> been convinced.
> 
> OTOH, there's many types of iommu, intel, amd, a lot of other types. They need
> their own fixes, so it looks not that elegant.
> 
> For pci reset, it is not perfect, but it has another advantage, the patch is
> simpler. The problem I see from the old discusssion is, reset bus in 2nd kernel
> is acceptable but it does not fix things on sparc platform. AFAIK current reported
> problems are intel and amd iommu, at least pci reset stuff does not make it worse.
> 
> Thanks
> Dave

WARNING: multiple messages have this Message-ID (diff)
From: Baoquan He <bhe@redhat.com>
To: Dave Young <dyoung@redhat.com>
Cc: alex.williamson@redhat.com, indou.takao@jp.fujitsu.com,
	tom.vaden@hp.com, rwright@hp.com, linux-pci@vger.kernel.org,
	joro@8bytes.org, kexec@lists.infradead.org,
	linux-kernel@vger.kernel.org, lisa.mitchell@hp.com,
	jerry.hoemann@hp.com, iommu@lists.linux-foundation.org, "Li,
	ZhenHua" <zhen-hual@hp.com>,
	ddutile@redhat.com, doug.hatch@hp.com,
	ishii.hironobu@jp.fujitsu.com, bhelgaas@google.com,
	billsumnerlinux@gmail.com, li.zhang6@hp.com, dwmw2@infradead.org,
	vgoyal@redhat.com
Subject: Re: [PATCH v10 0/10] iommu/vt-d: Fix intel vt-d faults in kdump kernel
Date: Fri, 24 Apr 2015 16:01:47 +0800	[thread overview]
Message-ID: <20150424080147.GC4458@dhcp-16-116.nay.redhat.com> (raw)
In-Reply-To: <20150415064803.GF19051@localhost.localdomain>

On 04/15/15 at 02:48pm, Dave Young wrote:
> On 04/15/15 at 01:47pm, Li, ZhenHua wrote:
> > On 04/15/2015 08:57 AM, Dave Young wrote:
> > >Again, I think it is bad to use old page table, below issues need consider:
> > >1) make sure old page table are reliable across crash
> > >2) do not allow writing oldmem after crash
> > >
> > >Please correct me if I'm wrong, or if above is not doable I think I will vote for
> > >resetting pci bus.
> > >
> > >Thanks
> > >Dave
> > >
> > Hi Dave,
> > 
> > When updating the context tables, we have to write their address to root
> > tables, this will cause writing to old mem.
> > 
> > Resetting the pci bus has been discussed, please check this:
> > http://lists.infradead.org/pipermail/kexec/2014-October/012752.html
> > https://lkml.org/lkml/2014/10/21/890

I support this patchset.

We should not fear oldmem since reserved crashkernel region is similar.
No one can guarantee that any crazy code won't step into crashkernel
region just because 1st kernel says it's reversed for kdump kernel. Here
the root table and context tables are also not built to allow legal code
to danamge. Both of them has the risk to be corrupted, for trying our
best to get a dumped vmcore the risk is worth being taken.

And the resetting pci way has been NACKed by David Woodhouse, the
maintainer of intel iommu. Because the place calling the resetting pci
code is ugly before kdump kernel or in kdump kernel. And as he said a
certain device made mistakes why we blame on all devices. We should fix
that device who made mistakes. 

As for me, periodically poked by customers to ask how iommu fix is
going, I really think this patchset is good enough. Aren't we going to
do thing just because there's a risk with tiny possibility or not perfect
enough. I think people won't agree. Otherwise kdump could have been
killed when author proposed it since crashkernel reserved region is
risky and could be corrupted by 1st kernel.

Anyway, let's comprimise a little. At worst it can be reverted if it's
not satisfactory.

Personal opinion.

By the way, I tested it and it works well on my HP z420 workstation.

Thanks
Baoquan


> 
> I know one reason to use old pgtable is this looks better because it fixes the
> real problem, but it is not a good way if it introduce more problems because of
> it have to use oldmem. I will be glad if this is not a problem but I have not
> been convinced.
> 
> OTOH, there's many types of iommu, intel, amd, a lot of other types. They need
> their own fixes, so it looks not that elegant.
> 
> For pci reset, it is not perfect, but it has another advantage, the patch is
> simpler. The problem I see from the old discusssion is, reset bus in 2nd kernel
> is acceptable but it does not fix things on sparc platform. AFAIK current reported
> problems are intel and amd iommu, at least pci reset stuff does not make it worse.
> 
> Thanks
> Dave

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

  parent reply	other threads:[~2015-04-24  8:02 UTC|newest]

Thread overview: 99+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-04-10  8:42 [PATCH v10 0/10] iommu/vt-d: Fix intel vt-d faults in kdump kernel Li, Zhen-Hua
2015-04-10  8:42 ` Li, Zhen-Hua
2015-04-10  8:42 ` Li, Zhen-Hua
2015-04-10  8:42 ` [PATCH v10 01/10] iommu/vt-d: New function to attach domain with id Li, Zhen-Hua
2015-04-10  8:42   ` Li, Zhen-Hua
2015-04-10  8:42   ` Li, Zhen-Hua
2015-04-10  8:42 ` [PATCH v10 02/10] iommu/vt-d: Items required for kdump Li, Zhen-Hua
2015-04-10  8:42   ` Li, Zhen-Hua
2015-04-10  8:42   ` Li, Zhen-Hua
2015-04-10  8:42 ` [PATCH v10 03/10] iommu/vt-d: Function to get old context entry Li, Zhen-Hua
2015-04-10  8:42   ` Li, Zhen-Hua
2015-04-10  8:42   ` Li, Zhen-Hua
2015-04-10  8:42 ` [PATCH v10 04/10] iommu/vt-d: functions to copy data from old mem Li, Zhen-Hua
2015-04-10  8:42   ` Li, Zhen-Hua
2015-04-10  8:42   ` Li, Zhen-Hua
2015-05-07  7:49   ` Baoquan He
2015-05-07  7:49     ` Baoquan He
2015-05-07  8:33     ` Li, ZhenHua
2015-05-07  8:33       ` Li, ZhenHua
2015-05-07  8:33       ` Li, ZhenHua
2015-04-10  8:42 ` [PATCH v10 05/10] iommu/vt-d: Add functions to load and save old re Li, Zhen-Hua
2015-04-10  8:42   ` Li, Zhen-Hua
2015-04-10  8:42   ` Li, Zhen-Hua
2015-04-10  8:42 ` [PATCH v10 06/10] iommu/vt-d: datatypes and functions used for kdump Li, Zhen-Hua
2015-04-10  8:42   ` Li, Zhen-Hua
2015-04-10  8:42   ` Li, Zhen-Hua
2015-04-10  8:42 ` [PATCH v10 07/10] iommu/vt-d: enable kdump support in iommu module Li, Zhen-Hua
2015-04-10  8:42   ` Li, Zhen-Hua
2015-04-10  8:42   ` Li, Zhen-Hua
2015-04-10  8:42 ` [PATCH v10 08/10] iommu/vt-d: assign new page table for dma_map Li, Zhen-Hua
2015-04-10  8:42   ` Li, Zhen-Hua
2015-04-10  8:42   ` Li, Zhen-Hua
2015-04-10  8:42 ` [PATCH v10 09/10] iommu/vt-d: Copy functions for irte Li, Zhen-Hua
2015-04-10  8:42   ` Li, Zhen-Hua
2015-04-10  8:42   ` Li, Zhen-Hua
2015-04-10  8:42 ` [PATCH v10 10/10] iommu/vt-d: Use old irte in kdump kernel Li, Zhen-Hua
2015-04-10  8:42   ` Li, Zhen-Hua
2015-04-10  8:42   ` Li, Zhen-Hua
2015-04-15  0:57 ` [PATCH v10 0/10] iommu/vt-d: Fix intel vt-d faults " Dave Young
2015-04-15  5:47   ` Li, ZhenHua
2015-04-15  5:47     ` Li, ZhenHua
2015-04-15  5:47     ` Li, ZhenHua
2015-04-15  6:48     ` Dave Young
2015-04-15  6:48       ` Dave Young
2015-04-21  1:39       ` Li, ZhenHua
2015-04-21  1:39         ` Li, ZhenHua
2015-04-21  2:53         ` Dave Young
2015-04-21  2:53           ` Dave Young
2015-04-21  2:53           ` Dave Young
2015-04-24  8:01       ` Baoquan He [this message]
2015-04-24  8:01         ` Baoquan He
2015-04-24  8:25         ` Dave Young
2015-04-24  8:25           ` Dave Young
2015-04-24  8:35           ` Baoquan He
2015-04-24  8:35             ` Baoquan He
2015-04-24  8:49             ` Dave Young
2015-04-24  8:49               ` Dave Young
2015-04-28  8:54               ` Baoquan He
2015-04-28  8:54                 ` Baoquan He
2015-04-28  9:00                 ` Li, ZhenHua
2015-04-28  9:00                   ` Li, ZhenHua
2015-05-04 16:23               ` Joerg Roedel
2015-05-04 16:23                 ` Joerg Roedel
2015-05-05  6:14                 ` Dave Young
2015-05-05  6:14                   ` Dave Young
2015-05-05 15:31                   ` Joerg Roedel
2015-05-05 15:31                     ` Joerg Roedel
2015-05-06  1:51                     ` Dave Young
2015-05-06  1:51                       ` Dave Young
2015-05-06  1:51                       ` Dave Young
2015-05-06  2:37                       ` Li, ZhenHua
2015-05-06  2:37                         ` Li, ZhenHua
2015-05-06  2:37                         ` Li, ZhenHua
2015-05-06  8:25                       ` Joerg Roedel
2015-05-06  8:25                         ` Joerg Roedel
2015-04-23  8:35 ` Li, ZhenHua
2015-04-23  8:35   ` Li, ZhenHua
2015-04-23  8:35   ` Li, ZhenHua
2015-04-23  8:38   ` Li, ZhenHua
2015-04-23  8:38     ` Li, ZhenHua
2015-04-23  8:38     ` Li, ZhenHua
2015-04-29 11:20 ` Baoquan He
2015-04-29 11:20   ` Baoquan He
2015-04-29 11:20   ` Baoquan He
2015-05-03  8:55   ` Baoquan He
2015-05-03  8:55     ` Baoquan He
2015-05-03  8:55     ` Baoquan He
2015-05-04  3:06     ` Li, ZhenHua
2015-05-04  3:06       ` Li, ZhenHua
2015-05-04  3:06       ` Li, ZhenHua
2015-05-04  3:17       ` Baoquan He
2015-05-04  3:17         ` Baoquan He
2015-05-07 17:32         ` Joerg Roedel
2015-05-07 17:32           ` Joerg Roedel
2015-05-08  1:00           ` Li, ZhenHua
2015-05-08  1:00             ` Li, ZhenHua
2015-05-08  1:00             ` Li, ZhenHua
2015-06-11 15:40 ` David Woodhouse
2015-06-11 15:40   ` David Woodhouse

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150424080147.GC4458@dhcp-16-116.nay.redhat.com \
    --to=bhe@redhat.com \
    --cc=alex.williamson@redhat.com \
    --cc=bhelgaas@google.com \
    --cc=billsumnerlinux@gmail.com \
    --cc=ddutile@redhat.com \
    --cc=doug.hatch@hp.com \
    --cc=dwmw2@infradead.org \
    --cc=dyoung@redhat.com \
    --cc=indou.takao@jp.fujitsu.com \
    --cc=iommu@lists.linux-foundation.org \
    --cc=ishii.hironobu@jp.fujitsu.com \
    --cc=jerry.hoemann@hp.com \
    --cc=joro@8bytes.org \
    --cc=kexec@lists.infradead.org \
    --cc=li.zhang6@hp.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=lisa.mitchell@hp.com \
    --cc=rwright@hp.com \
    --cc=tom.vaden@hp.com \
    --cc=vgoyal@redhat.com \
    --cc=zhen-hual@hp.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.