From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
	id S1753184AbbDHDew (ORCPT <rfc822;w@1wt.eu>);
	Tue, 7 Apr 2015 23:34:52 -0400
Received: from mx1.redhat.com ([209.132.183.28]:39285 "EHLO mx1.redhat.com"
	rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
	id S1751122AbbDHDeu (ORCPT <rfc822;linux-kernel@vger.kernel.org>);
	Tue, 7 Apr 2015 23:34:50 -0400
Date: Wed, 8 Apr 2015 11:33:51 +0800
From: Dave Young <dyoung@redhat.com>
To: "Li, ZhenHua" <zhen-hual@hp.com>
Cc: Baoquan He <bhe@redhat.com>, dwmw2@infradead.org,
        indou.takao@jp.fujitsu.com, joro@8bytes.org, vgoyal@redhat.com,
        iommu@lists.linux-foundation.org, linux-kernel@vger.kernel.org,
        linux-pci@vger.kernel.org, kexec@lists.infradead.org,
        alex.williamson@redhat.com, ddutile@redhat.com,
        ishii.hironobu@jp.fujitsu.com, bhelgaas@google.com, doug.hatch@hp.com,
        jerry.hoemann@hp.com, tom.vaden@hp.com, li.zhang6@hp.com,
        lisa.mitchell@hp.com, billsumnerlinux@gmail.com, rwright@hp.com
Subject: Re: [PATCH v9 0/10] iommu/vt-d: Fix intel vt-d faults in kdump kernel
Message-ID: <20150408033351.GI7213@localhost.localdomain>
References: <1426743388-26908-1-git-send-email-zhen-hual@hp.com>
 <20150403084031.GF22579@dhcp-128-53.nay.redhat.com>
 <551E56F6.60503@hp.com>
 <20150403092111.GG22579@dhcp-128-53.nay.redhat.com>
 <20150405015453.GB1562@dhcp-17-102.nay.redhat.com>
 <20150407034622.GB7213@localhost.localdomain>
 <20150407090837.GE7213@localhost.localdomain>
 <5523A977.1030707@hp.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <5523A977.1030707@hp.com>
User-Agent: Mutt/1.5.22.1-rc1 (2013-10-16)
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

On 04/07/15 at 05:55pm, Li, ZhenHua wrote:
> On 04/07/2015 05:08 PM, Dave Young wrote:
> >On 04/07/15 at 11:46am, Dave Young wrote:
> >>On 04/05/15 at 09:54am, Baoquan He wrote:
> >>>On 04/03/15 at 05:21pm, Dave Young wrote:
> >>>>On 04/03/15 at 05:01pm, Li, ZhenHua wrote:
> >>>>>Hi Dave,
> >>>>>
> >>>>>There may be some possibilities that the old iommu data is corrupted by
> >>>>>some other modules. Currently we do not have a better solution for the
> >>>>>dmar faults.
> >>>>>
> >>>>>But I think when this happens, we need to fix the module that corrupted
> >>>>>the old iommu data. I once met a similar problem in normal kernel, the
> >>>>>queue used by the qi_* functions was written again by another module.
> >>>>>The fix was in that module, not in iommu module.
> >>>>
> >>>>It is too late, there will be no chance to save vmcore then.
> >>>>
> >>>>Also if it is possible to continue corrupt other area of oldmem because
> >>>>of using old iommu tables then it will cause more problems.
> >>>>
> >>>>So I think the tables at least need some verifycation before being used.
> >>>>
> >>>
> >>>Yes, it's a good thinking anout this and verification is also an
> >>>interesting idea. kexec/kdump do a sha256 calculation on loaded kernel
> >>>and then verify this again when panic happens in purgatory. This checks
> >>>whether any code stomps into region reserved for kexec/kernel and corrupt
> >>>the loaded kernel.
> >>>
> >>>If this is decided to do it should be an enhancement to current
> >>>patchset but not a approach change. Since this patchset is going very
> >>>close to point as maintainers expected maybe this can be merged firstly,
> >>>then think about enhancement. After all without this patchset vt-d often
> >>>raised error message, hung.
> >>
> >>It does not convince me, we should do it right at the beginning instead of
> >>introduce something wrong.
> >>
> >>I wonder why the old dma can not be remap to a specific page in kdump kernel
> >>so that it will not corrupt more memory. But I may missed something, I will
> >>looking for old threads and catch up.
> >
> >I have read the old discussion, above way was dropped because it could corrupt
> >filesystem. Apologize about late commenting.
> >
> >But current solution sounds bad to me because of using old memory which is not
> >reliable.
> >
> >Thanks
> >Dave
> >
> Seems we do not have a better solution for the dmar faults.  But I believe
> we can find out how to verify the iommu data which is located in old memory.

That will be great, thanks.

So there's two things:
1) make sure old pg tables are right, this is what we were talking about.
2) avoid writing old memory, I suppose only dma read could corrupt filesystem,
right? So how about for any dma writes just create a scratch page in 2nd kernel
memory. Only using old page table for dma read.

Thanks
Dave

From mboxrd@z Thu Jan  1 00:00:00 1970
From: Dave Young <dyoung-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
Subject: Re: [PATCH v9 0/10] iommu/vt-d: Fix intel vt-d faults in kdump kernel
Date: Wed, 8 Apr 2015 11:33:51 +0800
Message-ID: <20150408033351.GI7213@localhost.localdomain>
References: <1426743388-26908-1-git-send-email-zhen-hual@hp.com>
	<20150403084031.GF22579@dhcp-128-53.nay.redhat.com>
	<551E56F6.60503@hp.com>
	<20150403092111.GG22579@dhcp-128-53.nay.redhat.com>
	<20150405015453.GB1562@dhcp-17-102.nay.redhat.com>
	<20150407034622.GB7213@localhost.localdomain>
	<20150407090837.GE7213@localhost.localdomain>
	<5523A977.1030707@hp.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Return-path: <iommu-bounces-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org>
Content-Disposition: inline
In-Reply-To: <5523A977.1030707-VXdhtT5mjnY@public.gmane.org>
List-Unsubscribe: <https://lists.linuxfoundation.org/mailman/options/iommu>,
	<mailto:iommu-request-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org?subject=unsubscribe>
List-Archive: <http://lists.linuxfoundation.org/pipermail/iommu/>
List-Post: <mailto:iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org>
List-Help: <mailto:iommu-request-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org?subject=help>
List-Subscribe: <https://lists.linuxfoundation.org/mailman/listinfo/iommu>,
	<mailto:iommu-request-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org?subject=subscribe>
Sender: iommu-bounces-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
Errors-To: iommu-bounces-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
To: "Li, ZhenHua" <zhen-hual-VXdhtT5mjnY@public.gmane.org>
Cc: Baoquan He <bhe-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>, tom.vaden-VXdhtT5mjnY@public.gmane.org, rwright-VXdhtT5mjnY@public.gmane.org, linux-pci-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, kexec-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r@public.gmane.org, linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, lisa.mitchell-VXdhtT5mjnY@public.gmane.org, iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org, doug.hatch-VXdhtT5mjnY@public.gmane.org, ishii.hironobu-+CUm20s59erQFUHtdCDX3A@public.gmane.org, bhelgaas-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org, billsumnerlinux-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org, li.zhang6-VXdhtT5mjnY@public.gmane.org, dwmw2-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org, vgoyal-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org
List-Id: iommu@lists.linux-foundation.org

On 04/07/15 at 05:55pm, Li, ZhenHua wrote:
> On 04/07/2015 05:08 PM, Dave Young wrote:
> >On 04/07/15 at 11:46am, Dave Young wrote:
> >>On 04/05/15 at 09:54am, Baoquan He wrote:
> >>>On 04/03/15 at 05:21pm, Dave Young wrote:
> >>>>On 04/03/15 at 05:01pm, Li, ZhenHua wrote:
> >>>>>Hi Dave,
> >>>>>
> >>>>>There may be some possibilities that the old iommu data is corrupted by
> >>>>>some other modules. Currently we do not have a better solution for the
> >>>>>dmar faults.
> >>>>>
> >>>>>But I think when this happens, we need to fix the module that corrupted
> >>>>>the old iommu data. I once met a similar problem in normal kernel, the
> >>>>>queue used by the qi_* functions was written again by another module.
> >>>>>The fix was in that module, not in iommu module.
> >>>>
> >>>>It is too late, there will be no chance to save vmcore then.
> >>>>
> >>>>Also if it is possible to continue corrupt other area of oldmem because
> >>>>of using old iommu tables then it will cause more problems.
> >>>>
> >>>>So I think the tables at least need some verifycation before being used.
> >>>>
> >>>
> >>>Yes, it's a good thinking anout this and verification is also an
> >>>interesting idea. kexec/kdump do a sha256 calculation on loaded kernel
> >>>and then verify this again when panic happens in purgatory. This checks
> >>>whether any code stomps into region reserved for kexec/kernel and corrupt
> >>>the loaded kernel.
> >>>
> >>>If this is decided to do it should be an enhancement to current
> >>>patchset but not a approach change. Since this patchset is going very
> >>>close to point as maintainers expected maybe this can be merged firstly,
> >>>then think about enhancement. After all without this patchset vt-d often
> >>>raised error message, hung.
> >>
> >>It does not convince me, we should do it right at the beginning instead of
> >>introduce something wrong.
> >>
> >>I wonder why the old dma can not be remap to a specific page in kdump kernel
> >>so that it will not corrupt more memory. But I may missed something, I will
> >>looking for old threads and catch up.
> >
> >I have read the old discussion, above way was dropped because it could corrupt
> >filesystem. Apologize about late commenting.
> >
> >But current solution sounds bad to me because of using old memory which is not
> >reliable.
> >
> >Thanks
> >Dave
> >
> Seems we do not have a better solution for the dmar faults.  But I believe
> we can find out how to verify the iommu data which is located in old memory.

That will be great, thanks.

So there's two things:
1) make sure old pg tables are right, this is what we were talking about.
2) avoid writing old memory, I suppose only dma read could corrupt filesystem,
right? So how about for any dma writes just create a scratch page in 2nd kernel
memory. Only using old page table for dma read.

Thanks
Dave

From mboxrd@z Thu Jan  1 00:00:00 1970
Return-path: <kexec-bounces+dwmw2=infradead.org@lists.infradead.org>
Received: from mx1.redhat.com ([209.132.183.28])
 by bombadil.infradead.org with esmtps (Exim 4.80.1 #2 (Red Hat Linux))
 id 1YfglM-0001ok-Gy
 for kexec@lists.infradead.org; Wed, 08 Apr 2015 03:34:49 +0000
Date: Wed, 8 Apr 2015 11:33:51 +0800
From: Dave Young <dyoung@redhat.com>
Subject: Re: [PATCH v9 0/10] iommu/vt-d: Fix intel vt-d faults in kdump kernel
Message-ID: <20150408033351.GI7213@localhost.localdomain>
References: <1426743388-26908-1-git-send-email-zhen-hual@hp.com>
 <20150403084031.GF22579@dhcp-128-53.nay.redhat.com>
 <551E56F6.60503@hp.com>
 <20150403092111.GG22579@dhcp-128-53.nay.redhat.com>
 <20150405015453.GB1562@dhcp-17-102.nay.redhat.com>
 <20150407034622.GB7213@localhost.localdomain>
 <20150407090837.GE7213@localhost.localdomain>
 <5523A977.1030707@hp.com>
MIME-Version: 1.0
Content-Disposition: inline
In-Reply-To: <5523A977.1030707@hp.com>
List-Id: <kexec.lists.infradead.org>
List-Unsubscribe: <http://lists.infradead.org/mailman/options/kexec>,
 <mailto:kexec-request@lists.infradead.org?subject=unsubscribe>
List-Archive: <http://lists.infradead.org/pipermail/kexec/>
List-Post: <mailto:kexec@lists.infradead.org>
List-Help: <mailto:kexec-request@lists.infradead.org?subject=help>
List-Subscribe: <http://lists.infradead.org/mailman/listinfo/kexec>,
 <mailto:kexec-request@lists.infradead.org?subject=subscribe>
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Sender: "kexec" <kexec-bounces@lists.infradead.org>
Errors-To: kexec-bounces+dwmw2=infradead.org@lists.infradead.org
To: "Li, ZhenHua" <zhen-hual@hp.com>
Cc: alex.williamson@redhat.com, indou.takao@jp.fujitsu.com, Baoquan He <bhe@redhat.com>, tom.vaden@hp.com, rwright@hp.com, linux-pci@vger.kernel.org, joro@8bytes.org, kexec@lists.infradead.org, linux-kernel@vger.kernel.org, lisa.mitchell@hp.com, jerry.hoemann@hp.com, iommu@lists.linux-foundation.org, ddutile@redhat.com, doug.hatch@hp.com, ishii.hironobu@jp.fujitsu.com, bhelgaas@google.com, billsumnerlinux@gmail.com, li.zhang6@hp.com, dwmw2@infradead.org, vgoyal@redhat.com

On 04/07/15 at 05:55pm, Li, ZhenHua wrote:
> On 04/07/2015 05:08 PM, Dave Young wrote:
> >On 04/07/15 at 11:46am, Dave Young wrote:
> >>On 04/05/15 at 09:54am, Baoquan He wrote:
> >>>On 04/03/15 at 05:21pm, Dave Young wrote:
> >>>>On 04/03/15 at 05:01pm, Li, ZhenHua wrote:
> >>>>>Hi Dave,
> >>>>>
> >>>>>There may be some possibilities that the old iommu data is corrupted by
> >>>>>some other modules. Currently we do not have a better solution for the
> >>>>>dmar faults.
> >>>>>
> >>>>>But I think when this happens, we need to fix the module that corrupted
> >>>>>the old iommu data. I once met a similar problem in normal kernel, the
> >>>>>queue used by the qi_* functions was written again by another module.
> >>>>>The fix was in that module, not in iommu module.
> >>>>
> >>>>It is too late, there will be no chance to save vmcore then.
> >>>>
> >>>>Also if it is possible to continue corrupt other area of oldmem because
> >>>>of using old iommu tables then it will cause more problems.
> >>>>
> >>>>So I think the tables at least need some verifycation before being used.
> >>>>
> >>>
> >>>Yes, it's a good thinking anout this and verification is also an
> >>>interesting idea. kexec/kdump do a sha256 calculation on loaded kernel
> >>>and then verify this again when panic happens in purgatory. This checks
> >>>whether any code stomps into region reserved for kexec/kernel and corrupt
> >>>the loaded kernel.
> >>>
> >>>If this is decided to do it should be an enhancement to current
> >>>patchset but not a approach change. Since this patchset is going very
> >>>close to point as maintainers expected maybe this can be merged firstly,
> >>>then think about enhancement. After all without this patchset vt-d often
> >>>raised error message, hung.
> >>
> >>It does not convince me, we should do it right at the beginning instead of
> >>introduce something wrong.
> >>
> >>I wonder why the old dma can not be remap to a specific page in kdump kernel
> >>so that it will not corrupt more memory. But I may missed something, I will
> >>looking for old threads and catch up.
> >
> >I have read the old discussion, above way was dropped because it could corrupt
> >filesystem. Apologize about late commenting.
> >
> >But current solution sounds bad to me because of using old memory which is not
> >reliable.
> >
> >Thanks
> >Dave
> >
> Seems we do not have a better solution for the dmar faults.  But I believe
> we can find out how to verify the iommu data which is located in old memory.

That will be great, thanks.

So there's two things:
1) make sure old pg tables are right, this is what we were talking about.
2) avoid writing old memory, I suppose only dma read could corrupt filesystem,
right? So how about for any dma writes just create a scratch page in 2nd kernel
memory. Only using old page table for dma read.

Thanks
Dave

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec