From mboxrd@z Thu Jan  1 00:00:00 1970
From: "Zhang, Yang Z" <yang.z.zhang@intel.com>
Subject: Re: cpuidle and un-eoid interrupts at the local apic
Date: Tue, 13 Aug 2013 01:43:48 +0000
Message-ID: <A9667DDFB95DB7438FA9D7D576C3D87E0A8E1E39@SHSMSX104.ccr.corp.intel.com>
References: <51A908CA.7050604@citrix.com><51F8CB15.1070608@digithi.de><51F8DD40.2090207@citrix.com><51FC37A9.9090809@digithi.de><51FC418D.8020708@citrix.com><51FFBA8502000078000E9462@nat28.tlf.novell.com><51FFBC08.6070804@citrix.com>
	<52055EC9.8030207@digithi.de><520561E1.8020809@citrix.com>
	<520562C8.8080703@citrix.com> <5207CE0C.1000502@digithi.de>
	<A9667DDFB95DB7438FA9D7D576C3D87E0A8E11A4@SHSMSX104.ccr.corp.intel.com>
	<5208E933.1020609@digithi.de> <5208EBEC.9000308@citrix.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Return-path: <xen-devel-bounces@lists.xen.org>
In-Reply-To: <5208EBEC.9000308@citrix.com>
Content-Language: en-US
List-Unsubscribe: <http://lists.xen.org/cgi-bin/mailman/options/xen-devel>,
	<mailto:xen-devel-request@lists.xen.org?subject=unsubscribe>
List-Post: <mailto:xen-devel@lists.xen.org>
List-Help: <mailto:xen-devel-request@lists.xen.org?subject=help>
List-Subscribe: <http://lists.xen.org/cgi-bin/mailman/listinfo/xen-devel>,
	<mailto:xen-devel-request@lists.xen.org?subject=subscribe>
Sender: xen-devel-bounces@lists.xen.org
Errors-To: xen-devel-bounces@lists.xen.org
To: Andrew Cooper <andrew.cooper3@citrix.com>, Thimo E <abc@digithi.de>
Cc: Keir Fraser <keir@xen.org>, Jan Beulich <JBeulich@suse.com>, "Dong,
	Eddie" <eddie.dong@intel.com>, Xen-develList <xen-devel@lists.xen.org>, "Nakajima,
	Jun" <jun.nakajima@intel.com>, "Zhang, Xiantao" <xiantao.zhang@intel.com>
List-Id: xen-devel@lists.xenproject.org

Andrew Cooper wrote on 2013-08-12:
> On 12/08/13 14:54, Thimo E wrote:
> 
> 
> 	Hello Yang,
> 
> 	and attached the next crash dump which occured today, only some
> minutes after I've created the logfiles I've sent in the mail just before.
> 	Perhaps together with the logfiles of the former mail it gives you a
> better understand of what is going on.
> 
> 	I've disabled Interrupt remapping now.
> 
> 	> 4.....
> 	> can you add some debug message in the guest EOI code path(like
> _irq_guest_eoi())) to track the EOI?
> 	@Andrew: Is it possible for you to integrate the requested changes
> from Yang into your Xen debugging version ?
> 
> 
> 
> I already have.  That would be "Marked {foo} ready" debugging in the
> PEOI stack section.
I didn't find your debug patch that add PEOI stack tracing. Could you resend it? thanks.

> 
> ~Andrew
> 
> 
> 
> 
> 	Best regards
> 	  Thimo
> 
> 	Am 12.08.2013 10:49, schrieb Zhang, Yang Z:
> 
> 
> 		Hi Thimo,
> 
> 		From your previous experience and log, it shows:
> 
> 		1.       The interrupt that triggers the issue is a MSI.
> 
> 		2.       MSI are treated as edge-triggered interrupts nomally,
> except when there is no way to mask the device. In this case, your
> previous log indicates the device is unmaskable(What special device
> are you using?Modern PCI devcie should be maskable).
> 
> 		3.       The IRQ 29 is belong to dom0, it seems it is not a HVM
> related issue.
> 
> 		4.       The status of IRQ 29 is 10 which means the guest already
> issues the EOI because the bit IRQ_GUEST_EOI_PENDING is cleared, so
> there should be no pending EOI in the EOI stack. If possible, can you
> add some debug message in the guest EOI code path(like _irq_guest_eoi())) to track the EOI?
> 
> 		5.       Both of the log show when the issue occured, most of the
> other interrupts which owned by dom0 were in IRQ_MOVE_PENDING status.
> Is it a coincidence? Or it happened only on the special condition like
> heavy of IRQ migration?Perhaps you can disable irq balance in dom0 and
> pin the IRQ manually.
> 
> 	|6.       I guess the interrupt remapping is enabled in your machine.
> Can you try to disable IR to see whether it still reproduceable?
> 
> 		Also, please provide the whole Xen log.
> 
> 
> 
> 		Best regards,
> 
> 		Yang
> 
> 
>


Best regards,
Yang