From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Zhang, Yang Z" Subject: Re: cpuidle and un-eoid interrupts at the local apic Date: Tue, 13 Aug 2013 01:43:48 +0000 Message-ID: References: <51A908CA.7050604@citrix.com><51F8CB15.1070608@digithi.de><51F8DD40.2090207@citrix.com><51FC37A9.9090809@digithi.de><51FC418D.8020708@citrix.com><51FFBA8502000078000E9462@nat28.tlf.novell.com><51FFBC08.6070804@citrix.com> <52055EC9.8030207@digithi.de><520561E1.8020809@citrix.com> <520562C8.8080703@citrix.com> <5207CE0C.1000502@digithi.de> <5208E933.1020609@digithi.de> <5208EBEC.9000308@citrix.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <5208EBEC.9000308@citrix.com> Content-Language: en-US List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Andrew Cooper , Thimo E Cc: Keir Fraser , Jan Beulich , "Dong, Eddie" , Xen-develList , "Nakajima, Jun" , "Zhang, Xiantao" List-Id: xen-devel@lists.xenproject.org Andrew Cooper wrote on 2013-08-12: > On 12/08/13 14:54, Thimo E wrote: > > > Hello Yang, > > and attached the next crash dump which occured today, only some > minutes after I've created the logfiles I've sent in the mail just before. > Perhaps together with the logfiles of the former mail it gives you a > better understand of what is going on. > > I've disabled Interrupt remapping now. > > > 4..... > > can you add some debug message in the guest EOI code path(like > _irq_guest_eoi())) to track the EOI? > @Andrew: Is it possible for you to integrate the requested changes > from Yang into your Xen debugging version ? > > > > I already have. That would be "Marked {foo} ready" debugging in the > PEOI stack section. I didn't find your debug patch that add PEOI stack tracing. Could you resend it? thanks. > > ~Andrew > > > > > Best regards > Thimo > > Am 12.08.2013 10:49, schrieb Zhang, Yang Z: > > > Hi Thimo, > > From your previous experience and log, it shows: > > 1. The interrupt that triggers the issue is a MSI. > > 2. MSI are treated as edge-triggered interrupts nomally, > except when there is no way to mask the device. In this case, your > previous log indicates the device is unmaskable(What special device > are you using?Modern PCI devcie should be maskable). > > 3. The IRQ 29 is belong to dom0, it seems it is not a HVM > related issue. > > 4. The status of IRQ 29 is 10 which means the guest already > issues the EOI because the bit IRQ_GUEST_EOI_PENDING is cleared, so > there should be no pending EOI in the EOI stack. If possible, can you > add some debug message in the guest EOI code path(like _irq_guest_eoi())) to track the EOI? > > 5. Both of the log show when the issue occured, most of the > other interrupts which owned by dom0 were in IRQ_MOVE_PENDING status. > Is it a coincidence? Or it happened only on the special condition like > heavy of IRQ migration?Perhaps you can disable irq balance in dom0 and > pin the IRQ manually. > > |6. I guess the interrupt remapping is enabled in your machine. > Can you try to disable IR to see whether it still reproduceable? > > Also, please provide the whole Xen log. > > > > Best regards, > > Yang > > > Best regards, Yang