From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Thimo E." Subject: Re: cpuidle and un-eoid interrupts at the local apic Date: Sun, 08 Sep 2013 01:37:34 +0200 Message-ID: <522BB8BE.8060800@digithi.de> References: <51A908CA.7050604@citrix.com><51F8CB15.1070608@digithi.de><51F8DD40.2090207@citrix.com><51FC37A9.9090809@digithi.de><51FC418D.8020708@citrix.com><51FFBA8502000078000E9462@nat28.tlf.novell.com><51FFBC08.6070804@citrix.com><52055EC9.8030207@digithi.de><5208B6DE02000078000EB08E@nat28.tlf.novell.com><5208AACF.7050901@citrix.com><5208CF9702000078000EB1F2@nat28.tlf.novell.com><5208B88F.1030002@citrix.com><520B5345.4000508@citrix.com> <522B29BC.3010003@digithi.de> <522B5C1A.1050100@citrix.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; Format="flowed" Content-Transfer-Encoding: 7bit Return-path: Received: from mail6.bemta14.messagelabs.com ([193.109.254.103]) by lists.xen.org with esmtp (Exim 4.72) (envelope-from ) id 1VIS4m-0003id-CT for xen-devel@lists.xenproject.org; Sat, 07 Sep 2013 23:38:00 +0000 In-Reply-To: <522B5C1A.1050100@citrix.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Andrew Cooper Cc: "Zhang, Yang Z" , xen-devel , Keir Fraser , Jan Beulich List-Id: xen-devel@lists.xenproject.org Hello Andrew, ok, thanks. This is what I assumed. The output of "xl debug-keys iMQ" is empty. [root@localhost ~]# dmesg |grep arcmsr [ 8.159321] arcmsr 0000:01:00.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16 [ 8.159413] arcmsr 0000:01:00.0: setting latency timer to 64 [ 8.170316] arcmsr 0000:01:00.0: get owner: 7ff0 [ 8.170414] arcmsr 0000:01:00.0: irq 1276 (276) for MSI/MSI-X [ 8.170421] IRQ 1276/arcmsr: IRQF_DISABLED is not guaranteed on shared IRQs [ 8.170654] arcmsr0: msi enabled [root@localhost /]# cat /proc/irq/1276/spurious count 61007 unhandled 8 last_unhandled 36736990 ms arcmsr is the driver of the Areca Storage Raid Controller. Used it already before with Xenserver 6.0.2 for years, no problems. THe messages "IRQF_DISABLED is not guaranteed...." and "8 unhandled interrupts" look interesting. I am not a kernel hacker but what I interpret from http://lxr.free-electrons.com/source/kernel/irq/manage.c?v=2.6.32: 1025 if ((irqflags & (IRQF_SHARED|IRQF_DISABLED)) == 1026 (IRQF_SHARED|IRQF_DISABLED)) { 1027 pr_warning( 1028 "IRQ %d/%s: IRQF_DISABLED is not guaranteed on shared IRQs\n", 1029 irq, devname); ... 738 * Force MSI interrupts to run with interrupts 739 * disabled. The multi vector cards can cause stack 740 * overflows due to nested interrupts when enough of 741 * them are directed to a core and fire at the same 742 * time. 743 */ 744 if (desc->msi_desc) 745 new->flags |= IRQF_DISABLED; --> "IRQF_DISABLED is not guaranteed on shared IRQs" warning is only printed when irqflags IRQF_SHARED and IRQF_DISABLED are set --> Is that what we see in the kernel oops the stack overflow the comment in lines 738-742 is talking about ?! --> IRQF_SHARED is set, so MSI interrupt 1276 is shared ?! I thought that it is not possible that MSI interrupts are shared. Attached you'll see my /proc/interrupts So what I do now is disabling MSI for the arcmsr driver. Could this be the source of the problem ?! But why is 1276 shared ?! Best regards Thimo Am 07.09.2013 19:02, schrieb Andrew Cooper: > > irq 29 is just an internal Xen number for accounting all interrupts. It > doesn't mean anything specific regarding hardware etc. The vector and > affinity would expect to change as dom0s vcpus are moved around by the > scheduler. > > domain-list=0 means that this interrupt is targeted at dom0 (It is a > list because certain interrupts have to be shared my more than 1 > domain). Helpfully, the keyhandler truncates the pirq field, so 276 is > unlikely to be correct. As it is a dom0 MSI, I am guessing it actually > matches up with interrupt 1276 in /proc/interrupts, if there is one. > > Can you provide the results of `xl debug-keys iMQ`, and attach > /proc/interrupts to this email (just in case the setup has changed after > playing with your BIOS) > > ~Andrew >