From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751704AbdINQ2S (ORCPT ); Thu, 14 Sep 2017 12:28:18 -0400 Received: from mail-qk0-f194.google.com ([209.85.220.194]:33453 "EHLO mail-qk0-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751445AbdINQ2Q (ORCPT ); Thu, 14 Sep 2017 12:28:16 -0400 X-Google-Smtp-Source: AOwi7QAx7YA2rdAx9l5I8ZELMiGGI4VEi4rivtDk4GdbbSF0FkXbUWGHc5Zqw0pNKFaCC0Ny/ra5rQ== Subject: Re: system hung up when offlining CPUs To: Thomas Gleixner , Kashyap Desai References: <20170809124213.0d9518bb@why.wild-wind.fr.eu.org> <20170821131809.GA17564@lst.de> <8e0d76cd-7cd4-3a98-12ba-815f00d4d772@gmail.com> <2f2ae1bc-4093-d083-6a18-96b9aaa090c9@gmail.com> <8cb26204cb5402824496bbb6b636e0af@mail.gmail.com> Cc: Hannes Reinecke , Marc Zyngier , Christoph Hellwig , axboe@kernel.dk, mpe@ellerman.id.au, keith.busch@intel.com, peterz@infradead.org, LKML , linux-scsi@vger.kernel.org, Sumit Saxena , Shivasharan Srikanteshwara , yasu.isimatu@gmail.com From: YASUAKI ISHIMATSU Message-ID: <3ce6837a-9aba-0ff4-64b9-7ebca5afca13@gmail.com> Date: Thu, 14 Sep 2017 12:28:11 -0400 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.8.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 09/13/2017 09:33 AM, Thomas Gleixner wrote: > On Wed, 13 Sep 2017, Kashyap Desai wrote: >>> On 09/12/2017 08:15 PM, YASUAKI ISHIMATSU wrote: >>>> + linux-scsi and maintainers of megasas > >>>>> In my server, IRQ#66-89 are sent to CPU#24-29. And if I offline >>>>> CPU#24-29, I/O does not work, showing the following messages. > > .... > >>> This indeed looks like a problem. >>> We're going to great lengths to submit and complete I/O on the same CPU, >>> so >>> if the CPU is offlined while I/O is in flight we won't be getting a >>> completion for >>> this particular I/O. >>> However, the megasas driver should be able to cope with this situation; >>> after >>> all, the firmware maintains completions queues, so it would be dead easy >>> to >>> look at _other_ completions queues, too, if a timeout occurs. >> In case of IO timeout, megaraid_sas driver is checking other queues as well. >> That is why IO was completed in this case and further IOs were resumed. >> >> Driver complete commands as below code executed from >> megasas_wait_for_outstanding_fusion(). >> for (MSIxIndex = 0 ; MSIxIndex < count; MSIxIndex++) >> complete_cmd_fusion(instance, MSIxIndex); >> >> Because of above code executed in driver, we see only one print as below in >> this logs. >> megaraid_sas 0000:02:00.0: [ 0]waiting for 2 commands to complete for scsi0 >> >> As per below link CPU hotplug will take care- "All interrupts targeted to >> this CPU are migrated to a new CPU" >> https://www.kernel.org/doc/html/v4.11/core-api/cpu_hotplug.html >> >> BTW - We are also able reproduce this issue locally. Reason for IO timeout >> is -" IO is completed, but corresponding interrupt did not arrived on Online >> CPU. Either missed due to CPU is in transient state of being OFFLINED. I am >> not sure which component should take care this." >> >> Question - "what happens once __cpu_disable is called and some of the queued >> interrupt has affinity to that particular CPU ?" >> I assume ideally those pending/queued Interrupt should be migrated to >> remaining online CPUs. It should not be unhandled if we want to avoid such >> IO timeout. > > Can you please provide the following information, before and after > offlining the last CPU in the affinity set: > > # cat /proc/irq/$IRQNUM/smp_affinity_list > # cat /proc/irq/$IRQNUM/effective_affinity > # cat /sys/kernel/debug/irq/irqs/$IRQNUM > > The last one requires: CONFIG_GENERIC_IRQ_DEBUGFS=y Here are one irq's info of megasas: - Before offline CPU /proc/irq/70/smp_affinity_list 24-29 /proc/irq/70/effective_affinity 00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,3f000000 /sys/kernel/debug/irq/irqs/70 handler: handle_edge_irq status: 0x00004000 istate: 0x00000000 ddepth: 0 wdepth: 0 dstate: 0x00609200 IRQD_ACTIVATED IRQD_IRQ_STARTED IRQD_MOVE_PCNTXT IRQD_AFFINITY_SET IRQD_AFFINITY_MANAGED node: 1 affinity: 24-29 effectiv: 24-29 pending: domain: INTEL-IR-MSI-0-2 hwirq: 0x100018 chip: IR-PCI-MSI flags: 0x10 IRQCHIP_SKIP_SET_WAKE parent: domain: INTEL-IR-0 hwirq: 0x400000 chip: INTEL-IR flags: 0x0 parent: domain: VECTOR hwirq: 0x46 chip: APIC flags: 0x0 - After offline CPU#24-29 /proc/irq/70/smp_affinity_list 29 /proc/irq/70/effective_affinity 00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,20000000 /sys/kernel/debug/irq/irqs/70 handler: handle_edge_irq status: 0x00004000 istate: 0x00000000 ddepth: 1 wdepth: 0 dstate: 0x00a39000 IRQD_IRQ_DISABLED IRQD_IRQ_MASKED IRQD_MOVE_PCNTXT IRQD_AFFINITY_SET IRQD_AFFINITY_MANAGED IRQD_MANAGED_SHUTDOWN node: 1 affinity: 29 effectiv: 29 pending: domain: INTEL-IR-MSI-0-2 hwirq: 0x100018 chip: IR-PCI-MSI flags: 0x10 IRQCHIP_SKIP_SET_WAKE parent: domain: INTEL-IR-0 hwirq: 0x400000 chip: INTEL-IR flags: 0x0 parent: domain: VECTOR hwirq: 0x46 chip: APIC flags: 0x0 Thanks, Yasuaki Ishimatsu > > Thanks, > > tglx > From mboxrd@z Thu Jan 1 00:00:00 1970 From: YASUAKI ISHIMATSU Subject: Re: system hung up when offlining CPUs Date: Thu, 14 Sep 2017 12:28:11 -0400 Message-ID: <3ce6837a-9aba-0ff4-64b9-7ebca5afca13@gmail.com> References: <20170809124213.0d9518bb@why.wild-wind.fr.eu.org> <20170821131809.GA17564@lst.de> <8e0d76cd-7cd4-3a98-12ba-815f00d4d772@gmail.com> <2f2ae1bc-4093-d083-6a18-96b9aaa090c9@gmail.com> <8cb26204cb5402824496bbb6b636e0af@mail.gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: Sender: linux-kernel-owner@vger.kernel.org To: Thomas Gleixner , Kashyap Desai Cc: Hannes Reinecke , Marc Zyngier , Christoph Hellwig , axboe@kernel.dk, mpe@ellerman.id.au, keith.busch@intel.com, peterz@infradead.org, LKML , linux-scsi@vger.kernel.org, Sumit Saxena , Shivasharan Srikanteshwara , yasu.isimatu@gmail.com List-Id: linux-scsi@vger.kernel.org On 09/13/2017 09:33 AM, Thomas Gleixner wrote: > On Wed, 13 Sep 2017, Kashyap Desai wrote: >>> On 09/12/2017 08:15 PM, YASUAKI ISHIMATSU wrote: >>>> + linux-scsi and maintainers of megasas > >>>>> In my server, IRQ#66-89 are sent to CPU#24-29. And if I offline >>>>> CPU#24-29, I/O does not work, showing the following messages. > > .... > >>> This indeed looks like a problem. >>> We're going to great lengths to submit and complete I/O on the same CPU, >>> so >>> if the CPU is offlined while I/O is in flight we won't be getting a >>> completion for >>> this particular I/O. >>> However, the megasas driver should be able to cope with this situation; >>> after >>> all, the firmware maintains completions queues, so it would be dead easy >>> to >>> look at _other_ completions queues, too, if a timeout occurs. >> In case of IO timeout, megaraid_sas driver is checking other queues as well. >> That is why IO was completed in this case and further IOs were resumed. >> >> Driver complete commands as below code executed from >> megasas_wait_for_outstanding_fusion(). >> for (MSIxIndex = 0 ; MSIxIndex < count; MSIxIndex++) >> complete_cmd_fusion(instance, MSIxIndex); >> >> Because of above code executed in driver, we see only one print as below in >> this logs. >> megaraid_sas 0000:02:00.0: [ 0]waiting for 2 commands to complete for scsi0 >> >> As per below link CPU hotplug will take care- "All interrupts targeted to >> this CPU are migrated to a new CPU" >> https://www.kernel.org/doc/html/v4.11/core-api/cpu_hotplug.html >> >> BTW - We are also able reproduce this issue locally. Reason for IO timeout >> is -" IO is completed, but corresponding interrupt did not arrived on Online >> CPU. Either missed due to CPU is in transient state of being OFFLINED. I am >> not sure which component should take care this." >> >> Question - "what happens once __cpu_disable is called and some of the queued >> interrupt has affinity to that particular CPU ?" >> I assume ideally those pending/queued Interrupt should be migrated to >> remaining online CPUs. It should not be unhandled if we want to avoid such >> IO timeout. > > Can you please provide the following information, before and after > offlining the last CPU in the affinity set: > > # cat /proc/irq/$IRQNUM/smp_affinity_list > # cat /proc/irq/$IRQNUM/effective_affinity > # cat /sys/kernel/debug/irq/irqs/$IRQNUM > > The last one requires: CONFIG_GENERIC_IRQ_DEBUGFS=y Here are one irq's info of megasas: - Before offline CPU /proc/irq/70/smp_affinity_list 24-29 /proc/irq/70/effective_affinity 00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,3f000000 /sys/kernel/debug/irq/irqs/70 handler: handle_edge_irq status: 0x00004000 istate: 0x00000000 ddepth: 0 wdepth: 0 dstate: 0x00609200 IRQD_ACTIVATED IRQD_IRQ_STARTED IRQD_MOVE_PCNTXT IRQD_AFFINITY_SET IRQD_AFFINITY_MANAGED node: 1 affinity: 24-29 effectiv: 24-29 pending: domain: INTEL-IR-MSI-0-2 hwirq: 0x100018 chip: IR-PCI-MSI flags: 0x10 IRQCHIP_SKIP_SET_WAKE parent: domain: INTEL-IR-0 hwirq: 0x400000 chip: INTEL-IR flags: 0x0 parent: domain: VECTOR hwirq: 0x46 chip: APIC flags: 0x0 - After offline CPU#24-29 /proc/irq/70/smp_affinity_list 29 /proc/irq/70/effective_affinity 00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,20000000 /sys/kernel/debug/irq/irqs/70 handler: handle_edge_irq status: 0x00004000 istate: 0x00000000 ddepth: 1 wdepth: 0 dstate: 0x00a39000 IRQD_IRQ_DISABLED IRQD_IRQ_MASKED IRQD_MOVE_PCNTXT IRQD_AFFINITY_SET IRQD_AFFINITY_MANAGED IRQD_MANAGED_SHUTDOWN node: 1 affinity: 29 effectiv: 29 pending: domain: INTEL-IR-MSI-0-2 hwirq: 0x100018 chip: IR-PCI-MSI flags: 0x10 IRQCHIP_SKIP_SET_WAKE parent: domain: INTEL-IR-0 hwirq: 0x400000 chip: INTEL-IR flags: 0x0 parent: domain: VECTOR hwirq: 0x46 chip: APIC flags: 0x0 Thanks, Yasuaki Ishimatsu > > Thanks, > > tglx >