All of lore.kernel.org
 help / color / mirror / Atom feed
From: Kashyap Desai <kashyap.desai@broadcom.com>
To: Hannes Reinecke <hare@suse.de>,
	YASUAKI ISHIMATSU <yasu.isimatu@gmail.com>,
	Marc Zyngier <marc.zyngier@arm.com>,
	Christoph Hellwig <hch@lst.de>
Cc: tglx@linutronix.de, axboe@kernel.dk, mpe@ellerman.id.au,
	keith.busch@intel.com, peterz@infradead.org,
	LKML <linux-kernel@vger.kernel.org>,
	linux-scsi@vger.kernel.org,
	Sumit Saxena <sumit.saxena@broadcom.com>,
	Shivasharan Srikanteshwara 
	<shivasharan.srikanteshwara@broadcom.com>
Subject: RE: system hung up when offlining CPUs
Date: Wed, 13 Sep 2017 17:05:53 +0530	[thread overview]
Message-ID: <8cb26204cb5402824496bbb6b636e0af@mail.gmail.com> (raw)
In-Reply-To: <b3e88f4d-8ca4-e265-5e09-437285cb18f5@suse.de>

>
> On 09/12/2017 08:15 PM, YASUAKI ISHIMATSU wrote:
> > + linux-scsi and maintainers of megasas
> >
> > When offlining CPU, I/O stops. Do you have any ideas?
> >
> > On 09/07/2017 04:23 PM, YASUAKI ISHIMATSU wrote:
> >> Hi Mark and Christoph,
> >>
> >> Sorry for the late reply. I appreciated that you fixed the issue on kvm
> environment.
> >> But the issue still occurs on physical server.
> >>
> >> Here ares irq information that I summarized megasas irqs from
> >> /proc/interrupts and /proc/irq/*/smp_affinity_list on my server:
> >>
> >> ---
> >> IRQ affinity_list IRQ_TYPE
> >>  42        0-5    IR-PCI-MSI 1048576-edge megasas
> >>  43        0-5    IR-PCI-MSI 1048577-edge megasas
> >>  44        0-5    IR-PCI-MSI 1048578-edge megasas
> >>  45        0-5    IR-PCI-MSI 1048579-edge megasas
> >>  46        0-5    IR-PCI-MSI 1048580-edge megasas
> >>  47        0-5    IR-PCI-MSI 1048581-edge megasas
> >>  48        0-5    IR-PCI-MSI 1048582-edge megasas
> >>  49        0-5    IR-PCI-MSI 1048583-edge megasas
> >>  50        0-5    IR-PCI-MSI 1048584-edge megasas
> >>  51        0-5    IR-PCI-MSI 1048585-edge megasas
> >>  52        0-5    IR-PCI-MSI 1048586-edge megasas
> >>  53        0-5    IR-PCI-MSI 1048587-edge megasas
> >>  54        0-5    IR-PCI-MSI 1048588-edge megasas
> >>  55        0-5    IR-PCI-MSI 1048589-edge megasas
> >>  56        0-5    IR-PCI-MSI 1048590-edge megasas
> >>  57        0-5    IR-PCI-MSI 1048591-edge megasas
> >>  58        0-5    IR-PCI-MSI 1048592-edge megasas
> >>  59        0-5    IR-PCI-MSI 1048593-edge megasas
> >>  60        0-5    IR-PCI-MSI 1048594-edge megasas
> >>  61        0-5    IR-PCI-MSI 1048595-edge megasas
> >>  62        0-5    IR-PCI-MSI 1048596-edge megasas
> >>  63        0-5    IR-PCI-MSI 1048597-edge megasas
> >>  64        0-5    IR-PCI-MSI 1048598-edge megasas
> >>  65        0-5    IR-PCI-MSI 1048599-edge megasas
> >>  66      24-29    IR-PCI-MSI 1048600-edge megasas
> >>  67      24-29    IR-PCI-MSI 1048601-edge megasas
> >>  68      24-29    IR-PCI-MSI 1048602-edge megasas
> >>  69      24-29    IR-PCI-MSI 1048603-edge megasas
> >>  70      24-29    IR-PCI-MSI 1048604-edge megasas
> >>  71      24-29    IR-PCI-MSI 1048605-edge megasas
> >>  72      24-29    IR-PCI-MSI 1048606-edge megasas
> >>  73      24-29    IR-PCI-MSI 1048607-edge megasas
> >>  74      24-29    IR-PCI-MSI 1048608-edge megasas
> >>  75      24-29    IR-PCI-MSI 1048609-edge megasas
> >>  76      24-29    IR-PCI-MSI 1048610-edge megasas
> >>  77      24-29    IR-PCI-MSI 1048611-edge megasas
> >>  78      24-29    IR-PCI-MSI 1048612-edge megasas
> >>  79      24-29    IR-PCI-MSI 1048613-edge megasas
> >>  80      24-29    IR-PCI-MSI 1048614-edge megasas
> >>  81      24-29    IR-PCI-MSI 1048615-edge megasas
> >>  82      24-29    IR-PCI-MSI 1048616-edge megasas
> >>  83      24-29    IR-PCI-MSI 1048617-edge megasas
> >>  84      24-29    IR-PCI-MSI 1048618-edge megasas
> >>  85      24-29    IR-PCI-MSI 1048619-edge megasas
> >>  86      24-29    IR-PCI-MSI 1048620-edge megasas
> >>  87      24-29    IR-PCI-MSI 1048621-edge megasas
> >>  88      24-29    IR-PCI-MSI 1048622-edge megasas
> >>  89      24-29    IR-PCI-MSI 1048623-edge megasas
> >> ---
> >>
> >> In my server, IRQ#66-89 are sent to CPU#24-29. And if I offline
> >> CPU#24-29, I/O does not work, showing the following messages.
> >>
> >> ---
> >> [...] sd 0:2:0:0: [sda] tag#1 task abort called for
> >> scmd(ffff8820574d7560) [...] sd 0:2:0:0: [sda] tag#1 CDB: Read(10) 28
> >> 00 0d e8 cf 78 00 00 08 00 [...] sd 0:2:0:0: task abort: FAILED
> >> scmd(ffff8820574d7560) [...] sd 0:2:0:0: [sda] tag#0 task abort
> >> called for scmd(ffff882057426560) [...] sd 0:2:0:0: [sda] tag#0 CDB:
> >> Write(10) 2a 00 0d 58 37 00 00 00 08 00 [...] sd 0:2:0:0: task abort:
> >> FAILED scmd(ffff882057426560) [...] sd 0:2:0:0: target reset called
> >> for scmd(ffff8820574d7560) [...] sd 0:2:0:0: [sda] tag#1 megasas:
> >> target
> reset FAILED!!
> >> [...] sd 0:2:0:0: [sda] tag#0 Controller reset is requested due to IO
> >> timeout
> >> [...] SCSI command pointer: (ffff882057426560)   SCSI host state: 5
> >> SCSI
> >> [...] IO request frame:
> >> [...]
> >> <snip>
> >> [...]
> >> [...] megaraid_sas 0000:02:00.0: [ 0]waiting for 2 commands to
> >> complete for scsi0 [...] INFO: task auditd:1200 blocked for more than
> >> 120
> seconds.
> >> [...]       Not tainted 4.13.0+ #15
> >> [...] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
> message.
> >> [...] auditd          D    0  1200      1 0x00000000
> >> [...] Call Trace:
> >> [...]  __schedule+0x28d/0x890
> >> [...]  schedule+0x36/0x80
> >> [...]  io_schedule+0x16/0x40
> >> [...]  wait_on_page_bit_common+0x109/0x1c0
> >> [...]  ? page_cache_tree_insert+0xf0/0xf0 [...]
> >> __filemap_fdatawait_range+0x127/0x190
> >> [...]  ? __filemap_fdatawrite_range+0xd1/0x100
> >> [...]  file_write_and_wait_range+0x60/0xb0
> >> [...]  xfs_file_fsync+0x67/0x1d0 [xfs] [...]
> >> vfs_fsync_range+0x3d/0xb0 [...]  do_fsync+0x3d/0x70 [...]
> >> SyS_fsync+0x10/0x20 [...]  entry_SYSCALL_64_fastpath+0x1a/0xa5
> >> [...] RIP: 0033:0x7f0bd9633d2d
> >> [...] RSP: 002b:00007f0bd751ed30 EFLAGS: 00000293 ORIG_RAX:
> >> 000000000000004a [...] RAX: ffffffffffffffda RBX: 00005590566d0080
> >> RCX: 00007f0bd9633d2d [...] RDX: 00005590566d1260 RSI:
> >> 0000000000000000 RDI: 0000000000000005 [...] RBP: 0000000000000000
> >> R08: 0000000000000000 R09: 0000000000000017 [...] R10:
> >> 0000000000000000 R11: 0000000000000293 R12: 0000000000000000 [...]
> >> R13: 00007f0bd751f9c0 R14: 00007f0bd751f700 R15: 0000000000000000
> >> ---
> >>
> >> Thanks,
> >> Yasuaki Ishimatsu
> >>
>
> This indeed looks like a problem.
> We're going to great lengths to submit and complete I/O on the same CPU,
> so
> if the CPU is offlined while I/O is in flight we won't be getting a
> completion for
> this particular I/O.
> However, the megasas driver should be able to cope with this situation;
> after
> all, the firmware maintains completions queues, so it would be dead easy
> to
> look at _other_ completions queues, too, if a timeout occurs.
In case of IO timeout, megaraid_sas driver is checking other queues as well.
That is why IO was completed in this case and further IOs were resumed.

Driver complete commands as below code executed from
megasas_wait_for_outstanding_fusion().
    for (MSIxIndex = 0 ; MSIxIndex < count; MSIxIndex++)
        complete_cmd_fusion(instance, MSIxIndex);

Because of above code executed in driver, we see only one print as below in
this logs.
megaraid_sas 0000:02:00.0: [ 0]waiting for 2 commands to complete for scsi0

As per below link CPU hotplug will take care- "All interrupts targeted to
this CPU are migrated to a new CPU"
https://www.kernel.org/doc/html/v4.11/core-api/cpu_hotplug.html

BTW - We are also able reproduce this issue locally.  Reason for IO timeout
is -" IO is completed, but corresponding interrupt did not arrived on Online
CPU. Either missed due to CPU is in transient state of being OFFLINED. I am
not sure which component should take care this."

Question - "what happens once __cpu_disable is called and some of the queued
interrupt has affinity to that particular CPU ?"
I assume ideally those pending/queued Interrupt should be migrated to
remaining online CPUs. It should not be unhandled if we want to avoid such
IO timeout.

Kashyap

> Also the IRQ affinity looks bogus (we should spread IRQs to _all_ CPUs,
> not
> just a subset), and the driver should make sure to receive completions
> even if
> the respective CPUs are offlined.
> Alternatively it should not try to submit a command abort via an offlined
> CPUs; that's guaranteed to run into the same problems.
>
> So it looks more like a driver issue to me...
>
> Cheers,
>
> Hannes
> --
> Dr. Hannes Reinecke		   Teamlead Storage & Networking
> hare@suse.de			               +49 911 74053 688
> SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
> GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton HRB 21284
> (AG Nürnberg)

  reply	other threads:[~2017-09-13 11:36 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-08-08 19:25 system hung up when offlining CPUs YASUAKI ISHIMATSU
2017-08-09 11:42 ` Marc Zyngier
2017-08-09 19:09   ` YASUAKI ISHIMATSU
2017-08-10 11:54     ` Marc Zyngier
2017-08-21 12:07       ` Christoph Hellwig
2017-08-21 13:18       ` Christoph Hellwig
2017-08-21 13:37         ` Marc Zyngier
2017-09-07 20:23           ` YASUAKI ISHIMATSU
2017-09-12 18:15             ` YASUAKI ISHIMATSU
2017-09-13 11:13               ` Hannes Reinecke
2017-09-13 11:35                 ` Kashyap Desai [this message]
2017-09-13 11:35                   ` Kashyap Desai
2017-09-13 13:33                   ` Thomas Gleixner
2017-09-13 13:33                     ` Thomas Gleixner
2017-09-14 16:28                     ` YASUAKI ISHIMATSU
2017-09-14 16:28                       ` YASUAKI ISHIMATSU
2017-09-16 10:15                       ` Thomas Gleixner
2017-09-16 10:15                         ` Thomas Gleixner
2017-09-16 15:02                         ` Thomas Gleixner
2017-09-16 15:02                           ` Thomas Gleixner
2017-10-02 16:36                           ` YASUAKI ISHIMATSU
2017-10-02 16:36                             ` YASUAKI ISHIMATSU
2017-10-03 21:44                             ` Thomas Gleixner
2017-10-03 21:44                               ` Thomas Gleixner
2017-10-04 21:04                               ` Thomas Gleixner
2017-10-04 21:04                                 ` Thomas Gleixner
2017-10-09 11:35                                 ` [tip:irq/urgent] genirq/cpuhotplug: Add sanity check for effective affinity mask tip-bot for Thomas Gleixner
2017-10-09 11:35                                 ` [tip:irq/urgent] genirq/cpuhotplug: Enforce affinity setting on startup of managed irqs tip-bot for Thomas Gleixner
2017-10-10 16:30                                 ` system hung up when offlining CPUs YASUAKI ISHIMATSU
2017-10-10 16:30                                   ` YASUAKI ISHIMATSU
2017-10-16 18:59                                   ` YASUAKI ISHIMATSU
2017-10-16 18:59                                     ` YASUAKI ISHIMATSU
2017-10-16 20:27                                     ` Thomas Gleixner
2017-10-16 20:27                                       ` Thomas Gleixner
2017-10-30  9:08                                       ` Shivasharan Srikanteshwara
2017-10-30  9:08                                         ` Shivasharan Srikanteshwara
2017-11-01  0:47                                         ` Thomas Gleixner
2017-11-01  0:47                                           ` Thomas Gleixner
2017-11-01 11:01                                           ` Hannes Reinecke
2017-11-01 11:01                                             ` Hannes Reinecke
2017-10-04 21:10                             ` Thomas Gleixner
2017-10-04 21:10                               ` Thomas Gleixner
  -- strict thread matches above, loose matches on Subject: below --
2017-08-08 19:24 YASUAKI ISHIMATSU

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8cb26204cb5402824496bbb6b636e0af@mail.gmail.com \
    --to=kashyap.desai@broadcom.com \
    --cc=axboe@kernel.dk \
    --cc=hare@suse.de \
    --cc=hch@lst.de \
    --cc=keith.busch@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=marc.zyngier@arm.com \
    --cc=mpe@ellerman.id.au \
    --cc=peterz@infradead.org \
    --cc=shivasharan.srikanteshwara@broadcom.com \
    --cc=sumit.saxena@broadcom.com \
    --cc=tglx@linutronix.de \
    --cc=yasu.isimatu@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.