linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: YASUAKI ISHIMATSU <yasu.isimatu@gmail.com>
To: Marc Zyngier <marc.zyngier@arm.com>
Cc: tglx@linutronix.de, axboe@kernel.dk, mpe@ellerman.id.au,
	keith.busch@intel.com, peterz@infradead.org,
	LKML <linux-kernel@vger.kernel.org>,
	yasu.isimatu@gmail.com
Subject: Re: system hung up when offlining CPUs
Date: Wed, 9 Aug 2017 15:09:33 -0400	[thread overview]
Message-ID: <cd524af7-1f20-1956-1e44-92a451053387@gmail.com> (raw)
In-Reply-To: <20170809124213.0d9518bb@why.wild-wind.fr.eu.org>

Hi Marc,

On 08/09/2017 07:42 AM, Marc Zyngier wrote:
> On Tue, 8 Aug 2017 15:25:35 -0400
> YASUAKI ISHIMATSU <yasu.isimatu@gmail.com> wrote:
> 
>> Hi Thomas,
>>
>> When offlining all CPUs except cpu0, system hung up with the following message.
>>
>> [...] INFO: task kworker/u384:1:1234 blocked for more than 120 seconds.
>> [...]       Not tainted 4.12.0-rc6+ #19
>> [...] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>> [...] kworker/u384:1  D    0  1234      2 0x00000000
>> [...] Workqueue: writeback wb_workfn (flush-253:0)
>> [...] Call Trace:
>> [...]  __schedule+0x28a/0x880
>> [...]  schedule+0x36/0x80
>> [...]  schedule_timeout+0x249/0x300
>> [...]  ? __schedule+0x292/0x880
>> [...]  __down_common+0xfc/0x132
>> [...]  ? _xfs_buf_find+0x2bb/0x510 [xfs]
>> [...]  __down+0x1d/0x1f
>> [...]  down+0x41/0x50
>> [...]  xfs_buf_lock+0x3c/0xf0 [xfs]
>> [...]  _xfs_buf_find+0x2bb/0x510 [xfs]
>> [...]  xfs_buf_get_map+0x2a/0x280 [xfs]
>> [...]  xfs_buf_read_map+0x2d/0x180 [xfs]
>> [...]  xfs_trans_read_buf_map+0xf5/0x310 [xfs]
>> [...]  xfs_btree_read_buf_block.constprop.35+0x78/0xc0 [xfs]
>> [...]  xfs_btree_lookup_get_block+0x88/0x160 [xfs]
>> [...]  xfs_btree_lookup+0xd0/0x3b0 [xfs]
>> [...]  ? xfs_allocbt_init_cursor+0x41/0xe0 [xfs]
>> [...]  xfs_alloc_ag_vextent_near+0xaf/0xaa0 [xfs]
>> [...]  xfs_alloc_ag_vextent+0x13c/0x150 [xfs]
>> [...]  xfs_alloc_vextent+0x425/0x590 [xfs]
>> [...]  xfs_bmap_btalloc+0x448/0x770 [xfs]
>> [...]  xfs_bmap_alloc+0xe/0x10 [xfs]
>> [...]  xfs_bmapi_write+0x61d/0xc10 [xfs]
>> [...]  ? kmem_zone_alloc+0x96/0x100 [xfs]
>> [...]  xfs_iomap_write_allocate+0x199/0x3a0 [xfs]
>> [...]  xfs_map_blocks+0x1e8/0x260 [xfs]
>> [...]  xfs_do_writepage+0x1ca/0x680 [xfs]
>> [...]  write_cache_pages+0x26f/0x510
>> [...]  ? xfs_vm_set_page_dirty+0x1d0/0x1d0 [xfs]
>> [...]  ? blk_mq_dispatch_rq_list+0x305/0x410
>> [...]  ? deadline_remove_request+0x7d/0xc0
>> [...]  xfs_vm_writepages+0xb6/0xd0 [xfs]
>> [...]  do_writepages+0x1c/0x70
>> [...]  __writeback_single_inode+0x45/0x320
>> [...]  writeback_sb_inodes+0x280/0x570
>> [...]  __writeback_inodes_wb+0x8c/0xc0
>> [...]  wb_writeback+0x276/0x310
>> [...]  ? get_nr_dirty_inodes+0x4d/0x80
>> [...]  wb_workfn+0x2d4/0x3b0
>> [...]  process_one_work+0x149/0x360
>> [...]  worker_thread+0x4d/0x3c0
>> [...]  kthread+0x109/0x140
>> [...]  ? rescuer_thread+0x380/0x380
>> [...]  ? kthread_park+0x60/0x60
>> [...]  ret_from_fork+0x25/0x30
>>
>>
>> I bisected upstream kernel. And I found that the following commit lead
>> the issue.
>>
>> commit c5cb83bb337c25caae995d992d1cdf9b317f83de
>> Author: Thomas Gleixner <tglx@linutronix.de>
>> Date:   Tue Jun 20 01:37:51 2017 +0200
>>
>>     genirq/cpuhotplug: Handle managed IRQs on CPU hotplug
> 
> Can you please post your /proc/interrupts and details of which
> interrupt you think goes wrong? This backtrace is not telling us much
> in terms of where to start looking...

Thank you for giving advise.

The issue is easily reproduced on physical/virtual machine by offling CPUs except cpu0.
Here are my /proc/interrupts on kvm guest before reproducing the issue. And when offlining
cpu1, the issue occurred. But when offling cpu0, the issue didn't occur.

           CPU0       CPU1
  0:        127          0   IO-APIC   2-edge      timer
  1:         10          0   IO-APIC   1-edge      i8042
  4:        227          0   IO-APIC   4-edge      ttyS0
  6:          3          0   IO-APIC   6-edge      floppy
  8:          0          0   IO-APIC   8-edge      rtc0
  9:          0          0   IO-APIC   9-fasteoi   acpi
 10:      10822          0   IO-APIC  10-fasteoi   ehci_hcd:usb1, uhci_hcd:usb2, virtio3
 11:         23          0   IO-APIC  11-fasteoi   uhci_hcd:usb3, uhci_hcd:usb4, qxl
 12:         15          0   IO-APIC  12-edge      i8042
 14:        218          0   IO-APIC  14-edge      ata_piix
 15:          0          0   IO-APIC  15-edge      ata_piix
 24:          0          0   PCI-MSI 49152-edge      virtio0-config
 25:        359          0   PCI-MSI 49153-edge      virtio0-input.0
 26:          1          0   PCI-MSI 49154-edge      virtio0-output.0
 27:          0          0   PCI-MSI 114688-edge      virtio2-config
 28:          1       3639   PCI-MSI 114689-edge      virtio2-req.0
 29:          0          0   PCI-MSI 98304-edge      virtio1-config
 30:          4          0   PCI-MSI 98305-edge      virtio1-virtqueues
 31:        189          0   PCI-MSI 65536-edge      snd_hda_intel:card0
NMI:          0          0   Non-maskable interrupts
LOC:      16115      12845   Local timer interrupts
SPU:          0          0   Spurious interrupts
PMI:          0          0   Performance monitoring interrupts
IWI:          0          0   IRQ work interrupts
RTR:          0          0   APIC ICR read retries
RES:       3016       2135   Rescheduling interrupts
CAL:       3666        557   Function call interrupts
TLB:         65         12   TLB shootdowns
TRM:          0          0   Thermal event interrupts
THR:          0          0   Threshold APIC interrupts
DFR:          0          0   Deferred Error APIC interrupts
MCE:          0          0   Machine check exceptions
MCP:          1          1   Machine check polls
ERR:          0
MIS:          0
PIN:          0          0   Posted-interrupt notification event
NPI:          0          0   Nested posted-interrupt event
PIW:          0          0   Posted-interrupt wakeup event


Thanks,
Yasuaki Ishimatsu

> 
> Thanks,
> 
> 	M.
> 

  reply	other threads:[~2017-08-09 19:09 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-08-08 19:25 system hung up when offlining CPUs YASUAKI ISHIMATSU
2017-08-09 11:42 ` Marc Zyngier
2017-08-09 19:09   ` YASUAKI ISHIMATSU [this message]
2017-08-10 11:54     ` Marc Zyngier
2017-08-21 12:07       ` Christoph Hellwig
2017-08-21 13:18       ` Christoph Hellwig
2017-08-21 13:37         ` Marc Zyngier
2017-09-07 20:23           ` YASUAKI ISHIMATSU
2017-09-12 18:15             ` YASUAKI ISHIMATSU
2017-09-13 11:13               ` Hannes Reinecke
2017-09-13 11:35                 ` Kashyap Desai
2017-09-13 13:33                   ` Thomas Gleixner
2017-09-14 16:28                     ` YASUAKI ISHIMATSU
2017-09-16 10:15                       ` Thomas Gleixner
2017-09-16 15:02                         ` Thomas Gleixner
2017-10-02 16:36                           ` YASUAKI ISHIMATSU
2017-10-03 21:44                             ` Thomas Gleixner
2017-10-04 21:04                               ` Thomas Gleixner
2017-10-09 11:35                                 ` [tip:irq/urgent] genirq/cpuhotplug: Add sanity check for effective affinity mask tip-bot for Thomas Gleixner
2017-10-09 11:35                                 ` [tip:irq/urgent] genirq/cpuhotplug: Enforce affinity setting on startup of managed irqs tip-bot for Thomas Gleixner
2017-10-10 16:30                                 ` system hung up when offlining CPUs YASUAKI ISHIMATSU
2017-10-16 18:59                                   ` YASUAKI ISHIMATSU
2017-10-16 20:27                                     ` Thomas Gleixner
2017-10-30  9:08                                       ` Shivasharan Srikanteshwara
2017-11-01  0:47                                         ` Thomas Gleixner
2017-11-01 11:01                                           ` Hannes Reinecke
2017-10-04 21:10                             ` Thomas Gleixner
  -- strict thread matches above, loose matches on Subject: below --
2017-08-08 19:24 YASUAKI ISHIMATSU

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=cd524af7-1f20-1956-1e44-92a451053387@gmail.com \
    --to=yasu.isimatu@gmail.com \
    --cc=axboe@kernel.dk \
    --cc=keith.busch@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=marc.zyngier@arm.com \
    --cc=mpe@ellerman.id.au \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).