All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: kernel hangs and crashes with 4.15 (also 4.13, 4.14)
@ 2018-02-12 11:29 tang.junhui
  0 siblings, 0 replies; 5+ messages in thread
From: tang.junhui @ 2018-02-12 11:29 UTC (permalink / raw)
  To: bcache; +Cc: linux-bcache, tang.junhui

From: Tang Junhui <tang.junhui@zte.com.cn>


Hello Matthias,

Thanks for your detail information, I will try to reproduce this issue
in my machine, but I am sorry that It would be 10 days later because the
Spring Festival holiday is comming. I also hope somebody would resolve
this problem before I am back.


> > On Mon, Feb 12, 2018 at 04:57:52PM +0800, tang.junhui@zte.com.cn wrote:
> > > From: Tang Junhui <tang.junhui@zte.com.cn>
> > > 
> > > 
> > > Hello Matthias,
> > > 
> > > What kind of caching mode do you use? Writeback or Writethrough?
> > 
> > Hi, thanks for looking into this.
> > 
> > This is with writeback caching (haven't tried writethrough yet).
> > 
> > Some more info about the disk drives used:
> > 
> > # ls -l /dev/disk/by-id  | grep -v -- -part | grep -v wwn-
> > total 0
> > lrwxrwxrwx 1 root root  9 Feb  8 00:48 ata-INTEL_SSDSC2BA200G4_BTHV7374017P200MGN -> ../../sdf
> > lrwxrwxrwx 1 root root  9 Feb  8 00:48 ata-WDC_WD2005FBYZ-01YCBB2_WD-WMC6N0K0SHRT -> ../../sdc
> > lrwxrwxrwx 1 root root  9 Feb  8 00:48 ata-WDC_WD2005FBYZ-01YCBB2_WD-WMC6N0K2KNK4 -> ../../sda
> > lrwxrwxrwx 1 root root  9 Feb  8 00:48 ata-WDC_WD2005FBYZ-01YCBB2_WD-WMC6N0K5KHNS -> ../../sdb
> > lrwxrwxrwx 1 root root  9 Feb  8 00:48 ata-WDC_WD2005FBYZ-01YCBB2_WD-WMC6N0K7212Z -> ../../sde
> > lrwxrwxrwx 1 root root  9 Feb  8 00:48 usb-Samsung_Flash_Drive_0352315110002396-0:0 -> ../../sdd
> 
> More de-obfuscation:
>    - Intel SSD DC S3710 SATA 200GB for caching
>    - WD Gold SATA 2TB for backing device
> 
> I now also tested with alternatives writethrough and with "hdparm -W 1",
> they also crash.
> 
> Perhaps more relevant: fio is started immediately after setup of md
> raid and bcache, while md sync speed is set very low:
> 
>     echo "1" >/sys/block/md127/md/sync_speed_min
>     echo "1" >/sys/block/md127/md/sync_speed_max
> 
> 
> I attached the script I use for producing the crashes, hope this helps.

Thanks.
Tang Junhui

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: kernel hangs and crashes with 4.15 (also 4.13, 4.14)
  2018-02-12  9:41 ` Matthias Ferdinand
@ 2018-02-12 10:55   ` Matthias Ferdinand
  0 siblings, 0 replies; 5+ messages in thread
From: Matthias Ferdinand @ 2018-02-12 10:55 UTC (permalink / raw)
  To: tang.junhui; +Cc: linux-bcache

[-- Attachment #1: Type: text/plain, Size: 1633 bytes --]

On Mon, Feb 12, 2018 at 10:41:05AM +0100, Matthias Ferdinand wrote:
> On Mon, Feb 12, 2018 at 04:57:52PM +0800, tang.junhui@zte.com.cn wrote:
> > From: Tang Junhui <tang.junhui@zte.com.cn>
> > 
> > 
> > Hello Matthias,
> > 
> > What kind of caching mode do you use? Writeback or Writethrough?
> 
> Hi, thanks for looking into this.
> 
> This is with writeback caching (haven't tried writethrough yet).
> 
> Some more info about the disk drives used:
> 
> # ls -l /dev/disk/by-id  | grep -v -- -part | grep -v wwn-
> total 0
> lrwxrwxrwx 1 root root  9 Feb  8 00:48 ata-INTEL_SSDSC2BA200G4_BTHV7374017P200MGN -> ../../sdf
> lrwxrwxrwx 1 root root  9 Feb  8 00:48 ata-WDC_WD2005FBYZ-01YCBB2_WD-WMC6N0K0SHRT -> ../../sdc
> lrwxrwxrwx 1 root root  9 Feb  8 00:48 ata-WDC_WD2005FBYZ-01YCBB2_WD-WMC6N0K2KNK4 -> ../../sda
> lrwxrwxrwx 1 root root  9 Feb  8 00:48 ata-WDC_WD2005FBYZ-01YCBB2_WD-WMC6N0K5KHNS -> ../../sdb
> lrwxrwxrwx 1 root root  9 Feb  8 00:48 ata-WDC_WD2005FBYZ-01YCBB2_WD-WMC6N0K7212Z -> ../../sde
> lrwxrwxrwx 1 root root  9 Feb  8 00:48 usb-Samsung_Flash_Drive_0352315110002396-0:0 -> ../../sdd

More de-obfuscation:
   - Intel SSD DC S3710 SATA 200GB for caching
   - WD Gold SATA 2TB for backing device

I now also tested with alternatives writethrough and with "hdparm -W 1",
they also crash.

Perhaps more relevant: fio is started immediately after setup of md
raid and bcache, while md sync speed is set very low:

    echo "1" >/sys/block/md127/md/sync_speed_min
    echo "1" >/sys/block/md127/md/sync_speed_max


I attached the script I use for producing the crashes, hope this helps.

Regards
Matthias Ferdinand

[-- Attachment #2: crash-with-fio.sh --]
[-- Type: application/x-sh, Size: 1700 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: kernel hangs and crashes with 4.15 (also 4.13, 4.14)
  2018-02-12  8:57 tang.junhui
@ 2018-02-12  9:41 ` Matthias Ferdinand
  2018-02-12 10:55   ` Matthias Ferdinand
  0 siblings, 1 reply; 5+ messages in thread
From: Matthias Ferdinand @ 2018-02-12  9:41 UTC (permalink / raw)
  To: tang.junhui; +Cc: linux-bcache

On Mon, Feb 12, 2018 at 04:57:52PM +0800, tang.junhui@zte.com.cn wrote:
> From: Tang Junhui <tang.junhui@zte.com.cn>
> 
> 
> Hello Matthias,
> 
> What kind of caching mode do you use? Writeback or Writethrough?

Hi, thanks for looking into this.

This is with writeback caching (haven't tried writethrough yet).

Some more info about the disk drives used:

# ls -l /dev/disk/by-id  | grep -v -- -part | grep -v wwn-
total 0
lrwxrwxrwx 1 root root  9 Feb  8 00:48 ata-INTEL_SSDSC2BA200G4_BTHV7374017P200MGN -> ../../sdf
lrwxrwxrwx 1 root root  9 Feb  8 00:48 ata-WDC_WD2005FBYZ-01YCBB2_WD-WMC6N0K0SHRT -> ../../sdc
lrwxrwxrwx 1 root root  9 Feb  8 00:48 ata-WDC_WD2005FBYZ-01YCBB2_WD-WMC6N0K2KNK4 -> ../../sda
lrwxrwxrwx 1 root root  9 Feb  8 00:48 ata-WDC_WD2005FBYZ-01YCBB2_WD-WMC6N0K5KHNS -> ../../sdb
lrwxrwxrwx 1 root root  9 Feb  8 00:48 ata-WDC_WD2005FBYZ-01YCBB2_WD-WMC6N0K7212Z -> ../../sde
lrwxrwxrwx 1 root root  9 Feb  8 00:48 usb-Samsung_Flash_Drive_0352315110002396-0:0 -> ../../sdd


with drive writeback cache disabled (hdparm -W 0 /dev/sdx).

The flash drive only contains the OS (Ubuntu 14.04) and is not otherwise
involved in the benchmark (it is a really slow device:-)

Caching device is a 15GB partition on the SSD (/dev/sdf1), backing
device is an md raid built from a 32GB partition on each of the HDDs.

Regards
Matthias Ferdinand

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: kernel hangs and crashes with 4.15 (also 4.13, 4.14)
@ 2018-02-12  8:57 tang.junhui
  2018-02-12  9:41 ` Matthias Ferdinand
  0 siblings, 1 reply; 5+ messages in thread
From: tang.junhui @ 2018-02-12  8:57 UTC (permalink / raw)
  To: bcache; +Cc: linux-bcache, tang.junhui

From: Tang Junhui <tang.junhui@zte.com.cn>


Hello Matthias,

What kind of caching mode do you use? Writeback or Writethrough?


> this seems to not have gone through on the first attempt, sorry if you
> get this twice.
> 
> Running some fio jobs directly on a writeback bcache device (no fs
> mounted), I can hang or crash the kernel with high probability (although
> not deterministically).
> 
>     for n in $(seq 1 8); do
>         fio --filename=/dev/bcache0 --direct=1 --sync=1 --rw=write --bs=4k --numjobs=${n} --iodepth=1 --runtime=30 --time_based --group_reporting --name=journal-test
>     done
> 
> If the backing device is a md raid5, bcache ?ocks up with a kernel BUG
> message, "shutdown -r now" hangs, but I can still reboot using magic
> sysrq (reisub).
> 
> If the backing device is a md raid10, the machine also shows a kernel
> BUG but completely freezes after that. On the VGA console, there is
> often another kernel BUG message, which hasn't made it to the serial
> console, but which is very similar to the first message.
> 
> There is no hang or crash if I run those tests on the md devices
> directly with no bcache.
> 
> This happens with kernel versions 4.15.0 (kernel.org), 4.14.15
> (kernel.org), 4.13.0-32 (Ubuntu Xenial HWE kernel). With an older
> 4.10.1, no crashes happen.
> 
> the kernel BUGs seem to happen at different places (from different
> runs with 4.15.0):
> 
>     [  438.880774] kernel BUG at block/bio.c:560!
> 
>     [  440.012034] kernel BUG at block/blk-ioc.c:146!
> 
>     [ 5378.266726] Kernel BUG at 000000006726b688 [verbose debug info unavailable]
> 
>     [  103.315681] Kernel BUG at 00000000af7724c9 [verbose debug info unavailable]
> 
> 
> Kernel BUG messages from a hang with raid5 and a lockup with raid10
> caught with USB serial console follow below.
> 
> Regards
> Matthias Ferdinand
> 
> -----------------------------------------------------------------------
> 
> raid5:
> 
> [  440.012034] kernel BUG at block/blk-ioc.c:146!
> [  440.012696] invalid opcode: 0000 [#1] SMP NOPTI
> [  440.013355] Modules linked in: bcache ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 xt_CHECKSUM iptable_mangle xt_tcpudp bridge stp llc ip6table_filter ip6_tables iptable_filter ip_tables ebtable_nat ebtables x_tables intel_rapl ttm x86_pkg_temp_thermal intel_powerclamp drm_kms_helper coretemp kvm_intel drm kvm ipmi_ssif i2c_algo_bit fb_sys_fops syscopyarea sysfillrect irqbypass sysimgblt crct10dif_pclmul crc32_pclmul hpilo ghash_clmulni_intel gpio_ich shpchp serio_raw pcbc aesni_intel ipmi_si aes_x86_64 crypto_simd ipmi_msghandler dm_multipath glue_helper cryptd acpi_power_meter intel_cstate intel_rapl_perf input_leds lpc_ich mfd_core ie31200_edac btrfs zstd_decompress zstd_com
 press xxhash
> [  440.023485]  raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq hid_generic libcrc32c uas raid1 usbhid psmouse raid0 usb_storage tg3 ahci hid libahci multipath linear [last unloaded: bcache]
> [  440.026537] CPU: 2 PID: 2205 Comm: md127_raid5 Not tainted 4.15.0 #8
> [  440.027441] Hardware name: HP ProLiant MicroServer Gen8, BIOS J06 07/16/2015
> [  440.028615] RIP: 0010:put_io_context+0x8b/0x90
> [  440.029246] RSP: 0018:ffffa8c882b43af8 EFLAGS: 00010246
> [  440.029990] RAX: 0000000000000000 RBX: ffffa8c88294fca0 RCX: 00000000000f4240
> [  440.031006] RDX: 0000000000000004 RSI: 0000000000000286 RDI: ffffa8c88294fca0
> [  440.032030] RBP: ffffa8c882b43b10 R08: 0000000000000003 R09: ffff949cb80c1700
> [  440.033206] R10: 0000000000000104 R11: 000000000000b71c R12: 0000000000001000
> [  440.034222] R13: 0000000000000000 R14: ffff949cad84db70 R15: ffff949cb11bd1e0
> [  440.035239] FS:  0000000000000000(0000) GS:ffff949cba280000(0000) knlGS:0000000000000000
> [  440.060190] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  440.084967] CR2: 00007ff0493ef000 CR3: 00000002f1e0a002 CR4: 00000000001606e0
> [  440.110498] Call Trace:
> [  440.135443]  bio_disassociate_task+0x1b/0x60
> [  440.160355]  bio_free+0x1b/0x60
> [  440.184666]  bio_put+0x23/0x30
> [  440.208272]  search_free+0x23/0x40 [bcache]
> [  440.231448]  cached_dev_write_complete+0x31/0x70 [bcache]
> [  440.254468]  closure_put+0xb6/0xd0 [bcache]
> [  440.277087]  request_endio+0x30/0x40 [bcache]
> [  440.298703]  bio_endio+0xa1/0x120
> [  440.319644]  handle_stripe+0x418/0x2270 [raid456]
> [  440.340614]  ? load_balance+0x17b/0x9c0
> [  440.360506]  handle_active_stripes.isra.58+0x387/0x5a0 [raid456]
> [  440.380675]  ? __release_stripe+0x15/0x20 [raid456]
> [  440.400132]  raid5d+0x3ed/0x5d0 [raid456]
> [  440.419193]  ? schedule+0x36/0x80
> [  440.437932]  ? schedule_timeout+0x1d2/0x2f0
> [  440.456136]  md_thread+0x122/0x150
> [  440.473687]  ? wait_woken+0x80/0x80
> [  440.491411]  kthread+0x102/0x140
> [  440.508636]  ? find_pers+0x70/0x70
> [  440.524927]  ? kthread_associate_blkcg+0xa0/0xa0
> [  440.541791]  ret_from_fork+0x35/0x40
> [  440.558020] Code: c2 48 00 5b 41 5c 41 5d 5d c3 48 89 c6 4c 89 e7 e8 bb c2 48 00 48 8b 3d bc 36 4b 01 48 89 de e8 7c f7 e0 ff 5b 41 5c 41 5d 5d c3 <0f> 0b 0f 1f 00 0f 1f 44 00 00 55 48 8d 47 b8 48 89 e5 41 57 41 
> [  440.610020] RIP: put_io_context+0x8b/0x90 RSP: ffffa8c882b43af8
> [  440.628575] ---[ end trace a1fd79d85643a73e ]---
> 
> -----------------------------------------------------------------------
> 
> raid10:
> 
> [  438.880774] kernel BUG at block/bio.c:560!
> [  438.881378] invalid opcode: 0000 [#1] SMP NOPTI
> [  438.882197] Modules linked in: bcache ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 xt_CHECKSUM iptable_mangle xt_tcpudp bridge stp llc ip6table_filter ip6_tables iptable_filter ip_tables ebtable_nat ebtables x_tables intel_rapl x86_pkg_temp_thermal intel_powerclamp ttm coretemp kvm_intel drm_kms_helper kvm drm irqbypass crct10dif_pclmul ipmi_ssif crc32_pclmul ghash_clmulni_intel pcbc aesni_intel aes_x86_64 crypto_simd glue_helper ipmi_si i2c_algo_bit ipmi_msghandler gpio_ich fb_sys_fops lpc_ich hpilo mfd_core acpi_power_meter ie31200_edac syscopyarea sysfillrect sysimgblt serio_raw dm_multipath input_leds cryptd shpchp intel_cstate intel_rapl_perf btrfs zstd_decompress zstd_com
 press xxhash
> [  438.892634]  raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c hid_generic uas raid1 usbhid usb_storage tg3 raid0 psmouse ahci hid libahci multipath linear [last unloaded: bcache]
> [  438.895674] CPU: 4 PID: 0 Comm: swapper/4 Not tainted 4.15.0 #8
> [  438.896516] Hardware name: HP ProLiant MicroServer Gen8, BIOS J06 07/16/2015
> [  438.897524] RIP: 0010:bio_put+0x25/0x30
> [  438.898069] RSP: 0018:ffff8ca2ba303c70 EFLAGS: 00010246
> [  438.898987] RAX: 0000000000000000 RBX: ffff8ca2a1adf788 RCX: 00000000000f4240
> [  438.900003] RDX: 0000000000000000 RSI: 0000000000000282 RDI: ffffa87782d67c08
> [  438.901020] RBP: ffff8ca2ba303c70 R08: 0000000000000000 R09: ffffffff85012480
> [  438.902036] R10: 000000000000012b R11: 000000000000013c R12: ffff8ca28fb30010
> [  438.903053] R13: 0000000000000000 R14: ffff8ca291327460 R15: 0000000000000000
> [  438.904237] FS:  0000000000000000(0000) GS:ffff8ca2ba300000(0000) knlGS:0000000000000000
> [  438.928279] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  438.952654] CR2: 00007fe51a454090 CR3: 00000002c420a004 CR4: 00000000001606e0
> [  438.977670] Call Trace:
> [  439.002472]  <IRQ>
> [  439.026414]  search_free+0x23/0x40 [bcache]
> [  439.050258]  cached_dev_write_complete+0x31/0x70 [bcache]
> [  439.074040]  closure_put+0xb6/0xd0 [bcache]
> [  439.097246]  request_endio+0x30/0x40 [bcache]
> [  439.119920]  bio_endio+0xa1/0x120
> [  439.141667]  raid_end_bio_io+0x30/0xc0 [raid10]
> [  439.163411]  one_write_done+0x35/0x50 [raid10]
> [  439.184486]  raid10_end_write_request+0x112/0x250 [raid10]
> [  439.205551]  bio_endio+0xa1/0x120
> [  439.225647]  blk_update_request+0xb7/0x310
> [  439.245295]  scsi_end_request+0x34/0x200
> [  439.264379]  scsi_io_completion+0x10d/0x5c0
> [  439.283087]  scsi_finish_command+0xd9/0x120
> [  439.301348]  scsi_softirq_done+0x144/0x170
> [  439.318718]  blk_done_softirq+0x7c/0x90
> [  439.335596]  __do_softirq+0xc9/0x26a
> [  439.352176]  irq_exit+0xa5/0xb0
> [  439.368242]  do_IRQ+0x51/0xd0
> [  439.383765]  common_interrupt+0x9f/0x9f
> [  439.399217]  </IRQ>
> [  439.414466] RIP: 0010:cpuidle_enter_state+0xeb/0x290
> [  439.430078] RSP: 0018:ffffa87781917e80 EFLAGS: 00000246 ORIG_RAX: ffffffffffffffdb
> [  439.462221] RAX: ffff8ca2ba3229c0 RBX: ffff8ca2ba32b700 RCX: 000000000000001f
> [  439.479067] RDX: 0000000000000000 RSI: ffffffe1cfb49a99 RDI: 0000000000000000
> [  439.495611] RBP: ffffa87781917eb8 R08: 00000000000003ff R09: 0000000000000187
> [  439.511910] R10: 000000000000008a R11: 0000000000000018 R12: 0000000000000002
> [  439.528158] R13: 0000000000000004 R14: ffffffff85171380 R15: 000000662f545620
> [  439.543608]  cpuidle_enter+0x17/0x20
> [  439.559325]  call_cpuidle+0x23/0x40
> [  439.574921]  do_idle+0x185/0x210
> [  439.590218]  cpu_startup_entry+0x1d/0x30
> [  439.605566]  start_secondary+0x133/0x170
> [  439.620897]  secondary_startup_64+0xa5/0xb0
> [  439.635885] Code: ff 5b 41 5c 5d c3 0f 1f 44 00 00 55 f6 47 15 01 48 89 e5 74 0f 8b 47 74 85 c0 74 0f f0 ff 4f 74 74 02 5d c3 e8 7d ff ff ff 5d c3 <0f> 0b 66 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 41
> [  439.684209] RIP: bio_put+0x25/0x30 RSP: ffff8ca2ba303c70
> [  439.700579] ---[ end trace ce9f8201937066d4 ]---
> [  439.720653] Kernel panic - not syncing: Fatal exception in interrupt
> 
> ~ 2 seconds later a similar message, but not on the serial console anymore
> --
> To unsubscribe from this list: send the line "unsubscribe linux-bcache" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

Thanks.
Tang Junhui

^ permalink raw reply	[flat|nested] 5+ messages in thread

* kernel hangs and crashes with 4.15 (also 4.13, 4.14)
@ 2018-02-08 20:22 Matthias Ferdinand
  0 siblings, 0 replies; 5+ messages in thread
From: Matthias Ferdinand @ 2018-02-08 20:22 UTC (permalink / raw)
  To: linux-bcache

Hi,

this seems to not have gone through on the first attempt, sorry if you
get this twice.

Running some fio jobs directly on a writeback bcache device (no fs
mounted), I can hang or crash the kernel with high probability (although
not deterministically).

    for n in $(seq 1 8); do
        fio --filename=/dev/bcache0 --direct=1 --sync=1 --rw=write --bs=4k --numjobs=${n} --iodepth=1 --runtime=30 --time_based --group_reporting --name=journal-test
    done

If the backing device is a md raid5, bcache ĺocks up with a kernel BUG
message, "shutdown -r now" hangs, but I can still reboot using magic
sysrq (reisub).

If the backing device is a md raid10, the machine also shows a kernel
BUG but completely freezes after that. On the VGA console, there is
often another kernel BUG message, which hasn't made it to the serial
console, but which is very similar to the first message.

There is no hang or crash if I run those tests on the md devices
directly with no bcache.

This happens with kernel versions 4.15.0 (kernel.org), 4.14.15
(kernel.org), 4.13.0-32 (Ubuntu Xenial HWE kernel). With an older
4.10.1, no crashes happen.

the kernel BUGs seem to happen at different places (from different
runs with 4.15.0):

    [  438.880774] kernel BUG at block/bio.c:560!

    [  440.012034] kernel BUG at block/blk-ioc.c:146!

    [ 5378.266726] Kernel BUG at 000000006726b688 [verbose debug info unavailable]

    [  103.315681] Kernel BUG at 00000000af7724c9 [verbose debug info unavailable]


Kernel BUG messages from a hang with raid5 and a lockup with raid10
caught with USB serial console follow below.

Regards
Matthias Ferdinand

-----------------------------------------------------------------------

raid5:

[  440.012034] kernel BUG at block/blk-ioc.c:146!
[  440.012696] invalid opcode: 0000 [#1] SMP NOPTI
[  440.013355] Modules linked in: bcache ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 xt_CHECKSUM iptable_mangle xt_tcpudp bridge stp llc ip6table_filter ip6_tables iptable_filter ip_tables ebtable_nat ebtables x_tables intel_rapl ttm x86_pkg_temp_thermal intel_powerclamp drm_kms_helper coretemp kvm_intel drm kvm ipmi_ssif i2c_algo_bit fb_sys_fops syscopyarea sysfillrect irqbypass sysimgblt crct10dif_pclmul crc32_pclmul hpilo ghash_clmulni_intel gpio_ich shpchp serio_raw pcbc aesni_intel ipmi_si aes_x86_64 crypto_simd ipmi_msghandler dm_multipath glue_helper cryptd acpi_power_meter intel_cstate intel_rapl_perf input_leds lpc_ich mfd_core ie31200_edac btrfs zstd_decompress zstd_compr
 ess xxhash
[  440.023485]  raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq hid_generic libcrc32c uas raid1 usbhid psmouse raid0 usb_storage tg3 ahci hid libahci multipath linear [last unloaded: bcache]
[  440.026537] CPU: 2 PID: 2205 Comm: md127_raid5 Not tainted 4.15.0 #8
[  440.027441] Hardware name: HP ProLiant MicroServer Gen8, BIOS J06 07/16/2015
[  440.028615] RIP: 0010:put_io_context+0x8b/0x90
[  440.029246] RSP: 0018:ffffa8c882b43af8 EFLAGS: 00010246
[  440.029990] RAX: 0000000000000000 RBX: ffffa8c88294fca0 RCX: 00000000000f4240
[  440.031006] RDX: 0000000000000004 RSI: 0000000000000286 RDI: ffffa8c88294fca0
[  440.032030] RBP: ffffa8c882b43b10 R08: 0000000000000003 R09: ffff949cb80c1700
[  440.033206] R10: 0000000000000104 R11: 000000000000b71c R12: 0000000000001000
[  440.034222] R13: 0000000000000000 R14: ffff949cad84db70 R15: ffff949cb11bd1e0
[  440.035239] FS:  0000000000000000(0000) GS:ffff949cba280000(0000) knlGS:0000000000000000
[  440.060190] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  440.084967] CR2: 00007ff0493ef000 CR3: 00000002f1e0a002 CR4: 00000000001606e0
[  440.110498] Call Trace:
[  440.135443]  bio_disassociate_task+0x1b/0x60
[  440.160355]  bio_free+0x1b/0x60
[  440.184666]  bio_put+0x23/0x30
[  440.208272]  search_free+0x23/0x40 [bcache]
[  440.231448]  cached_dev_write_complete+0x31/0x70 [bcache]
[  440.254468]  closure_put+0xb6/0xd0 [bcache]
[  440.277087]  request_endio+0x30/0x40 [bcache]
[  440.298703]  bio_endio+0xa1/0x120
[  440.319644]  handle_stripe+0x418/0x2270 [raid456]
[  440.340614]  ? load_balance+0x17b/0x9c0
[  440.360506]  handle_active_stripes.isra.58+0x387/0x5a0 [raid456]
[  440.380675]  ? __release_stripe+0x15/0x20 [raid456]
[  440.400132]  raid5d+0x3ed/0x5d0 [raid456]
[  440.419193]  ? schedule+0x36/0x80
[  440.437932]  ? schedule_timeout+0x1d2/0x2f0
[  440.456136]  md_thread+0x122/0x150
[  440.473687]  ? wait_woken+0x80/0x80
[  440.491411]  kthread+0x102/0x140
[  440.508636]  ? find_pers+0x70/0x70
[  440.524927]  ? kthread_associate_blkcg+0xa0/0xa0
[  440.541791]  ret_from_fork+0x35/0x40
[  440.558020] Code: c2 48 00 5b 41 5c 41 5d 5d c3 48 89 c6 4c 89 e7 e8 bb c2 48 00 48 8b 3d bc 36 4b 01 48 89 de e8 7c f7 e0 ff 5b 41 5c 41 5d 5d c3 <0f> 0b 0f 1f 00 0f 1f 44 00 00 55 48 8d 47 b8 48 89 e5 41 57 41 
[  440.610020] RIP: put_io_context+0x8b/0x90 RSP: ffffa8c882b43af8
[  440.628575] ---[ end trace a1fd79d85643a73e ]---

-----------------------------------------------------------------------

raid10:

[  438.880774] kernel BUG at block/bio.c:560!
[  438.881378] invalid opcode: 0000 [#1] SMP NOPTI
[  438.882197] Modules linked in: bcache ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 xt_CHECKSUM iptable_mangle xt_tcpudp bridge stp llc ip6table_filter ip6_tables iptable_filter ip_tables ebtable_nat ebtables x_tables intel_rapl x86_pkg_temp_thermal intel_powerclamp ttm coretemp kvm_intel drm_kms_helper kvm drm irqbypass crct10dif_pclmul ipmi_ssif crc32_pclmul ghash_clmulni_intel pcbc aesni_intel aes_x86_64 crypto_simd glue_helper ipmi_si i2c_algo_bit ipmi_msghandler gpio_ich fb_sys_fops lpc_ich hpilo mfd_core acpi_power_meter ie31200_edac syscopyarea sysfillrect sysimgblt serio_raw dm_multipath input_leds cryptd shpchp intel_cstate intel_rapl_perf btrfs zstd_decompress zstd_compr
 ess xxhash
[  438.892634]  raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c hid_generic uas raid1 usbhid usb_storage tg3 raid0 psmouse ahci hid libahci multipath linear [last unloaded: bcache]
[  438.895674] CPU: 4 PID: 0 Comm: swapper/4 Not tainted 4.15.0 #8
[  438.896516] Hardware name: HP ProLiant MicroServer Gen8, BIOS J06 07/16/2015
[  438.897524] RIP: 0010:bio_put+0x25/0x30
[  438.898069] RSP: 0018:ffff8ca2ba303c70 EFLAGS: 00010246
[  438.898987] RAX: 0000000000000000 RBX: ffff8ca2a1adf788 RCX: 00000000000f4240
[  438.900003] RDX: 0000000000000000 RSI: 0000000000000282 RDI: ffffa87782d67c08
[  438.901020] RBP: ffff8ca2ba303c70 R08: 0000000000000000 R09: ffffffff85012480
[  438.902036] R10: 000000000000012b R11: 000000000000013c R12: ffff8ca28fb30010
[  438.903053] R13: 0000000000000000 R14: ffff8ca291327460 R15: 0000000000000000
[  438.904237] FS:  0000000000000000(0000) GS:ffff8ca2ba300000(0000) knlGS:0000000000000000
[  438.928279] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  438.952654] CR2: 00007fe51a454090 CR3: 00000002c420a004 CR4: 00000000001606e0
[  438.977670] Call Trace:
[  439.002472]  <IRQ>
[  439.026414]  search_free+0x23/0x40 [bcache]
[  439.050258]  cached_dev_write_complete+0x31/0x70 [bcache]
[  439.074040]  closure_put+0xb6/0xd0 [bcache]
[  439.097246]  request_endio+0x30/0x40 [bcache]
[  439.119920]  bio_endio+0xa1/0x120
[  439.141667]  raid_end_bio_io+0x30/0xc0 [raid10]
[  439.163411]  one_write_done+0x35/0x50 [raid10]
[  439.184486]  raid10_end_write_request+0x112/0x250 [raid10]
[  439.205551]  bio_endio+0xa1/0x120
[  439.225647]  blk_update_request+0xb7/0x310
[  439.245295]  scsi_end_request+0x34/0x200
[  439.264379]  scsi_io_completion+0x10d/0x5c0
[  439.283087]  scsi_finish_command+0xd9/0x120
[  439.301348]  scsi_softirq_done+0x144/0x170
[  439.318718]  blk_done_softirq+0x7c/0x90
[  439.335596]  __do_softirq+0xc9/0x26a
[  439.352176]  irq_exit+0xa5/0xb0
[  439.368242]  do_IRQ+0x51/0xd0
[  439.383765]  common_interrupt+0x9f/0x9f
[  439.399217]  </IRQ>
[  439.414466] RIP: 0010:cpuidle_enter_state+0xeb/0x290
[  439.430078] RSP: 0018:ffffa87781917e80 EFLAGS: 00000246 ORIG_RAX: ffffffffffffffdb
[  439.462221] RAX: ffff8ca2ba3229c0 RBX: ffff8ca2ba32b700 RCX: 000000000000001f
[  439.479067] RDX: 0000000000000000 RSI: ffffffe1cfb49a99 RDI: 0000000000000000
[  439.495611] RBP: ffffa87781917eb8 R08: 00000000000003ff R09: 0000000000000187
[  439.511910] R10: 000000000000008a R11: 0000000000000018 R12: 0000000000000002
[  439.528158] R13: 0000000000000004 R14: ffffffff85171380 R15: 000000662f545620
[  439.543608]  cpuidle_enter+0x17/0x20
[  439.559325]  call_cpuidle+0x23/0x40
[  439.574921]  do_idle+0x185/0x210
[  439.590218]  cpu_startup_entry+0x1d/0x30
[  439.605566]  start_secondary+0x133/0x170
[  439.620897]  secondary_startup_64+0xa5/0xb0
[  439.635885] Code: ff 5b 41 5c 5d c3 0f 1f 44 00 00 55 f6 47 15 01 48 89 e5 74 0f 8b 47 74 85 c0 74 0f f0 ff 4f 74 74 02 5d c3 e8 7d ff ff ff 5d c3 <0f> 0b 66 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 41
[  439.684209] RIP: bio_put+0x25/0x30 RSP: ffff8ca2ba303c70
[  439.700579] ---[ end trace ce9f8201937066d4 ]---
[  439.720653] Kernel panic - not syncing: Fatal exception in interrupt

~ 2 seconds later a similar message, but not on the serial console anymore

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2018-02-12 11:26 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-02-12 11:29 kernel hangs and crashes with 4.15 (also 4.13, 4.14) tang.junhui
  -- strict thread matches above, loose matches on Subject: below --
2018-02-12  8:57 tang.junhui
2018-02-12  9:41 ` Matthias Ferdinand
2018-02-12 10:55   ` Matthias Ferdinand
2018-02-08 20:22 Matthias Ferdinand

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.