All of lore.kernel.org
 help / color / mirror / Atom feed
* NULL deref around xfs in v4.0-rc1–rc7
@ 2015-04-08 13:41 Jan Engelhardt
  2015-04-08 15:20 ` Jan Engelhardt
  0 siblings, 1 reply; 14+ messages in thread
From: Jan Engelhardt @ 2015-04-08 13:41 UTC (permalink / raw)
  To: torvalds; +Cc: Linux Kernel Mailing List


Starting somewhere around v4.0-rc1 and persisting through commit 
v4.0-rc7, there is a new NULL deference apparently happening in 
conjunction with xfs. This inhibits this machine's booting,
as xfs is used for the root filesystem.

First bisection points at first-bad commit v4.0-rc1~8, and since that is 
a merge commit, I'll be investigating some more hand-chosen commits (and 
then people to Cc) as we speak.


Boot log of v4.0-rc1~8:

 Fusion MPT base driver 3.04.20
 Copyright (c) 1999-2008 LSI Corporation
 Fusion MPT SAS Host driver 3.04.20
 mptbase: ioc0: Initiating bringup
 ioc0: LSISAS1068 A0: Capabilities={Initiator}
 scsi host0: ioc0: LSISAS1068 A0, FwRev=00000000h, Ports=8, MaxQ=256, IRQ=22
 mptsas: ioc0: attaching ssp device: fw_channel 0, fw_id 1, phy 1, sas_addr 0x1060504030201a0
 scsi 0:0:0:0: Direct-Access     VBOX     HARDDISK         1.0  PQ: 0 ANSI: 5
 scsi 0:0:0:0: Attached scsi generic sg0 type 0
 mptbase: ioc1: Initiating bringup
 ioc1: LSISAS1068 A0: Capabilities={Initiator}
 scsi host1: ioc1: LSISAS1068 A0, FwRev=00000000h, Ports=8, MaxQ=256, IRQ=17
 mptsas: ioc1: attaching ssp device: fw_channel 0, fw_id 0, phy 0, sas_addr 0x60504030201a0
 scsi 1:0:0:0: Direct-Access     VBOX     HARDDISK         1.0  PQ: 0 ANSI: 5
 scsi 1:0:0:0: Attached scsi generic sg1 type 0
 sd 0:0:0:0: [sda] 12582912 512-byte logical blocks: (6.44 GB/6.00 GiB)
 sd 1:0:0:0: [sdb] 16777216 512-byte logical blocks: (8.58 GB/8.00 GiB)
 sd 0:0:0:0: [sda] Write Protect is off
 sd 0:0:0:0: [sda] Incomplete mode parameter data
 sd 0:0:0:0: [sda] Assuming drive cache: write through
 sd 1:0:0:0: [sdb] Write Protect is off
 sd 1:0:0:0: [sdb] Incomplete mode parameter data
 sd 1:0:0:0: [sdb] Assuming drive cache: write through
  sda: sda1 sda2
 sd 0:0:0:0: [sda] Attached SCSI disk
  sdb: sdb1 sdb2
 sd 1:0:0:0: [sdb] Attached SCSI disk
 audit: type=1130 audit(1428456646.877:11): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=systemd-vconsole-setup comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
^[[1;39mPlease enter passphrase for disk HARDDISK (sfroot)! ^[[0m
 NET: Registered protocol family 38
 audit_printk_skb: 3 callbacks suppressed
 audit: type=1130 audit(1428456653.677:13): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=systemd-cryptsetup@sfroot comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
 audit: type=1130 audit(1428456653.941:14): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=dracut-initqueue comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
 audit: type=1130 audit(1428456654.369:15): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=systemd-fsck@dev-mapper-sfroot comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
 SGI XFS with ACLs, security attributes, realtime, no debug enabled
 XFS (dm-0): Mounting V5 Filesystem
 XFS (dm-0): Ending clean mount
 audit: type=1130 audit(1428456654.705:16): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=initrd-parse-etc comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
 audit: type=1131 audit(1428456654.761:17): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=initrd-parse-etc comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
 audit: type=1130 audit(1428456655.077:18): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=dracut-pre-pivot comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
 audit: type=1130 audit(1428456655.157:19): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=systemd-ask-password-console comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
 audit: type=1131 audit(1428456655.417:20): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=systemd-ask-password-console comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
 audit: type=1130 audit(1428456655.437:21): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=initrd-cleanup comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
 audit: type=1131 audit(1428456655.453:22): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=initrd-cleanup comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
 systemd-journald[155]: Received SIGTERM from PID 1 (systemd).
 BUG: unable to handle kernel paging request at 0000000000001000
 IP: [<ffffffff812718d0>] scsi_init_cmd_errh+0x26/0x5d
 PGD 0 
 Oops: 0002 [#1] SMP 
 Modules linked in: xfs crc32c_generic libcrc32c dm_crypt xts gf128mul algif_skcipher af_alg sd_mod mptsas scsi_transport_sas mptscsih mptbase dm_mod sg ipv6
 CPU: 0 PID: 447 Comm: systemd-cgroups Not tainted 4.0.0-rc1 #21
 Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
 task: ffff88007acceeb0 ti: ffff88007bcc0000 task.ti: ffff88007bcc0000
 RIP: 0010:[<ffffffff812718d0>]  [<ffffffff812718d0>] scsi_init_cmd_errh+0x26/0x5d
 RSP: 0018:ffff88007bcc3730  EFLAGS: 00010246
 RAX: 0000000000000000 RBX: ffff88007b078800 RCX: 0000000000000018
 RDX: ffff88007b02d370 RSI: 0000000000000000 RDI: 0000000000001000
 RBP: ffff88007b02d370 R08: ffff88007bf1a440 R09: 00000000000000fa
 R10: 0000000000000000 R11: ffffea0001eef880 R12: 0000000000000000
 R13: ffff88007b078800 R14: ffff88007bc35000 R15: ffff88007b02d200
 FS:  0000000000000000(0000) GS:ffff88007fc00000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
 CR2: 0000000000001000 CR3: 000000007cb79000 CR4: 00000000000007f0
 Stack:
  ffffffff812731ac ffff88007acceeb0 ffff88007bc35000 ffff88007b075400
  ffff88007b078968 ffff88007b02d378 ffff88007fc18e88 ffff88007b070f20
  ffff88007bcc37c0 ffff88007b075408 0000000000000000 0000000000000000
 Call Trace:
  [<ffffffff812731ac>] ? scsi_queue_rq+0x2e5/0x3d3
  [<ffffffff8118d833>] ? __blk_mq_run_hw_queue+0x19a/0x29f
  [<ffffffff8118da01>] ? blk_mq_alloc_request+0xc9/0x102
  [<ffffffffa00f774b>] ? __xfs_get_blocks+0x321/0x321 [xfs]
  [<ffffffff8118df7c>] ? blk_mq_run_hw_queue+0x4a/0x93
  [<ffffffff8118ebfa>] ? blk_sq_make_request+0x166/0x171
  [<ffffffff8118638e>] ? generic_make_request+0x8f/0xcc
  [<ffffffff811864ce>] ? submit_bio+0x103/0x121
  [<ffffffff810cc0ae>] ? get_page+0x9/0x25
  [<ffffffff810cc49f>] ? __lru_cache_add+0x1a/0x3a
  [<ffffffff8113629a>] ? mpage_bio_submit+0x1f/0x25
  [<ffffffff81136f07>] ? mpage_readpages+0xe2/0xf6
  [<ffffffffa00f774b>] ? __xfs_get_blocks+0x321/0x321 [xfs]
  [<ffffffff810c8d19>] ? get_page_from_freelist+0x3b3/0x785
  [<ffffffff810f860a>] ? alloc_pages_current+0xad/0xca
  [<ffffffff810cb5f9>] ? __do_page_cache_readahead+0x116/0x1af
  [<ffffffff810c4731>] ? filemap_fault+0x18e/0x393
  [<ffffffff810df31c>] ? __do_fault+0x3b/0x8f
  [<ffffffff810e2033>] ? handle_mm_fault+0x50e/0xfc9
  [<ffffffff810daf1c>] ? vma_interval_tree_insert+0x1f/0x82
  [<ffffffff810e5909>] ? vma_set_page_prot+0x33/0x52
  [<ffffffff81034345>] ? __do_page_fault+0x212/0x3ba
  [<ffffffff81337de8>] ? page_fault+0x28/0x30
  [<ffffffff811adf85>] ? __clear_user+0x1c/0x3d
  [<ffffffff811455f9>] ? padzero+0x1b/0x28
  [<ffffffff81147144>] ? load_elf_binary+0x7d5/0x1009
  [<ffffffff811135dc>] ? search_binary_handler+0x6b/0x170
  [<ffffffff81114780>] ? do_execveat_common.isra.31+0x45a/0x5dd
  [<ffffffff81114926>] ? do_execve+0x23/0x28
  [<ffffffff8104d4f7>] ? ____call_usermodehelper+0x100/0x128
  [<ffffffff8104d3f7>] ? call_usermodehelper+0x47/0x47
  [<ffffffff8133637c>] ? ret_from_fork+0x7c/0xb0
  [<ffffffff8104d3f7>] ? call_usermodehelper+0x47/0x47
 Code: c2 89 d0 5b c3 48 c7 87 b0 00 00 00 00 00 00 00 c7 87 f4 00 00 00 00 00 00 00 48 89 fa 48 8b bf 10 01 00 00 31 c0 b9 18 00 00 00 <f3> ab 66 83 ba cc 00 00 00 00 75 2a 48 8b 8a d8 00 00 00 8a 01 
 RIP  [<ffffffff812718d0>] scsi_init_cmd_errh+0x26/0x5d
  RSP <ffff88007bcc3730>
 CR2: 0000000000001000
 ---[ end trace ef687cac676b43b0 ]---

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: NULL deref around xfs in v4.0-rc1–rc7
  2015-04-08 13:41 NULL deref around xfs in v4.0-rc1–rc7 Jan Engelhardt
@ 2015-04-08 15:20 ` Jan Engelhardt
  2015-04-09 17:38   ` Linus Torvalds
  0 siblings, 1 reply; 14+ messages in thread
From: Jan Engelhardt @ 2015-04-08 15:20 UTC (permalink / raw)
  To: torvalds; +Cc: Linux Kernel Mailing List

On Wednesday 2015-04-08 15:41, Jan Engelhardt wrote:

>Starting somewhere around v4.0-rc1 and persisting through commit 
>v4.0-rc7, there is a new NULL deference apparently happening in 
>conjunction with xfs. This inhibits this machine's booting,
>as xfs is used for the root filesystem.
>
>First bisection points at first-bad commit v4.0-rc1~8, and since that is 
>a merge commit, I'll be investigating some more hand-chosen commits (and 
>then people to Cc) as we speak.

I reran bisect just to be sure.
It now shows v4.0-rc1~9 is bad, v4.0-rc1~9^1 is ok, and v4.0-rc~9^2 is 
ok too. So this means that the combination of the both ~9 childs work
badly together.


# good: [2bfedd1d9f470506d98cb5662ced381c38225968] Merge branch 'for-linus' of git://git.kernel.dk/linux-block
git bisect good 2bfedd1d9f470506d98cb5662ced381c38225968
# bad: [cd50b70ccd5c87794ec28bfb87b7fba9961eb0ae] Merge tag 'pm+acpi-3.20-rc1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
git bisect bad cd50b70ccd5c87794ec28bfb87b7fba9961eb0ae
# good: [9d0de5a63a4a22abfd2bd70694a610d18350cf87] Merge branches 'acpi-ec', 'acpi-soc', 'acpi-video' and 'acpi-resources'
git bisect good 9d0de5a63a4a22abfd2bd70694a610d18350cf87
# good: [67fadaa2768716209ee19a8b8bf05bc3ac399445] cpufreq: s3c: remove last use of resume_clocks callback
git bisect good 67fadaa2768716209ee19a8b8bf05bc3ac399445
# good: [70734a786acfd1998e47d40df19cba5c29469bdf] cpuidle: powernv: Avoid endianness conversions while parsing DT
git bisect good 70734a786acfd1998e47d40df19cba5c29469bdf
# good: [3466b547e37b988723dc93465b7cb06b4b1f731f] Merge branches 'pnp', 'pm-cpuidle' and 'pm-cpufreq'
git bisect good 3466b547e37b988723dc93465b7cb06b4b1f731f
# first bad commit: [cd50b70ccd5c87794ec28bfb87b7fba9961eb0ae] Merge tag 'pm+acpi-3.20-rc1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
# good: [2bfedd1d9f470506d98cb5662ced381c38225968] Merge branch 'for-linus' of git://git.kernel.dk/linux-block
git bisect good 2bfedd1d9f470506d98cb5662ced381c38225968
# bad: [cd50b70ccd5c87794ec28bfb87b7fba9961eb0ae] Merge tag 'pm+acpi-3.20-rc1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
git bisect bad cd50b70ccd5c87794ec28bfb87b7fba9961eb0ae
# good: [9d0de5a63a4a22abfd2bd70694a610d18350cf87] Merge branches 'acpi-ec', 'acpi-soc', 'acpi-video' and 'acpi-resources'
git bisect good 9d0de5a63a4a22abfd2bd70694a610d18350cf87
# good: [67fadaa2768716209ee19a8b8bf05bc3ac399445] cpufreq: s3c: remove last use of resume_clocks callback
git bisect good 67fadaa2768716209ee19a8b8bf05bc3ac399445
# good: [70734a786acfd1998e47d40df19cba5c29469bdf] cpuidle: powernv: Avoid endianness conversions while parsing DT
git bisect good 70734a786acfd1998e47d40df19cba5c29469bdf
# good: [3466b547e37b988723dc93465b7cb06b4b1f731f] Merge branches 'pnp', 'pm-cpuidle' and 'pm-cpufreq'
git bisect good 3466b547e37b988723dc93465b7cb06b4b1f731f
# first bad commit: [cd50b70ccd5c87794ec28bfb87b7fba9961eb0ae] Merge tag 'pm+acpi-3.20-rc1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm



 BUG: unable to handle kernel paging request at 0000000000001000
 IP: [<ffffffff81269d9e>] scsi_init_cmd_errh+0x26/0x5d
 PGD 0 
 Oops: 0002 [#1] SMP 
 Modules linked in: xfs crc32c_generic libcrc32c dm_crypt xts gf128mul algif_skcipher af_alg sd_mod mptsas scsi_transport_sas mptscsih mptbase dm_mod sg ipv6
 CPU: 0 PID: 406 Comm: kworker/u2:0 Not tainted 3.19.0+ #53
 Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
 task: ffff88007bf73c60 ti: ffff88007cb20000 task.ti: ffff88007cb20000
 RIP: 0010:[<ffffffff81269d9e>]  [<ffffffff81269d9e>] scsi_init_cmd_errh+0x26/0x5d
 RSP: 0018:ffff88007cb23870  EFLAGS: 00010246
 RAX: 0000000000000000 RBX: ffff88007bfa6800 RCX: 0000000000000018
 RDX: ffff88007bfec970 RSI: 0000000000000000 RDI: 0000000000001000
 RBP: ffff88007bfec970 R08: ffff88007be345c0 R09: 00000000000000fa
 R10: 0000000000000001 R11: ffffea0001ec8c40 R12: 0000000000000000
 R13: ffff88007bfa6800 R14: ffff88007bc04000 R15: ffff88007bfec800
 FS:  0000000000000000(0000) GS:ffff88007fc00000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
 CR2: 0000000000001000 CR3: 000000007bc9b000 CR4: 00000000000007f0
 Stack:
  ffffffff8126b67a ffff88007bf73c60 ffff88007bc04000 ffff88007bf13400
  ffff88007bfa6968 ffff88007bfec978 ffff88007fc18e48 ffff88007bfb4f20
  ffff88007cb23900 ffff88007bf13408 0000000000000000 0000000000000000
 Call Trace:
  [<ffffffff8126b67a>] ? scsi_queue_rq+0x2e5/0x3d3
  [<ffffffff8118d840>] ? __blk_mq_run_hw_queue+0x19a/0x29f
  [<ffffffff8118da01>] ? blk_mq_alloc_request+0xbc/0x102
  [<ffffffffa00f974b>] ? __xfs_get_blocks+0x321/0x321 [xfs]
  [<ffffffff8118df89>] ? blk_mq_run_hw_queue+0x4a/0x93
  [<ffffffff8118ec07>] ? blk_sq_make_request+0x166/0x171
  [<ffffffff8118639b>] ? generic_make_request+0x8f/0xcc
  [<ffffffff811864db>] ? submit_bio+0x103/0x121
  [<ffffffff810cc0ae>] ? get_page+0x9/0x25
  [<ffffffff810cc49f>] ? __lru_cache_add+0x1a/0x3a
  [<ffffffff81136312>] ? mpage_bio_submit+0x1f/0x25
  [<ffffffff81136f7f>] ? mpage_readpages+0xe2/0xf6
  [<ffffffffa00f974b>] ? __xfs_get_blocks+0x321/0x321 [xfs]
  [<ffffffff810f85f9>] ? alloc_pages_current+0xad/0xca
  [<ffffffff810cb5f9>] ? __do_page_cache_readahead+0x116/0x1af
  [<ffffffff811a8a21>] ? radix_tree_lookup_slot+0x10/0x23
  [<ffffffff810cb88b>] ? ondemand_readahead+0x1f9/0x20a
  [<ffffffff810c3231>] ? pagecache_get_page+0x22/0x138
  [<ffffffff810c3dab>] ? generic_file_read_iter+0x17a/0x4d4
  [<ffffffffa00d430e>] ? xfs_attr_get+0x52/0x113 [xfs]
  [<ffffffffa01013d8>] ? xfs_file_read_iter+0x1bb/0x20d [xfs]
  [<ffffffff8110e8c9>] ? new_sync_read+0x67/0x8b
  [<ffffffff8110f539>] ? vfs_read+0x6d/0xb7
  [<ffffffff81112ff7>] ? kernel_read+0x39/0x47
  [<ffffffff811146f3>] ? do_execveat_common.isra.31+0x3b7/0x5dd
  [<ffffffff8111493c>] ? do_execve+0x23/0x28
  [<ffffffff8104d4f7>] ? ____call_usermodehelper+0x100/0x128
  [<ffffffff8104d3f7>] ? call_usermodehelper+0x47/0x47
  [<ffffffff813188fc>] ? ret_from_fork+0x7c/0xb0
  [<ffffffff8104d3f7>] ? call_usermodehelper+0x47/0x47
 Code: c2 89 d0 5b c3 48 c7 87 b0 00 00 00 00 00 00 00 c7 87 f4 00 00 00 00 00 00 00 48 89 fa 48 8b bf 10 01 00 00 31 c0 b9 18 00 00 00 <f3> ab 66 83 ba cc 00 00 00 00 75 2a 48 8b 8a d8 00 00 00 8a 01 
 RIP  [<ffffffff81269d9e>] scsi_init_cmd_errh+0x26/0x5d
  RSP <ffff88007cb23870>
 CR2: 0000000000001000
 ---[ end trace 54414923d584f14b ]---

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: NULL deref around xfs in v4.0-rc1–rc7
  2015-04-08 15:20 ` Jan Engelhardt
@ 2015-04-09 17:38   ` Linus Torvalds
  2015-04-09 18:24     ` NULL deref around blkmq " Jan Engelhardt
  0 siblings, 1 reply; 14+ messages in thread
From: Linus Torvalds @ 2015-04-09 17:38 UTC (permalink / raw)
  To: Jan Engelhardt, Rafael J. Wysocki, Jens Axboe; +Cc: Linux Kernel Mailing List

On Wed, Apr 8, 2015 at 8:20 AM, Jan Engelhardt <jengelh@inai.de> wrote:
> On Wednesday 2015-04-08 15:41, Jan Engelhardt wrote:
>
>>Starting somewhere around v4.0-rc1 and persisting through commit
>>v4.0-rc7, there is a new NULL deference apparently happening in
>>conjunction with xfs. This inhibits this machine's booting,
>>as xfs is used for the root filesystem.
>>
>>First bisection points at first-bad commit v4.0-rc1~8, and since that is
>>a merge commit, I'll be investigating some more hand-chosen commits (and
>>then people to Cc) as we speak.
>
> I reran bisect just to be sure.
> It now shows v4.0-rc1~9 is bad, v4.0-rc1~9^1 is ok, and v4.0-rc~9^2 is
> ok too. So this means that the combination of the both ~9 childs work
> badly together.

Ok, that's just _odd_.

That v4.0-rc1~9 is just the pm+acpi merge, and has absolutely nothing
to do with XFS or the block code. In fact, looking at the diff from
it's direct parent, it doesn't even really change any relevant code.

So I get the feeling that the oops you are seeing is likely not
consistent, and may depend on allocation patterns or similar. Because
the bisect doesn't make any sense at all.

It looks much more like a pure block-mq bug, but one that needs some
very special condition to trigger.

Jens, does this look familiar or trigger any ideas:

    BUG: unable to handle kernel paging request at 0000000000001000
    IP: [<ffffffff812718d0>] scsi_init_cmd_errh+0x26/0x5d

(The whole oops is on lkml).

Jan, can you reproduce the oops with frame pointers so that we get a
better call trace? Although it looks fairly normal: the trapping code
is

    rep stos %eax,%es:*(%rdi)

and %rdi is 0x1000. It seems to be simply

         memset(cmd->sense_buffer, 0, SCSI_SENSE_BUFFERSIZE);

where 'cmd->sense_buffer' has some insane value ("PAGE_SIZE" or just a
flipped bit, or whatever)

Jens?

                              Linus

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: NULL deref around blkmq in v4.0-rc1–rc7
  2015-04-09 17:38   ` Linus Torvalds
@ 2015-04-09 18:24     ` Jan Engelhardt
  2015-04-09 21:12       ` Linus Torvalds
  0 siblings, 1 reply; 14+ messages in thread
From: Jan Engelhardt @ 2015-04-09 18:24 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Rafael J. Wysocki, Jens Axboe, Linux Kernel Mailing List


On Thursday 2015-04-09 19:38, Linus Torvalds wrote:
>>
>> I reran bisect just to be sure.
>> It now shows v4.0-rc1~9 is bad, v4.0-rc1~9^1 is ok, and v4.0-rc~9^2 is
>> ok too. So this means that the combination of the both ~9 childs work
>> badly together.
>
>Ok, that's just _odd_.
>[...]
>So I get the feeling that the oops you are seeing is likely not
>consistent, and may depend on allocation patterns or similar.

It's fairly consistent (reproducible?). Only 1 in 15 or so (have not kept track
really) attempts does it not die.

With frame pointers:
BUG: unable to handle kernel paging request at 0000000000001000
IP: [<ffffffff812853c9>] scsi_init_cmd_errh+0x2a/0x62
PGD 0 
Oops: 0002 [#1] SMP 
Modules linked in: xfs crc32c_generic libcrc32c dm_crypt xts gf128mul algif_skcipher af_alg sd_mod mptsas scsi_transport_sas mptscsih mptbase dm_mod sg ipv6
CPU: 0 PID: 403 Comm: kworker/u2:1 Not tainted 4.0.0-rc7+ #55
Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
task: ffff88007b686f60 ti: ffff88007bcb4000 task.ti: ffff88007bcb4000
RIP: 0010:[<ffffffff812853c9>]  [<ffffffff812853c9>] scsi_init_cmd_errh+0x2a/0x62
RSP: 0018:ffff88007bcb77a8  EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff88007bf8d800 RCX: 0000000000000018
RDX: ffff88007bf7ab70 RSI: 0000000000000000 RDI: 0000000000001000
RBP: ffff88007bcb77a8 R08: ffff88007beb9c40 R09: 0000000000000000
R10: 0000000000000000 R11: ffffea0001fe17c0 R12: ffff88007bf7ab70
R13: 0000000000000000 R14: ffff88007bf8d800 R15: ffff88007bf7aa00
FS:  0000000000000000(0000) GS:ffff88007fc00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000001000 CR3: 000000007cb0d000 CR4: 00000000000007f0
Stack:
 ffff88007bcb7818 ffffffff81286d59 ffff88007b686f60 ffff88007bc24000
 ffff88007bf7ab78 ffff88007bf8d968 ffff88007be56c00 ffff88007bc24000
 ffff88007cbfb400 ffff88007bcb7850 ffff88007be56c08 ffff88007bf7aa00
Call Trace:
 [<ffffffff81286d59>] scsi_queue_rq+0x2e8/0x3d2
 [<ffffffff8119e64d>] __blk_mq_run_hw_queue+0x19b/0x2a2
 [<ffffffff8119e901>] ? blk_mq_merge_queue_io+0x75/0x147
 [<ffffffffa00fa34a>] ? __xfs_get_blocks+0x2f9/0x2f9 [xfs]
 [<ffffffff8119edeb>] blk_mq_run_hw_queue+0x4f/0x99
 [<ffffffff8119fab9>] blk_sq_make_request+0x163/0x170
 [<ffffffff81196a8b>] generic_make_request+0x97/0xd6
 [<ffffffff81196bd7>] submit_bio+0x10d/0x12c
 [<ffffffff810d5e15>] ? __lru_cache_add+0x1e/0x3f
 [<ffffffff81142af5>] mpage_bio_submit+0x25/0x2c
 [<ffffffff8114387d>] mpage_readpages+0xf8/0x10c
 [<ffffffffa00fa34a>] ? __xfs_get_blocks+0x2f9/0x2f9 [xfs]
 [<ffffffffa00f9c45>] xfs_vm_readpages+0x18/0x1a [xfs]
 [<ffffffff810d4e5c>] __do_page_cache_readahead+0x137/0x1d3
 [<ffffffff810d5102>] ondemand_readahead+0x20a/0x21b
 [<ffffffff810d5262>] page_cache_sync_readahead+0x38/0x3a
 [<ffffffff810cd1c5>] generic_file_read_iter+0x191/0x4fb
 [<ffffffffa010b2cf>] ? xfs_ilock+0x32/0x5d [xfs]
 [<ffffffffa01023c2>] xfs_file_read_iter+0x1c2/0x213 [xfs]
 [<ffffffff81118e63>] new_sync_read+0x74/0x98
 [<ffffffff81119aef>] __vfs_read+0x14/0x3b
 [<ffffffff81119b8a>] vfs_read+0x74/0xc1
 [<ffffffff8111d977>] kernel_read+0x3c/0x4a
 [<ffffffff8111dbd2>] prepare_binprm+0x117/0x11f
 [<ffffffff8111f10d>] do_execveat_common.isra.31+0x3b2/0x5d8
 [<ffffffff8111f35a>] do_execve+0x27/0x29
 [<ffffffff81050e07>] ____call_usermodehelper+0x10a/0x138
 [<ffffffff81050cfd>] ? call_usermodehelper+0x49/0x49
 [<ffffffff8133b3d8>] ret_from_fork+0x58/0x90
 [<ffffffff81050cfd>] ? call_usermodehelper+0x49/0x49
Code: c3 55 48 89 fa 48 c7 87 b0 00 00 00 00 00 00 00 c7 87 f4 00 00 00 00 00 00 00 48 8b bf 10 01 00 00 31 c0 b9 18 00 00 00 48 89 e5 <f3> ab 66 83 ba cc 00 00 00 00 75 2a 48 8b 8a d8 00 00 00 8a 01 
RIP  [<ffffffff812853c9>] scsi_init_cmd_errh+0x2a/0x62
 RSP <ffff88007bcb77a8>
CR2: 0000000000001000
---[ end trace fbec0fe487830b1d ]---



>and %rdi is 0x1000. It seems to be simply
>
>         memset(cmd->sense_buffer, 0, SCSI_SENSE_BUFFERSIZE);
>
>where 'cmd->sense_buffer' has some insane value ("PAGE_SIZE" or just a
>flipped bit, or whatever)

Having been observed on two isolated different systems, I don't
think so much that it would be a broken HW-induced bitflip.

Oh yeah, if anybody likes, I can hand out the virtualbox image.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: NULL deref around blkmq in v4.0-rc1–rc7
  2015-04-09 18:24     ` NULL deref around blkmq " Jan Engelhardt
@ 2015-04-09 21:12       ` Linus Torvalds
  2015-04-09 21:25         ` Jens Axboe
  2015-04-09 22:29         ` Jan Engelhardt
  0 siblings, 2 replies; 14+ messages in thread
From: Linus Torvalds @ 2015-04-09 21:12 UTC (permalink / raw)
  To: Jan Engelhardt; +Cc: Rafael J. Wysocki, Jens Axboe, Linux Kernel Mailing List

[-- Attachment #1: Type: text/plain, Size: 1871 bytes --]

On Thu, Apr 9, 2015 at 11:24 AM, Jan Engelhardt <jengelh@inai.de> wrote:
>
> It's fairly consistent (reproducible?). Only 1 in 15 or so (have not kept track
> really) attempts does it not die.
>
> With frame pointers:
>  [<ffffffff81286d59>] scsi_queue_rq+0x2e8/0x3d2
>  [<ffffffff8119e64d>] __blk_mq_run_hw_queue+0x19b/0x2a2
>  [<ffffffff8119e901>] ? blk_mq_merge_queue_io+0x75/0x147
>  [<ffffffffa00fa34a>] ? __xfs_get_blocks+0x2f9/0x2f9 [xfs]
>  [<ffffffff8119edeb>] blk_mq_run_hw_queue+0x4f/0x99
>  [<ffffffff8119fab9>] blk_sq_make_request+0x163/0x170

Ok, good.

So the cmd comes from

        struct scsi_cmnd *cmd = blk_mq_rq_to_pdu(req);

which in turn is just

        return (void *) rq + sizeof(*rq);

which in turn is written by some crazy monkey on crack. That's some
shit code. Why the hell you'd write it that way, when the natural
thing to do would be just

        return rq + 1;

without the sizeof, and without the cast.

The particular crazy monkey on crack is Jens Axboe, in commit 320ae51feed5c.

Jens, really. This code is shit.

That ->sense_buffer thing is supposed to be initialized by the
blk_mq_ops.init_request() function, which is called - if it exists =
when the array of requests ('->rqs[]') is initialized.

And that code too looks like crap. It seems to be very clever, trying
to allocaet big contiguous chunks of RAM for the requests, but then
the initialization sequence is questionable as hell. It takes that
nonzeroed allocation, and zeroes a few fields randomly. The rest will
contain whatever garbage data they used to.

Does this entirely untested patch make any difference?

And Jens, this all really looks very fishy. When I look at these kinds
of core functions, and find just *stupid* code like this, it makes me
unhappy.

Anyway, I assume that the actual "disk" in question is mpt fusion?

                             Linus

[-- Attachment #2: patch.diff --]
[-- Type: text/plain, Size: 870 bytes --]

 block/blk-mq.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index b7b8933ec241..33c428530193 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1457,7 +1457,7 @@ static struct blk_mq_tags *blk_mq_init_rq_map(struct blk_mq_tag_set *set,
 
 		do {
 			page = alloc_pages_node(set->numa_node,
-				GFP_KERNEL | __GFP_NOWARN | __GFP_NORETRY,
+				GFP_KERNEL | __GFP_NOWARN | __GFP_NORETRY | __GFP_ZERO,
 				this_order);
 			if (page)
 				break;
@@ -1479,8 +1479,6 @@ static struct blk_mq_tags *blk_mq_init_rq_map(struct blk_mq_tag_set *set,
 		left -= to_do * rq_size;
 		for (j = 0; j < to_do; j++) {
 			tags->rqs[i] = p;
-			tags->rqs[i]->atomic_flags = 0;
-			tags->rqs[i]->cmd_flags = 0;
 			if (set->ops->init_request) {
 				if (set->ops->init_request(set->driver_data,
 						tags->rqs[i], hctx_idx, i,

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: NULL deref around blkmq in v4.0-rc1–rc7
  2015-04-09 21:12       ` Linus Torvalds
@ 2015-04-09 21:25         ` Jens Axboe
  2015-04-09 21:37           ` Linus Torvalds
  2015-04-09 22:29         ` Jan Engelhardt
  1 sibling, 1 reply; 14+ messages in thread
From: Jens Axboe @ 2015-04-09 21:25 UTC (permalink / raw)
  To: Linus Torvalds, Jan Engelhardt
  Cc: Rafael J. Wysocki, Linux Kernel Mailing List

On 04/09/2015 03:12 PM, Linus Torvalds wrote:
> On Thu, Apr 9, 2015 at 11:24 AM, Jan Engelhardt <jengelh@inai.de> wrote:
>>
>> It's fairly consistent (reproducible?). Only 1 in 15 or so (have not kept track
>> really) attempts does it not die.
>>
>> With frame pointers:
>>   [<ffffffff81286d59>] scsi_queue_rq+0x2e8/0x3d2
>>   [<ffffffff8119e64d>] __blk_mq_run_hw_queue+0x19b/0x2a2
>>   [<ffffffff8119e901>] ? blk_mq_merge_queue_io+0x75/0x147
>>   [<ffffffffa00fa34a>] ? __xfs_get_blocks+0x2f9/0x2f9 [xfs]
>>   [<ffffffff8119edeb>] blk_mq_run_hw_queue+0x4f/0x99
>>   [<ffffffff8119fab9>] blk_sq_make_request+0x163/0x170
>
> Ok, good.
>
> So the cmd comes from
>
>          struct scsi_cmnd *cmd = blk_mq_rq_to_pdu(req);
>
> which in turn is just
>
>          return (void *) rq + sizeof(*rq);
>
> which in turn is written by some crazy monkey on crack. That's some
> shit code. Why the hell you'd write it that way, when the natural
> thing to do would be just
>
>          return rq + 1;
>
> without the sizeof, and without the cast.
>
> The particular crazy monkey on crack is Jens Axboe, in commit 320ae51feed5c.
>
> Jens, really. This code is shit.

That's a bit rough on a single line of code like that, don't you think? 
But yes, rq + 1 is identical and cleaner...

> That ->sense_buffer thing is supposed to be initialized by the
> blk_mq_ops.init_request() function, which is called - if it exists =
> when the array of requests ('->rqs[]') is initialized.
>
> And that code too looks like crap. It seems to be very clever, trying
> to allocaet big contiguous chunks of RAM for the requests, but then
> the initialization sequence is questionable as hell. It takes that
> nonzeroed allocation, and zeroes a few fields randomly. The rest will
> contain whatever garbage data they used to.
>
> Does this entirely untested patch make any difference?
>
> And Jens, this all really looks very fishy. When I look at these kinds
> of core functions, and find just *stupid* code like this, it makes me
> unhappy.

Not sure why it isn't all zeroed, definitely the saner thing to do at 
init time. So patch looks fine, should be applied regardless of whether 
or not it fixes this issue.

But it's _always_ been like that, so it's not a change in behavior. It 
is fragile, so perhaps some SCSI change modified alloc behavior and it's 
not causing and issue.

And if this is mpt, we recently ran into some list corruption issues due 
to a bug in the driver. It hit on reboot, but it was scan related, so 
could be a boot issue as well.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: NULL deref around blkmq in v4.0-rc1–rc7
  2015-04-09 21:25         ` Jens Axboe
@ 2015-04-09 21:37           ` Linus Torvalds
  2015-04-09 21:42             ` Jens Axboe
  0 siblings, 1 reply; 14+ messages in thread
From: Linus Torvalds @ 2015-04-09 21:37 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Jan Engelhardt, Rafael J. Wysocki, Linux Kernel Mailing List

On Thu, Apr 9, 2015 at 2:25 PM, Jens Axboe <axboe@kernel.dk> wrote:
>
> Not sure why it isn't all zeroed, definitely the saner thing to do at init
> time.

So practically speaking, it might well often be zeroed just because
the BIOS may have initialized memory that way (and big multi-page
allocations have probably not gotten re-used).

> And if this is mpt, we recently ran into some list corruption issues due to
> a bug in the driver. It hit on reboot, but it was scan related, so could be
> a boot issue as well.

So one of the earlier emails had this:

   Copyright (c) 1999-2008 LSI Corporation
   Fusion MPT SAS Host driver 3.04.20
   mptbase: ioc0: Initiating bringup
   ioc0: LSISAS1068 A0: Capabilities={Initiator}
   scsi host0: ioc0: LSISAS1068 A0, FwRev=00000000h, Ports=8, MaxQ=256, IRQ=22
   mptsas: ioc0: attaching ssp device: fw_channel 0, fw_id 1, phy 1,
sas_addr 0x1060504030201a0
   scsi 0:0:0:0: Direct-Access     VBOX     HARDDISK         1.0  PQ: 0 ANSI: 5
   scsi 0:0:0:0: Attached scsi generic sg0 type 0
   mptbase: ioc1: Initiating bringup
   ioc1: LSISAS1068 A0: Capabilities={Initiator}
   scsi host1: ioc1: LSISAS1068 A0, FwRev=00000000h, Ports=8, MaxQ=256, IRQ=17
   mptsas: ioc1: attaching ssp device: fw_channel 0, fw_id 0, phy 0,
sas_addr 0x60504030201a0
   scsi 1:0:0:0: Direct-Access     VBOX     HARDDISK         1.0  PQ: 0 ANSI: 5
   scsi 1:0:0:0: Attached scsi generic sg1 type 0

and I'm assuming that that is the backing storage.

And yes, memory corruption sounds like a more likely cause than
anything else. I don't like how the request data wasn't fully
initialized, but the cmd->sense_buffer pointer itself *should* have
been initialized by the ->init_request() call.

So I don't actually expect my patch to really make any difference,
although I do think that code should be looked at.

                            Linus

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: NULL deref around blkmq in v4.0-rc1–rc7
  2015-04-09 21:37           ` Linus Torvalds
@ 2015-04-09 21:42             ` Jens Axboe
  2015-04-09 22:32               ` Jan Engelhardt
  0 siblings, 1 reply; 14+ messages in thread
From: Jens Axboe @ 2015-04-09 21:42 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Jan Engelhardt, Rafael J. Wysocki, Linux Kernel Mailing List

On 04/09/2015 03:37 PM, Linus Torvalds wrote:
> On Thu, Apr 9, 2015 at 2:25 PM, Jens Axboe <axboe@kernel.dk> wrote:
>>
>> Not sure why it isn't all zeroed, definitely the saner thing to do at init
>> time.
>
> So practically speaking, it might well often be zeroed just because
> the BIOS may have initialized memory that way (and big multi-page
> allocations have probably not gotten re-used).
>
>> And if this is mpt, we recently ran into some list corruption issues due to
>> a bug in the driver. It hit on reboot, but it was scan related, so could be
>> a boot issue as well.
>
> So one of the earlier emails had this:
>
>     Copyright (c) 1999-2008 LSI Corporation
>     Fusion MPT SAS Host driver 3.04.20
>     mptbase: ioc0: Initiating bringup
>     ioc0: LSISAS1068 A0: Capabilities={Initiator}
>     scsi host0: ioc0: LSISAS1068 A0, FwRev=00000000h, Ports=8, MaxQ=256, IRQ=22
>     mptsas: ioc0: attaching ssp device: fw_channel 0, fw_id 1, phy 1,
> sas_addr 0x1060504030201a0
>     scsi 0:0:0:0: Direct-Access     VBOX     HARDDISK         1.0  PQ: 0 ANSI: 5
>     scsi 0:0:0:0: Attached scsi generic sg0 type 0
>     mptbase: ioc1: Initiating bringup
>     ioc1: LSISAS1068 A0: Capabilities={Initiator}
>     scsi host1: ioc1: LSISAS1068 A0, FwRev=00000000h, Ports=8, MaxQ=256, IRQ=17
>     mptsas: ioc1: attaching ssp device: fw_channel 0, fw_id 0, phy 0,
> sas_addr 0x60504030201a0
>     scsi 1:0:0:0: Direct-Access     VBOX     HARDDISK         1.0  PQ: 0 ANSI: 5
>     scsi 1:0:0:0: Attached scsi generic sg1 type 0
>
> and I'm assuming that that is the backing storage.

mpt is a maze of roughly duplicate, crazy drivers. The bug in question 
impacted mpt2sas and mpt3sas, and this looks like the mpt fusion driver. 
So it's probably not that.

> And yes, memory corruption sounds like a more likely cause than
> anything else. I don't like how the request data wasn't fully
> initialized, but the cmd->sense_buffer pointer itself *should* have
> been initialized by the ->init_request() call.

The block request state should be sane, we clear what we need, and 
at-alloc init will ensure that state is safe across request reuse. But 
it really should just be cleared unconditionally at allocation time.

And ->init_request() should take care of the SCSI command init. It does 
look like it's relying on zeroes already, so either adding the memset() 
or just adding the __GFP_ZERO would be prudent.

> So I don't actually expect my patch to really make any difference,
> although I do think that code should be looked at.

Jan, is it always clearing in a page size? That seems odd, especially if 
we're considering random gunk in memory.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: NULL deref around blkmq in v4.0-rc1–rc7
  2015-04-09 21:12       ` Linus Torvalds
  2015-04-09 21:25         ` Jens Axboe
@ 2015-04-09 22:29         ` Jan Engelhardt
  2015-04-10  0:07           ` Linus Torvalds
  1 sibling, 1 reply; 14+ messages in thread
From: Jan Engelhardt @ 2015-04-09 22:29 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Rafael J. Wysocki, Jens Axboe, Linux Kernel Mailing List


On Thursday 2015-04-09 23:12, Linus Torvalds wrote:
>That ->sense_buffer thing is supposed to be initialized by the
>blk_mq_ops.init_request() function, which is called - if it exists =
>when the array of requests ('->rqs[]') is initialized.
>[...]
>Does this entirely untested patch make any difference?

Yes, this seems to solve it for me.


>Anyway, I assume that the actual "disk" in question is mpt fusion?

I had two setups, one non-virtual AHCI, and one emulated mpt,
both of which had failed and are now fixed for what I can tell.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: NULL deref around blkmq in v4.0-rc1–rc7
  2015-04-09 21:42             ` Jens Axboe
@ 2015-04-09 22:32               ` Jan Engelhardt
  0 siblings, 0 replies; 14+ messages in thread
From: Jan Engelhardt @ 2015-04-09 22:32 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Linus Torvalds, Rafael J. Wysocki, Linux Kernel Mailing List


On Thursday 2015-04-09 23:42, Jens Axboe wrote:
>
>> So I don't actually expect my patch to really make any difference,
>> although I do think that code should be looked at.
>
> Jan, is it always clearing in a page size? That seems odd, especially if we're
> considering random gunk in memory.

The particular memset in scsi_init_cmd_errh would only clear
SCSI_SENSE_BUFFERSIZE. The problem really was that cmd->sense_buffer
was nonsensical (I did a hexdump) and read 0x1000. In one instance,
it even read 0x10000 because I used that value in an if clause
preceding a printk statement I added. Together with __GFP_ZERO
apparently fixing the issue for today, the noninitialized memory
theory looks like the most applicable one to me currently.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: NULL deref around blkmq in v4.0-rc1–rc7
  2015-04-09 22:29         ` Jan Engelhardt
@ 2015-04-10  0:07           ` Linus Torvalds
  2015-04-11 17:46             ` Linus Torvalds
  0 siblings, 1 reply; 14+ messages in thread
From: Linus Torvalds @ 2015-04-10  0:07 UTC (permalink / raw)
  To: Jan Engelhardt; +Cc: Rafael J. Wysocki, Jens Axboe, Linux Kernel Mailing List

On Thu, Apr 9, 2015 at 3:29 PM, Jan Engelhardt <jengelh@inai.de> wrote:
>
> On Thursday 2015-04-09 23:12, Linus Torvalds wrote:
>>That ->sense_buffer thing is supposed to be initialized by the
>>blk_mq_ops.init_request() function, which is called - if it exists =
>>when the array of requests ('->rqs[]') is initialized.
>>[...]
>>Does this entirely untested patch make any difference?
>
> Yes, this seems to solve it for me.

Can you humor me, and try that patch on top of a few different kernel
versions that you had trouble with.

The fact that your bisection ended on a completely unrelated commit
makes me think that some particular memory aligment (code or initial
data or whatever) ends up showing or hiding the bug, so I'm a bit
worried that my ->rqs[] initialization patch also happened to just
hide the problem rather than really fix it.

That said, at least it's a lot more likely to be relevant than the
ACPI merge commit you bisected to, so while I'm somewhat cautious
about that patch (not that it would be wrong, but that it would matter
that much) I at least can see it make a real difference.

                        Linus

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: NULL deref around blkmq in v4.0-rc1–rc7
  2015-04-10  0:07           ` Linus Torvalds
@ 2015-04-11 17:46             ` Linus Torvalds
  2015-04-11 17:47               ` Jens Axboe
  2015-04-11 19:48               ` Jan Engelhardt
  0 siblings, 2 replies; 14+ messages in thread
From: Linus Torvalds @ 2015-04-11 17:46 UTC (permalink / raw)
  To: Jan Engelhardt; +Cc: Rafael J. Wysocki, Jens Axboe, Linux Kernel Mailing List

On Thu, Apr 9, 2015 at 5:07 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> On Thu, Apr 9, 2015 at 3:29 PM, Jan Engelhardt <jengelh@inai.de> wrote:
>>
>> Yes, this seems to solve it for me.
>
> Can you humor me, and try that patch on top of a few different kernel
> versions that you had trouble with.

Jan? I'm inclined to commit the patch for 4.0 even though it is late
(it's not like it can hurt), but since you seem to be the only one
seeing this issue, I'd really like you to double-check it..

                          Linus

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: NULL deref around blkmq in v4.0-rc1–rc7
  2015-04-11 17:46             ` Linus Torvalds
@ 2015-04-11 17:47               ` Jens Axboe
  2015-04-11 19:48               ` Jan Engelhardt
  1 sibling, 0 replies; 14+ messages in thread
From: Jens Axboe @ 2015-04-11 17:47 UTC (permalink / raw)
  To: Linus Torvalds, Jan Engelhardt
  Cc: Rafael J. Wysocki, Linux Kernel Mailing List

On 04/11/2015 11:46 AM, Linus Torvalds wrote:
> On Thu, Apr 9, 2015 at 5:07 PM, Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
>> On Thu, Apr 9, 2015 at 3:29 PM, Jan Engelhardt <jengelh@inai.de> wrote:
>>>
>>> Yes, this seems to solve it for me.
>>
>> Can you humor me, and try that patch on top of a few different kernel
>> versions that you had trouble with.
>
> Jan? I'm inclined to commit the patch for 4.0 even though it is late
> (it's not like it can hurt), but since you seem to be the only one
> seeing this issue, I'd really like you to double-check it..

I already committed for 4.1 (marked stable), but I'd be fine with adding 
it to this series as well obviously.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: NULL deref around blkmq in v4.0-rc1–rc7
  2015-04-11 17:46             ` Linus Torvalds
  2015-04-11 17:47               ` Jens Axboe
@ 2015-04-11 19:48               ` Jan Engelhardt
  1 sibling, 0 replies; 14+ messages in thread
From: Jan Engelhardt @ 2015-04-11 19:48 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Rafael J. Wysocki, Jens Axboe, Linux Kernel Mailing List

On Saturday 2015-04-11 19:46, Linus Torvalds wrote:

>On Thu, Apr 9, 2015 at 5:07 PM, Linus Torvalds
><torvalds@linux-foundation.org> wrote:
>> On Thu, Apr 9, 2015 at 3:29 PM, Jan Engelhardt <jengelh@inai.de> wrote:
>>>
>>> Yes, this seems to solve it for me.
>>
>> Can you humor me, and try that patch on top of a few different kernel
>> versions that you had trouble with.
>
>Jan? I'm inclined to commit the patch for 4.0 even though it is late
>(it's not like it can hurt), but since you seem to be the only one
>seeing this issue, I'd really like you to double-check it..

Yeah, just commit. I'm currently "stuck" in the weekend.

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2015-04-11 19:48 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-04-08 13:41 NULL deref around xfs in v4.0-rc1–rc7 Jan Engelhardt
2015-04-08 15:20 ` Jan Engelhardt
2015-04-09 17:38   ` Linus Torvalds
2015-04-09 18:24     ` NULL deref around blkmq " Jan Engelhardt
2015-04-09 21:12       ` Linus Torvalds
2015-04-09 21:25         ` Jens Axboe
2015-04-09 21:37           ` Linus Torvalds
2015-04-09 21:42             ` Jens Axboe
2015-04-09 22:32               ` Jan Engelhardt
2015-04-09 22:29         ` Jan Engelhardt
2015-04-10  0:07           ` Linus Torvalds
2015-04-11 17:46             ` Linus Torvalds
2015-04-11 17:47               ` Jens Axboe
2015-04-11 19:48               ` Jan Engelhardt

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.