All of lore.kernel.org
 help / color / mirror / Atom feed
* rb_erase oops.
@ 2014-06-09 22:30 ` Dave Jones
  0 siblings, 0 replies; 10+ messages in thread
From: Dave Jones @ 2014-06-09 22:30 UTC (permalink / raw)
  To: Linux Kernel; +Cc: linux-mm, Linus Torvalds

Kernel based on v3.15-7257-g963649d735c8

	Dave

Oops: 0000 [#1] PREEMPT SMP 
Modules linked in: dlci 8021q garp snd_seq_dummy bnep llc2 af_key bridge stp fuse tun scsi_transport_iscsi ipt_ULOG nfnetlink rfcomm can_raw hidp can_bcm nfc caif_socket caif af_802154 ieee802154 phonet af_rxrpc can pppoe pppox ppp_generic slhc irda crc_ccitt rds rose x25 atm netrom appletalk ipx p8023 psnap p8022 llc ax25 coretemp cfg80211 hwmon x86_pkg_temp_thermal kvm_intel kvm snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic btusb bluetooth snd_hda_intel xfs snd_hda_controller snd_hda_codec snd_hwdep snd_seq e1000e snd_seq_device crct10dif_pclmul crc32c_intel ghash_clmulni_intel snd_pcm snd_timer snd 6lowpan_iphc usb_debug rfkill libcrc32c ptp pps_core microcode shpchp pcspkr serio_raw soundcore
CPU: 3 PID: 2049 Comm: kworker/3:1 Not tainted 3.15.0+ #231
Workqueue: events free_work
task: ffff880100944260 ti: ffff88009ed9c000 task.ti: ffff88009ed9c000
RIP: 0010:[<ffffffff8a326619>]  [<ffffffff8a326619>] rb_erase+0xb9/0x380
RSP: 0000:ffff88009ed9fcc0  EFLAGS: 00010202
RAX: ffff8802396b0018 RBX: ffff88024176b008 RCX: 0000000000000000
RDX: ffffc90010fe1bf0 RSI: ffffffff8afb3178 RDI: 0000000000000001
RBP: ffff88009ed9fcc0 R08: ffff88023b122e58 R09: ffff88024176ae58
R10: 0000000000000000 R11: ffff880245801dc0 R12: ffff88024176b020
R13: ffff88009ed9fd80 R14: ffff88009ed9fd88 R15: ffff88024e397100
FS:  0000000000000000(0000) GS:ffff88024e380000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000001 CR3: 000000000ac10000 CR4: 00000000001407e0
DR0: 00000000024cc000 DR1: 00000000024c2000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600
Stack:
 ffff88009ed9fce0 ffffffff8a195674 ffff88024176ae40 ffffffff8ac64260
 ffff88009ed9fd70 ffffffff8a19631d ffff88009ed9fd88 ffff88009ed9fd80
 ffff88009ed9fd10 ffff880244b93750 0000000000000000 ffff880244b93750
Call Trace:
 [<ffffffff8a195674>] __free_vmap_area+0x54/0xf0
 [<ffffffff8a19631d>] __purge_vmap_area_lazy+0x15d/0x4a0
 [<ffffffff8a1966e1>] free_vmap_area_noflush+0x81/0x90
 [<ffffffff8a197dae>] remove_vm_area+0x5e/0x70
 [<ffffffff8a197dee>] __vunmap+0x2e/0x100
 [<ffffffff8a197eed>] free_work+0x2d/0x40
 [<ffffffff8a08df11>] process_one_work+0x171/0x4d0
 [<ffffffff8a08eeac>] worker_thread+0x12c/0x3d0
 [<ffffffff8a0bc4b2>] ? complete+0x42/0x50
 [<ffffffff8a08ed80>] ? manage_workers.isra.25+0x2d0/0x2d0
 [<ffffffff8a095b3f>] kthread+0xff/0x120
 [<ffffffff8a095a40>] ? kthread_create_on_node+0x1c0/0x1c0
 [<ffffffff8a741eec>] ret_from_fork+0x7c/0xb0
 [<ffffffff8a095a40>] ? kthread_create_on_node+0x1c0/0x1c0
Code: 85 d2 74 0e 48 83 c8 01 48 89 0a 49 89 02 5d c3 66 90 48 8b 3a 48 89 0a 83 e7 01 74 f1 31 c9 eb 40 90 48 8b 7a 08 48 85 ff 74 09 <f6> 07 01 0f 84 a3 01 00 00 48 8b 4a 10 48 85 c9 74 09 f6 01 01 
RIP  [<ffffffff8a326619>] rb_erase+0xb9/0x380
 RSP <ffff88009ed9fcc0>


^ permalink raw reply	[flat|nested] 10+ messages in thread

* rb_erase oops.
@ 2014-06-09 22:30 ` Dave Jones
  0 siblings, 0 replies; 10+ messages in thread
From: Dave Jones @ 2014-06-09 22:30 UTC (permalink / raw)
  To: Linux Kernel; +Cc: linux-mm, Linus Torvalds

Kernel based on v3.15-7257-g963649d735c8

	Dave

Oops: 0000 [#1] PREEMPT SMP 
Modules linked in: dlci 8021q garp snd_seq_dummy bnep llc2 af_key bridge stp fuse tun scsi_transport_iscsi ipt_ULOG nfnetlink rfcomm can_raw hidp can_bcm nfc caif_socket caif af_802154 ieee802154 phonet af_rxrpc can pppoe pppox ppp_generic slhc irda crc_ccitt rds rose x25 atm netrom appletalk ipx p8023 psnap p8022 llc ax25 coretemp cfg80211 hwmon x86_pkg_temp_thermal kvm_intel kvm snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic btusb bluetooth snd_hda_intel xfs snd_hda_controller snd_hda_codec snd_hwdep snd_seq e1000e snd_seq_device crct10dif_pclmul crc32c_intel ghash_clmulni_intel snd_pcm snd_timer snd 6lowpan_iphc usb_debug rfkill libcrc32c ptp pps_core microcode shpchp pcspkr serio_raw soundcore
CPU: 3 PID: 2049 Comm: kworker/3:1 Not tainted 3.15.0+ #231
Workqueue: events free_work
task: ffff880100944260 ti: ffff88009ed9c000 task.ti: ffff88009ed9c000
RIP: 0010:[<ffffffff8a326619>]  [<ffffffff8a326619>] rb_erase+0xb9/0x380
RSP: 0000:ffff88009ed9fcc0  EFLAGS: 00010202
RAX: ffff8802396b0018 RBX: ffff88024176b008 RCX: 0000000000000000
RDX: ffffc90010fe1bf0 RSI: ffffffff8afb3178 RDI: 0000000000000001
RBP: ffff88009ed9fcc0 R08: ffff88023b122e58 R09: ffff88024176ae58
R10: 0000000000000000 R11: ffff880245801dc0 R12: ffff88024176b020
R13: ffff88009ed9fd80 R14: ffff88009ed9fd88 R15: ffff88024e397100
FS:  0000000000000000(0000) GS:ffff88024e380000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000001 CR3: 000000000ac10000 CR4: 00000000001407e0
DR0: 00000000024cc000 DR1: 00000000024c2000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600
Stack:
 ffff88009ed9fce0 ffffffff8a195674 ffff88024176ae40 ffffffff8ac64260
 ffff88009ed9fd70 ffffffff8a19631d ffff88009ed9fd88 ffff88009ed9fd80
 ffff88009ed9fd10 ffff880244b93750 0000000000000000 ffff880244b93750
Call Trace:
 [<ffffffff8a195674>] __free_vmap_area+0x54/0xf0
 [<ffffffff8a19631d>] __purge_vmap_area_lazy+0x15d/0x4a0
 [<ffffffff8a1966e1>] free_vmap_area_noflush+0x81/0x90
 [<ffffffff8a197dae>] remove_vm_area+0x5e/0x70
 [<ffffffff8a197dee>] __vunmap+0x2e/0x100
 [<ffffffff8a197eed>] free_work+0x2d/0x40
 [<ffffffff8a08df11>] process_one_work+0x171/0x4d0
 [<ffffffff8a08eeac>] worker_thread+0x12c/0x3d0
 [<ffffffff8a0bc4b2>] ? complete+0x42/0x50
 [<ffffffff8a08ed80>] ? manage_workers.isra.25+0x2d0/0x2d0
 [<ffffffff8a095b3f>] kthread+0xff/0x120
 [<ffffffff8a095a40>] ? kthread_create_on_node+0x1c0/0x1c0
 [<ffffffff8a741eec>] ret_from_fork+0x7c/0xb0
 [<ffffffff8a095a40>] ? kthread_create_on_node+0x1c0/0x1c0
Code: 85 d2 74 0e 48 83 c8 01 48 89 0a 49 89 02 5d c3 66 90 48 8b 3a 48 89 0a 83 e7 01 74 f1 31 c9 eb 40 90 48 8b 7a 08 48 85 ff 74 09 <f6> 07 01 0f 84 a3 01 00 00 48 8b 4a 10 48 85 c9 74 09 f6 01 01 
RIP  [<ffffffff8a326619>] rb_erase+0xb9/0x380
 RSP <ffff88009ed9fcc0>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: rb_erase oops.
  2014-06-09 22:30 ` Dave Jones
@ 2014-06-09 23:09   ` David Rientjes
  -1 siblings, 0 replies; 10+ messages in thread
From: David Rientjes @ 2014-06-09 23:09 UTC (permalink / raw)
  To: Dave Jones, Joonsoo Kim; +Cc: Linus Torvalds, linux-kernel, linux-mm

On Mon, 9 Jun 2014, Dave Jones wrote:

> Kernel based on v3.15-7257-g963649d735c8
> 
> 	Dave
> 
> Oops: 0000 [#1] PREEMPT SMP 
> Modules linked in: dlci 8021q garp snd_seq_dummy bnep llc2 af_key bridge stp fuse tun scsi_transport_iscsi ipt_ULOG nfnetlink rfcomm can_raw hidp can_bcm nfc caif_socket caif af_802154 ieee802154 phonet af_rxrpc can pppoe pppox ppp_generic slhc irda crc_ccitt rds rose x25 atm netrom appletalk ipx p8023 psnap p8022 llc ax25 coretemp cfg80211 hwmon x86_pkg_temp_thermal kvm_intel kvm snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic btusb bluetooth snd_hda_intel xfs snd_hda_controller snd_hda_codec snd_hwdep snd_seq e1000e snd_seq_device crct10dif_pclmul crc32c_intel ghash_clmulni_intel snd_pcm snd_timer snd 6lowpan_iphc usb_debug rfkill libcrc32c ptp pps_core microcode shpchp pcspkr serio_raw soundcore
> CPU: 3 PID: 2049 Comm: kworker/3:1 Not tainted 3.15.0+ #231
> Workqueue: events free_work
> task: ffff880100944260 ti: ffff88009ed9c000 task.ti: ffff88009ed9c000
> RIP: 0010:[<ffffffff8a326619>]  [<ffffffff8a326619>] rb_erase+0xb9/0x380
> RSP: 0000:ffff88009ed9fcc0  EFLAGS: 00010202
> RAX: ffff8802396b0018 RBX: ffff88024176b008 RCX: 0000000000000000
> RDX: ffffc90010fe1bf0 RSI: ffffffff8afb3178 RDI: 0000000000000001
> RBP: ffff88009ed9fcc0 R08: ffff88023b122e58 R09: ffff88024176ae58
> R10: 0000000000000000 R11: ffff880245801dc0 R12: ffff88024176b020
> R13: ffff88009ed9fd80 R14: ffff88009ed9fd88 R15: ffff88024e397100
> FS:  0000000000000000(0000) GS:ffff88024e380000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000000000000001 CR3: 000000000ac10000 CR4: 00000000001407e0
> DR0: 00000000024cc000 DR1: 00000000024c2000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600
> Stack:
>  ffff88009ed9fce0 ffffffff8a195674 ffff88024176ae40 ffffffff8ac64260
>  ffff88009ed9fd70 ffffffff8a19631d ffff88009ed9fd88 ffff88009ed9fd80
>  ffff88009ed9fd10 ffff880244b93750 0000000000000000 ffff880244b93750
> Call Trace:
>  [<ffffffff8a195674>] __free_vmap_area+0x54/0xf0
>  [<ffffffff8a19631d>] __purge_vmap_area_lazy+0x15d/0x4a0
>  [<ffffffff8a1966e1>] free_vmap_area_noflush+0x81/0x90
>  [<ffffffff8a197dae>] remove_vm_area+0x5e/0x70
>  [<ffffffff8a197dee>] __vunmap+0x2e/0x100
>  [<ffffffff8a197eed>] free_work+0x2d/0x40
>  [<ffffffff8a08df11>] process_one_work+0x171/0x4d0
>  [<ffffffff8a08eeac>] worker_thread+0x12c/0x3d0
>  [<ffffffff8a0bc4b2>] ? complete+0x42/0x50
>  [<ffffffff8a08ed80>] ? manage_workers.isra.25+0x2d0/0x2d0
>  [<ffffffff8a095b3f>] kthread+0xff/0x120
>  [<ffffffff8a095a40>] ? kthread_create_on_node+0x1c0/0x1c0
>  [<ffffffff8a741eec>] ret_from_fork+0x7c/0xb0
>  [<ffffffff8a095a40>] ? kthread_create_on_node+0x1c0/0x1c0
> Code: 85 d2 74 0e 48 83 c8 01 48 89 0a 49 89 02 5d c3 66 90 48 8b 3a 48 89 0a 83 e7 01 74 f1 31 c9 eb 40 90 48 8b 7a 08 48 85 ff 74 09 <f6> 07 01 0f 84 a3 01 00 00 48 8b 4a 10 48 85 c9 74 09 f6 01 01 
> RIP  [<ffffffff8a326619>] rb_erase+0xb9/0x380
>  RSP <ffff88009ed9fcc0>
> 

Adding Joonsoo to the cc.

I haven't looked very closely, and it may be unrelated, but perhaps this 
is a race because of a failed radix_tree_preload() in new_vmap_block() and 
this happens in low on memory conditions (and would be tough to reproduce 
because of a race with the rcu-protected vmap_area_list iteration in
__purge_vmap_area_lazy() and the actual freeing of the vmap_area under 
vmap_area_lock).

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: rb_erase oops.
@ 2014-06-09 23:09   ` David Rientjes
  0 siblings, 0 replies; 10+ messages in thread
From: David Rientjes @ 2014-06-09 23:09 UTC (permalink / raw)
  To: Dave Jones, Joonsoo Kim; +Cc: Linus Torvalds, linux-kernel, linux-mm

On Mon, 9 Jun 2014, Dave Jones wrote:

> Kernel based on v3.15-7257-g963649d735c8
> 
> 	Dave
> 
> Oops: 0000 [#1] PREEMPT SMP 
> Modules linked in: dlci 8021q garp snd_seq_dummy bnep llc2 af_key bridge stp fuse tun scsi_transport_iscsi ipt_ULOG nfnetlink rfcomm can_raw hidp can_bcm nfc caif_socket caif af_802154 ieee802154 phonet af_rxrpc can pppoe pppox ppp_generic slhc irda crc_ccitt rds rose x25 atm netrom appletalk ipx p8023 psnap p8022 llc ax25 coretemp cfg80211 hwmon x86_pkg_temp_thermal kvm_intel kvm snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic btusb bluetooth snd_hda_intel xfs snd_hda_controller snd_hda_codec snd_hwdep snd_seq e1000e snd_seq_device crct10dif_pclmul crc32c_intel ghash_clmulni_intel snd_pcm snd_timer snd 6lowpan_iphc usb_debug rfkill libcrc32c ptp pps_core microcode shpchp pcspkr serio_raw soundcore
> CPU: 3 PID: 2049 Comm: kworker/3:1 Not tainted 3.15.0+ #231
> Workqueue: events free_work
> task: ffff880100944260 ti: ffff88009ed9c000 task.ti: ffff88009ed9c000
> RIP: 0010:[<ffffffff8a326619>]  [<ffffffff8a326619>] rb_erase+0xb9/0x380
> RSP: 0000:ffff88009ed9fcc0  EFLAGS: 00010202
> RAX: ffff8802396b0018 RBX: ffff88024176b008 RCX: 0000000000000000
> RDX: ffffc90010fe1bf0 RSI: ffffffff8afb3178 RDI: 0000000000000001
> RBP: ffff88009ed9fcc0 R08: ffff88023b122e58 R09: ffff88024176ae58
> R10: 0000000000000000 R11: ffff880245801dc0 R12: ffff88024176b020
> R13: ffff88009ed9fd80 R14: ffff88009ed9fd88 R15: ffff88024e397100
> FS:  0000000000000000(0000) GS:ffff88024e380000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000000000000001 CR3: 000000000ac10000 CR4: 00000000001407e0
> DR0: 00000000024cc000 DR1: 00000000024c2000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600
> Stack:
>  ffff88009ed9fce0 ffffffff8a195674 ffff88024176ae40 ffffffff8ac64260
>  ffff88009ed9fd70 ffffffff8a19631d ffff88009ed9fd88 ffff88009ed9fd80
>  ffff88009ed9fd10 ffff880244b93750 0000000000000000 ffff880244b93750
> Call Trace:
>  [<ffffffff8a195674>] __free_vmap_area+0x54/0xf0
>  [<ffffffff8a19631d>] __purge_vmap_area_lazy+0x15d/0x4a0
>  [<ffffffff8a1966e1>] free_vmap_area_noflush+0x81/0x90
>  [<ffffffff8a197dae>] remove_vm_area+0x5e/0x70
>  [<ffffffff8a197dee>] __vunmap+0x2e/0x100
>  [<ffffffff8a197eed>] free_work+0x2d/0x40
>  [<ffffffff8a08df11>] process_one_work+0x171/0x4d0
>  [<ffffffff8a08eeac>] worker_thread+0x12c/0x3d0
>  [<ffffffff8a0bc4b2>] ? complete+0x42/0x50
>  [<ffffffff8a08ed80>] ? manage_workers.isra.25+0x2d0/0x2d0
>  [<ffffffff8a095b3f>] kthread+0xff/0x120
>  [<ffffffff8a095a40>] ? kthread_create_on_node+0x1c0/0x1c0
>  [<ffffffff8a741eec>] ret_from_fork+0x7c/0xb0
>  [<ffffffff8a095a40>] ? kthread_create_on_node+0x1c0/0x1c0
> Code: 85 d2 74 0e 48 83 c8 01 48 89 0a 49 89 02 5d c3 66 90 48 8b 3a 48 89 0a 83 e7 01 74 f1 31 c9 eb 40 90 48 8b 7a 08 48 85 ff 74 09 <f6> 07 01 0f 84 a3 01 00 00 48 8b 4a 10 48 85 c9 74 09 f6 01 01 
> RIP  [<ffffffff8a326619>] rb_erase+0xb9/0x380
>  RSP <ffff88009ed9fcc0>
> 

Adding Joonsoo to the cc.

I haven't looked very closely, and it may be unrelated, but perhaps this 
is a race because of a failed radix_tree_preload() in new_vmap_block() and 
this happens in low on memory conditions (and would be tough to reproduce 
because of a race with the rcu-protected vmap_area_list iteration in
__purge_vmap_area_lazy() and the actual freeing of the vmap_area under 
vmap_area_lock).

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: rb_erase oops.
  2014-06-09 22:30 ` Dave Jones
@ 2014-06-09 23:30   ` Linus Torvalds
  -1 siblings, 0 replies; 10+ messages in thread
From: Linus Torvalds @ 2014-06-09 23:30 UTC (permalink / raw)
  To: Dave Jones, Linux Kernel, linux-mm, Linus Torvalds, Joonsoo Kim

On Mon, Jun 9, 2014 at 3:30 PM, Dave Jones <davej@redhat.com> wrote:
>
> Oops: 0000 [#1] PREEMPT SMP

Dave, for some reason your oops is missing the first line. There
should have been something like

 "Unable to handle kernel NULL pointer access at 00000001"

or something.

Anyway, the code decodes to

  22: 48 8b 7a 08           mov    0x8(%rdx),%rdi
  26: 48 85 ff             test   %rdi,%rdi
  29: 74 09                 je     0x34
  2b:* f6 07 01             testb  $0x1,(%rdi) <-- trapping instruction

and %rdi has the value "1". Which isn't  NULL (so the test against
zero doesn't trigger), but obviously traps.

That *looks* like the test for

                        if (!tmp1 || rb_is_black(tmp1)) {

(the "testb $0x1,(%rdi)" is actually testing the low bit of the
"entry->__rb_tree_parent" pointer, and there's a subsequent "testb"
too, so it looks like one of the two cases where we look at the two
siblings and see if they are both NULL or black)

> Workqueue: events free_work

That's the "vfree was done in irq context, so we delay it into a
workqueue" worker.

> RIP: rb_erase+0xb9/0x380
> RAX: ffff8802396b0018 RBX: ffff88024176b008 RCX: 0000000000000000
> RDX: ffffc90010fe1bf0 RSI: ffffffff8afb3178 RDI: 0000000000000001
> RBP: ffff88009ed9fcc0 R08: ffff88023b122e58 R09: ffff88024176ae58
> R10: 0000000000000000 R11: ffff880245801dc0 R12: ffff88024176b020
> R13: ffff88009ed9fd80 R14: ffff88009ed9fd88 R15: ffff88024e397100
> FS:  0000000000000000(0000) GS:ffff88024e380000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000000000000001

.. and obviously the faulting address in CR2 is that invalid pointer "1".

The rbtree pointers have the two low bits reserved for color (only one
bit used - the low bit says whether it's red or black), so I'm
wondering if this is a "black NULL" pointer that hit some code
sequence that didn't properly mask off the color. But that really
shouldn't exist, afaik.

Perhaps just memory corruption (ie a NULL having been randomly
incremented to "1").

          Linus

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: rb_erase oops.
@ 2014-06-09 23:30   ` Linus Torvalds
  0 siblings, 0 replies; 10+ messages in thread
From: Linus Torvalds @ 2014-06-09 23:30 UTC (permalink / raw)
  To: Dave Jones, Linux Kernel, linux-mm, Linus Torvalds, Joonsoo Kim

On Mon, Jun 9, 2014 at 3:30 PM, Dave Jones <davej@redhat.com> wrote:
>
> Oops: 0000 [#1] PREEMPT SMP

Dave, for some reason your oops is missing the first line. There
should have been something like

 "Unable to handle kernel NULL pointer access at 00000001"

or something.

Anyway, the code decodes to

  22: 48 8b 7a 08           mov    0x8(%rdx),%rdi
  26: 48 85 ff             test   %rdi,%rdi
  29: 74 09                 je     0x34
  2b:* f6 07 01             testb  $0x1,(%rdi) <-- trapping instruction

and %rdi has the value "1". Which isn't  NULL (so the test against
zero doesn't trigger), but obviously traps.

That *looks* like the test for

                        if (!tmp1 || rb_is_black(tmp1)) {

(the "testb $0x1,(%rdi)" is actually testing the low bit of the
"entry->__rb_tree_parent" pointer, and there's a subsequent "testb"
too, so it looks like one of the two cases where we look at the two
siblings and see if they are both NULL or black)

> Workqueue: events free_work

That's the "vfree was done in irq context, so we delay it into a
workqueue" worker.

> RIP: rb_erase+0xb9/0x380
> RAX: ffff8802396b0018 RBX: ffff88024176b008 RCX: 0000000000000000
> RDX: ffffc90010fe1bf0 RSI: ffffffff8afb3178 RDI: 0000000000000001
> RBP: ffff88009ed9fcc0 R08: ffff88023b122e58 R09: ffff88024176ae58
> R10: 0000000000000000 R11: ffff880245801dc0 R12: ffff88024176b020
> R13: ffff88009ed9fd80 R14: ffff88009ed9fd88 R15: ffff88024e397100
> FS:  0000000000000000(0000) GS:ffff88024e380000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000000000000001

.. and obviously the faulting address in CR2 is that invalid pointer "1".

The rbtree pointers have the two low bits reserved for color (only one
bit used - the low bit says whether it's red or black), so I'm
wondering if this is a "black NULL" pointer that hit some code
sequence that didn't properly mask off the color. But that really
shouldn't exist, afaik.

Perhaps just memory corruption (ie a NULL having been randomly
incremented to "1").

          Linus

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: rb_erase oops.
  2014-06-09 23:30   ` Linus Torvalds
@ 2014-06-09 23:34     ` Dave Jones
  -1 siblings, 0 replies; 10+ messages in thread
From: Dave Jones @ 2014-06-09 23:34 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Linux Kernel, linux-mm, Joonsoo Kim

On Mon, Jun 09, 2014 at 04:30:21PM -0700, Linus Torvalds wrote:
 > On Mon, Jun 9, 2014 at 3:30 PM, Dave Jones <davej@redhat.com> wrote:
 > >
 > > Oops: 0000 [#1] PREEMPT SMP
 > 
 > Dave, for some reason your oops is missing the first line. There
 > should have been something like
 > 
 >  "Unable to handle kernel NULL pointer access at 00000001"
 > 
 > or something.

For some reason, that line never made it over usb-serial.

[56274.041989] trinity-c22 (7025) used greatest stack depth: 9440 bytes left
[77373.915561] Oops: 0000 [#1] PREEMPT SMP

Weird.

	Dave


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: rb_erase oops.
@ 2014-06-09 23:34     ` Dave Jones
  0 siblings, 0 replies; 10+ messages in thread
From: Dave Jones @ 2014-06-09 23:34 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Linux Kernel, linux-mm, Joonsoo Kim

On Mon, Jun 09, 2014 at 04:30:21PM -0700, Linus Torvalds wrote:
 > On Mon, Jun 9, 2014 at 3:30 PM, Dave Jones <davej@redhat.com> wrote:
 > >
 > > Oops: 0000 [#1] PREEMPT SMP
 > 
 > Dave, for some reason your oops is missing the first line. There
 > should have been something like
 > 
 >  "Unable to handle kernel NULL pointer access at 00000001"
 > 
 > or something.

For some reason, that line never made it over usb-serial.

[56274.041989] trinity-c22 (7025) used greatest stack depth: 9440 bytes left
[77373.915561] Oops: 0000 [#1] PREEMPT SMP

Weird.

	Dave

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: rb_erase oops.
  2014-06-09 23:09   ` David Rientjes
@ 2014-06-10  3:23     ` Joonsoo Kim
  -1 siblings, 0 replies; 10+ messages in thread
From: Joonsoo Kim @ 2014-06-10  3:23 UTC (permalink / raw)
  To: David Rientjes; +Cc: Dave Jones, Linus Torvalds, linux-kernel, linux-mm

On Mon, Jun 09, 2014 at 04:09:38PM -0700, David Rientjes wrote:
> On Mon, 9 Jun 2014, Dave Jones wrote:
> 
> > Kernel based on v3.15-7257-g963649d735c8
> > 
> > 	Dave
> > 
> > Oops: 0000 [#1] PREEMPT SMP 
> > Modules linked in: dlci 8021q garp snd_seq_dummy bnep llc2 af_key bridge stp fuse tun scsi_transport_iscsi ipt_ULOG nfnetlink rfcomm can_raw hidp can_bcm nfc caif_socket caif af_802154 ieee802154 phonet af_rxrpc can pppoe pppox ppp_generic slhc irda crc_ccitt rds rose x25 atm netrom appletalk ipx p8023 psnap p8022 llc ax25 coretemp cfg80211 hwmon x86_pkg_temp_thermal kvm_intel kvm snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic btusb bluetooth snd_hda_intel xfs snd_hda_controller snd_hda_codec snd_hwdep snd_seq e1000e snd_seq_device crct10dif_pclmul crc32c_intel ghash_clmulni_intel snd_pcm snd_timer snd 6lowpan_iphc usb_debug rfkill libcrc32c ptp pps_core microcode shpchp pcspkr serio_raw soundcore
> > CPU: 3 PID: 2049 Comm: kworker/3:1 Not tainted 3.15.0+ #231
> > Workqueue: events free_work
> > task: ffff880100944260 ti: ffff88009ed9c000 task.ti: ffff88009ed9c000
> > RIP: 0010:[<ffffffff8a326619>]  [<ffffffff8a326619>] rb_erase+0xb9/0x380
> > RSP: 0000:ffff88009ed9fcc0  EFLAGS: 00010202
> > RAX: ffff8802396b0018 RBX: ffff88024176b008 RCX: 0000000000000000
> > RDX: ffffc90010fe1bf0 RSI: ffffffff8afb3178 RDI: 0000000000000001
> > RBP: ffff88009ed9fcc0 R08: ffff88023b122e58 R09: ffff88024176ae58
> > R10: 0000000000000000 R11: ffff880245801dc0 R12: ffff88024176b020
> > R13: ffff88009ed9fd80 R14: ffff88009ed9fd88 R15: ffff88024e397100
> > FS:  0000000000000000(0000) GS:ffff88024e380000(0000) knlGS:0000000000000000
> > CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > CR2: 0000000000000001 CR3: 000000000ac10000 CR4: 00000000001407e0
> > DR0: 00000000024cc000 DR1: 00000000024c2000 DR2: 0000000000000000
> > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600
> > Stack:
> >  ffff88009ed9fce0 ffffffff8a195674 ffff88024176ae40 ffffffff8ac64260
> >  ffff88009ed9fd70 ffffffff8a19631d ffff88009ed9fd88 ffff88009ed9fd80
> >  ffff88009ed9fd10 ffff880244b93750 0000000000000000 ffff880244b93750
> > Call Trace:
> >  [<ffffffff8a195674>] __free_vmap_area+0x54/0xf0
> >  [<ffffffff8a19631d>] __purge_vmap_area_lazy+0x15d/0x4a0
> >  [<ffffffff8a1966e1>] free_vmap_area_noflush+0x81/0x90
> >  [<ffffffff8a197dae>] remove_vm_area+0x5e/0x70
> >  [<ffffffff8a197dee>] __vunmap+0x2e/0x100
> >  [<ffffffff8a197eed>] free_work+0x2d/0x40
> >  [<ffffffff8a08df11>] process_one_work+0x171/0x4d0
> >  [<ffffffff8a08eeac>] worker_thread+0x12c/0x3d0
> >  [<ffffffff8a0bc4b2>] ? complete+0x42/0x50
> >  [<ffffffff8a08ed80>] ? manage_workers.isra.25+0x2d0/0x2d0
> >  [<ffffffff8a095b3f>] kthread+0xff/0x120
> >  [<ffffffff8a095a40>] ? kthread_create_on_node+0x1c0/0x1c0
> >  [<ffffffff8a741eec>] ret_from_fork+0x7c/0xb0
> >  [<ffffffff8a095a40>] ? kthread_create_on_node+0x1c0/0x1c0
> > Code: 85 d2 74 0e 48 83 c8 01 48 89 0a 49 89 02 5d c3 66 90 48 8b 3a 48 89 0a 83 e7 01 74 f1 31 c9 eb 40 90 48 8b 7a 08 48 85 ff 74 09 <f6> 07 01 0f 84 a3 01 00 00 48 8b 4a 10 48 85 c9 74 09 f6 01 01 
> > RIP  [<ffffffff8a326619>] rb_erase+0xb9/0x380
> >  RSP <ffff88009ed9fcc0>
> > 
> 
> Adding Joonsoo to the cc.
> 
> I haven't looked very closely, and it may be unrelated, but perhaps this 
> is a race because of a failed radix_tree_preload() in new_vmap_block() and 
> this happens in low on memory conditions (and would be tough to reproduce 
> because of a race with the rcu-protected vmap_area_list iteration in
> __purge_vmap_area_lazy() and the actual freeing of the vmap_area under 
> vmap_area_lock).

Hello, David.

I think that the race you mentioned is impossible. When
radix_tree_preload() is failed in new_vmap_block(), flag of that
vmap_area would be 0, so __purge_vmap_area_lazy() would not try to
purge that vmap_area.

Anyway, I have no idea what causes this problem now.

Thanks.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: rb_erase oops.
@ 2014-06-10  3:23     ` Joonsoo Kim
  0 siblings, 0 replies; 10+ messages in thread
From: Joonsoo Kim @ 2014-06-10  3:23 UTC (permalink / raw)
  To: David Rientjes; +Cc: Dave Jones, Linus Torvalds, linux-kernel, linux-mm

On Mon, Jun 09, 2014 at 04:09:38PM -0700, David Rientjes wrote:
> On Mon, 9 Jun 2014, Dave Jones wrote:
> 
> > Kernel based on v3.15-7257-g963649d735c8
> > 
> > 	Dave
> > 
> > Oops: 0000 [#1] PREEMPT SMP 
> > Modules linked in: dlci 8021q garp snd_seq_dummy bnep llc2 af_key bridge stp fuse tun scsi_transport_iscsi ipt_ULOG nfnetlink rfcomm can_raw hidp can_bcm nfc caif_socket caif af_802154 ieee802154 phonet af_rxrpc can pppoe pppox ppp_generic slhc irda crc_ccitt rds rose x25 atm netrom appletalk ipx p8023 psnap p8022 llc ax25 coretemp cfg80211 hwmon x86_pkg_temp_thermal kvm_intel kvm snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic btusb bluetooth snd_hda_intel xfs snd_hda_controller snd_hda_codec snd_hwdep snd_seq e1000e snd_seq_device crct10dif_pclmul crc32c_intel ghash_clmulni_intel snd_pcm snd_timer snd 6lowpan_iphc usb_debug rfkill libcrc32c ptp pps_core microcode shpchp pcspkr serio_raw soundcore
> > CPU: 3 PID: 2049 Comm: kworker/3:1 Not tainted 3.15.0+ #231
> > Workqueue: events free_work
> > task: ffff880100944260 ti: ffff88009ed9c000 task.ti: ffff88009ed9c000
> > RIP: 0010:[<ffffffff8a326619>]  [<ffffffff8a326619>] rb_erase+0xb9/0x380
> > RSP: 0000:ffff88009ed9fcc0  EFLAGS: 00010202
> > RAX: ffff8802396b0018 RBX: ffff88024176b008 RCX: 0000000000000000
> > RDX: ffffc90010fe1bf0 RSI: ffffffff8afb3178 RDI: 0000000000000001
> > RBP: ffff88009ed9fcc0 R08: ffff88023b122e58 R09: ffff88024176ae58
> > R10: 0000000000000000 R11: ffff880245801dc0 R12: ffff88024176b020
> > R13: ffff88009ed9fd80 R14: ffff88009ed9fd88 R15: ffff88024e397100
> > FS:  0000000000000000(0000) GS:ffff88024e380000(0000) knlGS:0000000000000000
> > CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > CR2: 0000000000000001 CR3: 000000000ac10000 CR4: 00000000001407e0
> > DR0: 00000000024cc000 DR1: 00000000024c2000 DR2: 0000000000000000
> > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600
> > Stack:
> >  ffff88009ed9fce0 ffffffff8a195674 ffff88024176ae40 ffffffff8ac64260
> >  ffff88009ed9fd70 ffffffff8a19631d ffff88009ed9fd88 ffff88009ed9fd80
> >  ffff88009ed9fd10 ffff880244b93750 0000000000000000 ffff880244b93750
> > Call Trace:
> >  [<ffffffff8a195674>] __free_vmap_area+0x54/0xf0
> >  [<ffffffff8a19631d>] __purge_vmap_area_lazy+0x15d/0x4a0
> >  [<ffffffff8a1966e1>] free_vmap_area_noflush+0x81/0x90
> >  [<ffffffff8a197dae>] remove_vm_area+0x5e/0x70
> >  [<ffffffff8a197dee>] __vunmap+0x2e/0x100
> >  [<ffffffff8a197eed>] free_work+0x2d/0x40
> >  [<ffffffff8a08df11>] process_one_work+0x171/0x4d0
> >  [<ffffffff8a08eeac>] worker_thread+0x12c/0x3d0
> >  [<ffffffff8a0bc4b2>] ? complete+0x42/0x50
> >  [<ffffffff8a08ed80>] ? manage_workers.isra.25+0x2d0/0x2d0
> >  [<ffffffff8a095b3f>] kthread+0xff/0x120
> >  [<ffffffff8a095a40>] ? kthread_create_on_node+0x1c0/0x1c0
> >  [<ffffffff8a741eec>] ret_from_fork+0x7c/0xb0
> >  [<ffffffff8a095a40>] ? kthread_create_on_node+0x1c0/0x1c0
> > Code: 85 d2 74 0e 48 83 c8 01 48 89 0a 49 89 02 5d c3 66 90 48 8b 3a 48 89 0a 83 e7 01 74 f1 31 c9 eb 40 90 48 8b 7a 08 48 85 ff 74 09 <f6> 07 01 0f 84 a3 01 00 00 48 8b 4a 10 48 85 c9 74 09 f6 01 01 
> > RIP  [<ffffffff8a326619>] rb_erase+0xb9/0x380
> >  RSP <ffff88009ed9fcc0>
> > 
> 
> Adding Joonsoo to the cc.
> 
> I haven't looked very closely, and it may be unrelated, but perhaps this 
> is a race because of a failed radix_tree_preload() in new_vmap_block() and 
> this happens in low on memory conditions (and would be tough to reproduce 
> because of a race with the rcu-protected vmap_area_list iteration in
> __purge_vmap_area_lazy() and the actual freeing of the vmap_area under 
> vmap_area_lock).

Hello, David.

I think that the race you mentioned is impossible. When
radix_tree_preload() is failed in new_vmap_block(), flag of that
vmap_area would be 0, so __purge_vmap_area_lazy() would not try to
purge that vmap_area.

Anyway, I have no idea what causes this problem now.

Thanks.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2014-06-10  3:19 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-06-09 22:30 rb_erase oops Dave Jones
2014-06-09 22:30 ` Dave Jones
2014-06-09 23:09 ` David Rientjes
2014-06-09 23:09   ` David Rientjes
2014-06-10  3:23   ` Joonsoo Kim
2014-06-10  3:23     ` Joonsoo Kim
2014-06-09 23:30 ` Linus Torvalds
2014-06-09 23:30   ` Linus Torvalds
2014-06-09 23:34   ` Dave Jones
2014-06-09 23:34     ` Dave Jones

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.