All of lore.kernel.org
 help / color / mirror / Atom feed
* [2.6.35.3] BUG: unable to handle kernel NULL pointer dereference at (null)
@ 2010-08-26  9:53 Carlos R. Mafra
  2010-08-26 11:53 ` Chris Mason
  2010-08-27  9:38 ` liubo
  0 siblings, 2 replies; 5+ messages in thread
From: Carlos R. Mafra @ 2010-08-26  9:53 UTC (permalink / raw)
  To: linux-btrfs; +Cc: linux-kernel

Hi,

I've just got this BUG: message in dmesg which I think is btrfs related.

I have a btrfs filesystem in a memory card which I use to contain the 
cache and config of chromium (to avoid writing to much in the SSD):

/dev/mmcblk0p1 on /home/mafra/mmc type btrfs (rw,noexec,nosuid,nodev,noatime,noacl,compress)

and the last message before the bug is about the memory card.

[  148.149179] mmcblk0: retrying using single block read
[  148.152014] BUG: unable to handle kernel NULL pointer dereference at (null)
[  148.152021] IP: [<ffffffff811b1301>] extent_range_uptodate+0x51/0xa0
[  148.152030] PGD 72c37067 PUD 7d64b067 PMD 0 
[  148.152035] Oops: 0000 [#1] SMP 
[  148.152038] last sysfs file: /sys/devices/system/cpu/cpu0/cpufreq/scaling_cur_freq
[  148.152041] CPU 1 
[  148.152043] Modules linked in: snd_seq snd_seq_device snd_hda_codec_idt i2c_i801 snd_hda_intel snd_hda_codec sky2 uvcvideo iwlagn snd_hwdep evdev
[  148.152053] 
[  148.152057] Pid: 2731, comm: btrfs-endio-met Not tainted 2.6.35.3+ #12 VAIO/VGN-FZ240E
[  148.152059] RIP: 0010:[<ffffffff811b1301>]  [<ffffffff811b1301>] extent_range_uptodate+0x51/0xa0
[  148.152064] RSP: 0018:ffff880079acddd0  EFLAGS: 00010246
[  148.152066] RAX: 0000000000000000 RBX: 0000000043eba000 RCX: 0000000000000000
[  148.152068] RDX: 0000000000000001 RSI: 0000000000043eba RDI: 0000000000000001
[  148.152071] RBP: ffff880079acddf0 R08: 0000000000000000 R09: 0000000000000000
[  148.152073] R10: 0000000000001000 R11: 0000000000000000 R12: ffff88007b40eb18
[  148.152076] R13: 0000000043ebadff R14: ffff880079ab0000 R15: ffff880079acde80
[  148.152079] FS:  0000000000000000(0000) GS:ffff880001b00000(0000) knlGS:0000000000000000
[  148.152081] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[  148.152084] CR2: 0000000000000000 CR3: 0000000070be9000 CR4: 00000000000006e0
[  148.152086] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  148.152088] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[  148.152091] Process btrfs-endio-met (pid: 2731, threadinfo ffff880079acc000, task ffff880079aca8c0)
[  148.152093] Stack:
[  148.152094]  ffffffff8104e730 ffff880069b9c8f8 ffff88007e87ca40 ffff880069b9c8c0
[  148.152098] <0> ffff880079acde20 ffffffff8118a42d ffff88007bbc70c0 ffff880069b9c8f8
[  148.152102] <0> ffff88007bbc70d8 ffff88007bbc7110 ffff880079acdee0 ffffffff811b98f0
[  148.152107] Call Trace:
[  148.152113]  [<ffffffff8104e730>] ? process_timeout+0x0/0x10
[  148.152118]  [<ffffffff8118a42d>] end_workqueue_fn+0x10d/0x130
[  148.152122]  [<ffffffff811b98f0>] worker_loop+0xb0/0x5a0
[  148.152126]  [<ffffffff811b9840>] ? worker_loop+0x0/0x5a0
[  148.152130]  [<ffffffff8105b60e>] kthread+0x8e/0xa0
[  148.152135]  [<ffffffff81003a54>] kernel_thread_helper+0x4/0x10
[  148.152138]  [<ffffffff8105b580>] ? kthread+0x0/0xa0
[  148.152142]  [<ffffffff81003a50>] ? kernel_thread_helper+0x0/0x10
[  148.152144] Code: d3 ff ff 89 c2 b8 01 00 00 00 85 d2 75 56 4c 39 eb 77 51 0f 1f 80 00 00 00 00 48 89 de 49 8b 7c 24 10 48 c1 ee 0c e8 3f 21 ef ff <f6> 00 08 74 2a 48 89 c7 48 81 c3 00 10 00 00 e8 4b af ef ff 49 
[  148.152176] RIP  [<ffffffff811b1301>] extent_range_uptodate+0x51/0xa0
[  148.152180]  RSP <ffff880079acddd0>
[  148.152182] CR2: 0000000000000000
[  148.152185] ---[ end trace c11a5009b12451d7 ]---


Full dmesg is here

http://www.aei.mpg.de/~crmafra/dmesg-2.6.35.3.txt

Is there something else I should provide to help debug this?

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [2.6.35.3] BUG: unable to handle kernel NULL pointer dereference at (null)
  2010-08-26  9:53 [2.6.35.3] BUG: unable to handle kernel NULL pointer dereference at (null) Carlos R. Mafra
@ 2010-08-26 11:53 ` Chris Mason
  2010-08-26 12:16   ` Carlos R. Mafra
  2010-08-27  9:38 ` liubo
  1 sibling, 1 reply; 5+ messages in thread
From: Chris Mason @ 2010-08-26 11:53 UTC (permalink / raw)
  To: Carlos R. Mafra; +Cc: linux-btrfs, linux-kernel

On Thu, Aug 26, 2010 at 11:53:57AM +0200, Carlos R. Mafra wrote:
> Hi,
> 
> I've just got this BUG: message in dmesg which I think is btrfs related.
> 
> I have a btrfs filesystem in a memory card which I use to contain the 
> cache and config of chromium (to avoid writing to much in the SSD):
> 
> /dev/mmcblk0p1 on /home/mafra/mmc type btrfs (rw,noexec,nosuid,nodev,noatime,noacl,compress)
> 
> and the last message before the bug is about the memory card.

Hmmm it is definitely btrfs related.  Do you continue to get it after
rebooting?

-chris

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [2.6.35.3] BUG: unable to handle kernel NULL pointer dereference at (null)
  2010-08-26 11:53 ` Chris Mason
@ 2010-08-26 12:16   ` Carlos R. Mafra
  0 siblings, 0 replies; 5+ messages in thread
From: Carlos R. Mafra @ 2010-08-26 12:16 UTC (permalink / raw)
  To: Chris Mason; +Cc: linux-btrfs, linux-kernel

On Do 26.Aug'10 at  7:53:36 -0400, Chris Mason wrote:
> On Thu, Aug 26, 2010 at 11:53:57AM +0200, Carlos R. Mafra wrote:
> > Hi,
> > 
> > I've just got this BUG: message in dmesg which I think is btrfs related.
> > 
> > I have a btrfs filesystem in a memory card which I use to contain the 
> > cache and config of chromium (to avoid writing to much in the SSD):
> > 
> > /dev/mmcblk0p1 on /home/mafra/mmc type btrfs (rw,noexec,nosuid,nodev,noatime,noacl,compress)
> > 
> > and the last message before the bug is about the memory card.
> 
> Hmmm it is definitely btrfs related.  Do you continue to get it after
> rebooting?

No, after rebooting chromium refused to open (like it did during the BUG) 
but no BUG: message appeared.

Then I deleted all chromium cache+config from the memory card and chromium
worked again, but after suspending to RAM I noticed this:

[ 6035.423238] PM: Syncing filesystems ... done.
[ 6036.524889] Freezing user space processes ... 
[ 6056.533079] Freezing of tasks failed after 20.00 seconds (1 tasks refusing to freeze):
[ 6056.533119] chrome        D 00000000ffffffff     0  7028   7015 0x00800004
[ 6056.533124]  ffff8800630a1c08 0000000000000086 0000000000000000 0000000000012d40
[ 6056.533129]  ffff8800630a1fd8 ffff8800630a1fd8 ffff88006303e0c0 0000000000012d40
[ 6056.533133]  0000000000012d40 0000000000004000 0000000000004000 ffff8800630a1fd8
[ 6056.533137] Call Trace:
[ 6056.533145]  [<ffffffff81064029>] ? ktime_get_ts+0xa9/0xe0
[ 6056.533150]  [<ffffffff810a3610>] ? sync_page_killable+0x0/0x40
[ 6056.533155]  [<ffffffff81506bde>] io_schedule+0x6e/0xb0
[ 6056.533158]  [<ffffffff810a35f8>] sync_page+0x38/0x50
[ 6056.533161]  [<ffffffff810a3619>] sync_page_killable+0x9/0x40
[ 6056.533164]  [<ffffffff81507332>] __wait_on_bit_lock+0x52/0xb0
[ 6056.533168]  [<ffffffff810a3532>] __lock_page_killable+0x62/0x70
[ 6056.533172]  [<ffffffff8105bb40>] ? wake_bit_function+0x0/0x40
[ 6056.533175]  [<ffffffff810a3459>] ? find_get_page+0x19/0x90
[ 6056.533178]  [<ffffffff810a5144>] generic_file_aio_read+0x494/0x6d0
[ 6056.533183]  [<ffffffff810d4b22>] do_sync_read+0xd2/0x110
[ 6056.533186]  [<ffffffff81025406>] ? do_page_fault+0x186/0x390
[ 6056.533190]  [<ffffffff810d5273>] vfs_read+0xb3/0x160
[ 6056.533193]  [<ffffffff810d536c>] sys_read+0x4c/0x80
[ 6056.533197]  [<ffffffff81002d2b>] system_call_fastpath+0x16/0x1b
[ 6056.533207] 
[ 6056.533208] Restarting tasks ... done.

but everything is working well AFAICT.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [2.6.35.3] BUG: unable to handle kernel NULL pointer dereference at (null)
  2010-08-26  9:53 [2.6.35.3] BUG: unable to handle kernel NULL pointer dereference at (null) Carlos R. Mafra
  2010-08-26 11:53 ` Chris Mason
@ 2010-08-27  9:38 ` liubo
  2010-08-27 18:03   ` Carlos R. Mafra
  1 sibling, 1 reply; 5+ messages in thread
From: liubo @ 2010-08-27  9:38 UTC (permalink / raw)
  To: Carlos R. Mafra; +Cc: linux-btrfs, linux-kernel, Chris Mason

On 08/26/2010 05:53 PM, Carlos R. Mafra wrote:
> Hi,
>
> I've just got this BUG: message in dmesg which I think is btrfs related.
>   
[...]

> [  148.152107] Call Trace:
> [  148.152113]  [<ffffffff8104e730>] ? process_timeout+0x0/0x10
> [  148.152118]  [<ffffffff8118a42d>] end_workqueue_fn+0x10d/0x130
> [  148.152122]  [<ffffffff811b98f0>] worker_loop+0xb0/0x5a0
> [  148.152126]  [<ffffffff811b9840>] ? worker_loop+0x0/0x5a0
> [  148.152130]  [<ffffffff8105b60e>] kthread+0x8e/0xa0
> [  148.152135]  [<ffffffff81003a54>] kernel_thread_helper+0x4/0x10
> [  148.152138]  [<ffffffff8105b580>] ? kthread+0x0/0xa0
> [  148.152142]  [<ffffffff81003a50>] ? kernel_thread_helper+0x0/0x10
> [  148.152144] Code: d3 ff ff 89 c2 b8 01 00 00 00 85 d2 75 56 4c 39 eb 77 51 0f 1f 80 00 00 00 00 48 89 de 49 8b 7c 24 10 48 c1 ee 0c e8 3f 21 ef ff <f6> 00 08 74 2a 48 89 c7 48 81 c3 00 10 00 00 e8 4b af ef ff 49 
> [  148.152176] RIP  [<ffffffff811b1301>] extent_range_uptodate+0x51/0xa0
> [  148.152180]  RSP <ffff880079acddd0>
> [  148.152182] CR2: 0000000000000000
> [  148.152185] ---[ end trace c11a5009b12451d7 ]---
>
>   

Hi, Carlos,
Did you hit this bug under heavy memory stress?

And, could you reproduce the bug? or show some reproduce steps for us?

After digging into extent_range_uptodate(), IMO, this NULL pointer bug that issued page 
can barely be hit. 

Maybe, due to heavy memory stress, a page of the extent_buffer has been freed before, 
which leads that it is missing in page_cache and return NULL.


thanks,
liubo


> Full dmesg is here
>
> http://www.aei.mpg.de/~crmafra/dmesg-2.6.35.3.txt
>
> Is there something else I should provide to help debug this?
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
>
>   

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [2.6.35.3] BUG: unable to handle kernel NULL pointer dereference at (null)
  2010-08-27  9:38 ` liubo
@ 2010-08-27 18:03   ` Carlos R. Mafra
  0 siblings, 0 replies; 5+ messages in thread
From: Carlos R. Mafra @ 2010-08-27 18:03 UTC (permalink / raw)
  To: liubo; +Cc: linux-btrfs, linux-kernel, Chris Mason

On Fr 27.Aug'10 at 17:38:01 +0800, liubo wrote:
> On 08/26/2010 05:53 PM, Carlos R. Mafra wrote:
> > Hi,
> >
> > I've just got this BUG: message in dmesg which I think is btrfs related.
> >   
> [...]
> 
> > [  148.152107] Call Trace:
> > [  148.152113]  [<ffffffff8104e730>] ? process_timeout+0x0/0x10
> > [  148.152118]  [<ffffffff8118a42d>] end_workqueue_fn+0x10d/0x130
> > [  148.152122]  [<ffffffff811b98f0>] worker_loop+0xb0/0x5a0
> > [  148.152126]  [<ffffffff811b9840>] ? worker_loop+0x0/0x5a0
> > [  148.152130]  [<ffffffff8105b60e>] kthread+0x8e/0xa0
> > [  148.152135]  [<ffffffff81003a54>] kernel_thread_helper+0x4/0x10
> > [  148.152138]  [<ffffffff8105b580>] ? kthread+0x0/0xa0
> > [  148.152142]  [<ffffffff81003a50>] ? kernel_thread_helper+0x0/0x10
> > [  148.152144] Code: d3 ff ff 89 c2 b8 01 00 00 00 85 d2 75 56 4c 39 eb 77 51 0f 1f 80 00 00 00 00 48 89 de 49 8b 7c 24 10 48 c1 ee 0c e8 3f 21 ef ff <f6> 00 08 74 2a 48 89 c7 48 81 c3 00 10 00 00 e8 4b af ef ff 49 
> > [  148.152176] RIP  [<ffffffff811b1301>] extent_range_uptodate+0x51/0xa0
> > [  148.152180]  RSP <ffff880079acddd0>
> > [  148.152182] CR2: 0000000000000000
> > [  148.152185] ---[ end trace c11a5009b12451d7 ]---
> >
> >   
> 
> Hi, Carlos,
> Did you hit this bug under heavy memory stress?

Not at all! 

My laptop had been recently booted (see the timings in the dmesg)
and it was basically idle: just a couple xterms and WindowMaker running,
apart from the chromium which I tried to open and led to the bug.
 
> And, could you reproduce the bug? or show some reproduce steps for us?

Unfortunately not. That was the first time, and it didn't repeat so far.

> After digging into extent_range_uptodate(), IMO, this NULL pointer bug that issued page 
> can barely be hit. 
> 
> Maybe, due to heavy memory stress, a page of the extent_buffer has been freed before, 
> which leads that it is missing in page_cache and return NULL.

Could it be that the memory card had some bad block and btrfs could not
recover from the failure?


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2010-08-27 18:03 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-08-26  9:53 [2.6.35.3] BUG: unable to handle kernel NULL pointer dereference at (null) Carlos R. Mafra
2010-08-26 11:53 ` Chris Mason
2010-08-26 12:16   ` Carlos R. Mafra
2010-08-27  9:38 ` liubo
2010-08-27 18:03   ` Carlos R. Mafra

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.