All of lore.kernel.org
 help / color / mirror / Atom feed
From: Johannes Hirte <johannes.hirte@fem.tu-ilmenau.de>
To: Josef Bacik <josef@redhat.com>
Cc: linux-btrfs@vger.kernel.org
Subject: Re: BUG: unable to handle kernel NULL pointer dereference at (null)
Date: Wed, 6 Apr 2011 00:00:10 +0200	[thread overview]
Message-ID: <201104060000.11403.johannes.hirte@fem.tu-ilmenau.de> (raw)
In-Reply-To: <20110405211227.GD484@dhcp231-156.rdu.redhat.com>

On Tuesday 05 April 2011 23:12:27 Josef Bacik wrote:
> On Tue, Apr 05, 2011 at 11:08:52PM +0200, Johannes Hirte wrote:
> > On Tuesday 05 April 2011 21:31:43 Josef Bacik wrote:
> > > On Tue, Apr 05, 2011 at 09:21:55PM +0200, Johannes Hirte wrote:
> > > > On Tuesday 05 April 2011 20:53:24 Josef Bacik wrote:
> > > > > On Tue, Apr 05, 2011 at 08:52:21PM +0200, Johannes Hirte wrote:
> > > > > > On Tuesday 05 April 2011 19:42:03 Josef Bacik wrote:
> > > > > > > On Tue, Apr 05, 2011 at 07:38:13PM +0200, Johannes Hirte wrote:
> > > > > > > > With the latest btrfs changes, I got this Oops when doing rm
> > > > > > > > on a large directory:
> > > > > > > > 
> > > > > > > > BUG: unable to handle kernel NULL pointer dereference at  
> > > > > > > > (null) IP: [<c101c838>] kunmap+0x46/0x46
> > > > > > > > *pdpt = 0000000034a85001 *pde = 0000000000000000
> > > > > > > > Oops: 0000 [#1] PREEMPT SMP
> > > > > > > > last sysfs file: /sys/devices/virtual/vtconsole/vtcon1/uevent
> > > > > > > > Modules linked in: snd_seq_oss snd_seq_midi_event snd_seq
> > > > > > > > snd_seq_device snd_pcm_oss snd_mixer_oss fuse dm_crypt dm_mod
> > > > > > > > usbhid snd_intel8x0 snd_ac97_codec ac97_bus snd_pcm snd_timer
> > > > > > > > sr_mod cdrom sg snd fschmd e1000 uhci_hcd snd_page_alloc
> > > > > > > > i2c_i801 [last unloaded: microcode]
> > > > > > > > 
> > > > > > > > Pid: 1156, comm: btrfs-transacti Tainted: G        W
> > > > > > > > 2.6.39-rc1-00262- gc53813f #20 FUJITSU SIEMENS SCENIC P /
> > > > > > > > SCENICO P/D1561
> > > > > > > > EIP: 0060:[<c101c838>] EFLAGS: 00010296 CPU: 1
> > > > > > > > EIP is at kmap+0x0/0x38
> > > > > > > > EAX: 00000000 EBX: 00000000 ECX: 00000000 EDX: 00000010
> > > > > > > > ESI: f5bc6400 EDI: f3c75520 EBP: f3c755f0 ESP: f58f9e10
> > > > > > > > 
> > > > > > > >  DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
> > > > > > > > 
> > > > > > > > Process btrfs-transacti (pid: 1156, ti=f58f8000 task=f6516f40
> > > > > > > > task.ti=f58f8000)
> > > > > > > > 
> > > > > > > > Stack:
> > > > > > > >  c1186d15 ffc22000 f58f9ec0 00000010 f3c75610 00000000
> > > > > > > >  f5885780 f52339e8 00000009 f5bc6400 00010000 00000000
> > > > > > > >  f6415800 f3c75638 000008bb f5bc63c0 f58857b4 f60b68a0
> > > > > > > >  00000040 f52338e8 ffc22000 00000000 00000008 00000010
> > > > > > > > 
> > > > > > > > Call Trace:
> > > > > > > >  [<c1186d15>] ? btrfs_write_out_cache+0x60c/0xa3c
> > > > > > > >  [<c114a815>] ? btrfs_write_dirty_block_groups+0x400/0x494
> > > > > > > >  [<c11566a7>] ? commit_cowonly_roots+0xa9/0x180
> > > > > > > >  [<c1157799>] ? btrfs_commit_transaction+0x2ee/0x59c
> > > > > > > >  [<c1037c85>] ? wake_up_bit+0x16/0x16
> > > > > > > >  [<c1152a83>] ? transaction_kthread+0x149/0x1d6
> > > > > > > >  [<c101d1b9>] ? complete+0x28/0x36
> > > > > > > >  [<c115293a>] ? btrfs_congested_fn+0x5d/0x5d
> > > > > > > >  [<c10379c4>] ? kthread+0x63/0x68
> > > > > > > >  [<c1037961>] ? kthread_worker_fn+0xeb/0xeb
> > > > > > > >  [<c13cba36>] ? kernel_thread_helper+0x6/0xd
> > > > > > > > 
> > > > > > > > Code: 8d 8a 00 e4 54 c1 2b 8a 8c e7 54 c1 81 f9 00 08 00 00
> > > > > > > > 74 11 81 f9 00 0c 00 00 75 0e 83 3d 10 2f 60 c1 02 75 05 e9
> > > > > > > > 5e a3 04 00 c3 <8b> 10 c1 ea 1e c1 e2 0a 8d 8a 00 e4 54 c1
> > > > > > > > 2b 8a 8c e7 54 c1 81 EIP: [<c101c838>] kmap+0x0/0x38 SS:ESP
> > > > > > > > 0068:f58f9e10 CR2: 0000000000000000
> > > > > > > > ---[ end trace c8511126ee91dfdf ]---
> > > > > > > > 
> > > > > > > > This is the second Oops. On the first one I wasn't able to
> > > > > > > > catch the backtrace, but IIRC the bug happend on kmap not
> > > > > > > > kunmap the first time.
> > > > > > > 
> > > > > > > Yeah I think I know what this is but I need somebody to verify
> > > > > > > it for me. Can you run with this patch and let me know what
> > > > > > > happens? Thanks,
> > > > > > > 
> > > > > > > Josef
> > > > > > > 
> > > > > > > diff --git a/fs/btrfs/free-space-cache.c
> > > > > > > b/fs/btrfs/free-space-cache.c index 74bc432..5e6f4b3 100644
> > > > > > > --- a/fs/btrfs/free-space-cache.c
> > > > > > > +++ b/fs/btrfs/free-space-cache.c
> > > > > > > @@ -624,6 +624,7 @@ int btrfs_write_out_cache(struct btrfs_root
> > > > > > > *root,
> > > > > > > 
> > > > > > >  		next_page = false;
> > > > > > > 
> > > > > > > +		BUG_ON(index > last_index);
> > > > > > > 
> > > > > > >  		if (index == 0) {
> > > > > > >  		
> > > > > > >  			start_offset = first_page_offset;
> > > > > > >  			offset = start_offset;
> > > > > > > 
> > > > > > > @@ -732,6 +733,7 @@ int btrfs_write_out_cache(struct btrfs_root
> > > > > > > *root,
> > > > > > > 
> > > > > > >  		struct btrfs_free_space *entry =
> > > > > > >  		
> > > > > > >  			list_entry(pos, struct btrfs_free_space, list);
> > > > > > > 
> > > > > > > +		BUG_ON(index > last_index);
> > > > > > > 
> > > > > > >  		page = find_get_page(inode->i_mapping, index);
> > > > > > >  		
> > > > > > >  		addr = kmap(page);
> > > > > > 
> > > > > > Hm, I tried but now I hit the
> > > > > > BUG_ON(block_group->total_bitmaps >= max_bitmaps); in
> > > > > > add_new_bitmap in fs/btrfs/free-space-cache.c:1255 when booting
> > > > > > the system.
> > > > > 
> > > > > Can you mount -o clear_cache to make sure it's not the cache thats
> > > > > causing that? Thanks,
> > > > > 
> > > > > Josef
> > > > 
> > > > Mounting  with clear_cache under 2.6.38 helped. I was able to boot
> > > > and test with your patch an hit the second BUG_ON on
> > > > free-space-cache.c:738.
> > > 
> > > Perfect can you try this and verify you don't panic anymore please?
> > > Thanks,
> > > 
> > > Josef
> > > 
> > > ---
> > > 
> > >  fs/btrfs/free-space-cache.c |   18 ++++++++++++++++++
> > >  1 files changed, 18 insertions(+), 0 deletions(-)
> > > 
> > > diff --git a/fs/btrfs/free-space-cache.c b/fs/btrfs/free-space-cache.c
> > > index 74bc432..33287e8 100644
> > > --- a/fs/btrfs/free-space-cache.c
> > > +++ b/fs/btrfs/free-space-cache.c
> > > @@ -522,6 +522,7 @@ int btrfs_write_out_cache(struct btrfs_root *root,
> > > 
> > >  	int bitmaps = 0;
> > >  	int ret = 0;
> > >  	bool next_page = false;
> > > 
> > > +	bool out_of_space = false;
> > > 
> > >  	root = root->fs_info->tree_root;
> > > 
> > > @@ -629,6 +630,11 @@ int btrfs_write_out_cache(struct btrfs_root *root,
> > > 
> > >  			offset = start_offset;
> > >  		
> > >  		}
> > > 
> > > +		if (index > last_index) {
> > > +			out_of_space = true;
> > > +			break;
> > > +		}
> > > +
> > > 
> > >  		page = find_get_page(inode->i_mapping, index);
> > >  		
> > >  		addr = kmap(page);
> > > 
> > > @@ -732,6 +738,10 @@ int btrfs_write_out_cache(struct btrfs_root *root,
> > > 
> > >  		struct btrfs_free_space *entry =
> > >  		
> > >  			list_entry(pos, struct btrfs_free_space, list);
> > > 
> > > +		if (index > last_index) {
> > > +			out_of_space = true;
> > > +			break;
> > > +		}
> > > 
> > >  		page = find_get_page(inode->i_mapping, index);
> > >  		
> > >  		addr = kmap(page);
> > > 
> > > @@ -754,6 +764,14 @@ int btrfs_write_out_cache(struct btrfs_root *root,
> > > 
> > >  		index++;
> > >  	
> > >  	}
> > > 
> > > +	if (out_of_space) {
> > > +		ret = 0;
> > > +		unlock_extent_cached(&BTRFS_I(inode)->io_tree, 0,
> > > +				     i_size_read(inode) - 1, &cached_state,
> > > +				     GFP_NOFS);
> > > +		goto out_free;
> > > +	}
> > > +
> > > 
> > >  	/* Zero out the rest of the pages just to make sure */
> > >  	while (index <= last_index) {
> > >  	
> > >  		void *addr;
> > 
> > With this patch the system doesn't panic anymore, it just hangs. The
> > output from sysrq-t after the hang is attached. I've got also several
> > Oopses from BUG_ON(block_group->total_bitmaps >= max_bitmaps) again when
> > switching from 2.6.38 to 2.6.39-rc.
> 
> Balls, sorry about that, this should do the trick, thanks,
> 
> Josef
> 
> ---
>  fs/btrfs/free-space-cache.c |   23 +++++++++++++++++++++++
>  1 files changed, 23 insertions(+), 0 deletions(-)
> 
> diff --git a/fs/btrfs/free-space-cache.c b/fs/btrfs/free-space-cache.c
> index 74bc432..fc2bbbe 100644
> --- a/fs/btrfs/free-space-cache.c
> +++ b/fs/btrfs/free-space-cache.c
> @@ -522,6 +522,7 @@ int btrfs_write_out_cache(struct btrfs_root *root,
>  	int bitmaps = 0;
>  	int ret = 0;
>  	bool next_page = false;
> +	bool out_of_space = false;
> 
>  	root = root->fs_info->tree_root;
> 
> @@ -629,6 +630,11 @@ int btrfs_write_out_cache(struct btrfs_root *root,
>  			offset = start_offset;
>  		}
> 
> +		if (index > last_index) {
> +			out_of_space = true;
> +			break;
> +		}
> +
>  		page = find_get_page(inode->i_mapping, index);
> 
>  		addr = kmap(page);
> @@ -732,6 +738,10 @@ int btrfs_write_out_cache(struct btrfs_root *root,
>  		struct btrfs_free_space *entry =
>  			list_entry(pos, struct btrfs_free_space, list);
> 
> +		if (index > last_index) {
> +			out_of_space = true;
> +			break;
> +		}
>  		page = find_get_page(inode->i_mapping, index);
> 
>  		addr = kmap(page);
> @@ -754,6 +764,19 @@ int btrfs_write_out_cache(struct btrfs_root *root,
>  		index++;
>  	}
> 
> +	if (out_of_space) {
> +		page = find_get_page(inode->i_mapping, 0);
> +		unlock_page(page);
> +		page_cache_release(page);
> +		page_cache_release(page);
> +
> +		ret = 0;
> +		unlock_extent_cached(&BTRFS_I(inode)->io_tree, 0,
> +				     i_size_read(inode) - 1, &cached_state,
> +				     GFP_NOFS);
> +		goto out_free;
> +	}
> +
>  	/* Zero out the rest of the pages just to make sure */
>  	while (index <= last_index) {
>  		void *addr;

Now it hit 
------------[ cut here ]------------
kernel BUG at fs/btrfs/inode.c:1565!
invalid opcode: 0000 [#1] PREEMPT SMP 
last sysfs file: /sys/devices/virtual/vtconsole/vtcon1/uevent
Modules linked in: snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device 
snd_pcm_oss snd_mixer_oss fuse dm_crypt dm_mod usbhid snd_intel8x0 
snd_ac97_codec ac97_bus snd_pcm sr_mod snd_timer sg cdrom snd fschmd uhci_hcd 
e1000 snd_page_alloc i2c_i801 [last unloaded: microcode]

Pid: 1147, comm: btrfs-fixup-0 Tainted: G        W   2.6.39-rc1-00262-gc53813f-
dirty #24 FUJITSU SIEMENS SCENIC P / SCENICO P/D1561
EIP: 0060:[<c1158999>] EFLAGS: 00010246 CPU: 1
EIP is at btrfs_writepage_fixup_worker+0xf1/0x12d
EAX: 00000000 EBX: 00000000 ECX: 00000000 EDX: f58b1f3c
ESI: 00000000 EDI: f7aef080 EBP: f58b1f6c ESP: f58b1f54
 DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
Process btrfs-fixup-0 (pid: 1147, ti=f58b0000 task=f65206c0 task.ti=f58b0000)
Stack:
 f4e354e0 00000fff 00000000 f4e353e0 00000000 f49d97c0 00000000 f6645c40
 f49d97d8 f49d97c4 f6645c4c c117c57f f58b1f9c f6645c6c f65206c0 f65206c0
 f65206c0 f6645c4c f58b1f9c f58b1f9c f49d9858 f49d9e98 00000000 f645fd6c
Call Trace:
 [<c117c57f>] ? worker_loop+0x117/0x3ac
 [<c117c468>] ? btrfs_queue_worker+0x1ee/0x1ee
 [<c10379c4>] ? kthread+0x63/0x68
 [<c1037961>] ? kthread_worker_fn+0xeb/0xeb
 [<c13cbab6>] ? kernel_thread_helper+0x6/0xd
Code: f1 8b 44 24 1c e8 83 92 01 00 89 f8 e8 38 e0 ef ff b9 01 00 00 00 8b 54 
24 20 8b 44 24 10 e8 61 6a 01 00 83 c4 10 e9 2c ff ff ff <0f> 0b 6a 50 55 ff 74 24 
10 ff 74 24 10 89 da 89 f1 8b 44 24 1c 
EIP: [<c1158999>] btrfs_writepage_fixup_worker+0xf1/0x12d SS:ESP 0068:f58b1f54
---[ end trace 60f7cf60a0edd44b ]---

And the code looks confusing to me here. btrfs_set_extent_delalloc and 
ClearPageChecked could never be reached. Why are they still there?

regards,
  Johannes

  reply	other threads:[~2011-04-05 22:00 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-04-05 17:38 BUG: unable to handle kernel NULL pointer dereference at (null) Johannes Hirte
2011-04-05 17:42 ` Josef Bacik
2011-04-05 18:52   ` Johannes Hirte
2011-04-05 18:53     ` Josef Bacik
2011-04-05 19:21       ` Johannes Hirte
2011-04-05 19:31         ` Josef Bacik
     [not found]           ` <201104052308.53816.johannes.hirte@fem.tu-ilmenau.de>
2011-04-05 21:12             ` Josef Bacik
2011-04-05 22:00               ` Johannes Hirte [this message]
2011-04-05 21:57                 ` Josef Bacik
2011-04-06 11:10                   ` Johannes Hirte
2011-04-06 17:15                     ` Josef Bacik
2011-04-06 20:47                       ` Jordan Patterson
2011-04-06 23:42                         ` Johannes Hirte
2011-04-07 13:17                         ` Josef Bacik
     [not found]                           ` <BANLkTin1MN-QZWGvVE4o0T1_U9B1qtunig@mail.gmail.com>
2011-04-07 15:44                             ` Jordan Patterson
2011-04-07  8:28                       ` Johannes Hirte
2011-07-20 15:12 TB
2011-07-21  4:13 ` Randy Dunlap
2011-11-13 15:18 Andreas Hartmann
2013-08-11  5:53 [cpufreq] " Fengguang Wu
2013-08-12  4:59 ` Viresh Kumar
2014-10-23  6:58 Kevin Wilson
2014-10-23  9:05 ` Paul Bolle

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=201104060000.11403.johannes.hirte@fem.tu-ilmenau.de \
    --to=johannes.hirte@fem.tu-ilmenau.de \
    --cc=josef@redhat.com \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.