All of lore.kernel.org
 help / color / mirror / Atom feed
* Single disk performance
@ 2009-06-26 14:28 Steven Pratt
  2009-06-26 20:56 ` Chris Mason
  0 siblings, 1 reply; 9+ messages in thread
From: Steven Pratt @ 2009-06-26 14:28 UTC (permalink / raw)
  To: linux-btrfs

Upgraded the btrfs tree to 6-17 and all of the stability problems went 
away on the single disk system, so not sure if this was a code problem 
or hardware, but at least stable now.
Performance results updated at:
 http://btrfs.boxacle.net/repository/single-disk/History/History.html

The fixed to the cow path are obvious for random write, although even on 
single disk the CPU overhead is very noticeable as the efficiency graphs 
show.

The good news is that now the only workload that Btrfs is not at or near 
the top in performance for single disk is MailServer.

Steve

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Single disk performance
  2009-06-26 14:28 Single disk performance Steven Pratt
@ 2009-06-26 20:56 ` Chris Mason
  2009-06-27  2:26   ` Steven Pratt
  0 siblings, 1 reply; 9+ messages in thread
From: Chris Mason @ 2009-06-26 20:56 UTC (permalink / raw)
  To: Steven Pratt; +Cc: linux-btrfs

On Fri, Jun 26, 2009 at 09:28:51AM -0500, Steven Pratt wrote:
> Upgraded the btrfs tree to 6-17 and all of the stability problems went  
> away on the single disk system, so not sure if this was a code problem  
> or hardware, but at least stable now.
> Performance results updated at:
> http://btrfs.boxacle.net/repository/single-disk/History/History.html
>
> The fixed to the cow path are obvious for random write, although even on  
> single disk the CPU overhead is very noticeable as the efficiency graphs  
> show.
>
> The good news is that now the only workload that Btrfs is not at or near  
> the top in performance for single disk is MailServer.

Thanks Steve, glad to hear the stability problems are gone.

Could you please try this one liner to see if our big CPU problem during
streaming writes goes away?

diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
index 126477e..7c3cd24 100644
--- a/fs/btrfs/file.c
+++ b/fs/btrfs/file.c
@@ -151,7 +151,10 @@ static noinline int dirty_and_release_pages(struct btrfs_trans_handle *trans,
 	}
 	if (end_pos > isize) {
 		i_size_write(inode, end_pos);
-		btrfs_update_inode(trans, root, inode);
+		/* we've only changed i_size in ram, and we haven't updated
+		 * the disk i_size.  There is no need to log the inode
+		 * at this time.
+		 */
 	}
 	err = btrfs_end_transaction(trans, root);
 out_unlock:

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: Single disk performance
  2009-06-26 20:56 ` Chris Mason
@ 2009-06-27  2:26   ` Steven Pratt
  2009-06-29 12:41     ` Chris Mason
  0 siblings, 1 reply; 9+ messages in thread
From: Steven Pratt @ 2009-06-27  2:26 UTC (permalink / raw)
  To: Chris Mason, Steven Pratt, linux-btrfs

Chris Mason wrote:
> On Fri, Jun 26, 2009 at 09:28:51AM -0500, Steven Pratt wrote:
>   
>> Upgraded the btrfs tree to 6-17 and all of the stability problems went  
>> away on the single disk system, so not sure if this was a code problem  
>> or hardware, but at least stable now.
>> Performance results updated at:
>> http://btrfs.boxacle.net/repository/single-disk/History/History.html
>>
>> The fixed to the cow path are obvious for random write, although even on  
>> single disk the CPU overhead is very noticeable as the efficiency graphs  
>> show.
>>
>> The good news is that now the only workload that Btrfs is not at or near  
>> the top in performance for single disk is MailServer.
>>     
>
> Thanks Steve, glad to hear the stability problems are gone.
>
>   
Well, maybe I spoke too soon. :-(    Run with this patch died in similar 
way to before.  My remote service console is not responding, so will 
probably be Monday before I can get to the lab to restart manually.


I am getting messages like:


8:36:13 btrfs2 kernel: [ 4200.909078] INFO: task ffsb:26362 blocked for 
more than 120 seconds.
Jun 26 18:36:13 btrfs2 kernel: [ 4200.915474] "echo 0 > 
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jun 26 18:36:13 btrfs2 kernel: [ 4200.923338] ffsb          D 
ffffffff804e15e0     0 26362  26200
Jun 26 18:36:13 btrfs2 kernel: [ 4200.923346]  ffff8801263bdcc8 
0000000000000086 0000000000000000 ffff88004519d158
Jun 26 18:36:13 btrfs2 kernel: [ 4200.930914]  0000000000000000 
ffff88013b9cc710 ffff88013fbf96f0 ffff88013b9cca98
Jun 26 18:36:13 btrfs2 kernel: [ 4200.938489]  00000008263bdca8 
000000010039973e ffff8801263bdca8 ffff88012c95a600
Jun 26 18:36:13 btrfs2 kernel: [ 4200.946054] Call Trace:
Jun 26 18:36:13 btrfs2 kernel: [ 4200.948545]  [<ffffffff804cbe09>] 
schedule+0x9/0x1d
Jun 26 18:36:13 btrfs2 kernel: [ 4200.953459]  [<ffffffff804cc09c>] 
io_schedule+0x5d/0x9f
Jun 26 18:36:13 btrfs2 kernel: [ 4200.958718]  [<ffffffff8027c86e>] 
sync_page+0x44/0x48
Jun 26 18:36:13 btrfs2 kernel: [ 4200.963800]  [<ffffffff804cc3e6>] 
__wait_on_bit+0x45/0x77
Jun 26 18:36:13 btrfs2 kernel: [ 4200.969235]  [<ffffffff8027c82a>] ? 
sync_page+0x0/0x48
Jun 26 18:36:13 btrfs2 kernel: [ 4200.974408]  [<ffffffff8027c9fa>] 
wait_on_page_bit+0x6f/0x76
Jun 26 18:36:13 btrfs2 kernel: [ 4200.980107]  [<ffffffff8024c498>] ? 
wake_bit_function+0x0/0x2a
Jun 26 18:36:13 btrfs2 kernel: [ 4200.986050]  [<ffffffffa036c123>] 
prepare_pages+0xbd/0x1f3 [btrfs]
Jun 26 18:36:13 btrfs2 kernel: [ 4200.992281]  [<ffffffffa036c619>] 
btrfs_file_write+0x3c0/0x6d2 [btrfs]
Jun 26 18:36:13 btrfs2 kernel: [ 4200.998839]  [<ffffffff802ab7b8>] 
vfs_write+0xae/0x137
Jun 26 18:36:13 btrfs2 kernel: [ 4201.004351]  [<ffffffff802abcfd>] 
sys_write+0x47/0x6f
Jun 26 18:36:13 btrfs2 kernel: [ 4201.009773]  [<ffffffff8020ba2b>] 
system_call_fastpath+0x16/0x1b
Jun 26 18:36:13 btrfs2 kernel: [ 4201.016160] INFO: task ffsb:26366 
blocked for more than 120 seconds.
Jun 26 18:36:13 btrfs2 kernel: [ 4201.022894] "echo 0 > 
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jun 26 18:36:13 btrfs2 kernel: [ 4201.031446] ffsb          D 
ffffffff804e15e0     0 26366  26200

Lots of these timeout messages, then eventually

 18:40:32 btrfs2 kernel: [ 4459.870613] sd 0:0:1:0: [sdb] Unhandled 
error code
Jun 26 18:40:32 btrfs2 kernel: [ 4459.870640] sd 0:0:1:0: [sdb] Result: 
hostbyte=DID_ABORT driverbyte=DRIVER_OK
Jun 26 18:40:32 btrfs2 kernel: [ 4459.870646] end_request: I/O error, 
dev sdb, sector 103359232

So still not sure if this is HW, but no other FS has triggered it.

Steve

> Could you please try this one liner to see if our big CPU problem during
> streaming writes goes away?
>   
> diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
> index 126477e..7c3cd24 100644
> --- a/fs/btrfs/file.c
> +++ b/fs/btrfs/file.c
> @@ -151,7 +151,10 @@ static noinline int dirty_and_release_pages(struct btrfs_trans_handle *trans,
>  	}
>  	if (end_pos > isize) {
>  		i_size_write(inode, end_pos);
> -		btrfs_update_inode(trans, root, inode);
> +		/* we've only changed i_size in ram, and we haven't updated
> +		 * the disk i_size.  There is no need to log the inode
> +		 * at this time.
> +		 */
>  	}
>  	err = btrfs_end_transaction(trans, root);
>  out_unlock:
>   


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Single disk performance
  2009-06-27  2:26   ` Steven Pratt
@ 2009-06-29 12:41     ` Chris Mason
  2009-06-29 23:17       ` Bron Gondwana
  2009-06-30 14:38       ` Steven Pratt
  0 siblings, 2 replies; 9+ messages in thread
From: Chris Mason @ 2009-06-29 12:41 UTC (permalink / raw)
  To: Steven Pratt; +Cc: linux-btrfs

On Fri, Jun 26, 2009 at 09:26:59PM -0500, Steven Pratt wrote:
> Chris Mason wrote:
>> On Fri, Jun 26, 2009 at 09:28:51AM -0500, Steven Pratt wrote:
>>   
>>> Upgraded the btrfs tree to 6-17 and all of the stability problems 
>>> went  away on the single disk system, so not sure if this was a code 
>>> problem  or hardware, but at least stable now.
>>> Performance results updated at:
>>> http://btrfs.boxacle.net/repository/single-disk/History/History.html
>>>
>>> The fixed to the cow path are obvious for random write, although even 
>>> on  single disk the CPU overhead is very noticeable as the efficiency 
>>> graphs  show.
>>>
>>> The good news is that now the only workload that Btrfs is not at or 
>>> near  the top in performance for single disk is MailServer.
>>>     
>>
>> Thanks Steve, glad to hear the stability problems are gone.
>>
>>   
> Well, maybe I spoke too soon. :-(    Run with this patch died in similar  
> way to before.  My remote service console is not responding, so will  
> probably be Monday before I can get to the lab to restart manually.
>
>
> I am getting messages like:
>
> Lots of these timeout messages, then eventually
>
> 18:40:32 btrfs2 kernel: [ 4459.870613] sd 0:0:1:0: [sdb] Unhandled error 
> code
> Jun 26 18:40:32 btrfs2 kernel: [ 4459.870640] sd 0:0:1:0: [sdb] Result:  
> hostbyte=DID_ABORT driverbyte=DRIVER_OK
> Jun 26 18:40:32 btrfs2 kernel: [ 4459.870646] end_request: I/O error,  
> dev sdb, sector 103359232
>
> So still not sure if this is HW, but no other FS has triggered it.
>

I'm afraid Btrfs can't do this on its own.  It needs to HW, scsi
drivers or HW or scsi drivdes ;)

You could try dd if=/dev/sdb of=/dev/zero bs=512 count=1 skip=103359232

Hopefully that will fall over without btrfs helping.

-chris

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Single disk performance
  2009-06-29 12:41     ` Chris Mason
@ 2009-06-29 23:17       ` Bron Gondwana
  2009-06-30 11:02         ` Chris Mason
  2009-06-30 14:38       ` Steven Pratt
  1 sibling, 1 reply; 9+ messages in thread
From: Bron Gondwana @ 2009-06-29 23:17 UTC (permalink / raw)
  To: Chris Mason, Steven Pratt, linux-btrfs

On Mon, Jun 29, 2009 at 08:41:49AM -0400, Chris Mason wrote:
> On Fri, Jun 26, 2009 at 09:26:59PM -0500, Steven Pratt wrote:
> > 18:40:32 btrfs2 kernel: [ 4459.870613] sd 0:0:1:0: [sdb] Unhandled error 
> > code
> > Jun 26 18:40:32 btrfs2 kernel: [ 4459.870640] sd 0:0:1:0: [sdb] Result:  
> > hostbyte=DID_ABORT driverbyte=DRIVER_OK
> > Jun 26 18:40:32 btrfs2 kernel: [ 4459.870646] end_request: I/O error,  
> > dev sdb, sector 103359232
> >
> > So still not sure if this is HW, but no other FS has triggered it.
> 
> I'm afraid Btrfs can't do this on its own.  It needs to HW, scsi
> drivers or HW or scsi drivdes ;)
> 
> You could try dd if=/dev/sdb of=/dev/zero bs=512 count=1 skip=103359232
> 
> Hopefully that will fall over without btrfs helping.

Surely of=/dev/null ?  Unless you mean to write to the disk at that block
which would be if=/dev/zero of=/dev/sdb...

Bron.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Single disk performance
  2009-06-29 23:17       ` Bron Gondwana
@ 2009-06-30 11:02         ` Chris Mason
  0 siblings, 0 replies; 9+ messages in thread
From: Chris Mason @ 2009-06-30 11:02 UTC (permalink / raw)
  To: Bron Gondwana; +Cc: Steven Pratt, linux-btrfs

On Tue, Jun 30, 2009 at 09:17:22AM +1000, Bron Gondwana wrote:
> On Mon, Jun 29, 2009 at 08:41:49AM -0400, Chris Mason wrote:
> > On Fri, Jun 26, 2009 at 09:26:59PM -0500, Steven Pratt wrote:
> > > 18:40:32 btrfs2 kernel: [ 4459.870613] sd 0:0:1:0: [sdb] Unhandled error 
> > > code
> > > Jun 26 18:40:32 btrfs2 kernel: [ 4459.870640] sd 0:0:1:0: [sdb] Result:  
> > > hostbyte=DID_ABORT driverbyte=DRIVER_OK
> > > Jun 26 18:40:32 btrfs2 kernel: [ 4459.870646] end_request: I/O error,  
> > > dev sdb, sector 103359232
> > >
> > > So still not sure if this is HW, but no other FS has triggered it.
> > 
> > I'm afraid Btrfs can't do this on its own.  It needs to HW, scsi
> > drivers or HW or scsi drivdes ;)
> > 
> > You could try dd if=/dev/sdb of=/dev/zero bs=512 count=1 skip=103359232
> > 
> > Hopefully that will fall over without btrfs helping.
> 
> Surely of=/dev/null ?  Unless you mean to write to the disk at that block
> which would be if=/dev/zero of=/dev/sdb...

Its just habit for me, since /dev/null won't work as if=/dev/null, I
always use /dev/zero with dd.

-chris

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Single disk performance
  2009-06-29 12:41     ` Chris Mason
  2009-06-29 23:17       ` Bron Gondwana
@ 2009-06-30 14:38       ` Steven Pratt
  2009-06-30 15:10         ` Yan Zheng
  1 sibling, 1 reply; 9+ messages in thread
From: Steven Pratt @ 2009-06-30 14:38 UTC (permalink / raw)
  To: Chris Mason, Steven Pratt, linux-btrfs

Chris Mason wrote:
> On Fri, Jun 26, 2009 at 09:26:59PM -0500, Steven Pratt wrote:
>   
>> Chris Mason wrote:
>>     
>>> On Fri, Jun 26, 2009 at 09:28:51AM -0500, Steven Pratt wrote:
>>>   
>>>       
>>>> Upgraded the btrfs tree to 6-17 and all of the stability problems 
>>>> went  away on the single disk system, so not sure if this was a code 
>>>> problem  or hardware, but at least stable now.
>>>> Performance results updated at:
>>>> http://btrfs.boxacle.net/repository/single-disk/History/History.html
>>>>
>>>> The fixed to the cow path are obvious for random write, although even 
>>>> on  single disk the CPU overhead is very noticeable as the efficiency 
>>>> graphs  show.
>>>>
>>>> The good news is that now the only workload that Btrfs is not at or 
>>>> near  the top in performance for single disk is MailServer.
>>>>     
>>>>         
>>> Thanks Steve, glad to hear the stability problems are gone.
>>>
>>>   
>>>       
>> Well, maybe I spoke too soon. :-(    Run with this patch died in similar  
>> way to before.  My remote service console is not responding, so will  
>> probably be Monday before I can get to the lab to restart manually.
>>
>>
>> I am getting messages like:
>>
>> Lots of these timeout messages, then eventually
>>
>> 18:40:32 btrfs2 kernel: [ 4459.870613] sd 0:0:1:0: [sdb] Unhandled error 
>> code
>> Jun 26 18:40:32 btrfs2 kernel: [ 4459.870640] sd 0:0:1:0: [sdb] Result:  
>> hostbyte=DID_ABORT driverbyte=DRIVER_OK
>> Jun 26 18:40:32 btrfs2 kernel: [ 4459.870646] end_request: I/O error,  
>> dev sdb, sector 103359232
>>
>> So still not sure if this is HW, but no other FS has triggered it.
>>
>>     
>
> I'm afraid Btrfs can't do this on its own.  It needs to HW, scsi
> drivers or HW or scsi drivdes ;)
>
> You could try dd if=/dev/sdb of=/dev/zero bs=512 count=1 skip=103359232
>   
Well, dd write of entire drive shows no errors.  Ran btrfs tests again 
and go this, no disk or scsi errors reported this time.


Jun 29 15:55:34 btrfs2 kernel: [ 8214.725011] kernel BUG at 
fs/btrfs/extent-tree.c:3865!
Jun 29 15:55:34 btrfs2 kernel: [ 8214.725011] invalid opcode: 0000 [#1] SMP
Jun 29 15:55:34 btrfs2 kernel: [ 8214.725011] last sysfs file: 
/sys/devices/system/cpu/cpu15/cache/index1/shared_cpu_map
Jun 29 15:55:34 btrfs2 kernel: [ 8214.725011] CPU 8
Jun 29 15:55:34 btrfs2 kernel: [ 8214.725011] Modules linked in: 
oprofile btrfs zlib_deflate autofs4 nfs lockd nfs_acl auth_rpcgss sunrpc 
dm_multipath sbs sbshc ba
ttery ac parport_pc lp parport sg joydev serio_raw acpi_memhotplug 
rtc_cmos rtc_core rtc_lib button tg3 libphy i2c_piix4 i2c_core pcspkr 
dm_snapshot dm_zero dm_mir
ror dm_region_hash dm_log dm_mod lpfc scsi_transport_fc aic94xx libsas 
libata scsi_transport_sas sd_mod scsi_mod ext3 jbd uhci_hcd ohci_hcd 
ehci_hcd [last unloaded
: microcode]
Jun 29 15:55:34 btrfs2 kernel: [ 8214.725011] Pid: 21731, comm: 
btrfs-endio-wri Not tainted 2.6.30-rc7-autokern1 #1 IBM x3950-[88726RU]-
Jun 29 15:55:34 btrfs2 kernel: [ 8214.725011] RIP: 
0010:[<ffffffffa0346ce4>]  [<ffffffffa0346ce4>] 
alloc_reserved_file_extent+0x8d/0x1c3 [btrfs]
Jun 29 15:55:34 btrfs2 kernel: [ 8214.725011] RSP: 
0018:ffff88013e10bb60  EFLAGS: 00010282
Jun 29 15:55:34 btrfs2 kernel: [ 8214.725011] RAX: 00000000ffffffef RBX: 
ffff88006fbde000 RCX: 0000000000000002
Jun 29 15:55:34 btrfs2 kernel: [ 8214.725011] RDX: 0000000000000001 RSI: 
0000000000000000 RDI: ffff8801020ac5b0
Jun 29 15:55:34 btrfs2 kernel: [ 8214.725011] RBP: ffff88013e10bbd0 R08: 
ffff88013e10b9d8 R09: ffff88013e10b9d0
Jun 29 15:55:34 btrfs2 kernel: [ 8214.725011] R10: 0000000000000004 R11: 
ffff8801020ac5b0 R12: 000000000000001d
Jun 29 15:55:34 btrfs2 kernel: [ 8214.725011] R13: ffff88012e1e7910 R14: 
0000000000000000 R15: 0000000000000000
Jun 29 15:55:34 btrfs2 kernel: [ 8214.725011] FS:  
0000000000000000(0000) GS:ffff88002bac0000(0000) knlGS:0000000000000000
Jun 29 15:55:34 btrfs2 kernel: [ 8214.725011] CS:  0010 DS: 0018 ES: 
0018 CR0: 000000008005003b
Jun 29 15:55:34 btrfs2 kernel: [ 8214.725011] CR2: 00007fffdac2efb0 CR3: 
0000000138cc9000 CR4: 00000000000006e0
Jun 29 15:55:34 btrfs2 kernel: [ 8214.725011] DR0: 0000000000000000 DR1: 
0000000000000000 DR2: 0000000000000000
Jun 29 15:55:34 btrfs2 kernel: [ 8214.725011] DR3: 0000000000000000 DR6: 
00000000ffff0ff0 DR7: 0000000000000400
Jun 29 15:55:34 btrfs2 kernel: [ 8214.725011] Process btrfs-endio-wri 
(pid: 21731, threadinfo ffff88013e10a000, task ffff880138d117b0)
Jun 29 15:55:34 btrfs2 kernel: [ 8214.725011] Stack:
Jun 29 15:55:34 btrfs2 kernel: [ 8214.725011]  0000000000000000 
00000000000011d5 0000000000000005 0000000000000000
Jun 29 15:55:34 btrfs2 kernel: [ 8214.725011]  ffff88005fcb0800 
ffff88011a47f860 000000b2844a5030 000000000000008c
Jun 29 15:55:34 btrfs2 kernel: [ 8214.725011]  000000352e1e7910 
ffff8800be095540 ffff8800be095740 0000000000000001
Jun 29 15:55:34 btrfs2 kernel: [ 8214.725011] Call Trace:
Jun 29 15:55:34 btrfs2 kernel: [ 8214.725011]  [<ffffffffa034b198>] 
run_one_delayed_ref+0x382/0x42f [btrfs]
Jun 29 15:55:34 btrfs2 kernel: [ 8214.725011]  [<ffffffffa036abbd>] ? 
map_extent_buffer+0xab/0xbe [btrfs]
Jun 29 15:55:34 btrfs2 kernel: [ 8214.725011]  [<ffffffffa034bf75>] 
run_clustered_refs+0x237/0x2b4 [btrfs]
Jun 29 15:55:34 btrfs2 kernel: [ 8214.725011]  [<ffffffffa037ef71>] ? 
btrfs_find_ref_cluster+0xdc/0x115 [btrfs]
Jun 29 15:55:34 btrfs2 kernel: [ 8214.725011]  [<ffffffffa034c09e>] 
btrfs_run_delayed_refs+0xac/0x195 [btrfs]
Jun 29 15:55:34 btrfs2 kernel: [ 8214.725011]  [<ffffffffa035486e>] 
__btrfs_end_transaction+0x59/0xfe [btrfs]
Jun 29 15:55:34 btrfs2 kernel: [ 8214.725011]  [<ffffffffa035492e>] 
btrfs_end_transaction+0xb/0xd [btrfs]
Jun 29 15:55:34 btrfs2 kernel: [ 8214.725011]  [<ffffffffa035a18b>] 
btrfs_finish_ordered_io+0x224/0x24d [btrfs]
Jun 29 15:55:34 btrfs2 kernel: [ 8214.725011]  [<ffffffffa035a1c4>] 
btrfs_writepage_end_io_hook+0x10/0x12 [btrfs]
Jun 29 15:55:34 btrfs2 kernel: [ 8214.725011]  [<ffffffffa036d585>] 
end_bio_extent_writepage+0xa3/0x18f [btrfs]
Jun 29 15:55:34 btrfs2 kernel: [ 8214.725011]  [<ffffffff8024276e>] ? 
del_timer_sync+0x14/0x20
Jun 29 15:55:34 btrfs2 kernel: [ 8214.725011]  [<ffffffff802cbbee>] 
bio_endio+0x26/0x28
Jun 29 15:55:34 btrfs2 kernel: [ 8214.725011]  [<ffffffffa03515d6>] 
end_workqueue_fn+0x111/0x11e [btrfs]
Jun 29 15:55:34 btrfs2 kernel: [ 8214.725011]  [<ffffffffa0374fe1>] 
worker_loop+0x67/0x1ee [btrfs]
Jun 29 15:55:34 btrfs2 kernel: [ 8214.725011]  [<ffffffffa0374f7a>] ? 
worker_loop+0x0/0x1ee [btrfs]
Jun 29 15:55:34 btrfs2 kernel: [ 8214.725011]  [<ffffffff8024c324>] 
kthread+0x56/0x86
Jun 29 15:55:35 btrfs2 kernel: [ 8214.725011]  [<ffffffff8020c9fa>] 
child_rip+0xa/0x20
Jun 29 15:55:35 btrfs2 kernel: [ 8214.725011]  [<ffffffff8024c2ce>] ? 
kthread+0x0/0x86
Jun 29 15:55:35 btrfs2 kernel: [ 8214.725011]  [<ffffffff8020c9f0>] ? 
child_rip+0x0/0x20
Jun 29 15:55:35 btrfs2 kernel: [ 8214.725011] Code: 08 4c 8d 45 d4 41 8d 
44 24 18 48 8b 73 20 48 8b 4d 18 41 b9 01 00 00 00 48 8b 7d b8 4c 89 ea 
89 45 d4 e8 df e3
ff ff 85 c0 74 04 <0f> 0b eb fe 49 63 75 40 4d 8b 65 00 49 83 cf 01 4c 
89 e7 48 6b
Jun 29 15:55:35 btrfs2 kernel: [ 8214.725011] RIP  [<ffffffffa0346ce4>] 
alloc_reserved_file_extent+0x8d/0x1c3 [btrfs]
Jun 29 15:55:35 btrfs2 kernel: [ 8214.725011]  RSP <ffff88013e10bb60>
Jun 29 15:55:35 btrfs2 kernel: [ 8215.101864] ---[ end trace 
2a2583ccd67ef43b ]---


After this error, get a bunch of messages similar to this one:


Jun 29 15:56:39 btrfs2 kernel: [ 8279.623396] BUG: soft lockup - CPU#8 
stuck for 61s! [btrfs-endio-wri:21732]
Jun 29 15:56:39 btrfs2 kernel: [ 8279.630424] Modules linked in: 
oprofile btrfs zlib_deflate autofs4 nfs lockd nfs_acl auth_rpcgss sunrpc 
dm_multipath sbs sbshc ba
ttery ac parport_pc lp parport sg joydev serio_raw acpi_memhotplug 
rtc_cmos rtc_core rtc_lib button tg3 libphy i2c_piix4 i2c_core pcspkr 
dm_snapshot dm_zero dm_mir
ror dm_region_hash dm_log dm_mod lpfc scsi_transport_fc aic94xx libsas 
libata scsi_transport_sas sd_mod scsi_mod ext3 jbd uhci_hcd ohci_hcd 
ehci_hcd [last unloaded
: microcode]
Jun 29 15:56:39 btrfs2 kernel: [ 8279.677406] CPU 8:
Jun 29 15:56:39 btrfs2 kernel: [ 8279.680414] Modules linked in: 
oprofile btrfs zlib_deflate autofs4 nfs lockd nfs_acl auth_rpcgss sunrpc 
dm_multipath sbs sbshc ba
ttery ac parport_pc lp parport sg joydev serio_raw acpi_memhotplug 
rtc_cmos rtc_core rtc_lib button tg3 libphy i2c_piix4 i2c_core pcspkr 
dm_snapshot dm_zero dm_mir
ror dm_region_hash dm_log dm_mod lpfc scsi_transport_fc aic94xx libsas 
libata scsi_transport_sas sd_mod scsi_mod ext3 jbd uhci_hcd ohci_hcd 
ehci_hcd [last unloaded
: microcode]
Jun 29 15:56:39 btrfs2 kernel: [ 8279.727394] Pid: 21732, comm: 
btrfs-endio-wri Tainted: G      D    2.6.30-rc7-autokern1 #1 IBM 
x3950-[88726RU]-
Jun 29 15:56:39 btrfs2 kernel: [ 8279.738395] RIP: 
0010:[<ffffffff804cd70d>]  [<ffffffff804cd70d>] _spin_lock+0x14/0x1a
Jun 29 15:56:39 btrfs2 kernel: [ 8279.746397] RSP: 
0018:ffff88013989d8e0  EFLAGS: 00000297
Jun 29 15:56:39 btrfs2 kernel: [ 8279.752394] RAX: 0000000000000e0d RBX: 
ffff88013989d8e0 RCX: 0000000000000000
Jun 29 15:56:39 btrfs2 kernel: [ 8279.760392] RDX: 0000000000000000 RSI: 
0000000000001000 RDI: ffff8800bddc5b30
Jun 29 15:56:39 btrfs2 kernel: [ 8279.767389] RBP: ffffffff8020c50e R08: 
0000000000000001 R09: 0000000000000000
Jun 29 15:56:39 btrfs2 kernel: [ 8279.775385] R10: ffff88013989d7a0 R11: 
ffff88013989d8c0 R12: 0000000000000000
Jun 29 15:56:39 btrfs2 kernel: [ 8279.782388] R13: 0000000000000000 R14: 
ffff88013989d8c0 R15: ffffffffa036abbd
Jun 29 15:56:39 btrfs2 kernel: [ 8279.790387] FS:  
0000000000000000(0000) GS:ffff88002bac0000(0000) knlGS:0000000000000000
Jun 29 15:56:39 btrfs2 kernel: [ 8279.799381] CS:  0010 DS: 0018 ES: 
0018 CR0: 000000008005003b
Jun 29 15:56:39 btrfs2 kernel: [ 8279.805384] CR2: 00007ff77fc11b80 CR3: 
000000013d1f3000 CR4: 00000000000006e0
Jun 29 15:56:39 btrfs2 kernel: [ 8279.812383] DR0: 0000000000000000 DR1: 
0000000000000000 DR2: 0000000000000000
Jun 29 15:56:39 btrfs2 kernel: [ 8279.820378] DR3: 0000000000000000 DR6: 
00000000ffff0ff0 DR7: 0000000000000400
Jun 29 15:56:39 btrfs2 kernel: [ 8279.828345] Call Trace:
Jun 29 15:56:39 btrfs2 kernel: [ 8279.830378]  [<ffffffffa03770bb>] ? 
btrfs_tree_lock+0x54/0x9e [btrfs]
Jun 29 15:56:39 btrfs2 kernel: [ 8279.837373]  [<ffffffffa037700e>] ? 
btrfs_wake_function+0x0/0x10 [btrfs]
Jun 29 15:56:39 btrfs2 kernel: [ 8279.844375]  [<ffffffffa0342294>] ? 
push_leaf_left+0xc1/0x155 [btrfs]
Jun 29 15:56:39 btrfs2 kernel: [ 8279.851372]  [<ffffffffa03429d6>] ? 
split_leaf+0x63/0x64f [btrfs]
Jun 29 15:56:39 btrfs2 kernel: [ 8279.858372]  [<ffffffffa033d837>] ? 
leaf_space_used+0xbc/0xeb [btrfs]
Jun 29 15:56:39 btrfs2 kernel: [ 8279.865368]  [<ffffffffa0344a85>] ? 
btrfs_search_slot+0x687/0x73e [btrfs]
Jun 29 15:56:39 btrfs2 kernel: [ 8279.872370]  [<ffffffffa034511d>] ? 
btrfs_insert_empty_items+0x5e/0xa9 [btrfs]
Jun 29 15:56:39 btrfs2 kernel: [ 8279.880370]  [<ffffffffa0346ce0>] ? 
alloc_reserved_file_extent+0x89/0x1c3 [btrfs]
Jun 29 15:56:39 btrfs2 kernel: [ 8279.888367]  [<ffffffffa034b198>] ? 
run_one_delayed_ref+0x382/0x42f [btrfs]
Jun 29 15:56:39 btrfs2 kernel: [ 8279.895363]  [<ffffffffa036abbd>] ? 
map_extent_buffer+0xab/0xbe [btrfs]
Jun 29 15:56:39 btrfs2 kernel: [ 8279.902366]  [<ffffffffa034bf75>] ? 
run_clustered_refs+0x237/0x2b4 [btrfs]
Jun 29 15:56:39 btrfs2 kernel: [ 8279.910361]  [<ffffffffa037ef71>] ? 
btrfs_find_ref_cluster+0xdc/0x115 [btrfs]
Jun 29 15:56:39 btrfs2 kernel: [ 8279.917357]  [<ffffffffa034c09e>] ? 
btrfs_run_delayed_refs+0xac/0x195 [btrfs]
Jun 29 15:56:39 btrfs2 kernel: [ 8279.925356]  [<ffffffffa035486e>] ? 
__btrfs_end_transaction+0x59/0xfe [btrfs]
Jun 29 15:56:39 btrfs2 kernel: [ 8279.932361]  [<ffffffffa035492e>] ? 
btrfs_end_transaction+0xb/0xd [btrfs]
Jun 29 15:56:39 btrfs2 kernel: [ 8279.940359]  [<ffffffffa035a18b>] ? 
btrfs_finish_ordered_io+0x224/0x24d [btrfs]
Jun 29 15:56:39 btrfs2 kernel: [ 8279.948362]  [<ffffffffa035a1c4>] ? 
btrfs_writepage_end_io_hook+0x10/0x12 [btrfs]
Jun 29 15:56:39 btrfs2 kernel: [ 8279.956352]  [<ffffffffa036d585>] ? 
end_bio_extent_writepage+0xa3/0x18f [btrfs]
Jun 29 15:56:39 btrfs2 kernel: [ 8279.964351]  [<ffffffff8024276e>] ? 
del_timer_sync+0x14/0x20
Jun 29 15:56:39 btrfs2 kernel: [ 8279.970352]  [<ffffffff802cbbee>] ? 
bio_endio+0x26/0x28
Jun 29 15:56:39 btrfs2 kernel: [ 8279.976349]  [<ffffffffa03515d6>] ? 
end_workqueue_fn+0x111/0x11e [btrfs]
Jun 29 15:56:39 btrfs2 kernel: [ 8279.983345]  [<ffffffffa0374fe1>] ? 
worker_loop+0x67/0x1ee [btrfs]
Jun 29 15:56:39 btrfs2 kernel: [ 8279.989345]  [<ffffffffa0374f7a>] ? 
worker_loop+0x0/0x1ee [btrfs]
Jun 29 15:56:39 btrfs2 kernel: [ 8279.996345]  [<ffffffff8024c324>] ? 
kthread+0x56/0x86
Jun 29 15:56:39 btrfs2 kernel: [ 8280.001345]  [<ffffffff8020c9fa>] ? 
child_rip+0xa/0x20
Jun 29 15:56:39 btrfs2 kernel: [ 8280.007343]  [<ffffffff8024c2ce>] ? 
kthread+0x0/0x86
Jun 29 15:56:39 btrfs2 kernel: [ 8280.012342]  [<ffffffff8020c9f0>] ? 
child_rip+0x0/0x20


Steve

> Hopefully that will fall over without btrfs helping.
>
> -chris
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>   


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Single disk performance
  2009-06-30 14:38       ` Steven Pratt
@ 2009-06-30 15:10         ` Yan Zheng
  2009-06-30 15:26           ` Steven Pratt
  0 siblings, 1 reply; 9+ messages in thread
From: Yan Zheng @ 2009-06-30 15:10 UTC (permalink / raw)
  To: Steven Pratt; +Cc: Chris Mason, linux-btrfs

2009/6/30 Steven Pratt <slpratt@austin.ibm.com>:
> Chris Mason wrote:
>>
>> On Fri, Jun 26, 2009 at 09:26:59PM -0500, Steven Pratt wrote:
>>
>>>
>>> Chris Mason wrote:
>>>
>>>>
>>>> On Fri, Jun 26, 2009 at 09:28:51AM -0500, Steven Pratt wrote:
>>>>
>>>>>
>>>>> Upgraded the btrfs tree to 6-17 and all of the stability problems went
>>>>>  away on the single disk system, so not sure if this was a code problem  or
>>>>> hardware, but at least stable now.
>>>>> Performance results updated at:
>>>>> http://btrfs.boxacle.net/repository/single-disk/History/History.html
>>>>>
>>>>> The fixed to the cow path are obvious for random write, although even
>>>>> on  single disk the CPU overhead is very noticeable as the efficiency graphs
>>>>>  show.
>>>>>
>>>>> The good news is that now the only workload that Btrfs is not at or
>>>>> near  the top in performance for single disk is MailServer.
>>>>>
>>>>
>>>> Thanks Steve, glad to hear the stability problems are gone.
>>>>
>>>>
>>>
>>> Well, maybe I spoke too soon. :-(    Run with this patch died in similar
>>>  way to before.  My remote service console is not responding, so will
>>>  probably be Monday before I can get to the lab to restart manually.
>>>
>>>
>>> I am getting messages like:
>>>
>>> Lots of these timeout messages, then eventually
>>>
>>> 18:40:32 btrfs2 kernel: [ 4459.870613] sd 0:0:1:0: [sdb] Unhandled error
>>> code
>>> Jun 26 18:40:32 btrfs2 kernel: [ 4459.870640] sd 0:0:1:0: [sdb] Result:
>>>  hostbyte=DID_ABORT driverbyte=DRIVER_OK
>>> Jun 26 18:40:32 btrfs2 kernel: [ 4459.870646] end_request: I/O error,
>>>  dev sdb, sector 103359232
>>>
>>> So still not sure if this is HW, but no other FS has triggered it.
>>>
>>>
>>
>> I'm afraid Btrfs can't do this on its own.  It needs to HW, scsi
>> drivers or HW or scsi drivdes ;)
>>
>> You could try dd if=/dev/sdb of=/dev/zero bs=512 count=1 skip=103359232
>>
>
> Well, dd write of entire drive shows no errors.  Ran btrfs tests again and
> go this, no disk or scsi errors reported this time.
>
>
> Jun 29 15:55:34 btrfs2 kernel: [ 8214.725011] kernel BUG at
> fs/btrfs/extent-tree.c:3865!
> Jun 29 15:55:34 btrfs2 kernel: [ 8214.725011] invalid opcode: 0000 [#1] SMP
> Jun 29 15:55:34 btrfs2 kernel: [ 8214.725011] last sysfs file:
> /sys/devices/system/cpu/cpu15/cache/index1/shared_cpu_map
> Jun 29 15:55:34 btrfs2 kernel: [ 8214.725011] CPU 8
> Jun 29 15:55:34 btrfs2 kernel: [ 8214.725011] Modules linked in: oprofile
> btrfs zlib_deflate autofs4 nfs lockd nfs_acl auth_rpcgss sunrpc dm_multipath
> sbs sbshc ba
> ttery ac parport_pc lp parport sg joydev serio_raw acpi_memhotplug rtc_cmos
> rtc_core rtc_lib button tg3 libphy i2c_piix4 i2c_core pcspkr dm_snapshot
> dm_zero dm_mir
> ror dm_region_hash dm_log dm_mod lpfc scsi_transport_fc aic94xx libsas
> libata scsi_transport_sas sd_mod scsi_mod ext3 jbd uhci_hcd ohci_hcd
> ehci_hcd [last unloaded
> : microcode]
> Jun 29 15:55:34 btrfs2 kernel: [ 8214.725011] Pid: 21731, comm:
> btrfs-endio-wri Not tainted 2.6.30-rc7-autokern1 #1 IBM x3950-[88726RU]-
> Jun 29 15:55:34 btrfs2 kernel: [ 8214.725011] RIP: 0010:[<ffffffffa0346ce4>]
>  [<ffffffffa0346ce4>] alloc_reserved_file_extent+0x8d/0x1c3 [btrfs]
> Jun 29 15:55:34 btrfs2 kernel: [ 8214.725011] RSP: 0018:ffff88013e10bb60
>  EFLAGS: 00010282
> Jun 29 15:55:34 btrfs2 kernel: [ 8214.725011] RAX: 00000000ffffffef RBX:
> ffff88006fbde000 RCX: 0000000000000002
> Jun 29 15:55:34 btrfs2 kernel: [ 8214.725011] RDX: 0000000000000001 RSI:
> 0000000000000000 RDI: ffff8801020ac5b0
> Jun 29 15:55:34 btrfs2 kernel: [ 8214.725011] RBP: ffff88013e10bbd0 R08:
> ffff88013e10b9d8 R09: ffff88013e10b9d0
> Jun 29 15:55:34 btrfs2 kernel: [ 8214.725011] R10: 0000000000000004 R11:
> ffff8801020ac5b0 R12: 000000000000001d
> Jun 29 15:55:34 btrfs2 kernel: [ 8214.725011] R13: ffff88012e1e7910 R14:
> 0000000000000000 R15: 0000000000000000
> Jun 29 15:55:34 btrfs2 kernel: [ 8214.725011] FS:  0000000000000000(0000)
> GS:ffff88002bac0000(0000) knlGS:0000000000000000
> Jun 29 15:55:34 btrfs2 kernel: [ 8214.725011] CS:  0010 DS: 0018 ES: 0018
> CR0: 000000008005003b
> Jun 29 15:55:34 btrfs2 kernel: [ 8214.725011] CR2: 00007fffdac2efb0 CR3:
> 0000000138cc9000 CR4: 00000000000006e0
> Jun 29 15:55:34 btrfs2 kernel: [ 8214.725011] DR0: 0000000000000000 DR1:
> 0000000000000000 DR2: 0000000000000000
> Jun 29 15:55:34 btrfs2 kernel: [ 8214.725011] DR3: 0000000000000000 DR6:
> 00000000ffff0ff0 DR7: 0000000000000400
> Jun 29 15:55:34 btrfs2 kernel: [ 8214.725011] Process btrfs-endio-wri (pid:
> 21731, threadinfo ffff88013e10a000, task ffff880138d117b0)
> Jun 29 15:55:34 btrfs2 kernel: [ 8214.725011] Stack:
> Jun 29 15:55:34 btrfs2 kernel: [ 8214.725011]  0000000000000000
> 00000000000011d5 0000000000000005 0000000000000000
> Jun 29 15:55:34 btrfs2 kernel: [ 8214.725011]  ffff88005fcb0800
> ffff88011a47f860 000000b2844a5030 000000000000008c
> Jun 29 15:55:34 btrfs2 kernel: [ 8214.725011]  000000352e1e7910
> ffff8800be095540 ffff8800be095740 0000000000000001
> Jun 29 15:55:34 btrfs2 kernel: [ 8214.725011] Call Trace:
> Jun 29 15:55:34 btrfs2 kernel: [ 8214.725011]  [<ffffffffa034b198>]
> run_one_delayed_ref+0x382/0x42f [btrfs]
> Jun 29 15:55:34 btrfs2 kernel: [ 8214.725011]  [<ffffffffa036abbd>] ?
> map_extent_buffer+0xab/0xbe [btrfs]
> Jun 29 15:55:34 btrfs2 kernel: [ 8214.725011]  [<ffffffffa034bf75>]
> run_clustered_refs+0x237/0x2b4 [btrfs]
> Jun 29 15:55:34 btrfs2 kernel: [ 8214.725011]  [<ffffffffa037ef71>] ?
> btrfs_find_ref_cluster+0xdc/0x115 [btrfs]
> Jun 29 15:55:34 btrfs2 kernel: [ 8214.725011]  [<ffffffffa034c09e>]
> btrfs_run_delayed_refs+0xac/0x195 [btrfs]
> Jun 29 15:55:34 btrfs2 kernel: [ 8214.725011]  [<ffffffffa035486e>]
> __btrfs_end_transaction+0x59/0xfe [btrfs]
> Jun 29 15:55:34 btrfs2 kernel: [ 8214.725011]  [<ffffffffa035492e>]
> btrfs_end_transaction+0xb/0xd [btrfs]
> Jun 29 15:55:34 btrfs2 kernel: [ 8214.725011]  [<ffffffffa035a18b>]
> btrfs_finish_ordered_io+0x224/0x24d [btrfs]
> Jun 29 15:55:34 btrfs2 kernel: [ 8214.725011]  [<ffffffffa035a1c4>]
> btrfs_writepage_end_io_hook+0x10/0x12 [btrfs]
> Jun 29 15:55:34 btrfs2 kernel: [ 8214.725011]  [<ffffffffa036d585>]
> end_bio_extent_writepage+0xa3/0x18f [btrfs]
> Jun 29 15:55:34 btrfs2 kernel: [ 8214.725011]  [<ffffffff8024276e>] ?
> del_timer_sync+0x14/0x20
> Jun 29 15:55:34 btrfs2 kernel: [ 8214.725011]  [<ffffffff802cbbee>]
> bio_endio+0x26/0x28
> Jun 29 15:55:34 btrfs2 kernel: [ 8214.725011]  [<ffffffffa03515d6>]
> end_workqueue_fn+0x111/0x11e [btrfs]
> Jun 29 15:55:34 btrfs2 kernel: [ 8214.725011]  [<ffffffffa0374fe1>]
> worker_loop+0x67/0x1ee [btrfs]
> Jun 29 15:55:34 btrfs2 kernel: [ 8214.725011]  [<ffffffffa0374f7a>] ?
> worker_loop+0x0/0x1ee [btrfs]
> Jun 29 15:55:34 btrfs2 kernel: [ 8214.725011]  [<ffffffff8024c324>]
> kthread+0x56/0x86
> Jun 29 15:55:35 btrfs2 kernel: [ 8214.725011]  [<ffffffff8020c9fa>]
> child_rip+0xa/0x20
> Jun 29 15:55:35 btrfs2 kernel: [ 8214.725011]  [<ffffffff8024c2ce>] ?
> kthread+0x0/0x86
> Jun 29 15:55:35 btrfs2 kernel: [ 8214.725011]  [<ffffffff8020c9f0>] ?
> child_rip+0x0/0x20
> Jun 29 15:55:35 btrfs2 kernel: [ 8214.725011] Code: 08 4c 8d 45 d4 41 8d 44
> 24 18 48 8b 73 20 48 8b 4d 18 41 b9 01 00 00 00 48 8b 7d b8 4c 89 ea 89 45
> d4 e8 df e3
> ff ff 85 c0 74 04 <0f> 0b eb fe 49 63 75 40 4d 8b 65 00 49 83 cf 01 4c 89 e7
> 48 6b
> Jun 29 15:55:35 btrfs2 kernel: [ 8214.725011] RIP  [<ffffffffa0346ce4>]
> alloc_reserved_file_extent+0x8d/0x1c3 [btrfs]
> Jun 29 15:55:35 btrfs2 kernel: [ 8214.725011]  RSP <ffff88013e10bb60>
> Jun 29 15:55:35 btrfs2 kernel: [ 8215.101864] ---[ end trace
> 2a2583ccd67ef43b ]---
>

Is there any "parent transid verify failed on xxx wanted xxx found" like message
in the log ?

Thank you,
Yan Zheng

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Single disk performance
  2009-06-30 15:10         ` Yan Zheng
@ 2009-06-30 15:26           ` Steven Pratt
  0 siblings, 0 replies; 9+ messages in thread
From: Steven Pratt @ 2009-06-30 15:26 UTC (permalink / raw)
  To: Yan Zheng; +Cc: Chris Mason, linux-btrfs

Yan Zheng wrote:
> 2009/6/30 Steven Pratt <slpratt@austin.ibm.com>:
>   
>> Chris Mason wrote:
>>     
>>> On Fri, Jun 26, 2009 at 09:26:59PM -0500, Steven Pratt wrote:
>>>
>>>       
>>>> Chris Mason wrote:
>>>>
>>>>         
>>>>> On Fri, Jun 26, 2009 at 09:28:51AM -0500, Steven Pratt wrote:
>>>>>
>>>>>           
>>>>>> Upgraded the btrfs tree to 6-17 and all of the stability problems went
>>>>>>  away on the single disk system, so not sure if this was a code problem  or
>>>>>> hardware, but at least stable now.
>>>>>> Performance results updated at:
>>>>>> http://btrfs.boxacle.net/repository/single-disk/History/History.html
>>>>>>
>>>>>> The fixed to the cow path are obvious for random write, although even
>>>>>> on  single disk the CPU overhead is very noticeable as the efficiency graphs
>>>>>>  show.
>>>>>>
>>>>>> The good news is that now the only workload that Btrfs is not at or
>>>>>> near  the top in performance for single disk is MailServer.
>>>>>>
>>>>>>             
>>>>> Thanks Steve, glad to hear the stability problems are gone.
>>>>>
>>>>>
>>>>>           
>>>> Well, maybe I spoke too soon. :-(    Run with this patch died in similar
>>>>  way to before.  My remote service console is not responding, so will
>>>>  probably be Monday before I can get to the lab to restart manually.
>>>>
>>>>
>>>> I am getting messages like:
>>>>
>>>> Lots of these timeout messages, then eventually
>>>>
>>>> 18:40:32 btrfs2 kernel: [ 4459.870613] sd 0:0:1:0: [sdb] Unhandled error
>>>> code
>>>> Jun 26 18:40:32 btrfs2 kernel: [ 4459.870640] sd 0:0:1:0: [sdb] Result:
>>>>  hostbyte=DID_ABORT driverbyte=DRIVER_OK
>>>> Jun 26 18:40:32 btrfs2 kernel: [ 4459.870646] end_request: I/O error,
>>>>  dev sdb, sector 103359232
>>>>
>>>> So still not sure if this is HW, but no other FS has triggered it.
>>>>
>>>>
>>>>         
>>> I'm afraid Btrfs can't do this on its own.  It needs to HW, scsi
>>> drivers or HW or scsi drivdes ;)
>>>
>>> You could try dd if=/dev/sdb of=/dev/zero bs=512 count=1 skip=103359232
>>>
>>>       
>> Well, dd write of entire drive shows no errors.  Ran btrfs tests again and
>> go this, no disk or scsi errors reported this time.
>>
>>
>> Jun 29 15:55:34 btrfs2 kernel: [ 8214.725011] kernel BUG at
>> fs/btrfs/extent-tree.c:3865!
>> Jun 29 15:55:34 btrfs2 kernel: [ 8214.725011] invalid opcode: 0000 [#1] SMP
>> Jun 29 15:55:34 btrfs2 kernel: [ 8214.725011] last sysfs file:
>> /sys/devices/system/cpu/cpu15/cache/index1/shared_cpu_map
>> Jun 29 15:55:34 btrfs2 kernel: [ 8214.725011] CPU 8
>> Jun 29 15:55:34 btrfs2 kernel: [ 8214.725011] Modules linked in: oprofile
>> btrfs zlib_deflate autofs4 nfs lockd nfs_acl auth_rpcgss sunrpc dm_multipath
>> sbs sbshc ba
>> ttery ac parport_pc lp parport sg joydev serio_raw acpi_memhotplug rtc_cmos
>> rtc_core rtc_lib button tg3 libphy i2c_piix4 i2c_core pcspkr dm_snapshot
>> dm_zero dm_mir
>> ror dm_region_hash dm_log dm_mod lpfc scsi_transport_fc aic94xx libsas
>> libata scsi_transport_sas sd_mod scsi_mod ext3 jbd uhci_hcd ohci_hcd
>> ehci_hcd [last unloaded
>> : microcode]
>> Jun 29 15:55:34 btrfs2 kernel: [ 8214.725011] Pid: 21731, comm:
>> btrfs-endio-wri Not tainted 2.6.30-rc7-autokern1 #1 IBM x3950-[88726RU]-
>> Jun 29 15:55:34 btrfs2 kernel: [ 8214.725011] RIP: 0010:[<ffffffffa0346ce4>]
>>  [<ffffffffa0346ce4>] alloc_reserved_file_extent+0x8d/0x1c3 [btrfs]
>> Jun 29 15:55:34 btrfs2 kernel: [ 8214.725011] RSP: 0018:ffff88013e10bb60
>>  EFLAGS: 00010282
>> Jun 29 15:55:34 btrfs2 kernel: [ 8214.725011] RAX: 00000000ffffffef RBX:
>> ffff88006fbde000 RCX: 0000000000000002
>> Jun 29 15:55:34 btrfs2 kernel: [ 8214.725011] RDX: 0000000000000001 RSI:
>> 0000000000000000 RDI: ffff8801020ac5b0
>> Jun 29 15:55:34 btrfs2 kernel: [ 8214.725011] RBP: ffff88013e10bbd0 R08:
>> ffff88013e10b9d8 R09: ffff88013e10b9d0
>> Jun 29 15:55:34 btrfs2 kernel: [ 8214.725011] R10: 0000000000000004 R11:
>> ffff8801020ac5b0 R12: 000000000000001d
>> Jun 29 15:55:34 btrfs2 kernel: [ 8214.725011] R13: ffff88012e1e7910 R14:
>> 0000000000000000 R15: 0000000000000000
>> Jun 29 15:55:34 btrfs2 kernel: [ 8214.725011] FS:  0000000000000000(0000)
>> GS:ffff88002bac0000(0000) knlGS:0000000000000000
>> Jun 29 15:55:34 btrfs2 kernel: [ 8214.725011] CS:  0010 DS: 0018 ES: 0018
>> CR0: 000000008005003b
>> Jun 29 15:55:34 btrfs2 kernel: [ 8214.725011] CR2: 00007fffdac2efb0 CR3:
>> 0000000138cc9000 CR4: 00000000000006e0
>> Jun 29 15:55:34 btrfs2 kernel: [ 8214.725011] DR0: 0000000000000000 DR1:
>> 0000000000000000 DR2: 0000000000000000
>> Jun 29 15:55:34 btrfs2 kernel: [ 8214.725011] DR3: 0000000000000000 DR6:
>> 00000000ffff0ff0 DR7: 0000000000000400
>> Jun 29 15:55:34 btrfs2 kernel: [ 8214.725011] Process btrfs-endio-wri (pid:
>> 21731, threadinfo ffff88013e10a000, task ffff880138d117b0)
>> Jun 29 15:55:34 btrfs2 kernel: [ 8214.725011] Stack:
>> Jun 29 15:55:34 btrfs2 kernel: [ 8214.725011]  0000000000000000
>> 00000000000011d5 0000000000000005 0000000000000000
>> Jun 29 15:55:34 btrfs2 kernel: [ 8214.725011]  ffff88005fcb0800
>> ffff88011a47f860 000000b2844a5030 000000000000008c
>> Jun 29 15:55:34 btrfs2 kernel: [ 8214.725011]  000000352e1e7910
>> ffff8800be095540 ffff8800be095740 0000000000000001
>> Jun 29 15:55:34 btrfs2 kernel: [ 8214.725011] Call Trace:
>> Jun 29 15:55:34 btrfs2 kernel: [ 8214.725011]  [<ffffffffa034b198>]
>> run_one_delayed_ref+0x382/0x42f [btrfs]
>> Jun 29 15:55:34 btrfs2 kernel: [ 8214.725011]  [<ffffffffa036abbd>] ?
>> map_extent_buffer+0xab/0xbe [btrfs]
>> Jun 29 15:55:34 btrfs2 kernel: [ 8214.725011]  [<ffffffffa034bf75>]
>> run_clustered_refs+0x237/0x2b4 [btrfs]
>> Jun 29 15:55:34 btrfs2 kernel: [ 8214.725011]  [<ffffffffa037ef71>] ?
>> btrfs_find_ref_cluster+0xdc/0x115 [btrfs]
>> Jun 29 15:55:34 btrfs2 kernel: [ 8214.725011]  [<ffffffffa034c09e>]
>> btrfs_run_delayed_refs+0xac/0x195 [btrfs]
>> Jun 29 15:55:34 btrfs2 kernel: [ 8214.725011]  [<ffffffffa035486e>]
>> __btrfs_end_transaction+0x59/0xfe [btrfs]
>> Jun 29 15:55:34 btrfs2 kernel: [ 8214.725011]  [<ffffffffa035492e>]
>> btrfs_end_transaction+0xb/0xd [btrfs]
>> Jun 29 15:55:34 btrfs2 kernel: [ 8214.725011]  [<ffffffffa035a18b>]
>> btrfs_finish_ordered_io+0x224/0x24d [btrfs]
>> Jun 29 15:55:34 btrfs2 kernel: [ 8214.725011]  [<ffffffffa035a1c4>]
>> btrfs_writepage_end_io_hook+0x10/0x12 [btrfs]
>> Jun 29 15:55:34 btrfs2 kernel: [ 8214.725011]  [<ffffffffa036d585>]
>> end_bio_extent_writepage+0xa3/0x18f [btrfs]
>> Jun 29 15:55:34 btrfs2 kernel: [ 8214.725011]  [<ffffffff8024276e>] ?
>> del_timer_sync+0x14/0x20
>> Jun 29 15:55:34 btrfs2 kernel: [ 8214.725011]  [<ffffffff802cbbee>]
>> bio_endio+0x26/0x28
>> Jun 29 15:55:34 btrfs2 kernel: [ 8214.725011]  [<ffffffffa03515d6>]
>> end_workqueue_fn+0x111/0x11e [btrfs]
>> Jun 29 15:55:34 btrfs2 kernel: [ 8214.725011]  [<ffffffffa0374fe1>]
>> worker_loop+0x67/0x1ee [btrfs]
>> Jun 29 15:55:34 btrfs2 kernel: [ 8214.725011]  [<ffffffffa0374f7a>] ?
>> worker_loop+0x0/0x1ee [btrfs]
>> Jun 29 15:55:34 btrfs2 kernel: [ 8214.725011]  [<ffffffff8024c324>]
>> kthread+0x56/0x86
>> Jun 29 15:55:35 btrfs2 kernel: [ 8214.725011]  [<ffffffff8020c9fa>]
>> child_rip+0xa/0x20
>> Jun 29 15:55:35 btrfs2 kernel: [ 8214.725011]  [<ffffffff8024c2ce>] ?
>> kthread+0x0/0x86
>> Jun 29 15:55:35 btrfs2 kernel: [ 8214.725011]  [<ffffffff8020c9f0>] ?
>> child_rip+0x0/0x20
>> Jun 29 15:55:35 btrfs2 kernel: [ 8214.725011] Code: 08 4c 8d 45 d4 41 8d 44
>> 24 18 48 8b 73 20 48 8b 4d 18 41 b9 01 00 00 00 48 8b 7d b8 4c 89 ea 89 45
>> d4 e8 df e3
>> ff ff 85 c0 74 04 <0f> 0b eb fe 49 63 75 40 4d 8b 65 00 49 83 cf 01 4c 89 e7
>> 48 6b
>> Jun 29 15:55:35 btrfs2 kernel: [ 8214.725011] RIP  [<ffffffffa0346ce4>]
>> alloc_reserved_file_extent+0x8d/0x1c3 [btrfs]
>> Jun 29 15:55:35 btrfs2 kernel: [ 8214.725011]  RSP <ffff88013e10bb60>
>> Jun 29 15:55:35 btrfs2 kernel: [ 8215.101864] ---[ end trace
>> 2a2583ccd67ef43b ]---
>>
>>     
>
> Is there any "parent transid verify failed on xxx wanted xxx found" like message
> in the log ?
>   

No, nothing like that.

Steve
> Thank you,
> Yan Zheng
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>   


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2009-06-30 15:26 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-06-26 14:28 Single disk performance Steven Pratt
2009-06-26 20:56 ` Chris Mason
2009-06-27  2:26   ` Steven Pratt
2009-06-29 12:41     ` Chris Mason
2009-06-29 23:17       ` Bron Gondwana
2009-06-30 11:02         ` Chris Mason
2009-06-30 14:38       ` Steven Pratt
2009-06-30 15:10         ` Yan Zheng
2009-06-30 15:26           ` Steven Pratt

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.