All of lore.kernel.org
 help / color / mirror / Atom feed
* bufferlist rebuild_aligned issue
@ 2016-10-14  6:37 Zhangzengran
  2016-10-14  6:44 ` Haomai Wang
  2016-10-14 20:26 ` Sage Weil
  0 siblings, 2 replies; 7+ messages in thread
From: Zhangzengran @ 2016-10-14  6:37 UTC (permalink / raw)
  To: Sage Weil (sage@newdream.net); +Cc: ceph-devel

Hi Sage,
  Recently I tracked some single op, and found that the bufferlist::rebuild_aligned will cost some time that a litte longer.
The whole subop spend 7ms,and the pwritev spend 1.2ms, but the rebuild_aligned cost 1.1ms.the test’s object length is 1M,
Using rados put. I think the bufferlist rebuild and align should not cost so much time. The os environment have vm.swappiness = 0
and have tcmalloc. Is this normal?

2016-10-14 11:50:48.131103  queue_transactions existing 0x7f8900c302c0 osr(134.62s3 0x7f88ff458160)
2016-10-14 11:50:48.132477  queue_transactions (writeahead) 3234 [Transaction(0x7f8907736000),Transaction(0x7f89077361c0)]
>> cost 1.1ms FileJournal::prepare_entry => bufferlist:: rebuild_aligned
2016-10-14 11:50:48.133583  _journaled_ahead 0x7f88ff586ba0 seq 3234 osr(134.62s3 0x7f88ff458160) [Transaction(0x7f8907736000),Transaction(0x7f89077361c0)]
2016-10-14 11:50:48.133620  queue_op 0x7f88ff586ba0 seq 3234 osr(134.62s3 0x7f88ff458160) 1051330 bytes   (queue has 1 ops and 1051330 bytes)
2016-10-14 11:50:48.133630   queueing ondisk 0x7f8903ab77c0
2016-10-14 11:50:48.133654  _do_op 0x7f88ff586ba0 seq 3234 osr(134.62s3 0x7f88ff458160)/0x7f88ff458160 start
2016-10-14 11:50:48.133669  _do_transaction on 0x7f8907736000
2016-10-14 11:50:48.133685  remove 134.62s3_head/3#134:461e3a1c:::testfile_4M:head#298
2016-10-14 11:50:48.133754  lfn_unlink: clearing omap on 3#134:461e3a1c:::testfile_4M:head#298 in cid 134.62s3_head
2016-10-14 11:50:48.134526  remove 134.62s3_head/3#134:461e3a1c:::testfile_4M:head#298 = 0
2016-10-14 11:50:48.134557  _omap_setkeys 134.62s3_head/3#134:46000000::::head#
2016-10-14 11:50:48.134705  _omap_setkeys 134.62s3_head/3#134:46000000::::head# = 0
2016-10-14 11:50:48.134716  _do_transaction on 0x7f89077361c0
2016-10-14 11:50:48.134723  _collection_move_rename 134.62s3_head/3#134:461e3a1c:::testfile_4M:head#29a from 134.62s3_head/3#134:461e3a1c:::testfile_4M:head#
2016-10-14 11:50:48.134736  _set_replay_guard 3234.1.0 START
2016-10-14 11:50:48.135859  _set_replay_guard 3234.1.0 done
2016-10-14 11:50:48.136130  lfn_unlink: clearing omap on 3#134:461e3a1c:::testfile_4M:head# in cid 134.62s3_head
2016-10-14 11:50:48.136305  _close_replay_guard 3234.1.0
2016-10-14 11:50:48.137023  _close_replay_guard 3234.1.0 done
2016-10-14 11:50:48.137039  _collection_move_rename 134.62s3_head/3#134:461e3a1c:::testfile_4M:head#29a from 134.62s3_head/3#134:461e3a1c:::testfile_4M:head# = 0
2016-10-14 11:50:48.137057  write 134.62s3_head/3#134:461e3a1c:::testfile_4M:head# 0~1048576
>> cost 1.2ms pwritev
2016-10-14 11:50:48.138255  write 134.62s3_head/3#134:461e3a1c:::testfile_4M:head# 0~1048576 = 1048576
2016-10-14 11:50:48.138291  setattrs 134.62s3_head/3#134:461e3a1c:::testfile_4M:head#
2016-10-14 11:50:48.138313  setattrs 134.62s3_head/3#134:461e3a1c:::testfile_4M:head# = 0
2016-10-14 11:50:48.138330  fgetattrs 70 getting 'hinfo_key'
2016-10-14 11:50:48.138337  setattrs 134.62s3_head/3#134:461e3a1c:::testfile_4M:head#
2016-10-14 11:50:48.138357  setattrs 134.62s3_head/3#134:461e3a1c:::testfile_4M:head# = 0
2016-10-14 11:50:48.138371  fgetattrs 70 getting '_'
2016-10-14 11:50:48.138376  fgetattrs 70 getting 'snapset'
2016-10-14 11:50:48.138381  fgetattrs 70 getting 'hinfo_key'
2016-10-14 11:50:48.138386  setattrs 134.62s3_head/3#134:461e3a1c:::testfile_4M:head#
2016-10-14 11:50:48.138409  _do_op 0x7f88ff586ba0 seq 3234 r = 0, finisher 0x7f8909398680 0
2016-10-14 11:50:48.138420  _finish_op 0x7f88ff586ba0 seq 3234 osr(134.62s3 0x7f88ff458160)/0x7f88ff458160 lat 0.007306

Than you, best regards
-------------------------------------------------------------------------------------------------------------------------------------
本邮件及其附件含有杭州华三通信技术有限公司的保密信息,仅限于发送给上面地址中列出
的个人或群组。禁止任何其他人以任何形式使用(包括但不限于全部或部分地泄露、复制、
或散发)本邮件中的信息。如果您错收了本邮件,请您立即电话或邮件通知发件人并删除本
邮件!
This e-mail and its attachments contain confidential information from H3C, which is
intended only for the person or entity whose address is listed above. Any use of the
information contained herein in any way (including, but not limited to, total or partial
disclosure, reproduction, or dissemination) by persons other than the intended
recipient(s) is prohibited. If you receive this e-mail in error, please notify the sender
by phone or email immediately and delete it!

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: bufferlist rebuild_aligned issue
  2016-10-14  6:37 bufferlist rebuild_aligned issue Zhangzengran
@ 2016-10-14  6:44 ` Haomai Wang
  2016-10-14 20:26 ` Sage Weil
  1 sibling, 0 replies; 7+ messages in thread
From: Haomai Wang @ 2016-10-14  6:44 UTC (permalink / raw)
  To: Zhangzengran; +Cc: Sage Weil (sage@newdream.net), ceph-devel

On Fri, Oct 14, 2016 at 2:37 PM, Zhangzengran <zhangzengran@h3c.com> wrote:
> Hi Sage,
>   Recently I tracked some single op, and found that the bufferlist::rebuild_aligned will cost some time that a litte longer.
> The whole subop spend 7ms,and the pwritev spend 1.2ms, but the rebuild_aligned cost 1.1ms.the test’s object length is 1M,
> Using rados put. I think the bufferlist rebuild and align should not cost so much time. The os environment have vm.swappiness = 0
> and have tcmalloc. Is this normal?
>
> 2016-10-14 11:50:48.131103  queue_transactions existing 0x7f8900c302c0 osr(134.62s3 0x7f88ff458160)
> 2016-10-14 11:50:48.132477  queue_transactions (writeahead) 3234 [Transaction(0x7f8907736000),Transaction(0x7f89077361c0)]
>>> cost 1.1ms FileJournal::prepare_entry => bufferlist:: rebuild_aligned

why do you think 1.1ms is cost from rebuild... there exists a lot of
things. rebuild of course is a problem we need to resolve. but it
won't be problem in hdd case since it's just memory copy.

> 2016-10-14 11:50:48.133583  _journaled_ahead 0x7f88ff586ba0 seq 3234 osr(134.62s3 0x7f88ff458160) [Transaction(0x7f8907736000),Transaction(0x7f89077361c0)]
> 2016-10-14 11:50:48.133620  queue_op 0x7f88ff586ba0 seq 3234 osr(134.62s3 0x7f88ff458160) 1051330 bytes   (queue has 1 ops and 1051330 bytes)
> 2016-10-14 11:50:48.133630   queueing ondisk 0x7f8903ab77c0
> 2016-10-14 11:50:48.133654  _do_op 0x7f88ff586ba0 seq 3234 osr(134.62s3 0x7f88ff458160)/0x7f88ff458160 start
> 2016-10-14 11:50:48.133669  _do_transaction on 0x7f8907736000
> 2016-10-14 11:50:48.133685  remove 134.62s3_head/3#134:461e3a1c:::testfile_4M:head#298
> 2016-10-14 11:50:48.133754  lfn_unlink: clearing omap on 3#134:461e3a1c:::testfile_4M:head#298 in cid 134.62s3_head
> 2016-10-14 11:50:48.134526  remove 134.62s3_head/3#134:461e3a1c:::testfile_4M:head#298 = 0
> 2016-10-14 11:50:48.134557  _omap_setkeys 134.62s3_head/3#134:46000000::::head#
> 2016-10-14 11:50:48.134705  _omap_setkeys 134.62s3_head/3#134:46000000::::head# = 0
> 2016-10-14 11:50:48.134716  _do_transaction on 0x7f89077361c0
> 2016-10-14 11:50:48.134723  _collection_move_rename 134.62s3_head/3#134:461e3a1c:::testfile_4M:head#29a from 134.62s3_head/3#134:461e3a1c:::testfile_4M:head#
> 2016-10-14 11:50:48.134736  _set_replay_guard 3234.1.0 START
> 2016-10-14 11:50:48.135859  _set_replay_guard 3234.1.0 done
> 2016-10-14 11:50:48.136130  lfn_unlink: clearing omap on 3#134:461e3a1c:::testfile_4M:head# in cid 134.62s3_head
> 2016-10-14 11:50:48.136305  _close_replay_guard 3234.1.0
> 2016-10-14 11:50:48.137023  _close_replay_guard 3234.1.0 done
> 2016-10-14 11:50:48.137039  _collection_move_rename 134.62s3_head/3#134:461e3a1c:::testfile_4M:head#29a from 134.62s3_head/3#134:461e3a1c:::testfile_4M:head# = 0
> 2016-10-14 11:50:48.137057  write 134.62s3_head/3#134:461e3a1c:::testfile_4M:head# 0~1048576
>>> cost 1.2ms pwritev
> 2016-10-14 11:50:48.138255  write 134.62s3_head/3#134:461e3a1c:::testfile_4M:head# 0~1048576 = 1048576
> 2016-10-14 11:50:48.138291  setattrs 134.62s3_head/3#134:461e3a1c:::testfile_4M:head#
> 2016-10-14 11:50:48.138313  setattrs 134.62s3_head/3#134:461e3a1c:::testfile_4M:head# = 0
> 2016-10-14 11:50:48.138330  fgetattrs 70 getting 'hinfo_key'
> 2016-10-14 11:50:48.138337  setattrs 134.62s3_head/3#134:461e3a1c:::testfile_4M:head#
> 2016-10-14 11:50:48.138357  setattrs 134.62s3_head/3#134:461e3a1c:::testfile_4M:head# = 0
> 2016-10-14 11:50:48.138371  fgetattrs 70 getting '_'
> 2016-10-14 11:50:48.138376  fgetattrs 70 getting 'snapset'
> 2016-10-14 11:50:48.138381  fgetattrs 70 getting 'hinfo_key'
> 2016-10-14 11:50:48.138386  setattrs 134.62s3_head/3#134:461e3a1c:::testfile_4M:head#
> 2016-10-14 11:50:48.138409  _do_op 0x7f88ff586ba0 seq 3234 r = 0, finisher 0x7f8909398680 0
> 2016-10-14 11:50:48.138420  _finish_op 0x7f88ff586ba0 seq 3234 osr(134.62s3 0x7f88ff458160)/0x7f88ff458160 lat 0.007306
>
> Than you, best regards
> -------------------------------------------------------------------------------------------------------------------------------------
> 本邮件及其附件含有杭州华三通信技术有限公司的保密信息,仅限于发送给上面地址中列出
> 的个人或群组。禁止任何其他人以任何形式使用(包括但不限于全部或部分地泄露、复制、
> 或散发)本邮件中的信息。如果您错收了本邮件,请您立即电话或邮件通知发件人并删除本
> 邮件!
> This e-mail and its attachments contain confidential information from H3C, which is
> intended only for the person or entity whose address is listed above. Any use of the
> information contained herein in any way (including, but not limited to, total or partial
> disclosure, reproduction, or dissemination) by persons other than the intended
> recipient(s) is prohibited. If you receive this e-mail in error, please notify the sender
> by phone or email immediately and delete it!

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: bufferlist rebuild_aligned issue
  2016-10-14  6:37 bufferlist rebuild_aligned issue Zhangzengran
  2016-10-14  6:44 ` Haomai Wang
@ 2016-10-14 20:26 ` Sage Weil
  2016-10-14 20:31   ` Somnath Roy
  1 sibling, 1 reply; 7+ messages in thread
From: Sage Weil @ 2016-10-14 20:26 UTC (permalink / raw)
  To: Zhangzengran; +Cc: ceph-devel

[-- Attachment #1: Type: TEXT/PLAIN, Size: 5742 bytes --]

On Fri, 14 Oct 2016, Zhangzengran wrote:
> Hi Sage,
>   Recently I tracked some single op, and found that the bufferlist::rebuild_aligned will cost some time that a litte longer.
> The whole subop spend 7ms,and the pwritev spend 1.2ms, but the rebuild_aligned cost 1.1ms.the test’s object length is 1M,
> Using rados put. I think the bufferlist rebuild and align should not cost so much time. The os environment have vm.swappiness = 0
> and have tcmalloc. Is this normal?
> 
> 2016-10-14 11:50:48.131103  queue_transactions existing 0x7f8900c302c0 osr(134.62s3 0x7f88ff458160)
> 2016-10-14 11:50:48.132477  queue_transactions (writeahead) 3234 [Transaction(0x7f8907736000),Transaction(0x7f89077361c0)]
> >> cost 1.1ms FileJournal::prepare_entry => bufferlist:: rebuild_aligned

This looks like a bug.  In general we should never have to do a rebuild 
here.  Two things might be going wrong:

1- The messenger should be reading the write into aligned memory based on 
the offset in the ceph_msg_header.  You can compare the buffer pointer 
(just dout the bufferlist var and you'll see the buffer segments and their 
actual data pointers) and the write offset and make sure they match up.

2- The journal code has some delicate logic to adjust the head and tail 
padding so that the data buffer remains aligned to the journal blocks as 
well.  This code is in FileJournal::pre_pad.  It may also be that the 
alignment is getting clobbered by the ObjectStore::Transaction,
which tries to track the largest data buffer in the transaction and note 
it's alignment.

I suspect one of those two things is the culprit, though!

> 2016-10-14 11:50:48.133583  _journaled_ahead 0x7f88ff586ba0 seq 3234 osr(134.62s3 0x7f88ff458160) [Transaction(0x7f8907736000),Transaction(0x7f89077361c0)]
> 2016-10-14 11:50:48.133620  queue_op 0x7f88ff586ba0 seq 3234 osr(134.62s3 0x7f88ff458160) 1051330 bytes   (queue has 1 ops and 1051330 bytes)
> 2016-10-14 11:50:48.133630   queueing ondisk 0x7f8903ab77c0
> 2016-10-14 11:50:48.133654  _do_op 0x7f88ff586ba0 seq 3234 osr(134.62s3 0x7f88ff458160)/0x7f88ff458160 start
> 2016-10-14 11:50:48.133669  _do_transaction on 0x7f8907736000
> 2016-10-14 11:50:48.133685  remove 134.62s3_head/3#134:461e3a1c:::testfile_4M:head#298
> 2016-10-14 11:50:48.133754  lfn_unlink: clearing omap on 3#134:461e3a1c:::testfile_4M:head#298 in cid 134.62s3_head
> 2016-10-14 11:50:48.134526  remove 134.62s3_head/3#134:461e3a1c:::testfile_4M:head#298 = 0
> 2016-10-14 11:50:48.134557  _omap_setkeys 134.62s3_head/3#134:46000000::::head#
> 2016-10-14 11:50:48.134705  _omap_setkeys 134.62s3_head/3#134:46000000::::head# = 0
> 2016-10-14 11:50:48.134716  _do_transaction on 0x7f89077361c0
> 2016-10-14 11:50:48.134723  _collection_move_rename 134.62s3_head/3#134:461e3a1c:::testfile_4M:head#29a from 134.62s3_head/3#134:461e3a1c:::testfile_4M:head#
> 2016-10-14 11:50:48.134736  _set_replay_guard 3234.1.0 START
> 2016-10-14 11:50:48.135859  _set_replay_guard 3234.1.0 done
> 2016-10-14 11:50:48.136130  lfn_unlink: clearing omap on 3#134:461e3a1c:::testfile_4M:head# in cid 134.62s3_head
> 2016-10-14 11:50:48.136305  _close_replay_guard 3234.1.0
> 2016-10-14 11:50:48.137023  _close_replay_guard 3234.1.0 done
> 2016-10-14 11:50:48.137039  _collection_move_rename 134.62s3_head/3#134:461e3a1c:::testfile_4M:head#29a from 134.62s3_head/3#134:461e3a1c:::testfile_4M:head# = 0
> 2016-10-14 11:50:48.137057  write 134.62s3_head/3#134:461e3a1c:::testfile_4M:head# 0~1048576
> >> cost 1.2ms pwritev

This doing a buffered write, which copies data into the page cache.  It's 
not clear we can avoid that unless with do O_DIRECT.. which is tricky 
because of the way we cache file descriptors.

sage


> 2016-10-14 11:50:48.138255  write 134.62s3_head/3#134:461e3a1c:::testfile_4M:head# 0~1048576 = 1048576
> 2016-10-14 11:50:48.138291  setattrs 134.62s3_head/3#134:461e3a1c:::testfile_4M:head#
> 2016-10-14 11:50:48.138313  setattrs 134.62s3_head/3#134:461e3a1c:::testfile_4M:head# = 0
> 2016-10-14 11:50:48.138330  fgetattrs 70 getting 'hinfo_key'
> 2016-10-14 11:50:48.138337  setattrs 134.62s3_head/3#134:461e3a1c:::testfile_4M:head#
> 2016-10-14 11:50:48.138357  setattrs 134.62s3_head/3#134:461e3a1c:::testfile_4M:head# = 0
> 2016-10-14 11:50:48.138371  fgetattrs 70 getting '_'
> 2016-10-14 11:50:48.138376  fgetattrs 70 getting 'snapset'
> 2016-10-14 11:50:48.138381  fgetattrs 70 getting 'hinfo_key'
> 2016-10-14 11:50:48.138386  setattrs 134.62s3_head/3#134:461e3a1c:::testfile_4M:head#
> 2016-10-14 11:50:48.138409  _do_op 0x7f88ff586ba0 seq 3234 r = 0, finisher 0x7f8909398680 0
> 2016-10-14 11:50:48.138420  _finish_op 0x7f88ff586ba0 seq 3234 osr(134.62s3 0x7f88ff458160)/0x7f88ff458160 lat 0.007306
> 
> Than you, best regards
> -------------------------------------------------------------------------------------------------------------------------------------
> 本邮件及其附件含有杭州华三通信技术有限公司的保密信息,仅限于发送给上面地址中列出
> 的个人或群组。禁止任何其他人以任何形式使用(包括但不限于全部或部分地泄露、复制、
> 或散发)本邮件中的信息。如果您错收了本邮件,请您立即电话或邮件通知发件人并删除本
> 邮件!
> This e-mail and its attachments contain confidential information from H3C, which is
> intended only for the person or entity whose address is listed above. Any use of the
> information contained herein in any way (including, but not limited to, total or partial
> disclosure, reproduction, or dissemination) by persons other than the intended
> recipient(s) is prohibited. If you receive this e-mail in error, please notify the sender
> by phone or email immediately and delete it!
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: bufferlist rebuild_aligned issue
  2016-10-14 20:26 ` Sage Weil
@ 2016-10-14 20:31   ` Somnath Roy
  2016-10-14 20:33     ` Sage Weil
  0 siblings, 1 reply; 7+ messages in thread
From: Somnath Roy @ 2016-10-14 20:31 UTC (permalink / raw)
  To: Sage Weil, Zhangzengran; +Cc: ceph-devel

Sage,
I am seeing the rebuild bufferlist is happening from KernelDevice::aio_write  write path as well and as a result it is popping up during profiling. Isn't the buffer supposed to aligned in this case as well ?

Thanks & Regards
Somnath

-----Original Message-----
From: ceph-devel-owner@vger.kernel.org [mailto:ceph-devel-owner@vger.kernel.org] On Behalf Of Sage Weil
Sent: Friday, October 14, 2016 1:26 PM
To: Zhangzengran
Cc: ceph-devel@vger.kernel.org
Subject: Re: bufferlist rebuild_aligned issue

On Fri, 14 Oct 2016, Zhangzengran wrote:
> Hi Sage,
>   Recently I tracked some single op, and found that the bufferlist::rebuild_aligned will cost some time that a litte longer.
> The whole subop spend 7ms,and the pwritev spend 1.2ms, but the
> rebuild_aligned cost 1.1ms.the test’s object length is 1M, Using rados
> put. I think the bufferlist rebuild and align should not cost so much time. The os environment have vm.swappiness = 0 and have tcmalloc. Is this normal?
>
> 2016-10-14 11:50:48.131103  queue_transactions existing 0x7f8900c302c0
> osr(134.62s3 0x7f88ff458160)
> 2016-10-14 11:50:48.132477  queue_transactions (writeahead) 3234
> [Transaction(0x7f8907736000),Transaction(0x7f89077361c0)]
> >> cost 1.1ms FileJournal::prepare_entry => bufferlist::
> >> rebuild_aligned

This looks like a bug.  In general we should never have to do a rebuild here.  Two things might be going wrong:

1- The messenger should be reading the write into aligned memory based on the offset in the ceph_msg_header.  You can compare the buffer pointer (just dout the bufferlist var and you'll see the buffer segments and their actual data pointers) and the write offset and make sure they match up.

2- The journal code has some delicate logic to adjust the head and tail padding so that the data buffer remains aligned to the journal blocks as well.  This code is in FileJournal::pre_pad.  It may also be that the alignment is getting clobbered by the ObjectStore::Transaction, which tries to track the largest data buffer in the transaction and note it's alignment.

I suspect one of those two things is the culprit, though!

> 2016-10-14 11:50:48.133583  _journaled_ahead 0x7f88ff586ba0 seq 3234 osr(134.62s3 0x7f88ff458160) [Transaction(0x7f8907736000),Transaction(0x7f89077361c0)]
> 2016-10-14 11:50:48.133620  queue_op 0x7f88ff586ba0 seq 3234 osr(134.62s3 0x7f88ff458160) 1051330 bytes   (queue has 1 ops and 1051330 bytes)
> 2016-10-14 11:50:48.133630   queueing ondisk 0x7f8903ab77c0
> 2016-10-14 11:50:48.133654  _do_op 0x7f88ff586ba0 seq 3234
> osr(134.62s3 0x7f88ff458160)/0x7f88ff458160 start
> 2016-10-14 11:50:48.133669  _do_transaction on 0x7f8907736000
> 2016-10-14 11:50:48.133685  remove
> 134.62s3_head/3#134:461e3a1c:::testfile_4M:head#298
> 2016-10-14 11:50:48.133754  lfn_unlink: clearing omap on
> 3#134:461e3a1c:::testfile_4M:head#298 in cid 134.62s3_head
> 2016-10-14 11:50:48.134526  remove
> 134.62s3_head/3#134:461e3a1c:::testfile_4M:head#298 = 0
> 2016-10-14 11:50:48.134557  _omap_setkeys
> 134.62s3_head/3#134:46000000::::head#
> 2016-10-14 11:50:48.134705  _omap_setkeys
> 134.62s3_head/3#134:46000000::::head# = 0
> 2016-10-14 11:50:48.134716  _do_transaction on 0x7f89077361c0
> 2016-10-14 11:50:48.134723  _collection_move_rename
> 134.62s3_head/3#134:461e3a1c:::testfile_4M:head#29a from
> 134.62s3_head/3#134:461e3a1c:::testfile_4M:head#
> 2016-10-14 11:50:48.134736  _set_replay_guard 3234.1.0 START
> 2016-10-14 11:50:48.135859  _set_replay_guard 3234.1.0 done
> 2016-10-14 11:50:48.136130  lfn_unlink: clearing omap on
> 3#134:461e3a1c:::testfile_4M:head# in cid 134.62s3_head
> 2016-10-14 11:50:48.136305  _close_replay_guard 3234.1.0
> 2016-10-14 11:50:48.137023  _close_replay_guard 3234.1.0 done
> 2016-10-14 11:50:48.137039  _collection_move_rename
> 134.62s3_head/3#134:461e3a1c:::testfile_4M:head#29a from
> 134.62s3_head/3#134:461e3a1c:::testfile_4M:head# = 0
> 2016-10-14 11:50:48.137057  write
> 134.62s3_head/3#134:461e3a1c:::testfile_4M:head# 0~1048576
> >> cost 1.2ms pwritev

This doing a buffered write, which copies data into the page cache.  It's not clear we can avoid that unless with do O_DIRECT.. which is tricky because of the way we cache file descriptors.

sage


> 2016-10-14 11:50:48.138255  write
> 134.62s3_head/3#134:461e3a1c:::testfile_4M:head# 0~1048576 = 1048576
> 2016-10-14 11:50:48.138291  setattrs
> 134.62s3_head/3#134:461e3a1c:::testfile_4M:head#
> 2016-10-14 11:50:48.138313  setattrs
> 134.62s3_head/3#134:461e3a1c:::testfile_4M:head# = 0
> 2016-10-14 11:50:48.138330  fgetattrs 70 getting 'hinfo_key'
> 2016-10-14 11:50:48.138337  setattrs
> 134.62s3_head/3#134:461e3a1c:::testfile_4M:head#
> 2016-10-14 11:50:48.138357  setattrs
> 134.62s3_head/3#134:461e3a1c:::testfile_4M:head# = 0
> 2016-10-14 11:50:48.138371  fgetattrs 70 getting '_'
> 2016-10-14 11:50:48.138376  fgetattrs 70 getting 'snapset'
> 2016-10-14 11:50:48.138381  fgetattrs 70 getting 'hinfo_key'
> 2016-10-14 11:50:48.138386  setattrs
> 134.62s3_head/3#134:461e3a1c:::testfile_4M:head#
> 2016-10-14 11:50:48.138409  _do_op 0x7f88ff586ba0 seq 3234 r = 0,
> finisher 0x7f8909398680 0
> 2016-10-14 11:50:48.138420  _finish_op 0x7f88ff586ba0 seq 3234
> osr(134.62s3 0x7f88ff458160)/0x7f88ff458160 lat 0.007306
>
> Than you, best regards
> ----------------------------------------------------------------------
> ---------------------------------------------------------------
> 本邮件及其附件含有杭州华三通信技术有限公司的保密信息,仅限于发送给上面地址中列出
> 的个人或群组。禁止任何其他人以任何形式使用(包括但不限于全部或部分地泄露、复制、
> 或散发)本邮件中的信息。如果您错收了本邮件,请您立即电话或邮件通知发件人并删除本
> 邮件!
> This e-mail and its attachments contain confidential information from
> H3C, which is intended only for the person or entity whose address is
> listed above. Any use of the information contained herein in any way
> (including, but not limited to, total or partial disclosure,
> reproduction, or dissemination) by persons other than the intended
> recipient(s) is prohibited. If you receive this e-mail in error,
> please notify the sender by phone or email immediately and delete it!
>
PLEASE NOTE: The information contained in this electronic mail message is intended only for the use of the designated recipient(s) named above. If the reader of this message is not the intended recipient, you are hereby notified that you have received this message in error and that any review, dissemination, distribution, or copying of this message is strictly prohibited. If you have received this communication in error, please notify the sender by telephone or e-mail (as shown above) immediately and destroy any and all copies of this message in your possession (whether hard copies or electronically stored copies).

^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: bufferlist rebuild_aligned issue
  2016-10-14 20:31   ` Somnath Roy
@ 2016-10-14 20:33     ` Sage Weil
  2016-10-14 20:35       ` Somnath Roy
  0 siblings, 1 reply; 7+ messages in thread
From: Sage Weil @ 2016-10-14 20:33 UTC (permalink / raw)
  To: Somnath Roy; +Cc: Zhangzengran, ceph-devel

[-- Attachment #1: Type: TEXT/PLAIN, Size: 7395 bytes --]

On Fri, 14 Oct 2016, Somnath Roy wrote:
> Sage,
> I am seeing the rebuild bufferlist is happening from KernelDevice::aio_write  write path as well and as a result it is popping up during profiling. Isn't the buffer supposed to aligned in this case as well ?

The bluestore rebuild is fixed as part of wip-denc, which should merge 
soon.. hopefully next week.

s
> 
> Thanks & Regards
> Somnath
> 
> -----Original Message-----
> From: ceph-devel-owner@vger.kernel.org [mailto:ceph-devel-owner@vger.kernel.org] On Behalf Of Sage Weil
> Sent: Friday, October 14, 2016 1:26 PM
> To: Zhangzengran
> Cc: ceph-devel@vger.kernel.org
> Subject: Re: bufferlist rebuild_aligned issue
> 
> On Fri, 14 Oct 2016, Zhangzengran wrote:
> > Hi Sage,
> >   Recently I tracked some single op, and found that the bufferlist::rebuild_aligned will cost some time that a litte longer.
> > The whole subop spend 7ms,and the pwritev spend 1.2ms, but the
> > rebuild_aligned cost 1.1ms.the test’s object length is 1M, Using rados
> > put. I think the bufferlist rebuild and align should not cost so much time. The os environment have vm.swappiness = 0 and have tcmalloc. Is this normal?
> >
> > 2016-10-14 11:50:48.131103  queue_transactions existing 0x7f8900c302c0
> > osr(134.62s3 0x7f88ff458160)
> > 2016-10-14 11:50:48.132477  queue_transactions (writeahead) 3234
> > [Transaction(0x7f8907736000),Transaction(0x7f89077361c0)]
> > >> cost 1.1ms FileJournal::prepare_entry => bufferlist::
> > >> rebuild_aligned
> 
> This looks like a bug.  In general we should never have to do a rebuild here.  Two things might be going wrong:
> 
> 1- The messenger should be reading the write into aligned memory based on the offset in the ceph_msg_header.  You can compare the buffer pointer (just dout the bufferlist var and you'll see the buffer segments and their actual data pointers) and the write offset and make sure they match up.
> 
> 2- The journal code has some delicate logic to adjust the head and tail padding so that the data buffer remains aligned to the journal blocks as well.  This code is in FileJournal::pre_pad.  It may also be that the alignment is getting clobbered by the ObjectStore::Transaction, which tries to track the largest data buffer in the transaction and note it's alignment.
> 
> I suspect one of those two things is the culprit, though!
> 
> > 2016-10-14 11:50:48.133583  _journaled_ahead 0x7f88ff586ba0 seq 3234 osr(134.62s3 0x7f88ff458160) [Transaction(0x7f8907736000),Transaction(0x7f89077361c0)]
> > 2016-10-14 11:50:48.133620  queue_op 0x7f88ff586ba0 seq 3234 osr(134.62s3 0x7f88ff458160) 1051330 bytes   (queue has 1 ops and 1051330 bytes)
> > 2016-10-14 11:50:48.133630   queueing ondisk 0x7f8903ab77c0
> > 2016-10-14 11:50:48.133654  _do_op 0x7f88ff586ba0 seq 3234
> > osr(134.62s3 0x7f88ff458160)/0x7f88ff458160 start
> > 2016-10-14 11:50:48.133669  _do_transaction on 0x7f8907736000
> > 2016-10-14 11:50:48.133685  remove
> > 134.62s3_head/3#134:461e3a1c:::testfile_4M:head#298
> > 2016-10-14 11:50:48.133754  lfn_unlink: clearing omap on
> > 3#134:461e3a1c:::testfile_4M:head#298 in cid 134.62s3_head
> > 2016-10-14 11:50:48.134526  remove
> > 134.62s3_head/3#134:461e3a1c:::testfile_4M:head#298 = 0
> > 2016-10-14 11:50:48.134557  _omap_setkeys
> > 134.62s3_head/3#134:46000000::::head#
> > 2016-10-14 11:50:48.134705  _omap_setkeys
> > 134.62s3_head/3#134:46000000::::head# = 0
> > 2016-10-14 11:50:48.134716  _do_transaction on 0x7f89077361c0
> > 2016-10-14 11:50:48.134723  _collection_move_rename
> > 134.62s3_head/3#134:461e3a1c:::testfile_4M:head#29a from
> > 134.62s3_head/3#134:461e3a1c:::testfile_4M:head#
> > 2016-10-14 11:50:48.134736  _set_replay_guard 3234.1.0 START
> > 2016-10-14 11:50:48.135859  _set_replay_guard 3234.1.0 done
> > 2016-10-14 11:50:48.136130  lfn_unlink: clearing omap on
> > 3#134:461e3a1c:::testfile_4M:head# in cid 134.62s3_head
> > 2016-10-14 11:50:48.136305  _close_replay_guard 3234.1.0
> > 2016-10-14 11:50:48.137023  _close_replay_guard 3234.1.0 done
> > 2016-10-14 11:50:48.137039  _collection_move_rename
> > 134.62s3_head/3#134:461e3a1c:::testfile_4M:head#29a from
> > 134.62s3_head/3#134:461e3a1c:::testfile_4M:head# = 0
> > 2016-10-14 11:50:48.137057  write
> > 134.62s3_head/3#134:461e3a1c:::testfile_4M:head# 0~1048576
> > >> cost 1.2ms pwritev
> 
> This doing a buffered write, which copies data into the page cache.  It's not clear we can avoid that unless with do O_DIRECT.. which is tricky because of the way we cache file descriptors.
> 
> sage
> 
> 
> > 2016-10-14 11:50:48.138255  write
> > 134.62s3_head/3#134:461e3a1c:::testfile_4M:head# 0~1048576 = 1048576
> > 2016-10-14 11:50:48.138291  setattrs
> > 134.62s3_head/3#134:461e3a1c:::testfile_4M:head#
> > 2016-10-14 11:50:48.138313  setattrs
> > 134.62s3_head/3#134:461e3a1c:::testfile_4M:head# = 0
> > 2016-10-14 11:50:48.138330  fgetattrs 70 getting 'hinfo_key'
> > 2016-10-14 11:50:48.138337  setattrs
> > 134.62s3_head/3#134:461e3a1c:::testfile_4M:head#
> > 2016-10-14 11:50:48.138357  setattrs
> > 134.62s3_head/3#134:461e3a1c:::testfile_4M:head# = 0
> > 2016-10-14 11:50:48.138371  fgetattrs 70 getting '_'
> > 2016-10-14 11:50:48.138376  fgetattrs 70 getting 'snapset'
> > 2016-10-14 11:50:48.138381  fgetattrs 70 getting 'hinfo_key'
> > 2016-10-14 11:50:48.138386  setattrs
> > 134.62s3_head/3#134:461e3a1c:::testfile_4M:head#
> > 2016-10-14 11:50:48.138409  _do_op 0x7f88ff586ba0 seq 3234 r = 0,
> > finisher 0x7f8909398680 0
> > 2016-10-14 11:50:48.138420  _finish_op 0x7f88ff586ba0 seq 3234
> > osr(134.62s3 0x7f88ff458160)/0x7f88ff458160 lat 0.007306
> >
> > Than you, best regards
> > ----------------------------------------------------------------------
> > ---------------------------------------------------------------
> > 本邮件及其附件含有杭州华三通信技术有限公司的保密信息,仅限于发送给上面地址中列出
> > 的个人或群组。禁止任何其他人以任何形式使用(包括但不限于全部或部分地泄露、复制、
> > 或散发)本邮件中的信息。如果您错收了本邮件,请您立即电话或邮件通知发件人并删除本
> > 邮件!
> > This e-mail and its attachments contain confidential information from
> > H3C, which is intended only for the person or entity whose address is
> > listed above. Any use of the information contained herein in any way
> > (including, but not limited to, total or partial disclosure,
> > reproduction, or dissemination) by persons other than the intended
> > recipient(s) is prohibited. If you receive this e-mail in error,
> > please notify the sender by phone or email immediately and delete it!
> >
> PLEASE NOTE: The information contained in this electronic mail message is intended only for the use of the designated recipient(s) named above. If the reader of this message is not the intended recipient, you are hereby notified that you have received this message in error and that any review, dissemination, distribution, or copying of this message is strictly prohibited. If you have received this communication in error, please notify the sender by telephone or e-mail (as shown above) immediately and destroy any and all copies of this message in your possession (whether hard copies or electronically stored copies).
> N?????r??y??????X??ǧv???)޺{.n?????z?]z????ay?\x1dʇڙ??j\a??f???h?????\x1e?w???\f???j:+v???w????????\a????zZ+???????j"????i

^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: bufferlist rebuild_aligned issue
  2016-10-14 20:33     ` Sage Weil
@ 2016-10-14 20:35       ` Somnath Roy
  0 siblings, 0 replies; 7+ messages in thread
From: Somnath Roy @ 2016-10-14 20:35 UTC (permalink / raw)
  To: Sage Weil; +Cc: Zhangzengran, ceph-devel

Cool , thanks..
BTW, couldn't able to cherry-pick wip-denc as it needs rebase..Will profile once it is merged then..

Thanks & Regards
Somnath

-----Original Message-----
From: Sage Weil [mailto:sage@newdream.net] 
Sent: Friday, October 14, 2016 1:33 PM
To: Somnath Roy
Cc: Zhangzengran; ceph-devel@vger.kernel.org
Subject: RE: bufferlist rebuild_aligned issue

On Fri, 14 Oct 2016, Somnath Roy wrote:
> Sage,
> I am seeing the rebuild bufferlist is happening from KernelDevice::aio_write  write path as well and as a result it is popping up during profiling. Isn't the buffer supposed to aligned in this case as well ?

The bluestore rebuild is fixed as part of wip-denc, which should merge soon.. hopefully next week.

s
> 
> Thanks & Regards
> Somnath
> 
> -----Original Message-----
> From: ceph-devel-owner@vger.kernel.org 
> [mailto:ceph-devel-owner@vger.kernel.org] On Behalf Of Sage Weil
> Sent: Friday, October 14, 2016 1:26 PM
> To: Zhangzengran
> Cc: ceph-devel@vger.kernel.org
> Subject: Re: bufferlist rebuild_aligned issue
> 
> On Fri, 14 Oct 2016, Zhangzengran wrote:
> > Hi Sage,
> >   Recently I tracked some single op, and found that the bufferlist::rebuild_aligned will cost some time that a litte longer.
> > The whole subop spend 7ms,and the pwritev spend 1.2ms, but the 
> > rebuild_aligned cost 1.1ms.the test’s object length is 1M, Using 
> > rados put. I think the bufferlist rebuild and align should not cost so much time. The os environment have vm.swappiness = 0 and have tcmalloc. Is this normal?
> >
> > 2016-10-14 11:50:48.131103  queue_transactions existing 
> > 0x7f8900c302c0
> > osr(134.62s3 0x7f88ff458160)
> > 2016-10-14 11:50:48.132477  queue_transactions (writeahead) 3234 
> > [Transaction(0x7f8907736000),Transaction(0x7f89077361c0)]
> > >> cost 1.1ms FileJournal::prepare_entry => bufferlist::
> > >> rebuild_aligned
> 
> This looks like a bug.  In general we should never have to do a rebuild here.  Two things might be going wrong:
> 
> 1- The messenger should be reading the write into aligned memory based on the offset in the ceph_msg_header.  You can compare the buffer pointer (just dout the bufferlist var and you'll see the buffer segments and their actual data pointers) and the write offset and make sure they match up.
> 
> 2- The journal code has some delicate logic to adjust the head and tail padding so that the data buffer remains aligned to the journal blocks as well.  This code is in FileJournal::pre_pad.  It may also be that the alignment is getting clobbered by the ObjectStore::Transaction, which tries to track the largest data buffer in the transaction and note it's alignment.
> 
> I suspect one of those two things is the culprit, though!
> 
> > 2016-10-14 11:50:48.133583  _journaled_ahead 0x7f88ff586ba0 seq 3234 osr(134.62s3 0x7f88ff458160) [Transaction(0x7f8907736000),Transaction(0x7f89077361c0)]
> > 2016-10-14 11:50:48.133620  queue_op 0x7f88ff586ba0 seq 3234 osr(134.62s3 0x7f88ff458160) 1051330 bytes   (queue has 1 ops and 1051330 bytes)
> > 2016-10-14 11:50:48.133630   queueing ondisk 0x7f8903ab77c0
> > 2016-10-14 11:50:48.133654  _do_op 0x7f88ff586ba0 seq 3234
> > osr(134.62s3 0x7f88ff458160)/0x7f88ff458160 start
> > 2016-10-14 11:50:48.133669  _do_transaction on 0x7f8907736000
> > 2016-10-14 11:50:48.133685  remove
> > 134.62s3_head/3#134:461e3a1c:::testfile_4M:head#298
> > 2016-10-14 11:50:48.133754  lfn_unlink: clearing omap on
> > 3#134:461e3a1c:::testfile_4M:head#298 in cid 134.62s3_head
> > 2016-10-14 11:50:48.134526  remove
> > 134.62s3_head/3#134:461e3a1c:::testfile_4M:head#298 = 0
> > 2016-10-14 11:50:48.134557  _omap_setkeys 
> > 134.62s3_head/3#134:46000000::::head#
> > 2016-10-14 11:50:48.134705  _omap_setkeys 
> > 134.62s3_head/3#134:46000000::::head# = 0
> > 2016-10-14 11:50:48.134716  _do_transaction on 0x7f89077361c0
> > 2016-10-14 11:50:48.134723  _collection_move_rename 
> > 134.62s3_head/3#134:461e3a1c:::testfile_4M:head#29a from 
> > 134.62s3_head/3#134:461e3a1c:::testfile_4M:head#
> > 2016-10-14 11:50:48.134736  _set_replay_guard 3234.1.0 START
> > 2016-10-14 11:50:48.135859  _set_replay_guard 3234.1.0 done
> > 2016-10-14 11:50:48.136130  lfn_unlink: clearing omap on 
> > 3#134:461e3a1c:::testfile_4M:head# in cid 134.62s3_head
> > 2016-10-14 11:50:48.136305  _close_replay_guard 3234.1.0
> > 2016-10-14 11:50:48.137023  _close_replay_guard 3234.1.0 done
> > 2016-10-14 11:50:48.137039  _collection_move_rename 
> > 134.62s3_head/3#134:461e3a1c:::testfile_4M:head#29a from 
> > 134.62s3_head/3#134:461e3a1c:::testfile_4M:head# = 0
> > 2016-10-14 11:50:48.137057  write
> > 134.62s3_head/3#134:461e3a1c:::testfile_4M:head# 0~1048576
> > >> cost 1.2ms pwritev
> 
> This doing a buffered write, which copies data into the page cache.  It's not clear we can avoid that unless with do O_DIRECT.. which is tricky because of the way we cache file descriptors.
> 
> sage
> 
> 
> > 2016-10-14 11:50:48.138255  write
> > 134.62s3_head/3#134:461e3a1c:::testfile_4M:head# 0~1048576 = 1048576
> > 2016-10-14 11:50:48.138291  setattrs 
> > 134.62s3_head/3#134:461e3a1c:::testfile_4M:head#
> > 2016-10-14 11:50:48.138313  setattrs 
> > 134.62s3_head/3#134:461e3a1c:::testfile_4M:head# = 0
> > 2016-10-14 11:50:48.138330  fgetattrs 70 getting 'hinfo_key'
> > 2016-10-14 11:50:48.138337  setattrs 
> > 134.62s3_head/3#134:461e3a1c:::testfile_4M:head#
> > 2016-10-14 11:50:48.138357  setattrs 
> > 134.62s3_head/3#134:461e3a1c:::testfile_4M:head# = 0
> > 2016-10-14 11:50:48.138371  fgetattrs 70 getting '_'
> > 2016-10-14 11:50:48.138376  fgetattrs 70 getting 'snapset'
> > 2016-10-14 11:50:48.138381  fgetattrs 70 getting 'hinfo_key'
> > 2016-10-14 11:50:48.138386  setattrs 
> > 134.62s3_head/3#134:461e3a1c:::testfile_4M:head#
> > 2016-10-14 11:50:48.138409  _do_op 0x7f88ff586ba0 seq 3234 r = 0, 
> > finisher 0x7f8909398680 0
> > 2016-10-14 11:50:48.138420  _finish_op 0x7f88ff586ba0 seq 3234
> > osr(134.62s3 0x7f88ff458160)/0x7f88ff458160 lat 0.007306
> >
> > Than you, best regards
> > --------------------------------------------------------------------
> > --
> > ---------------------------------------------------------------
> > 本邮件及其附件含有杭州华三通信技术有限公司的保密信息,仅限于发送给上面地址中列出
> > 的个人或群组。禁止任何其他人以任何形式使用(包括但不限于全部或部分地泄露、复制、
> > 或散发)本邮件中的信息。如果您错收了本邮件,请您立即电话或邮件通知发件人并删除本
> > 邮件!
> > This e-mail and its attachments contain confidential information 
> > from H3C, which is intended only for the person or entity whose 
> > address is listed above. Any use of the information contained herein 
> > in any way (including, but not limited to, total or partial 
> > disclosure, reproduction, or dissemination) by persons other than 
> > the intended
> > recipient(s) is prohibited. If you receive this e-mail in error, 
> > please notify the sender by phone or email immediately and delete it!
> >
> PLEASE NOTE: The information contained in this electronic mail message is intended only for the use of the designated recipient(s) named above. If the reader of this message is not the intended recipient, you are hereby notified that you have received this message in error and that any review, dissemination, distribution, or copying of this message is strictly prohibited. If you have received this communication in error, please notify the sender by telephone or e-mail (as shown above) immediately and destroy any and all copies of this message in your possession (whether hard copies or electronically stored copies).
> N?????r??y??????X??ǧv???)޺{.n?????z?]z????ay?\x1dʇڙ??j ??f???h?????\x1e?w???
???j:+v???w???????? ????zZ+???????j"????i

^ permalink raw reply	[flat|nested] 7+ messages in thread

* re: bufferlist rebuild_aligned issue
@ 2016-10-14  7:03 Zhangzengran
  0 siblings, 0 replies; 7+ messages in thread
From: Zhangzengran @ 2016-10-14  7:03 UTC (permalink / raw)
  To: haomai; +Cc: Sage Weil (sage@newdream.net), ceph-devel

>> Hi Sage,
>>   Recently I tracked some single op, and found that the bufferlist::rebuild_aligned will cost some time that a litte longer.
>> The whole subop spend 7ms,and the pwritev spend 1.2ms, but the
>> rebuild_aligned cost 1.1ms.the test’s object length is 1M, Using
>> rados put. I think the bufferlist rebuild and align should not cost so much time. The os environment have vm.swappiness = 0 and have tcmalloc. Is this normal?
>>
>> 2016-10-14 11:50:48.131103  queue_transactions existing
>> 0x7f8900c302c0
>> osr(134.62s3 0x7f88ff458160)
>> 2016-10-14 11:50:48.132477  queue_transactions (writeahead) 3234
>> [Transaction(0x7f8907736000),Transaction(0x7f89077361c0)]
>>>> cost 1.1ms FileJournal::prepare_entry => bufferlist::
>>>> rebuild_aligned
>
>why do you think 1.1ms is cost from rebuild... there exists a lot of things. rebuild of course is a problem we need to resolve. but it won't be problem in hdd case since it's just memory copy.

i printed the timestamp on the path, and found the most time of 1.1ms spend in bufferlist::rebuild_aligned...
the 1M data of pwritev is write to page cache not on disk, maybe it let bufferlist aligned seem somewhat expensive...

>> 2016-10-14 11:50:48.133583  _journaled_ahead 0x7f88ff586ba0 seq 3234 osr(134.62s3 0x7f88ff458160) [Transaction(0x7f8907736000),Transaction(0x7f89077361c0)]
>> 2016-10-14 11:50:48.133620  queue_op 0x7f88ff586ba0 seq 3234 osr(134.62s3 0x7f88ff458160) 1051330 bytes   (queue has 1 ops and 1051330 bytes)
>> 2016-10-14 11:50:48.133630   queueing ondisk 0x7f8903ab77c0
>> 2016-10-14 11:50:48.133654  _do_op 0x7f88ff586ba0 seq 3234
>> osr(134.62s3 0x7f88ff458160)/0x7f88ff458160 start
>> 2016-10-14 11:50:48.133669  _do_transaction on 0x7f8907736000
>> 2016-10-14 11:50:48.133685  remove
>> 134.62s3_head/3#134:461e3a1c:::testfile_4M:head#298
>> 2016-10-14 11:50:48.133754  lfn_unlink: clearing omap on
>> 3#134:461e3a1c:::testfile_4M:head#298 in cid 134.62s3_head
>> 2016-10-14 11:50:48.134526  remove
>> 134.62s3_head/3#134:461e3a1c:::testfile_4M:head#298 = 0
>> 2016-10-14 11:50:48.134557  _omap_setkeys
>> 134.62s3_head/3#134:46000000::::head#
>> 2016-10-14 11:50:48.134705  _omap_setkeys
>> 134.62s3_head/3#134:46000000::::head# = 0
>> 2016-10-14 11:50:48.134716  _do_transaction on 0x7f89077361c0
>> 2016-10-14 11:50:48.134723  _collection_move_rename
>> 134.62s3_head/3#134:461e3a1c:::testfile_4M:head#29a from
>> 134.62s3_head/3#134:461e3a1c:::testfile_4M:head#
>> 2016-10-14 11:50:48.134736  _set_replay_guard 3234.1.0 START
>> 2016-10-14 11:50:48.135859  _set_replay_guard 3234.1.0 done
>> 2016-10-14 11:50:48.136130  lfn_unlink: clearing omap on
>> 3#134:461e3a1c:::testfile_4M:head# in cid 134.62s3_head
>> 2016-10-14 11:50:48.136305  _close_replay_guard 3234.1.0
>> 2016-10-14 11:50:48.137023  _close_replay_guard 3234.1.0 done
>> 2016-10-14 11:50:48.137039  _collection_move_rename
>> 134.62s3_head/3#134:461e3a1c:::testfile_4M:head#29a from
>> 134.62s3_head/3#134:461e3a1c:::testfile_4M:head# = 0
>> 2016-10-14 11:50:48.137057  write
>> 134.62s3_head/3#134:461e3a1c:::testfile_4M:head# 0~1048576
>>>> cost 1.2ms pwritev
>> 2016-10-14 11:50:48.138255  write
>> 134.62s3_head/3#134:461e3a1c:::testfile_4M:head# 0~1048576 = 1048576
>> 2016-10-14 11:50:48.138291  setattrs
>> 134.62s3_head/3#134:461e3a1c:::testfile_4M:head#
>> 2016-10-14 11:50:48.138313  setattrs
>> 134.62s3_head/3#134:461e3a1c:::testfile_4M:head# = 0
>> 2016-10-14 11:50:48.138330  fgetattrs 70 getting 'hinfo_key'
>> 2016-10-14 11:50:48.138337  setattrs
>> 134.62s3_head/3#134:461e3a1c:::testfile_4M:head#
>> 2016-10-14 11:50:48.138357  setattrs
>> 134.62s3_head/3#134:461e3a1c:::testfile_4M:head# = 0
>> 2016-10-14 11:50:48.138371  fgetattrs 70 getting '_'
>> 2016-10-14 11:50:48.138376  fgetattrs 70 getting 'snapset'
>> 2016-10-14 11:50:48.138381  fgetattrs 70 getting 'hinfo_key'
>> 2016-10-14 11:50:48.138386  setattrs
>> 134.62s3_head/3#134:461e3a1c:::testfile_4M:head#
>> 2016-10-14 11:50:48.138409  _do_op 0x7f88ff586ba0 seq 3234 r = 0,
>> finisher 0x7f8909398680 0
>> 2016-10-14 11:50:48.138420  _finish_op 0x7f88ff586ba0 seq 3234
>> osr(134.62s3 0x7f88ff458160)/0x7f88ff458160 lat 0.007306
>>
>> Than you, best regards
-------------------------------------------------------------------------------------------------------------------------------------
本邮件及其附件含有杭州华三通信技术有限公司的保密信息,仅限于发送给上面地址中列出
的个人或群组。禁止任何其他人以任何形式使用(包括但不限于全部或部分地泄露、复制、
或散发)本邮件中的信息。如果您错收了本邮件,请您立即电话或邮件通知发件人并删除本
邮件!
This e-mail and its attachments contain confidential information from H3C, which is
intended only for the person or entity whose address is listed above. Any use of the
information contained herein in any way (including, but not limited to, total or partial
disclosure, reproduction, or dissemination) by persons other than the intended
recipient(s) is prohibited. If you receive this e-mail in error, please notify the sender
by phone or email immediately and delete it!

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2016-10-14 20:35 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-10-14  6:37 bufferlist rebuild_aligned issue Zhangzengran
2016-10-14  6:44 ` Haomai Wang
2016-10-14 20:26 ` Sage Weil
2016-10-14 20:31   ` Somnath Roy
2016-10-14 20:33     ` Sage Weil
2016-10-14 20:35       ` Somnath Roy
2016-10-14  7:03 Zhangzengran

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.