All of lore.kernel.org
 help / color / mirror / Atom feed
From: Simon Tian <aixt2006@gmail.com>
To: Sage Weil <sage@newdream.net>
Cc: ceph-devel@vger.kernel.org
Subject: Re: operate one file in multi clients with libceph
Date: Thu, 19 May 2011 14:58:42 +0800	[thread overview]
Message-ID: <BANLkTin+D66OuXUbiRkDBNiLx--8eu2hkw@mail.gmail.com> (raw)
In-Reply-To: <Pine.LNX.4.64.1105180858360.1988@cobra.newdream.net>

Hi Sage,

    I've test this patch.
With 8 times test, only 1 (the first time) got a thread assert fail.As
back trace 1 showed:
==========================  back trace 1   ==================================
(gdb) bt
#0  0x000000367fa30265 in raise () from /lib64/libc.so.6
#1  0x000000367fa31d10 in abort () from /lib64/libc.so.6
#2  0x0000003682ebec44 in __gnu_cxx::__verbose_terminate_handler() ()
from /usr/lib64/libstdc++.so.6
#3  0x0000003682ebcdb6 in ?? () from /usr/lib64/libstdc++.so.6
#4  0x0000003682ebcde3 in std::terminate() () from /usr/lib64/libstdc++.so.6
#5  0x0000003682ebceca in __cxa_throw () from /usr/lib64/libstdc++.so.6
#6  0x00007ffff7c51a54 in ceph::__ceph_assert_fail
(assertion=0x7ffff7ce31f1 "in->cap_refs[(32 << 8)] == 1",
    file=0x7ffff7ce1431 "client/Client.cc", line=2190,
func=0x7ffff7ce5fc0 "void Client::_flush(Inode*, Context*)")
    at common/assert.cc:86
#7  0x00007ffff7b39bb0 in Client::_flush (this=0x629090, in=0x630f50,
onfinish=0x7fffe8000b60) at client/Client.cc:2190
#8  0x00007ffff7b52426 in Client::handle_cap_grant (this=0x629090,
in=0x630f50, mds=0, cap=0x62ee50, m=0x631fc0)
    at client/Client.cc:2930
#9  0x00007ffff7b52d2a in Client::handle_caps (this=0x629090,
m=0x631fc0) at client/Client.cc:2711
#10 0x00007ffff7b560a0 in Client::ms_dispatch (this=0x629090,
m=0x631fc0) at client/Client.cc:1444
#11 0x00007ffff7bcaff9 in Messenger::ms_deliver_dispatch
(this=0x628350, m=0x631fc0) at msg/Messenger.h:98
#12 0x00007ffff7bb2607 in SimpleMessenger::dispatch_entry
(this=0x628350) at msg/SimpleMessenger.cc:352
#13 0x00007ffff7b22641 in SimpleMessenger::DispatchThread::entry
(this=0x6287d8) at msg/SimpleMessenger.h:533
#14 0x00007ffff7b5aa04 in Thread::_entry_func (arg=0x6287d8) at
./common/Thread.h:41
#15 0x00000036802064a7 in start_thread () from /lib64/libpthread.so.0
#16 0x000000367fad3c2d in clone () from /lib64/libc.so.6
============================================================

In addition, when writing data for a long time, the write thread will
hang on a pthread_cond_wait, as back trace 2 showed:
==========================  back trace 2  ==================================
Thread 10 (Thread 0x43f87940 (LWP 19986)):
#0  0x000000312be0ab99 in pthread_cond_wait@@GLIBC_2.3.2 () from
/lib64/libpthread.so.0
#1  0x00007f147a8e763b in Cond::Wait (this=0x43f86b00,
mutex=@0x66d790) at ./common/Cond.h:46
#2  0x00007f147a86630d in Client::wait_on_list (this=0x66d430,
ls=@0x674df0) at client/Client.cc:2140
#3  0x00007f147a8771ff in Client::get_caps (this=0x66d430,
in=0x6749d0, need=4096, want=8192, got=0x43f86e8c, endoff=235470848)
    at client/Client.cc:1827
#4  0x00007f147a886003 in Client::_write (this=0x66d430, f=0x671510,
offset=235470336, size=512,
    buf=0x6452b0 '' <repeats 33 times>, ".001", '' <repeats 165
times>...) at client/Client.cc:5055
#5  0x00007f147a886d73 in Client::write (this=0x66d430, fd=10,
    buf=0x6452b0 '' <repeats 33 times>, ".001", '' <repeats 165
times>..., size=512, offset=235470336) at client/Client.cc:5007
#6  0x00007f147a85822d in ceph_write (fd=10, buf=0x6452b0 '' <repeats
33 times>, ".001", '' <repeats 165 times>..., size=512,
    offset=235470336) at libceph.cc:322
#7  0x000000000042fa27 in TdcCephImpl::AsyncWrite (this=0x647c30,
offset=235470336, length=512,
    buf=0x6452b0 '' <repeats 33 times>, ".001", '' <repeats 165
times>..., queue=@0x644b60, io=0x6456a0)
    at /opt/tsk/tdc-tapdisk/td-connector2.0/common/tdc_ceph_impl.cpp:191
#8  0x000000000042ffc5 in TdcCephImpl::AsyncProcess (this=0x647c30,
io=0x6456a0, queue=@0x644b60, netfd=5)
    at /opt/tsk/tdc-tapdisk/td-connector2.0/common/tdc_ceph_impl.cpp:268
#9  0x00000000004192be in SubmitRequest_Intra (arg=0x644b00) at
/opt/tsk/tdc-tapdisk/td-connector2.0/td_connector_server.cpp:390
#10 0x000000312be064a7 in start_thread () from /lib64/libpthread.so.0
#11 0x000000312b6d3c2d in clone () from /lib64/libc.so.6
============================================================

Seem not very safe with the patch.

So I applied this patch:
diff --git a/src/client/Client.cc b/src/client/Client.cc
index 7f7fb08..bf0997a 100644
--- a/src/client/Client.cc
+++ b/src/client/Client.cc
@@ -4934,7 +4934,6 @@ public:

 void Client::sync_write_commit(Inode *in)
 {
-  client_lock.Lock();
+ int r = client_lock.Lock();
  assert(unsafe_sync_write > 0);
  unsafe_sync_write--;

@@ -4947,8 +4946,6 @@ void Client::sync_write_commit(Inode *in)
  }

  put_inode(in);
+ if (r)
  client_lock.Unlock();
 }

 int Client::write(int fd, const char *buf, loff_t size, loff_t offset)


After some test, back trace 1 didn't appear, but back trace 2 still appear.
Hmm,, not safe also.

Thx!
Simon



2011/5/19 Sage Weil <sage@newdream.net>:
> Hi Simon,
>
> On Wed, 18 May 2011, Simon Tian wrote:
>
>> Hi,
>>
>>     Could any one give me an answer?
>>
>>    My application need to open one file with two clients in different
>> hosts. One for write/read and one for read only. The client could be
>> developed based on libceph or librbd.
>>    I tried librbd, exception appear too.
>
> I opened a bug for this, http://tracker.newdream.net/issues/1097.
>
> Can you try the patch below?  I this may just be something missed in a
> locking rewrite way back when in a rare code path:
>
> Thanks!
> sage
>
>
>
> diff --git a/src/client/Client.cc b/src/client/Client.cc
> index 7f7fb08..bf0997a 100644
> --- a/src/client/Client.cc
> +++ b/src/client/Client.cc
> @@ -4934,7 +4934,6 @@ public:
>
>  void Client::sync_write_commit(Inode *in)
>  {
> -  client_lock.Lock();
>   assert(unsafe_sync_write > 0);
>   unsafe_sync_write--;
>
> @@ -4947,8 +4946,6 @@ void Client::sync_write_commit(Inode *in)
>   }
>
>   put_inode(in);
> -
> -  client_lock.Unlock();
>  }
>
>  int Client::write(int fd, const char *buf, loff_t size, loff_t offset)
>
>
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  reply	other threads:[~2011-05-19  6:58 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-05-17 13:00 operate one file in multi clients with libceph Simon Tian
2011-05-18 14:40 ` Simon Tian
2011-05-18 15:26   ` Brian Chrisman
2011-05-19  6:48     ` Simon Tian
2011-05-18 16:01   ` Sage Weil
2011-05-19  6:58     ` Simon Tian [this message]
2011-05-19 17:11       ` Sage Weil
2011-05-19 17:20         ` Sage Weil
2011-05-19 21:45           ` Sage Weil
2011-05-20  8:56             ` Simon Tian

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=BANLkTin+D66OuXUbiRkDBNiLx--8eu2hkw@mail.gmail.com \
    --to=aixt2006@gmail.com \
    --cc=ceph-devel@vger.kernel.org \
    --cc=sage@newdream.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.