All of lore.kernel.org
 help / color / mirror / Atom feed
* filestore_fiemap is bad, memstore is buggy.
@ 2016-09-15 21:48 Sage Weil
  2016-09-16  3:40 ` Haomai Wang
  0 siblings, 1 reply; 5+ messages in thread
From: Sage Weil @ 2016-09-15 21:48 UTC (permalink / raw)
  To: ceph-devel

I was adding tests for clone_range in preparation for improving the 
bluestore implementation and got stuck on memstore and filestore bugs!

The most alarming is that filestore very quickly fails the tests if 
filestore_fiemap is enabled.  It's off by default (phew!) but the 
ceph_test_objectstore test was explicitly enabling it to get better 
coverage.  For now I've just removed that so we stick with the default (no 
fiemap == good).  I wonder if we should consider removing fiemap from the 
code entirely, though, since it is clearly buggy, even on a modern kernel 
(I'm running 4.4 and XFS).

I also hit bugs in memstore.  I fixed the one in the BufferlistObject 
implementation, but the PageSet one I couldn't figure out after 20 
minutes, so I changed the default config to memstore_page_set = false for 
now.  Perhaps Casey or Adam or whoever wrote that can take a look?  It's 
very easy to reproduce.

See this PR:

	https://github.com/ceph/ceph/pull/11103

Thanks!
sage


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: filestore_fiemap is bad, memstore is buggy.
  2016-09-15 21:48 filestore_fiemap is bad, memstore is buggy Sage Weil
@ 2016-09-16  3:40 ` Haomai Wang
  2016-09-16 13:24   ` Sage Weil
  0 siblings, 1 reply; 5+ messages in thread
From: Haomai Wang @ 2016-09-16  3:40 UTC (permalink / raw)
  To: Sage Weil; +Cc: ceph-devel

On Fri, Sep 16, 2016 at 5:48 AM, Sage Weil <sweil@redhat.com> wrote:
> I was adding tests for clone_range in preparation for improving the
> bluestore implementation and got stuck on memstore and filestore bugs!
>
> The most alarming is that filestore very quickly fails the tests if
> filestore_fiemap is enabled.  It's off by default (phew!) but the
> ceph_test_objectstore test was explicitly enabling it to get better
> coverage.  For now I've just removed that so we stick with the default (no
> fiemap == good).  I wonder if we should consider removing fiemap from the
> code entirely, though, since it is clearly buggy, even on a modern kernel
> (I'm running 4.4 and XFS).

I tried with the new test case, I found even disable fiemap, I will
ran into failure. So I suspect the splice codes, I disable it, test
passed.

But fiemap and seek_data/seek_hole is still failing even disable
splice, I would like to dive into more to verify our codes if correct
since _do_seek_hole_data is copied from fiemap....

>
> I also hit bugs in memstore.  I fixed the one in the BufferlistObject
> implementation, but the PageSet one I couldn't figure out after 20
> minutes, so I changed the default config to memstore_page_set = false for
> now.  Perhaps Casey or Adam or whoever wrote that can take a look?  It's
> very easy to reproduce.
>
> See this PR:
>
>         https://github.com/ceph/ceph/pull/11103
>
> Thanks!
> sage
>
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: filestore_fiemap is bad, memstore is buggy.
  2016-09-16  3:40 ` Haomai Wang
@ 2016-09-16 13:24   ` Sage Weil
  2016-09-16 16:29     ` Haomai Wang
  0 siblings, 1 reply; 5+ messages in thread
From: Sage Weil @ 2016-09-16 13:24 UTC (permalink / raw)
  To: Haomai Wang; +Cc: ceph-devel

On Fri, 16 Sep 2016, Haomai Wang wrote:
> On Fri, Sep 16, 2016 at 5:48 AM, Sage Weil <sweil@redhat.com> wrote:
> > I was adding tests for clone_range in preparation for improving the
> > bluestore implementation and got stuck on memstore and filestore bugs!
> >
> > The most alarming is that filestore very quickly fails the tests if
> > filestore_fiemap is enabled.  It's off by default (phew!) but the
> > ceph_test_objectstore test was explicitly enabling it to get better
> > coverage.  For now I've just removed that so we stick with the default (no
> > fiemap == good).  I wonder if we should consider removing fiemap from the
> > code entirely, though, since it is clearly buggy, even on a modern kernel
> > (I'm running 4.4 and XFS).
> 
> I tried with the new test case, I found even disable fiemap, I will
> ran into failure. So I suspect the splice codes, I disable it, test
> passed.
> 
> But fiemap and seek_data/seek_hole is still failing even disable
> splice, I would like to dive into more to verify our codes if correct
> since _do_seek_hole_data is copied from fiemap....

Yeah, good idea.  Thanks for looking into it!

sage

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: filestore_fiemap is bad, memstore is buggy.
  2016-09-16 13:24   ` Sage Weil
@ 2016-09-16 16:29     ` Haomai Wang
  2016-09-17 13:05       ` Haomai Wang
  0 siblings, 1 reply; 5+ messages in thread
From: Haomai Wang @ 2016-09-16 16:29 UTC (permalink / raw)
  To: Sage Weil; +Cc: ceph-devel

On Fri, Sep 16, 2016 at 9:24 PM, Sage Weil <sweil@redhat.com> wrote:
> On Fri, 16 Sep 2016, Haomai Wang wrote:
>> On Fri, Sep 16, 2016 at 5:48 AM, Sage Weil <sweil@redhat.com> wrote:
>> > I was adding tests for clone_range in preparation for improving the
>> > bluestore implementation and got stuck on memstore and filestore bugs!
>> >
>> > The most alarming is that filestore very quickly fails the tests if
>> > filestore_fiemap is enabled.  It's off by default (phew!) but the
>> > ceph_test_objectstore test was explicitly enabling it to get better
>> > coverage.  For now I've just removed that so we stick with the default (no
>> > fiemap == good).  I wonder if we should consider removing fiemap from the
>> > code entirely, though, since it is clearly buggy, even on a modern kernel
>> > (I'm running 4.4 and XFS).
>>
>> I tried with the new test case, I found even disable fiemap, I will
>> ran into failure. So I suspect the splice codes, I disable it, test
>> passed.
>>
>> But fiemap and seek_data/seek_hole is still failing even disable
>> splice, I would like to dive into more to verify our codes if correct
>> since _do_seek_hole_data is copied from fiemap....
>
> Yeah, good idea.  Thanks for looking into it!

Oh, I think I found the problem.

If source file exists hole in [offset, length], fiemap or
seek_data/seek_hole will return empty data map(map<uint64_t, uint64_t>
exomap). So we won't write any to target file. But we exactly need to
write zero to target file in [offset, length]....

And the other case is we may write garbage data to target file. This
fix may help a lot to cases:

--- a/src/os/filestore/FileStore.cc
+++ b/src/os/filestore/FileStore.cc
@@ -3597,7 +3597,7 @@ int FileStore::_do_copy_range(int from, int to,
uint64_t srcoff, u
   int buflen = 4096 * 16; //limit by pipe max size.see fcntl

 #ifdef CEPH_HAVE_SPLICE
-  if (backend->has_splice()) {
+  if (backend->has_splice() && 0) {
     int pipefd[2];
     if (pipe(pipefd) < 0) {
       r = -errno;
@@ -3661,6 +3661,7 @@ int FileStore::_do_copy_range(int from, int to,
uint64_t srcoff, u
     char buf[buflen];
     while (pos < end) {
       int l = MIN(end-pos, buflen);
+      memset(buf, 0, l);
       r = ::read(from, buf, l);
       dout(25) << "  read from " << pos << "~" << l << " got " << r << dendl;
       if (r < 0) {

Hmm, I'm on the vacation and don't have test machine at hand. Let me
rethink this logic and verify it's true. Anyone who want to fix this
problem immediately, plz feel free to do... Or correct me if I'm
wrong!

>
> sage

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: filestore_fiemap is bad, memstore is buggy.
  2016-09-16 16:29     ` Haomai Wang
@ 2016-09-17 13:05       ` Haomai Wang
  0 siblings, 0 replies; 5+ messages in thread
From: Haomai Wang @ 2016-09-17 13:05 UTC (permalink / raw)
  To: Sage Weil; +Cc: ceph-devel

*splice* feature is buggy in any case(disable fiemap/seek_data, enable
fiemap, enable seek_data). I'd like to disable it by default until we
have a clear answer to test failure. In other hand, from my previous
perf tests, splice doesn't show up outstanding improvements in ceph
usages(I think it's natural since splice only eliminate memory copy,
compared to ceph context, it's really not problem).

https://github.com/ceph/ceph/pull/11113

On Sat, Sep 17, 2016 at 12:29 AM, Haomai Wang <haomai@xsky.com> wrote:
> On Fri, Sep 16, 2016 at 9:24 PM, Sage Weil <sweil@redhat.com> wrote:
>> On Fri, 16 Sep 2016, Haomai Wang wrote:
>>> On Fri, Sep 16, 2016 at 5:48 AM, Sage Weil <sweil@redhat.com> wrote:
>>> > I was adding tests for clone_range in preparation for improving the
>>> > bluestore implementation and got stuck on memstore and filestore bugs!
>>> >
>>> > The most alarming is that filestore very quickly fails the tests if
>>> > filestore_fiemap is enabled.  It's off by default (phew!) but the
>>> > ceph_test_objectstore test was explicitly enabling it to get better
>>> > coverage.  For now I've just removed that so we stick with the default (no
>>> > fiemap == good).  I wonder if we should consider removing fiemap from the
>>> > code entirely, though, since it is clearly buggy, even on a modern kernel
>>> > (I'm running 4.4 and XFS).
>>>
>>> I tried with the new test case, I found even disable fiemap, I will
>>> ran into failure. So I suspect the splice codes, I disable it, test
>>> passed.
>>>
>>> But fiemap and seek_data/seek_hole is still failing even disable
>>> splice, I would like to dive into more to verify our codes if correct
>>> since _do_seek_hole_data is copied from fiemap....
>>
>> Yeah, good idea.  Thanks for looking into it!
>
> Oh, I think I found the problem.
>
> If source file exists hole in [offset, length], fiemap or
> seek_data/seek_hole will return empty data map(map<uint64_t, uint64_t>
> exomap). So we won't write any to target file. But we exactly need to
> write zero to target file in [offset, length]....
>
> And the other case is we may write garbage data to target file. This
> fix may help a lot to cases:
>
> --- a/src/os/filestore/FileStore.cc
> +++ b/src/os/filestore/FileStore.cc
> @@ -3597,7 +3597,7 @@ int FileStore::_do_copy_range(int from, int to,
> uint64_t srcoff, u
>    int buflen = 4096 * 16; //limit by pipe max size.see fcntl
>
>  #ifdef CEPH_HAVE_SPLICE
> -  if (backend->has_splice()) {
> +  if (backend->has_splice() && 0) {
>      int pipefd[2];
>      if (pipe(pipefd) < 0) {
>        r = -errno;
> @@ -3661,6 +3661,7 @@ int FileStore::_do_copy_range(int from, int to,
> uint64_t srcoff, u
>      char buf[buflen];
>      while (pos < end) {
>        int l = MIN(end-pos, buflen);
> +      memset(buf, 0, l);
>        r = ::read(from, buf, l);
>        dout(25) << "  read from " << pos << "~" << l << " got " << r << dendl;
>        if (r < 0) {
>
> Hmm, I'm on the vacation and don't have test machine at hand. Let me
> rethink this logic and verify it's true. Anyone who want to fix this
> problem immediately, plz feel free to do... Or correct me if I'm
> wrong!
>
>>
>> sage

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2016-09-17 13:05 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-09-15 21:48 filestore_fiemap is bad, memstore is buggy Sage Weil
2016-09-16  3:40 ` Haomai Wang
2016-09-16 13:24   ` Sage Weil
2016-09-16 16:29     ` Haomai Wang
2016-09-17 13:05       ` Haomai Wang

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.