* filestore_fiemap is bad, memstore is buggy.
@ 2016-09-15 21:48 Sage Weil
2016-09-16 3:40 ` Haomai Wang
0 siblings, 1 reply; 5+ messages in thread
From: Sage Weil @ 2016-09-15 21:48 UTC (permalink / raw)
To: ceph-devel
I was adding tests for clone_range in preparation for improving the
bluestore implementation and got stuck on memstore and filestore bugs!
The most alarming is that filestore very quickly fails the tests if
filestore_fiemap is enabled. It's off by default (phew!) but the
ceph_test_objectstore test was explicitly enabling it to get better
coverage. For now I've just removed that so we stick with the default (no
fiemap == good). I wonder if we should consider removing fiemap from the
code entirely, though, since it is clearly buggy, even on a modern kernel
(I'm running 4.4 and XFS).
I also hit bugs in memstore. I fixed the one in the BufferlistObject
implementation, but the PageSet one I couldn't figure out after 20
minutes, so I changed the default config to memstore_page_set = false for
now. Perhaps Casey or Adam or whoever wrote that can take a look? It's
very easy to reproduce.
See this PR:
https://github.com/ceph/ceph/pull/11103
Thanks!
sage
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: filestore_fiemap is bad, memstore is buggy.
2016-09-15 21:48 filestore_fiemap is bad, memstore is buggy Sage Weil
@ 2016-09-16 3:40 ` Haomai Wang
2016-09-16 13:24 ` Sage Weil
0 siblings, 1 reply; 5+ messages in thread
From: Haomai Wang @ 2016-09-16 3:40 UTC (permalink / raw)
To: Sage Weil; +Cc: ceph-devel
On Fri, Sep 16, 2016 at 5:48 AM, Sage Weil <sweil@redhat.com> wrote:
> I was adding tests for clone_range in preparation for improving the
> bluestore implementation and got stuck on memstore and filestore bugs!
>
> The most alarming is that filestore very quickly fails the tests if
> filestore_fiemap is enabled. It's off by default (phew!) but the
> ceph_test_objectstore test was explicitly enabling it to get better
> coverage. For now I've just removed that so we stick with the default (no
> fiemap == good). I wonder if we should consider removing fiemap from the
> code entirely, though, since it is clearly buggy, even on a modern kernel
> (I'm running 4.4 and XFS).
I tried with the new test case, I found even disable fiemap, I will
ran into failure. So I suspect the splice codes, I disable it, test
passed.
But fiemap and seek_data/seek_hole is still failing even disable
splice, I would like to dive into more to verify our codes if correct
since _do_seek_hole_data is copied from fiemap....
>
> I also hit bugs in memstore. I fixed the one in the BufferlistObject
> implementation, but the PageSet one I couldn't figure out after 20
> minutes, so I changed the default config to memstore_page_set = false for
> now. Perhaps Casey or Adam or whoever wrote that can take a look? It's
> very easy to reproduce.
>
> See this PR:
>
> https://github.com/ceph/ceph/pull/11103
>
> Thanks!
> sage
>
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: filestore_fiemap is bad, memstore is buggy.
2016-09-16 3:40 ` Haomai Wang
@ 2016-09-16 13:24 ` Sage Weil
2016-09-16 16:29 ` Haomai Wang
0 siblings, 1 reply; 5+ messages in thread
From: Sage Weil @ 2016-09-16 13:24 UTC (permalink / raw)
To: Haomai Wang; +Cc: ceph-devel
On Fri, 16 Sep 2016, Haomai Wang wrote:
> On Fri, Sep 16, 2016 at 5:48 AM, Sage Weil <sweil@redhat.com> wrote:
> > I was adding tests for clone_range in preparation for improving the
> > bluestore implementation and got stuck on memstore and filestore bugs!
> >
> > The most alarming is that filestore very quickly fails the tests if
> > filestore_fiemap is enabled. It's off by default (phew!) but the
> > ceph_test_objectstore test was explicitly enabling it to get better
> > coverage. For now I've just removed that so we stick with the default (no
> > fiemap == good). I wonder if we should consider removing fiemap from the
> > code entirely, though, since it is clearly buggy, even on a modern kernel
> > (I'm running 4.4 and XFS).
>
> I tried with the new test case, I found even disable fiemap, I will
> ran into failure. So I suspect the splice codes, I disable it, test
> passed.
>
> But fiemap and seek_data/seek_hole is still failing even disable
> splice, I would like to dive into more to verify our codes if correct
> since _do_seek_hole_data is copied from fiemap....
Yeah, good idea. Thanks for looking into it!
sage
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: filestore_fiemap is bad, memstore is buggy.
2016-09-16 13:24 ` Sage Weil
@ 2016-09-16 16:29 ` Haomai Wang
2016-09-17 13:05 ` Haomai Wang
0 siblings, 1 reply; 5+ messages in thread
From: Haomai Wang @ 2016-09-16 16:29 UTC (permalink / raw)
To: Sage Weil; +Cc: ceph-devel
On Fri, Sep 16, 2016 at 9:24 PM, Sage Weil <sweil@redhat.com> wrote:
> On Fri, 16 Sep 2016, Haomai Wang wrote:
>> On Fri, Sep 16, 2016 at 5:48 AM, Sage Weil <sweil@redhat.com> wrote:
>> > I was adding tests for clone_range in preparation for improving the
>> > bluestore implementation and got stuck on memstore and filestore bugs!
>> >
>> > The most alarming is that filestore very quickly fails the tests if
>> > filestore_fiemap is enabled. It's off by default (phew!) but the
>> > ceph_test_objectstore test was explicitly enabling it to get better
>> > coverage. For now I've just removed that so we stick with the default (no
>> > fiemap == good). I wonder if we should consider removing fiemap from the
>> > code entirely, though, since it is clearly buggy, even on a modern kernel
>> > (I'm running 4.4 and XFS).
>>
>> I tried with the new test case, I found even disable fiemap, I will
>> ran into failure. So I suspect the splice codes, I disable it, test
>> passed.
>>
>> But fiemap and seek_data/seek_hole is still failing even disable
>> splice, I would like to dive into more to verify our codes if correct
>> since _do_seek_hole_data is copied from fiemap....
>
> Yeah, good idea. Thanks for looking into it!
Oh, I think I found the problem.
If source file exists hole in [offset, length], fiemap or
seek_data/seek_hole will return empty data map(map<uint64_t, uint64_t>
exomap). So we won't write any to target file. But we exactly need to
write zero to target file in [offset, length]....
And the other case is we may write garbage data to target file. This
fix may help a lot to cases:
--- a/src/os/filestore/FileStore.cc
+++ b/src/os/filestore/FileStore.cc
@@ -3597,7 +3597,7 @@ int FileStore::_do_copy_range(int from, int to,
uint64_t srcoff, u
int buflen = 4096 * 16; //limit by pipe max size.see fcntl
#ifdef CEPH_HAVE_SPLICE
- if (backend->has_splice()) {
+ if (backend->has_splice() && 0) {
int pipefd[2];
if (pipe(pipefd) < 0) {
r = -errno;
@@ -3661,6 +3661,7 @@ int FileStore::_do_copy_range(int from, int to,
uint64_t srcoff, u
char buf[buflen];
while (pos < end) {
int l = MIN(end-pos, buflen);
+ memset(buf, 0, l);
r = ::read(from, buf, l);
dout(25) << " read from " << pos << "~" << l << " got " << r << dendl;
if (r < 0) {
Hmm, I'm on the vacation and don't have test machine at hand. Let me
rethink this logic and verify it's true. Anyone who want to fix this
problem immediately, plz feel free to do... Or correct me if I'm
wrong!
>
> sage
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: filestore_fiemap is bad, memstore is buggy.
2016-09-16 16:29 ` Haomai Wang
@ 2016-09-17 13:05 ` Haomai Wang
0 siblings, 0 replies; 5+ messages in thread
From: Haomai Wang @ 2016-09-17 13:05 UTC (permalink / raw)
To: Sage Weil; +Cc: ceph-devel
*splice* feature is buggy in any case(disable fiemap/seek_data, enable
fiemap, enable seek_data). I'd like to disable it by default until we
have a clear answer to test failure. In other hand, from my previous
perf tests, splice doesn't show up outstanding improvements in ceph
usages(I think it's natural since splice only eliminate memory copy,
compared to ceph context, it's really not problem).
https://github.com/ceph/ceph/pull/11113
On Sat, Sep 17, 2016 at 12:29 AM, Haomai Wang <haomai@xsky.com> wrote:
> On Fri, Sep 16, 2016 at 9:24 PM, Sage Weil <sweil@redhat.com> wrote:
>> On Fri, 16 Sep 2016, Haomai Wang wrote:
>>> On Fri, Sep 16, 2016 at 5:48 AM, Sage Weil <sweil@redhat.com> wrote:
>>> > I was adding tests for clone_range in preparation for improving the
>>> > bluestore implementation and got stuck on memstore and filestore bugs!
>>> >
>>> > The most alarming is that filestore very quickly fails the tests if
>>> > filestore_fiemap is enabled. It's off by default (phew!) but the
>>> > ceph_test_objectstore test was explicitly enabling it to get better
>>> > coverage. For now I've just removed that so we stick with the default (no
>>> > fiemap == good). I wonder if we should consider removing fiemap from the
>>> > code entirely, though, since it is clearly buggy, even on a modern kernel
>>> > (I'm running 4.4 and XFS).
>>>
>>> I tried with the new test case, I found even disable fiemap, I will
>>> ran into failure. So I suspect the splice codes, I disable it, test
>>> passed.
>>>
>>> But fiemap and seek_data/seek_hole is still failing even disable
>>> splice, I would like to dive into more to verify our codes if correct
>>> since _do_seek_hole_data is copied from fiemap....
>>
>> Yeah, good idea. Thanks for looking into it!
>
> Oh, I think I found the problem.
>
> If source file exists hole in [offset, length], fiemap or
> seek_data/seek_hole will return empty data map(map<uint64_t, uint64_t>
> exomap). So we won't write any to target file. But we exactly need to
> write zero to target file in [offset, length]....
>
> And the other case is we may write garbage data to target file. This
> fix may help a lot to cases:
>
> --- a/src/os/filestore/FileStore.cc
> +++ b/src/os/filestore/FileStore.cc
> @@ -3597,7 +3597,7 @@ int FileStore::_do_copy_range(int from, int to,
> uint64_t srcoff, u
> int buflen = 4096 * 16; //limit by pipe max size.see fcntl
>
> #ifdef CEPH_HAVE_SPLICE
> - if (backend->has_splice()) {
> + if (backend->has_splice() && 0) {
> int pipefd[2];
> if (pipe(pipefd) < 0) {
> r = -errno;
> @@ -3661,6 +3661,7 @@ int FileStore::_do_copy_range(int from, int to,
> uint64_t srcoff, u
> char buf[buflen];
> while (pos < end) {
> int l = MIN(end-pos, buflen);
> + memset(buf, 0, l);
> r = ::read(from, buf, l);
> dout(25) << " read from " << pos << "~" << l << " got " << r << dendl;
> if (r < 0) {
>
> Hmm, I'm on the vacation and don't have test machine at hand. Let me
> rethink this logic and verify it's true. Anyone who want to fix this
> problem immediately, plz feel free to do... Or correct me if I'm
> wrong!
>
>>
>> sage
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2016-09-17 13:05 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-09-15 21:48 filestore_fiemap is bad, memstore is buggy Sage Weil
2016-09-16 3:40 ` Haomai Wang
2016-09-16 13:24 ` Sage Weil
2016-09-16 16:29 ` Haomai Wang
2016-09-17 13:05 ` Haomai Wang
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.