All of lore.kernel.org
 help / color / mirror / Atom feed
* Disable fiemap lead to Data In-balance between OSD
@ 2016-09-28 16:26 Ning Yao
  2016-09-29  2:25 ` Haomai Wang
  0 siblings, 1 reply; 10+ messages in thread
From: Ning Yao @ 2016-09-28 16:26 UTC (permalink / raw)
  To: ceph-devel

Hi,

As lots of fiemap issues in XFS, fiemap is default disabled now,
especially in Hammer, before seek_data, seek_hole is added.

But disabling fiemap feature will cause a small sparse object become a
large full object during PushOps, which may lead to notably data
in-balance between OSD, especially on the new added OSD  during data
rebalance. With those full objects, some OSDs may simultaneously
becomes full.

Furthermore, currently, it is impossible to make the full objects
sparse again if we enable the fiemap feature in the future.

So I think if any solutions to make a full object back to a sparse
object again? One of the idea is to check whether the content in the
object contains consecutive zero and punch zeros for those object
during deep-scrub,  is that possible and reasonable?



Regards
Ning Yao

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Disable fiemap lead to Data In-balance between OSD
  2016-09-28 16:26 Disable fiemap lead to Data In-balance between OSD Ning Yao
@ 2016-09-29  2:25 ` Haomai Wang
  2016-09-29  2:27   ` Haomai Wang
  0 siblings, 1 reply; 10+ messages in thread
From: Haomai Wang @ 2016-09-29  2:25 UTC (permalink / raw)
  To: Ning Yao; +Cc: ceph-devel

On Thu, Sep 29, 2016 at 12:26 AM, Ning Yao <zay11022@gmail.com> wrote:
> Hi,
>
> As lots of fiemap issues in XFS, fiemap is default disabled now,
> especially in Hammer, before seek_data, seek_hole is added.
>
> But disabling fiemap feature will cause a small sparse object become a
> large full object during PushOps, which may lead to notably data
> in-balance between OSD, especially on the new added OSD  during data
> rebalance. With those full objects, some OSDs may simultaneously
> becomes full.

Until now, I don't know existing problem with fiemap enabled in
hammer. Although we find it maybe problem when clone to a existing
overlap data range, but it won't exists in real case.

>
> Furthermore, currently, it is impossible to make the full objects
> sparse again if we enable the fiemap feature in the future.
>
> So I think if any solutions to make a full object back to a sparse
> object again? One of the idea is to check whether the content in the
> object contains consecutive zero and punch zeros for those object
> during deep-scrub,  is that possible and reasonable?

Obviously it's a complex thing more than we get.

>
>
>
> Regards
> Ning Yao
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Disable fiemap lead to Data In-balance between OSD
  2016-09-29  2:25 ` Haomai Wang
@ 2016-09-29  2:27   ` Haomai Wang
  2016-09-29 13:49     ` Ning Yao
  0 siblings, 1 reply; 10+ messages in thread
From: Haomai Wang @ 2016-09-29  2:27 UTC (permalink / raw)
  To: Ning Yao; +Cc: ceph-devel

On Thu, Sep 29, 2016 at 10:25 AM, Haomai Wang <haomai@xsky.com> wrote:
> On Thu, Sep 29, 2016 at 12:26 AM, Ning Yao <zay11022@gmail.com> wrote:
>> Hi,
>>
>> As lots of fiemap issues in XFS, fiemap is default disabled now,
>> especially in Hammer, before seek_data, seek_hole is added.
>>
>> But disabling fiemap feature will cause a small sparse object become a
>> large full object during PushOps, which may lead to notably data
>> in-balance between OSD, especially on the new added OSD  during data
>> rebalance. With those full objects, some OSDs may simultaneously
>> becomes full.
>
> Until now, I don't know existing problem with fiemap enabled in
> hammer. Although we find it maybe problem when clone to a existing
> overlap data range, but it won't exists in real case.

Hmm, I can't guarantee this... I only means if you want to have sparse
object, you can enable this. ....

>
>>
>> Furthermore, currently, it is impossible to make the full objects
>> sparse again if we enable the fiemap feature in the future.
>>
>> So I think if any solutions to make a full object back to a sparse
>> object again? One of the idea is to check whether the content in the
>> object contains consecutive zero and punch zeros for those object
>> during deep-scrub,  is that possible and reasonable?
>
> Obviously it's a complex thing more than we get.
>
>>
>>
>>
>> Regards
>> Ning Yao
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Disable fiemap lead to Data In-balance between OSD
  2016-09-29  2:27   ` Haomai Wang
@ 2016-09-29 13:49     ` Ning Yao
  2016-09-29 13:54       ` Haomai Wang
  2016-09-30  3:23       ` Jeff Liu
  0 siblings, 2 replies; 10+ messages in thread
From: Ning Yao @ 2016-09-29 13:49 UTC (permalink / raw)
  To: Haomai Wang; +Cc: ceph-devel

XFS has #fiemap extent intervals limitted in kernel, so if we do not
use seek_data, seek_hole. It will lead to getting a wrong fiemap
(absence of some extents)  from a large object. It is actually not
security before Jewel with enabling filestore_seek_data_hole.
Regards
Ning Yao


2016-09-29 10:27 GMT+08:00 Haomai Wang <haomai@xsky.com>:
> On Thu, Sep 29, 2016 at 10:25 AM, Haomai Wang <haomai@xsky.com> wrote:
>> On Thu, Sep 29, 2016 at 12:26 AM, Ning Yao <zay11022@gmail.com> wrote:
>>> Hi,
>>>
>>> As lots of fiemap issues in XFS, fiemap is default disabled now,
>>> especially in Hammer, before seek_data, seek_hole is added.
>>>
>>> But disabling fiemap feature will cause a small sparse object become a
>>> large full object during PushOps, which may lead to notably data
>>> in-balance between OSD, especially on the new added OSD  during data
>>> rebalance. With those full objects, some OSDs may simultaneously
>>> becomes full.
>>
>> Until now, I don't know existing problem with fiemap enabled in
>> hammer. Although we find it maybe problem when clone to a existing
>> overlap data range, but it won't exists in real case.
>
> Hmm, I can't guarantee this... I only means if you want to have sparse
> object, you can enable this. ....
>
>>
>>>
>>> Furthermore, currently, it is impossible to make the full objects
>>> sparse again if we enable the fiemap feature in the future.
>>>
>>> So I think if any solutions to make a full object back to a sparse
>>> object again? One of the idea is to check whether the content in the
>>> object contains consecutive zero and punch zeros for those object
>>> during deep-scrub,  is that possible and reasonable?
>>
>> Obviously it's a complex thing more than we get.
>>
>>>
>>>
>>>
>>> Regards
>>> Ning Yao
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Disable fiemap lead to Data In-balance between OSD
  2016-09-29 13:49     ` Ning Yao
@ 2016-09-29 13:54       ` Haomai Wang
  2016-09-30  3:23       ` Jeff Liu
  1 sibling, 0 replies; 10+ messages in thread
From: Haomai Wang @ 2016-09-29 13:54 UTC (permalink / raw)
  To: Ning Yao; +Cc: ceph-devel

On Thu, Sep 29, 2016 at 9:49 PM, Ning Yao <zay11022@gmail.com> wrote:
> XFS has #fiemap extent intervals limitted in kernel, so if we do not
> use seek_data, seek_hole. It will lead to getting a wrong fiemap
> (absence of some extents)  from a large object. It is actually not
> security before Jewel with enabling filestore_seek_data_hole.

I'm not sure your problem. We already have known fiemap bug in legacy
os. And we also fixed fiemap usage in filestore layer. If possible,
we'd like to test this.

> Regards
> Ning Yao
>
>
> 2016-09-29 10:27 GMT+08:00 Haomai Wang <haomai@xsky.com>:
>> On Thu, Sep 29, 2016 at 10:25 AM, Haomai Wang <haomai@xsky.com> wrote:
>>> On Thu, Sep 29, 2016 at 12:26 AM, Ning Yao <zay11022@gmail.com> wrote:
>>>> Hi,
>>>>
>>>> As lots of fiemap issues in XFS, fiemap is default disabled now,
>>>> especially in Hammer, before seek_data, seek_hole is added.
>>>>
>>>> But disabling fiemap feature will cause a small sparse object become a
>>>> large full object during PushOps, which may lead to notably data
>>>> in-balance between OSD, especially on the new added OSD  during data
>>>> rebalance. With those full objects, some OSDs may simultaneously
>>>> becomes full.
>>>
>>> Until now, I don't know existing problem with fiemap enabled in
>>> hammer. Although we find it maybe problem when clone to a existing
>>> overlap data range, but it won't exists in real case.
>>
>> Hmm, I can't guarantee this... I only means if you want to have sparse
>> object, you can enable this. ....
>>
>>>
>>>>
>>>> Furthermore, currently, it is impossible to make the full objects
>>>> sparse again if we enable the fiemap feature in the future.
>>>>
>>>> So I think if any solutions to make a full object back to a sparse
>>>> object again? One of the idea is to check whether the content in the
>>>> object contains consecutive zero and punch zeros for those object
>>>> during deep-scrub,  is that possible and reasonable?
>>>
>>> Obviously it's a complex thing more than we get.
>>>
>>>>
>>>>
>>>>
>>>> Regards
>>>> Ning Yao
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>>> the body of a message to majordomo@vger.kernel.org
>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Disable fiemap lead to Data In-balance between OSD
  2016-09-29 13:49     ` Ning Yao
  2016-09-29 13:54       ` Haomai Wang
@ 2016-09-30  3:23       ` Jeff Liu
  2016-10-12 15:02         ` Haomai Wang
  1 sibling, 1 reply; 10+ messages in thread
From: Jeff Liu @ 2016-09-30  3:23 UTC (permalink / raw)
  To: Ning Yao, Haomai Wang; +Cc: ceph-devel, xfs

Could you please show your test cases about the fiemap issue against XFS?
I'd like to dig into it if that is still existing in upstream code base.

On 2016年09月29日 21:49, Ning Yao wrote:

XFS has #fiemap extent intervals limitted in kernel, so if we do not
use seek_data, seek_hole. It will lead to getting a wrong fiemap
(absence of some extents)  from a large object. It is actually not
security before Jewel with enabling filestore_seek_data_hole.
Regards
Ning Yao


2016-09-29 10:27 GMT+08:00 Haomai Wang <haomai@xsky.com>:

> On Thu, Sep 29, 2016 at 10:25 AM, Haomai Wang <haomai@xsky.com> wrote:
>> On Thu, Sep 29, 2016 at 12:26 AM, Ning Yao <zay11022@gmail.com> wrote:
>>> Hi,
>>>
>>> As lots of fiemap issues in XFS, fiemap is default disabled now,
>>> especially in Hammer, before seek_data, seek_hole is added.
>>>
>>> But disabling fiemap feature will cause a small sparse object become a
>>> large full object during PushOps, which may lead to notably data
>>> in-balance between OSD, especially on the new added OSD  during data
>>> rebalance. With those full objects, some OSDs may simultaneously
>>> becomes full.
>> Until now, I don't know existing problem with fiemap enabled in
>> hammer. Although we find it maybe problem when clone to a existing
>> overlap data range, but it won't exists in real case.
> Hmm, I can't guarantee this... I only means if you want to have sparse
> object, you can enable this. ....
>
>>> Furthermore, currently, it is impossible to make the full objects
>>> sparse again if we enable the fiemap feature in the future.
>>>
>>> So I think if any solutions to make a full object back to a sparse
>>> object again? One of the idea is to check whether the content in the
>>> object contains consecutive zero and punch zeros for those object
>>> during deep-scrub,  is that possible and reasonable?
>> Obviously it's a complex thing more than we get.
>>
>>>
>>> Regards
>>> Ning Yao
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


-- 
Cheers,

Jeff Liu


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Disable fiemap lead to Data In-balance between OSD
  2016-09-30  3:23       ` Jeff Liu
@ 2016-10-12 15:02         ` Haomai Wang
  2016-10-13 17:06           ` Ning Yao
  0 siblings, 1 reply; 10+ messages in thread
From: Haomai Wang @ 2016-10-12 15:02 UTC (permalink / raw)
  To: Jeff Liu; +Cc: Ning Yao, ceph-devel, xfs

thanks to Ning Yao. We have found ceph's incorrect usage in xfs fiemap.

Actually this reminds me when I'm looking for unaligned fiemap lookup,
we also observe this case. Refer to
http://www.spinics.net/lists/xfs/msg38001.html, if fiemap extents
larger than 1364, single fiemap call will only return 1364. We need to
check the last fiemap extent with FIEMAP_EXTENT_LAST flag. If not, we
need to continue to call fiemap.

Fortunately 1364 extents requires at least 8MB object but rbd's
default object size is 4MB. So if we don't change object size, nothing
happen. But I remember openstack glance's default object size is 64MB.
So it maybe problem for that case. Since I often advertise rbd users
to turn fiemap on, I hope no one don't hit this bug....

And one way is fix fiemap usage in GenericFilesystemBackend, another
is totally abandon fiemap in hammer. Or we don't need to do anything
since fiemap is disable default?

Anyway, thanks Ning Yao again!


On Fri, Sep 30, 2016 at 11:23 AM, Jeff Liu <jeff.liu@easystack.cn> wrote:
> Could you please show your test cases about the fiemap issue against XFS?
> I'd like to dig into it if that is still existing in upstream code base.
>
> On 2016年09月29日 21:49, Ning Yao wrote:
>
> XFS has #fiemap extent intervals limitted in kernel, so if we do not
> use seek_data, seek_hole. It will lead to getting a wrong fiemap
> (absence of some extents)  from a large object. It is actually not
> security before Jewel with enabling filestore_seek_data_hole.
> Regards
> Ning Yao
>
>
> 2016-09-29 10:27 GMT+08:00 Haomai Wang <haomai@xsky.com>:
>
>> On Thu, Sep 29, 2016 at 10:25 AM, Haomai Wang <haomai@xsky.com> wrote:
>>> On Thu, Sep 29, 2016 at 12:26 AM, Ning Yao <zay11022@gmail.com> wrote:
>>>> Hi,
>>>>
>>>> As lots of fiemap issues in XFS, fiemap is default disabled now,
>>>> especially in Hammer, before seek_data, seek_hole is added.
>>>>
>>>> But disabling fiemap feature will cause a small sparse object become a
>>>> large full object during PushOps, which may lead to notably data
>>>> in-balance between OSD, especially on the new added OSD  during data
>>>> rebalance. With those full objects, some OSDs may simultaneously
>>>> becomes full.
>>> Until now, I don't know existing problem with fiemap enabled in
>>> hammer. Although we find it maybe problem when clone to a existing
>>> overlap data range, but it won't exists in real case.
>> Hmm, I can't guarantee this... I only means if you want to have sparse
>> object, you can enable this. ....
>>
>>>> Furthermore, currently, it is impossible to make the full objects
>>>> sparse again if we enable the fiemap feature in the future.
>>>>
>>>> So I think if any solutions to make a full object back to a sparse
>>>> object again? One of the idea is to check whether the content in the
>>>> object contains consecutive zero and punch zeros for those object
>>>> during deep-scrub,  is that possible and reasonable?
>>> Obviously it's a complex thing more than we get.
>>>
>>>>
>>>> Regards
>>>> Ning Yao
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>>> the body of a message to majordomo@vger.kernel.org
>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
>
> --
> Cheers,
>
> Jeff Liu
>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Disable fiemap lead to Data In-balance between OSD
  2016-10-12 15:02         ` Haomai Wang
@ 2016-10-13 17:06           ` Ning Yao
  2016-10-14  2:25             ` Haomai Wang
  0 siblings, 1 reply; 10+ messages in thread
From: Ning Yao @ 2016-10-13 17:06 UTC (permalink / raw)
  To: Haomai Wang, sjust; +Cc: Jeff Liu, ceph-devel, xfs

Thanks to Haomai's suggested solutions. What about this:
https://github.com/mslovy/ceph/commit/539b7998fea16f8af3f6cbbbd243f6996f292acc
https://github.com/mslovy/ceph/commit/33240080f3324a70a288c79a77846688c1f29db5

As Haomai described, fiemap is default disabled in previous version
and may not use in newest version.
So is it really needed or should we backport the this fix?  any suggestions?

Ping Sam.

Regards
Ning Yao


2016-10-12 23:02 GMT+08:00 Haomai Wang <haomai@xsky.com>:
> thanks to Ning Yao. We have found ceph's incorrect usage in xfs fiemap.
>
> Actually this reminds me when I'm looking for unaligned fiemap lookup,
> we also observe this case. Refer to
> http://www.spinics.net/lists/xfs/msg38001.html, if fiemap extents
> larger than 1364, single fiemap call will only return 1364. We need to
> check the last fiemap extent with FIEMAP_EXTENT_LAST flag. If not, we
> need to continue to call fiemap.
>
> Fortunately 1364 extents requires at least 8MB object but rbd's
> default object size is 4MB. So if we don't change object size, nothing
> happen. But I remember openstack glance's default object size is 64MB.
> So it maybe problem for that case. Since I often advertise rbd users
> to turn fiemap on, I hope no one don't hit this bug....
>
> And one way is fix fiemap usage in GenericFilesystemBackend, another
> is totally abandon fiemap in hammer. Or we don't need to do anything
> since fiemap is disable default?
>
> Anyway, thanks Ning Yao again!
>
>
> On Fri, Sep 30, 2016 at 11:23 AM, Jeff Liu <jeff.liu@easystack.cn> wrote:
>> Could you please show your test cases about the fiemap issue against XFS?
>> I'd like to dig into it if that is still existing in upstream code base.
>>
>> On 2016年09月29日 21:49, Ning Yao wrote:
>>
>> XFS has #fiemap extent intervals limitted in kernel, so if we do not
>> use seek_data, seek_hole. It will lead to getting a wrong fiemap
>> (absence of some extents)  from a large object. It is actually not
>> security before Jewel with enabling filestore_seek_data_hole.
>> Regards
>> Ning Yao
>>
>>
>> 2016-09-29 10:27 GMT+08:00 Haomai Wang <haomai@xsky.com>:
>>
>>> On Thu, Sep 29, 2016 at 10:25 AM, Haomai Wang <haomai@xsky.com> wrote:
>>>> On Thu, Sep 29, 2016 at 12:26 AM, Ning Yao <zay11022@gmail.com> wrote:
>>>>> Hi,
>>>>>
>>>>> As lots of fiemap issues in XFS, fiemap is default disabled now,
>>>>> especially in Hammer, before seek_data, seek_hole is added.
>>>>>
>>>>> But disabling fiemap feature will cause a small sparse object become a
>>>>> large full object during PushOps, which may lead to notably data
>>>>> in-balance between OSD, especially on the new added OSD  during data
>>>>> rebalance. With those full objects, some OSDs may simultaneously
>>>>> becomes full.
>>>> Until now, I don't know existing problem with fiemap enabled in
>>>> hammer. Although we find it maybe problem when clone to a existing
>>>> overlap data range, but it won't exists in real case.
>>> Hmm, I can't guarantee this... I only means if you want to have sparse
>>> object, you can enable this. ....
>>>
>>>>> Furthermore, currently, it is impossible to make the full objects
>>>>> sparse again if we enable the fiemap feature in the future.
>>>>>
>>>>> So I think if any solutions to make a full object back to a sparse
>>>>> object again? One of the idea is to check whether the content in the
>>>>> object contains consecutive zero and punch zeros for those object
>>>>> during deep-scrub,  is that possible and reasonable?
>>>> Obviously it's a complex thing more than we get.
>>>>
>>>>>
>>>>> Regards
>>>>> Ning Yao
>>>>> --
>>>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>>>> the body of a message to majordomo@vger.kernel.org
>>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>>
>> --
>> Cheers,
>>
>> Jeff Liu
>>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Disable fiemap lead to Data In-balance between OSD
  2016-10-13 17:06           ` Ning Yao
@ 2016-10-14  2:25             ` Haomai Wang
  2016-10-14  2:31               ` Sage Weil
  0 siblings, 1 reply; 10+ messages in thread
From: Haomai Wang @ 2016-10-14  2:25 UTC (permalink / raw)
  To: Ning Yao; +Cc: sjust, Jeff Liu, ceph-devel

On Fri, Oct 14, 2016 at 1:06 AM, Ning Yao <zay11022@gmail.com> wrote:
> Thanks to Haomai's suggested solutions. What about this:
> https://github.com/mslovy/ceph/commit/539b7998fea16f8af3f6cbbbd243f6996f292acc
> https://github.com/mslovy/ceph/commit/33240080f3324a70a288c79a77846688c1f29db5

 Cool, the fix is looks good to me..

>
> As Haomai described, fiemap is default disabled in previous version
> and may not use in newest version.
> So is it really needed or should we backport the this fix?  any suggestions?
>
> Ping Sam.

Sam is on vacation. @sage's option?

>
> Regards
> Ning Yao
>
>
> 2016-10-12 23:02 GMT+08:00 Haomai Wang <haomai@xsky.com>:
>> thanks to Ning Yao. We have found ceph's incorrect usage in xfs fiemap.
>>
>> Actually this reminds me when I'm looking for unaligned fiemap lookup,
>> we also observe this case. Refer to
>> http://www.spinics.net/lists/xfs/msg38001.html, if fiemap extents
>> larger than 1364, single fiemap call will only return 1364. We need to
>> check the last fiemap extent with FIEMAP_EXTENT_LAST flag. If not, we
>> need to continue to call fiemap.
>>
>> Fortunately 1364 extents requires at least 8MB object but rbd's
>> default object size is 4MB. So if we don't change object size, nothing
>> happen. But I remember openstack glance's default object size is 64MB.
>> So it maybe problem for that case. Since I often advertise rbd users
>> to turn fiemap on, I hope no one don't hit this bug....
>>
>> And one way is fix fiemap usage in GenericFilesystemBackend, another
>> is totally abandon fiemap in hammer. Or we don't need to do anything
>> since fiemap is disable default?
>>
>> Anyway, thanks Ning Yao again!
>>
>>
>> On Fri, Sep 30, 2016 at 11:23 AM, Jeff Liu <jeff.liu@easystack.cn> wrote:
>>> Could you please show your test cases about the fiemap issue against XFS?
>>> I'd like to dig into it if that is still existing in upstream code base.
>>>
>>> On 2016年09月29日 21:49, Ning Yao wrote:
>>>
>>> XFS has #fiemap extent intervals limitted in kernel, so if we do not
>>> use seek_data, seek_hole. It will lead to getting a wrong fiemap
>>> (absence of some extents)  from a large object. It is actually not
>>> security before Jewel with enabling filestore_seek_data_hole.
>>> Regards
>>> Ning Yao
>>>
>>>
>>> 2016-09-29 10:27 GMT+08:00 Haomai Wang <haomai@xsky.com>:
>>>
>>>> On Thu, Sep 29, 2016 at 10:25 AM, Haomai Wang <haomai@xsky.com> wrote:
>>>>> On Thu, Sep 29, 2016 at 12:26 AM, Ning Yao <zay11022@gmail.com> wrote:
>>>>>> Hi,
>>>>>>
>>>>>> As lots of fiemap issues in XFS, fiemap is default disabled now,
>>>>>> especially in Hammer, before seek_data, seek_hole is added.
>>>>>>
>>>>>> But disabling fiemap feature will cause a small sparse object become a
>>>>>> large full object during PushOps, which may lead to notably data
>>>>>> in-balance between OSD, especially on the new added OSD  during data
>>>>>> rebalance. With those full objects, some OSDs may simultaneously
>>>>>> becomes full.
>>>>> Until now, I don't know existing problem with fiemap enabled in
>>>>> hammer. Although we find it maybe problem when clone to a existing
>>>>> overlap data range, but it won't exists in real case.
>>>> Hmm, I can't guarantee this... I only means if you want to have sparse
>>>> object, you can enable this. ....
>>>>
>>>>>> Furthermore, currently, it is impossible to make the full objects
>>>>>> sparse again if we enable the fiemap feature in the future.
>>>>>>
>>>>>> So I think if any solutions to make a full object back to a sparse
>>>>>> object again? One of the idea is to check whether the content in the
>>>>>> object contains consecutive zero and punch zeros for those object
>>>>>> during deep-scrub,  is that possible and reasonable?
>>>>> Obviously it's a complex thing more than we get.
>>>>>
>>>>>>
>>>>>> Regards
>>>>>> Ning Yao
>>>>>> --
>>>>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>>>>> the body of a message to majordomo@vger.kernel.org
>>>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>
>>>
>>> --
>>> Cheers,
>>>
>>> Jeff Liu
>>>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Disable fiemap lead to Data In-balance between OSD
  2016-10-14  2:25             ` Haomai Wang
@ 2016-10-14  2:31               ` Sage Weil
  0 siblings, 0 replies; 10+ messages in thread
From: Sage Weil @ 2016-10-14  2:31 UTC (permalink / raw)
  To: Haomai Wang; +Cc: Ning Yao, sjust, Jeff Liu, ceph-devel

[-- Attachment #1: Type: TEXT/PLAIN, Size: 4993 bytes --]

On Fri, 14 Oct 2016, Haomai Wang wrote:
> On Fri, Oct 14, 2016 at 1:06 AM, Ning Yao <zay11022@gmail.com> wrote:
> > Thanks to Haomai's suggested solutions. What about this:
> > https://github.com/mslovy/ceph/commit/539b7998fea16f8af3f6cbbbd243f6996f292acc
> > https://github.com/mslovy/ceph/commit/33240080f3324a70a288c79a77846688c1f29db5
> 
>  Cool, the fix is looks good to me..
> 
> >
> > As Haomai described, fiemap is default disabled in previous version
> > and may not use in newest version.
> > So is it really needed or should we backport the this fix?  any suggestions?
> >
> > Ping Sam.
> 
> Sam is on vacation. @sage's option?

We may as well backport the fix since someone may have turned it on.  If 
there is a tracker bug open for it we just need to set the backport field 
and it'll get done as part of the normal process!

Thanks-
sage



> 
> >
> > Regards
> > Ning Yao
> >
> >
> > 2016-10-12 23:02 GMT+08:00 Haomai Wang <haomai@xsky.com>:
> >> thanks to Ning Yao. We have found ceph's incorrect usage in xfs fiemap.
> >>
> >> Actually this reminds me when I'm looking for unaligned fiemap lookup,
> >> we also observe this case. Refer to
> >> http://www.spinics.net/lists/xfs/msg38001.html, if fiemap extents
> >> larger than 1364, single fiemap call will only return 1364. We need to
> >> check the last fiemap extent with FIEMAP_EXTENT_LAST flag. If not, we
> >> need to continue to call fiemap.
> >>
> >> Fortunately 1364 extents requires at least 8MB object but rbd's
> >> default object size is 4MB. So if we don't change object size, nothing
> >> happen. But I remember openstack glance's default object size is 64MB.
> >> So it maybe problem for that case. Since I often advertise rbd users
> >> to turn fiemap on, I hope no one don't hit this bug....
> >>
> >> And one way is fix fiemap usage in GenericFilesystemBackend, another
> >> is totally abandon fiemap in hammer. Or we don't need to do anything
> >> since fiemap is disable default?
> >>
> >> Anyway, thanks Ning Yao again!
> >>
> >>
> >> On Fri, Sep 30, 2016 at 11:23 AM, Jeff Liu <jeff.liu@easystack.cn> wrote:
> >>> Could you please show your test cases about the fiemap issue against XFS?
> >>> I'd like to dig into it if that is still existing in upstream code base.
> >>>
> >>> On 2016年09月29日 21:49, Ning Yao wrote:
> >>>
> >>> XFS has #fiemap extent intervals limitted in kernel, so if we do not
> >>> use seek_data, seek_hole. It will lead to getting a wrong fiemap
> >>> (absence of some extents)  from a large object. It is actually not
> >>> security before Jewel with enabling filestore_seek_data_hole.
> >>> Regards
> >>> Ning Yao
> >>>
> >>>
> >>> 2016-09-29 10:27 GMT+08:00 Haomai Wang <haomai@xsky.com>:
> >>>
> >>>> On Thu, Sep 29, 2016 at 10:25 AM, Haomai Wang <haomai@xsky.com> wrote:
> >>>>> On Thu, Sep 29, 2016 at 12:26 AM, Ning Yao <zay11022@gmail.com> wrote:
> >>>>>> Hi,
> >>>>>>
> >>>>>> As lots of fiemap issues in XFS, fiemap is default disabled now,
> >>>>>> especially in Hammer, before seek_data, seek_hole is added.
> >>>>>>
> >>>>>> But disabling fiemap feature will cause a small sparse object become a
> >>>>>> large full object during PushOps, which may lead to notably data
> >>>>>> in-balance between OSD, especially on the new added OSD  during data
> >>>>>> rebalance. With those full objects, some OSDs may simultaneously
> >>>>>> becomes full.
> >>>>> Until now, I don't know existing problem with fiemap enabled in
> >>>>> hammer. Although we find it maybe problem when clone to a existing
> >>>>> overlap data range, but it won't exists in real case.
> >>>> Hmm, I can't guarantee this... I only means if you want to have sparse
> >>>> object, you can enable this. ....
> >>>>
> >>>>>> Furthermore, currently, it is impossible to make the full objects
> >>>>>> sparse again if we enable the fiemap feature in the future.
> >>>>>>
> >>>>>> So I think if any solutions to make a full object back to a sparse
> >>>>>> object again? One of the idea is to check whether the content in the
> >>>>>> object contains consecutive zero and punch zeros for those object
> >>>>>> during deep-scrub,  is that possible and reasonable?
> >>>>> Obviously it's a complex thing more than we get.
> >>>>>
> >>>>>>
> >>>>>> Regards
> >>>>>> Ning Yao
> >>>>>> --
> >>>>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> >>>>>> the body of a message to majordomo@vger.kernel.org
> >>>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >>>
> >>> --
> >>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> >>> the body of a message to majordomo@vger.kernel.org
> >>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >>>
> >>>
> >>> --
> >>> Cheers,
> >>>
> >>> Jeff Liu
> >>>
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2016-10-14  2:31 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-09-28 16:26 Disable fiemap lead to Data In-balance between OSD Ning Yao
2016-09-29  2:25 ` Haomai Wang
2016-09-29  2:27   ` Haomai Wang
2016-09-29 13:49     ` Ning Yao
2016-09-29 13:54       ` Haomai Wang
2016-09-30  3:23       ` Jeff Liu
2016-10-12 15:02         ` Haomai Wang
2016-10-13 17:06           ` Ning Yao
2016-10-14  2:25             ` Haomai Wang
2016-10-14  2:31               ` Sage Weil

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.