All of lore.kernel.org
 help / color / mirror / Atom feed
* wip-fancy-striping in ceph/ceph-client
@ 2018-02-13  9:58 Ilya Dryomov
  2018-02-13 15:05 ` Alex Elder
  2018-02-16 16:30 ` Maged Mokhtar
  0 siblings, 2 replies; 6+ messages in thread
From: Ilya Dryomov @ 2018-02-13  9:58 UTC (permalink / raw)
  To: Ceph Development; +Cc: Alex Elder, David Disseldorp, Maged Mokhtar

Hello,

I've pushed wip-fancy-striping with my striping v2 patches.  I didn't
want to patch bomb the list and opened a dummy PR for feedback:

  https://github.com/ceph/ceph-client/pull/19

As this is a full striping v2 implementation with adjacent extent
merging and no limitations on parent layouts, the patch set is very
invasive -- it's basically a rewrite of the entire I/O path.  Any
review comments and testing cycles would be greatly appreciated!

Thanks,

                Ilya

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: wip-fancy-striping in ceph/ceph-client
  2018-02-13  9:58 wip-fancy-striping in ceph/ceph-client Ilya Dryomov
@ 2018-02-13 15:05 ` Alex Elder
  2018-02-13 16:24   ` Ilya Dryomov
  2018-02-16 16:30 ` Maged Mokhtar
  1 sibling, 1 reply; 6+ messages in thread
From: Alex Elder @ 2018-02-13 15:05 UTC (permalink / raw)
  To: Ilya Dryomov, Ceph Development; +Cc: David Disseldorp, Maged Mokhtar

On 02/13/2018 03:58 AM, Ilya Dryomov wrote:
> Hello,
> 
> I've pushed wip-fancy-striping with my striping v2 patches.  I didn't
> want to patch bomb the list and opened a dummy PR for feedback:
> 
>   https://github.com/ceph/ceph-client/pull/19
> 
> As this is a full striping v2 implementation with adjacent extent
> merging and no limitations on parent layouts, the patch set is very
> invasive -- it's basically a rewrite of the entire I/O path.  Any
> review comments and testing cycles would be greatly appreciated!

This looks like a nice improvement.  I think the bio improvements
from Kent Overstreet shortly after I finished working on this stuff,
and some of this might have been enabled by that work.  Some of the
multiqueue work helps too.  Regardless, I won't say I've formally
reviewed it but it looks really good to me, improves things beyond
just adding the fancy striping functionality.

I'm only scanning through them right now.  Here are a few simple things:
- Very nice cleanup in  __ceph_calc_file_object_mapping()
- I don't like the ({ zero_bvec(&bv); }) in zero_bios.  It looks wrong,
  but the real problem is the way ceph_bio_iter_advance_step() is defined.
  The caller shouldn't (have to) assume the existence of a symbol having
  a particular name.  I'm not sure how this advance step macro will be
  used elsewhere but can you just supply a function that operates on
  bio_vec pointer instead (possibly null)?  I think that may be the only
  reason you need a macro for pass-by-name, so doing this might allow this
  to become an inline function instead.  (I see later that this code got
  affected; maybe the problem goes away?)
- I'm not sure why OBJ_REQUEST_PAGES was used for stat requests.
- Are CEPH_MSG_DATA_BVECS dependent on CONFIG_BLOCK?  (Though I always
  wondered what system might ever be configured without it.)
- I think you should (have) split "rbd: new request handling code" into
  several patches.  I wanted to see the non-image requests going away
  separately, or the opportunistic write in a stat, for example.
- I like the state machine, and you simplified a lot with that commit.
- It's nice to complete all image requests at once rather than having to
  do so in order.
- Generally I didn't work through the new fancy striping code; too fancy
  for a quick review.
- Nice to see more use of multiple ops in an OSD request.  As I recall
  the way it's implemented is pretty limited; I wanted to generalize it
  but never got that far.
- I think the reference counting between parent and child images was
  done safely, but I'm glad you were able to reason away the need for
  some of it.

I'd love to give this a much deeper review but it'll take a full day
at some point and I don't have that right now.  (I've already used
time I probably can't afford...)  I can probably make time though if
you think it's final.  

					-Alex

> Thanks,
> 
>                 Ilya
> 


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: wip-fancy-striping in ceph/ceph-client
  2018-02-13 15:05 ` Alex Elder
@ 2018-02-13 16:24   ` Ilya Dryomov
  2018-02-13 16:38     ` Alex Elder
  0 siblings, 1 reply; 6+ messages in thread
From: Ilya Dryomov @ 2018-02-13 16:24 UTC (permalink / raw)
  To: Alex Elder; +Cc: Ceph Development, David Disseldorp, Maged Mokhtar

On Tue, Feb 13, 2018 at 4:05 PM, Alex Elder <elder@linaro.org> wrote:
> On 02/13/2018 03:58 AM, Ilya Dryomov wrote:
>> Hello,
>>
>> I've pushed wip-fancy-striping with my striping v2 patches.  I didn't
>> want to patch bomb the list and opened a dummy PR for feedback:
>>
>>   https://github.com/ceph/ceph-client/pull/19
>>
>> As this is a full striping v2 implementation with adjacent extent
>> merging and no limitations on parent layouts, the patch set is very
>> invasive -- it's basically a rewrite of the entire I/O path.  Any
>> review comments and testing cycles would be greatly appreciated!
>
> This looks like a nice improvement.  I think the bio improvements
> from Kent Overstreet shortly after I finished working on this stuff,
> and some of this might have been enabled by that work.  Some of the

Yeah, all of the new iterator stuff is based on Kent's immutable
bio_vecs work.

> multiqueue work helps too.  Regardless, I won't say I've formally
> reviewed it but it looks really good to me, improves things beyond
> just adding the fancy striping functionality.
>
> I'm only scanning through them right now.  Here are a few simple things:
> - Very nice cleanup in  __ceph_calc_file_object_mapping()
> - I don't like the ({ zero_bvec(&bv); }) in zero_bios.  It looks wrong,
>   but the real problem is the way ceph_bio_iter_advance_step() is defined.
>   The caller shouldn't (have to) assume the existence of a symbol having
>   a particular name.  I'm not sure how this advance step macro will be
>   used elsewhere but can you just supply a function that operates on
>   bio_vec pointer instead (possibly null)?  I think that may be the only
>   reason you need a macro for pass-by-name, so doing this might allow this
>   to become an inline function instead.  (I see later that this code got
>   affected; maybe the problem goes away?)

These macros were private to rbd.c and weren't supposed to be used
anywhere else.  I pulled them into messenger.h during last minute
refactoring, because they ended up being not rbd specific after all.
ceph_bvec_iter in particular might be used in the filesystem, which
also needs a fully-featured striping implementation.

Here is a more advanced example, from copy_bio_bvecs():

        ceph_bio_iter_advance_step(it, bytes, ({
                obj_req->bvec_pos.bvecs[obj_req->bvec_idx++] = bv;
                obj_req->bvec_pos.iter.bi_size += bv.bv_len;
        }));

It could certainly take a (struct bio_vec *bv, void *arg) function
instead, but I'm not sure if it's that much better or how good a job
gcc would do in that case.

> - I'm not sure why OBJ_REQUEST_PAGES was used for stat requests.
> - Are CEPH_MSG_DATA_BVECS dependent on CONFIG_BLOCK?  (Though I always
>   wondered what system might ever be configured without it.)

No.  struct bio_vec is a big misnomer -- it has nothing to do with
bios.  It should be struct part_of_page or something like that.

> - I think you should (have) split "rbd: new request handling code" into
>   several patches.  I wanted to see the non-image requests going away
>   separately, or the opportunistic write in a stat, for example.

The opportunistic write is what let non-image requests go -- it's just
a side effect, not an actual change.  There is no good way to split
that commit into smaller pieces, at least not without adding a bunch of
new callback based code and then removing it in the next commit.

> - I like the state machine, and you simplified a lot with that commit.
> - It's nice to complete all image requests at once rather than having to
>   do so in order.
> - Generally I didn't work through the new fancy striping code; too fancy
>   for a quick review.

:-)

> - Nice to see more use of multiple ops in an OSD request.  As I recall
>   the way it's implemented is pretty limited; I wanted to generalize it
>   but never got that far.

It's fine for regular OSD ops, but issues crop up when you try to
combine multiple method call ops.

> - I think the reference counting between parent and child images was
>   done safely, but I'm glad you were able to reason away the need for
>   some of it.
>
> I'd love to give this a much deeper review but it'll take a full day
> at some point and I don't have that right now.  (I've already used
> time I probably can't afford...)  I can probably make time though if
> you think it's final.

It's final.  I'd definitely appreciate a deeper review.

Thanks,

                Ilya

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: wip-fancy-striping in ceph/ceph-client
  2018-02-13 16:24   ` Ilya Dryomov
@ 2018-02-13 16:38     ` Alex Elder
  0 siblings, 0 replies; 6+ messages in thread
From: Alex Elder @ 2018-02-13 16:38 UTC (permalink / raw)
  To: Ilya Dryomov; +Cc: Ceph Development, David Disseldorp, Maged Mokhtar

On 02/13/2018 10:24 AM, Ilya Dryomov wrote:
>> I'd love to give this a much deeper review but it'll take a full day
>> at some point and I don't have that right now.  (I've already used
>> time I probably can't afford...)  I can probably make time though if
>> you think it's final.
> It's final.  I'd definitely appreciate a deeper review.

OK.  Let me see if I can get to it by the end of this week.  I'm
very interested, I just have a lot on my plate right now and a
looming deadline.

					-Alex

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: wip-fancy-striping in ceph/ceph-client
  2018-02-13  9:58 wip-fancy-striping in ceph/ceph-client Ilya Dryomov
  2018-02-13 15:05 ` Alex Elder
@ 2018-02-16 16:30 ` Maged Mokhtar
  2018-02-17  8:44   ` Ilya Dryomov
  1 sibling, 1 reply; 6+ messages in thread
From: Maged Mokhtar @ 2018-02-16 16:30 UTC (permalink / raw)
  To: Ilya Dryomov; +Cc: Ceph Development, Alex Elder, David Disseldorp

On 2018-02-13 11:58, Ilya Dryomov wrote:

> Hello,
> 
> I've pushed wip-fancy-striping with my striping v2 patches.  I didn't
> want to patch bomb the list and opened a dummy PR for feedback:
> 
> https://github.com/ceph/ceph-client/pull/19
> 
> As this is a full striping v2 implementation with adjacent extent
> merging and no limitations on parent layouts, the patch set is very
> invasive -- it's basically a rewrite of the entire I/O path.  Any
> review comments and testing cycles would be greatly appreciated!
> 
> Thanks,
> 
> Ilya

Hi Ilya,

I looked at the patches, it is not a small change but is very nicely 
done. i was surprised since even though you added complex feature 
support, the new code is less (in rbd.c) and is clearer. The different 
fill from functions and different osd request creation functions is a 
very clean design.

If i may say any minor/cosmetic comments:

Out of all the old code i was happy to see removed, the only thing i 
would "miss" is the obj_requests list in the rbd_img_request struct. It 
is not needed now, but i tend to think it is logical to have a place to 
find all osd requests for a particular top block layer request. it maybe 
just a warm a and fuzzy feeling to have this control, but it could be 
useful for debugging or if in the future we think of implementing a task 
abort operation and need to find the osd inflight ops in a quick way.

Another cosmetic thing, i tend to think the new object <-> file extent 
conversion be placed in a separate new file rather than osdmap.c

I would not be able to do full test any time soon, but i do intend to do 
some manual testing of basic functionality next week. i have in mind 
writing data pattern with each value repeating the size of stripe unit 
and doing a mix of read/write interchangeably between rbd and librbd, i 
would also test copyup functionality in clones. I will let you know if i 
do find any issues.

Cheers
Maged

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: wip-fancy-striping in ceph/ceph-client
  2018-02-16 16:30 ` Maged Mokhtar
@ 2018-02-17  8:44   ` Ilya Dryomov
  0 siblings, 0 replies; 6+ messages in thread
From: Ilya Dryomov @ 2018-02-17  8:44 UTC (permalink / raw)
  To: Maged Mokhtar; +Cc: Ceph Development, Alex Elder, David Disseldorp

On Fri, Feb 16, 2018 at 5:30 PM, Maged Mokhtar <mmokhtar@petasan.org> wrote:
> On 2018-02-13 11:58, Ilya Dryomov wrote:
>
>> Hello,
>>
>> I've pushed wip-fancy-striping with my striping v2 patches.  I didn't
>> want to patch bomb the list and opened a dummy PR for feedback:
>>
>> https://github.com/ceph/ceph-client/pull/19
>>
>> As this is a full striping v2 implementation with adjacent extent
>> merging and no limitations on parent layouts, the patch set is very
>> invasive -- it's basically a rewrite of the entire I/O path.  Any
>> review comments and testing cycles would be greatly appreciated!
>>
>> Thanks,
>>
>> Ilya
>
>
> Hi Ilya,
>
> I looked at the patches, it is not a small change but is very nicely done. i
> was surprised since even though you added complex feature support, the new
> code is less (in rbd.c) and is clearer. The different fill from functions
> and different osd request creation functions is a very clean design.
>
> If i may say any minor/cosmetic comments:
>
> Out of all the old code i was happy to see removed, the only thing i would
> "miss" is the obj_requests list in the rbd_img_request struct. It is not
> needed now, but i tend to think it is logical to have a place to find all
> osd requests for a particular top block layer request. it maybe just a warm
> a and fuzzy feeling to have this control, but it could be useful for
> debugging or if in the future we think of implementing a task abort
> operation and need to find the osd inflight ops in a quick way.

It's still there -- rbd_img_request::object_extents.  It holds object
extents, which are embedded in object requests.  As before, each object
request holds an OSD request.

>
> Another cosmetic thing, i tend to think the new object <-> file extent
> conversion be placed in a separate new file rather than osdmap.c

ceph_calc_file_object_mapping() is in osdmap.c, so I just piled on.
You are right though, it has nothing to do with osdmaps.

>
> I would not be able to do full test any time soon, but i do intend to do
> some manual testing of basic functionality next week. i have in mind writing
> data pattern with each value repeating the size of stripe unit and doing a
> mix of read/write interchangeably between rbd and librbd, i would also test
> copyup functionality in clones. I will let you know if i do find any issues.

Thanks!

                Ilya

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2018-02-17  8:44 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-02-13  9:58 wip-fancy-striping in ceph/ceph-client Ilya Dryomov
2018-02-13 15:05 ` Alex Elder
2018-02-13 16:24   ` Ilya Dryomov
2018-02-13 16:38     ` Alex Elder
2018-02-16 16:30 ` Maged Mokhtar
2018-02-17  8:44   ` Ilya Dryomov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.