All of lore.kernel.org
 help / color / mirror / Atom feed
* simplifying compound operations
@ 2012-01-13  0:40 Sage Weil
  2012-01-17 23:28 ` Gregory Farnum
  0 siblings, 1 reply; 2+ messages in thread
From: Sage Weil @ 2012-01-13  0:40 UTC (permalink / raw)
  To: ceph-devel

One of the features of the Ceph OSDs is the ability to do complex compound 
operations atomically.  Things like "write object + set xattr" or "read 
xattr + read object data" or even "verify xattr foo = bar, and if so, 
write object data".

Compound operations are built as a vector<OSDOp>, where each OSDOp has the 
description of the operation and an input bufferlist.  For results, 
however, results accumulate in a single buffer and the caller is 
responsible for picking through the pile for the results of those 
operations.  That clearly sucks.

Enter wip-op-data-mux branch,
	https://github.com/NewDreamNetwork/ceph/commits/wip-op-data-mux

First, we rename OSDOp::data -> indata, and add an outdata.  We also add 
an rval so you can get the result code for individual operations.  The 
MOSDOpReply message encoding is updated to pass that over the wire to the 
client.  (This is upwards and downwards compatible change.)

	https://github.com/NewDreamNetwork/ceph/commit/a8558284db178cfaf07a20bc475e7366c7903ffd

The OSD is updated to actually put results in those OSDOp fields:

	https://github.com/NewDreamNetwork/ceph/commit/ff55d2f310312bb5390326dcc35961d39ccad416

Conveniently, that change doesn't actually affect the resulting encoding 
for old clients, except that the OSDOp::op::payload_len is now filled in 
properly--and nobody ever looked at that before.  Old (and new) clients 
will still see the haystack.

Then, we extend the Objecter (client-side) ObjectOperation methods for 
read operations (read(..), getxattr(..), etc.) to take pointer arguments 
for where the results go:

	https://github.com/NewDreamNetwork/ceph/commit/fe077832b915175b8ed7880c1fa285309c642563

and the Objecter will sift through the haystack and fill those guys in 
when the reply comes back.

Finally, the librados ObjectReadOperations are similarly updated:

	https://github.com/NewDreamNetwork/ceph/commit/920bd5685c195e0a3c1c3c43f124fe904a73c9b3

Currently the arguments are optional, but I'm updating the rgw callers now 
and it makes things _way_ cleaner.  It's probably best to make them 
required so that librados users don't need to have internal knowledge of 
how we encoding things over the wire, e.g. for things like stat().

I haven't finished testing this yet.  Mainly I want to know if the 
Objecter internal API and librados external API changes make sense.  
And/or if there are any alternative ideas about how this should be 
approached...

sage

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: simplifying compound operations
  2012-01-13  0:40 simplifying compound operations Sage Weil
@ 2012-01-17 23:28 ` Gregory Farnum
  0 siblings, 0 replies; 2+ messages in thread
From: Gregory Farnum @ 2012-01-17 23:28 UTC (permalink / raw)
  To: Sage Weil; +Cc: ceph-devel

I made some line notes on Github about this series, but apart from
those it looks good; the design makes sense.

One other thing, though: we are going to need to rev the librados
version for this, yes? So it should probably wait until our C bindings
are complete.
-Greg

On Thu, Jan 12, 2012 at 4:40 PM, Sage Weil <sage@newdream.net> wrote:
> One of the features of the Ceph OSDs is the ability to do complex compound
> operations atomically.  Things like "write object + set xattr" or "read
> xattr + read object data" or even "verify xattr foo = bar, and if so,
> write object data".
>
> Compound operations are built as a vector<OSDOp>, where each OSDOp has the
> description of the operation and an input bufferlist.  For results,
> however, results accumulate in a single buffer and the caller is
> responsible for picking through the pile for the results of those
> operations.  That clearly sucks.
>
> Enter wip-op-data-mux branch,
>        https://github.com/NewDreamNetwork/ceph/commits/wip-op-data-mux
>
> First, we rename OSDOp::data -> indata, and add an outdata.  We also add
> an rval so you can get the result code for individual operations.  The
> MOSDOpReply message encoding is updated to pass that over the wire to the
> client.  (This is upwards and downwards compatible change.)
>
>        https://github.com/NewDreamNetwork/ceph/commit/a8558284db178cfaf07a20bc475e7366c7903ffd
>
> The OSD is updated to actually put results in those OSDOp fields:
>
>        https://github.com/NewDreamNetwork/ceph/commit/ff55d2f310312bb5390326dcc35961d39ccad416
>
> Conveniently, that change doesn't actually affect the resulting encoding
> for old clients, except that the OSDOp::op::payload_len is now filled in
> properly--and nobody ever looked at that before.  Old (and new) clients
> will still see the haystack.
>
> Then, we extend the Objecter (client-side) ObjectOperation methods for
> read operations (read(..), getxattr(..), etc.) to take pointer arguments
> for where the results go:
>
>        https://github.com/NewDreamNetwork/ceph/commit/fe077832b915175b8ed7880c1fa285309c642563
>
> and the Objecter will sift through the haystack and fill those guys in
> when the reply comes back.
>
> Finally, the librados ObjectReadOperations are similarly updated:
>
>        https://github.com/NewDreamNetwork/ceph/commit/920bd5685c195e0a3c1c3c43f124fe904a73c9b3
>
> Currently the arguments are optional, but I'm updating the rgw callers now
> and it makes things _way_ cleaner.  It's probably best to make them
> required so that librados users don't need to have internal knowledge of
> how we encoding things over the wire, e.g. for things like stat().
>
> I haven't finished testing this yet.  Mainly I want to know if the
> Objecter internal API and librados external API changes make sense.
> And/or if there are any alternative ideas about how this should be
> approached...
>
> sage
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2012-01-17 23:28 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-01-13  0:40 simplifying compound operations Sage Weil
2012-01-17 23:28 ` Gregory Farnum

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.