* simplifying compound operations
@ 2012-01-13 0:40 Sage Weil
2012-01-17 23:28 ` Gregory Farnum
0 siblings, 1 reply; 2+ messages in thread
From: Sage Weil @ 2012-01-13 0:40 UTC (permalink / raw)
To: ceph-devel
One of the features of the Ceph OSDs is the ability to do complex compound
operations atomically. Things like "write object + set xattr" or "read
xattr + read object data" or even "verify xattr foo = bar, and if so,
write object data".
Compound operations are built as a vector<OSDOp>, where each OSDOp has the
description of the operation and an input bufferlist. For results,
however, results accumulate in a single buffer and the caller is
responsible for picking through the pile for the results of those
operations. That clearly sucks.
Enter wip-op-data-mux branch,
https://github.com/NewDreamNetwork/ceph/commits/wip-op-data-mux
First, we rename OSDOp::data -> indata, and add an outdata. We also add
an rval so you can get the result code for individual operations. The
MOSDOpReply message encoding is updated to pass that over the wire to the
client. (This is upwards and downwards compatible change.)
https://github.com/NewDreamNetwork/ceph/commit/a8558284db178cfaf07a20bc475e7366c7903ffd
The OSD is updated to actually put results in those OSDOp fields:
https://github.com/NewDreamNetwork/ceph/commit/ff55d2f310312bb5390326dcc35961d39ccad416
Conveniently, that change doesn't actually affect the resulting encoding
for old clients, except that the OSDOp::op::payload_len is now filled in
properly--and nobody ever looked at that before. Old (and new) clients
will still see the haystack.
Then, we extend the Objecter (client-side) ObjectOperation methods for
read operations (read(..), getxattr(..), etc.) to take pointer arguments
for where the results go:
https://github.com/NewDreamNetwork/ceph/commit/fe077832b915175b8ed7880c1fa285309c642563
and the Objecter will sift through the haystack and fill those guys in
when the reply comes back.
Finally, the librados ObjectReadOperations are similarly updated:
https://github.com/NewDreamNetwork/ceph/commit/920bd5685c195e0a3c1c3c43f124fe904a73c9b3
Currently the arguments are optional, but I'm updating the rgw callers now
and it makes things _way_ cleaner. It's probably best to make them
required so that librados users don't need to have internal knowledge of
how we encoding things over the wire, e.g. for things like stat().
I haven't finished testing this yet. Mainly I want to know if the
Objecter internal API and librados external API changes make sense.
And/or if there are any alternative ideas about how this should be
approached...
sage
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: simplifying compound operations
2012-01-13 0:40 simplifying compound operations Sage Weil
@ 2012-01-17 23:28 ` Gregory Farnum
0 siblings, 0 replies; 2+ messages in thread
From: Gregory Farnum @ 2012-01-17 23:28 UTC (permalink / raw)
To: Sage Weil; +Cc: ceph-devel
I made some line notes on Github about this series, but apart from
those it looks good; the design makes sense.
One other thing, though: we are going to need to rev the librados
version for this, yes? So it should probably wait until our C bindings
are complete.
-Greg
On Thu, Jan 12, 2012 at 4:40 PM, Sage Weil <sage@newdream.net> wrote:
> One of the features of the Ceph OSDs is the ability to do complex compound
> operations atomically. Things like "write object + set xattr" or "read
> xattr + read object data" or even "verify xattr foo = bar, and if so,
> write object data".
>
> Compound operations are built as a vector<OSDOp>, where each OSDOp has the
> description of the operation and an input bufferlist. For results,
> however, results accumulate in a single buffer and the caller is
> responsible for picking through the pile for the results of those
> operations. That clearly sucks.
>
> Enter wip-op-data-mux branch,
> https://github.com/NewDreamNetwork/ceph/commits/wip-op-data-mux
>
> First, we rename OSDOp::data -> indata, and add an outdata. We also add
> an rval so you can get the result code for individual operations. The
> MOSDOpReply message encoding is updated to pass that over the wire to the
> client. (This is upwards and downwards compatible change.)
>
> https://github.com/NewDreamNetwork/ceph/commit/a8558284db178cfaf07a20bc475e7366c7903ffd
>
> The OSD is updated to actually put results in those OSDOp fields:
>
> https://github.com/NewDreamNetwork/ceph/commit/ff55d2f310312bb5390326dcc35961d39ccad416
>
> Conveniently, that change doesn't actually affect the resulting encoding
> for old clients, except that the OSDOp::op::payload_len is now filled in
> properly--and nobody ever looked at that before. Old (and new) clients
> will still see the haystack.
>
> Then, we extend the Objecter (client-side) ObjectOperation methods for
> read operations (read(..), getxattr(..), etc.) to take pointer arguments
> for where the results go:
>
> https://github.com/NewDreamNetwork/ceph/commit/fe077832b915175b8ed7880c1fa285309c642563
>
> and the Objecter will sift through the haystack and fill those guys in
> when the reply comes back.
>
> Finally, the librados ObjectReadOperations are similarly updated:
>
> https://github.com/NewDreamNetwork/ceph/commit/920bd5685c195e0a3c1c3c43f124fe904a73c9b3
>
> Currently the arguments are optional, but I'm updating the rgw callers now
> and it makes things _way_ cleaner. It's probably best to make them
> required so that librados users don't need to have internal knowledge of
> how we encoding things over the wire, e.g. for things like stat().
>
> I haven't finished testing this yet. Mainly I want to know if the
> Objecter internal API and librados external API changes make sense.
> And/or if there are any alternative ideas about how this should be
> approached...
>
> sage
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2012-01-17 23:28 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-01-13 0:40 simplifying compound operations Sage Weil
2012-01-17 23:28 ` Gregory Farnum
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.