All of lore.kernel.org
 help / color / mirror / Atom feed
* rgw object expiration
@ 2014-08-19 22:51 Yehuda Sadeh
  0 siblings, 0 replies; only message in thread
From: Yehuda Sadeh @ 2014-08-19 22:51 UTC (permalink / raw)
  To: ceph-devel; +Cc: Sage Weil, Neil Levine

We've discussed this feature briefly in the past, and it might be time
to look at the design a bit. The S3 and Swift features differ quite a
bit, so let's have a look at both:
S3:

Object expiration is part of a larger bucket lifecycle management
feature. This allows setting rules on a bucket that specify what to do
with specific objects (that have a specified prefix), after an amount
of time. Objects can either be removed, or transferred into a
secondary storage. The objects can either be current and expire, or
(in the case of versioned buckets) can be non-current. Bucket
lifecycle rules can be added, and removed, and they affect *all*
objects in the bucket, including objects created before the rules were
created. An interesting property is that users are not billed for
expired objects, even if the (async) removal process has not removed
them yet.

Swift:

The Swift objects expiration is set at the object level. It is
possible to set a specific header that will set expiration time for
the object. An async process will then garbage collect the object. An
expired object cannot be read anymore (although it is possible that it
can be listed, and removed by the user).

Looking at both features, it is possible to define a superset. That
is, provide both the S3 bucket-level lifecycle management, and the
swift object-level expiration scheme.

rgw implementation:

Object level expiration, a'la Swift:

 - A new maintenance thread, similar to the garbage collector will be
created. The thread will be used to apply deferred operations.
 - A new maintenance log will be created. The log will be sharded, and
entries there will be indexed by both timestamp, and

maintenance thread will work as follows: try to lock a shard, read
shard, operate, unlock

 - an object could be assigned with an expiration timestamp

When an object is set to expire, we'll update the maintenance log with
its id, and the timestamp. Note that we'll also keep note of the
object instance's tag, so that if the object is overwritten, we won't
remove the new instance. When updating the maintenance log, we'll
remove any existing entry for the same object.

 - when reading an object, we'll check to see if it's expired so that
we return a proper response
 - maintenance log will read entries, up until current timestamp, and
issue object removal for each of these entries

The S3 object expiration is much more complicated. It will still use
the same maintenance thread. Now, we'll need to decide whether we want
to provide a strong accounting functionality similar to S3 (objects
are not accounted if need to expire, even if were not garbage
collected yet), as it will affect the implementation.

Relaxed accounting:
 - Bucket rules list will be versioned. Each rule change will bump up
this version. Each rule will have the version in which it was created.
 - When adding a rule on a bucket, create a maintenance job that will
add relevant objects in this bucket to the list, and the rule (and
version) it applies to
 - When removing objects that apply to a specific rule, the
maintenance thread will verify that this rule+version is still active
 - Adding an object within a bucket, will add an appropriate entry in
the maintenance log, if applicable

Strict accounting:
 - Do we really want this?
 - bucket index will need to add accounting adjustments (by timestmap)
 - an object that is set to expire, will be added to the adjustments
record (by the timestamp). When the object is removed, it'll be
deducted from that record
 - when getting bucket's stats, we'll also get the adjustment
accounting (up until the relevant timestamp)
 - open question: how to update the quota


Let me know if this makes any sense.

Yehuda

^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2014-08-19 22:51 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-08-19 22:51 rgw object expiration Yehuda Sadeh

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.