All of lore.kernel.org
 help / color / mirror / Atom feed
* rados import/export
@ 2012-04-30  3:31 Henry C Chang
  2012-04-30 17:51 ` Tommi Virtanen
  2012-04-30 20:50 ` Sage Weil
  0 siblings, 2 replies; 7+ messages in thread
From: Henry C Chang @ 2012-04-30  3:31 UTC (permalink / raw)
  To: ceph-devel

Hi all,

I found one issue when I played around the "rados import/export"
commands. I tried:

1. Export the pool ".users.uid" to a local directory.
2. Delete the pool.
3. Import the pool ".user.uid" from the local directory.

After the pool is restored, the rgw's user info are correct. However,
I failed to list the user's buckets.
Then, I traced the codes a little bit. It seems the problem is that we
do not export/import the omaps (stored in leveldb) of the objects.
Since v0.44, we employed leveldb to store the key/value maps. I think
we need to take care of the omaps as well during backup and restore.

Henry

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: rados import/export
  2012-04-30  3:31 rados import/export Henry C Chang
@ 2012-04-30 17:51 ` Tommi Virtanen
  2012-04-30 20:50 ` Sage Weil
  1 sibling, 0 replies; 7+ messages in thread
From: Tommi Virtanen @ 2012-04-30 17:51 UTC (permalink / raw)
  To: Henry C Chang; +Cc: ceph-devel

On Sun, Apr 29, 2012 at 20:31, Henry C Chang <henry.cy.chang@gmail.com> wrote:
> Then, I traced the codes a little bit. It seems the problem is that we
> do not export/import the omaps (stored in leveldb) of the objects.
> Since v0.44, we employed leveldb to store the key/value maps. I think
> we need to take care of the omaps as well during backup and restore.

Yup, it seems you are correct. I filed ticket
http://tracker.newdream.net/issues/2362 for it.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: rados import/export
  2012-04-30  3:31 rados import/export Henry C Chang
  2012-04-30 17:51 ` Tommi Virtanen
@ 2012-04-30 20:50 ` Sage Weil
  2012-04-30 21:01   ` Tommi Virtanen
  1 sibling, 1 reply; 7+ messages in thread
From: Sage Weil @ 2012-04-30 20:50 UTC (permalink / raw)
  To: Henry C Chang; +Cc: ceph-devel

On Mon, 30 Apr 2012, Henry C Chang wrote:
> Hi all,
> 
> I found one issue when I played around the "rados import/export"
> commands. I tried:
> 
> 1. Export the pool ".users.uid" to a local directory.
> 2. Delete the pool.
> 3. Import the pool ".user.uid" from the local directory.
> 
> After the pool is restored, the rgw's user info are correct. However,
> I failed to list the user's buckets.
> Then, I traced the codes a little bit. It seems the problem is that we
> do not export/import the omaps (stored in leveldb) of the objects.
> Since v0.44, we employed leveldb to store the key/value maps. I think
> we need to take care of the omaps as well during backup and restore.

The real question here is how *should* we be storing omap values when we 
export to/import from regular files?  A specially-marked (with xattr) .db 
file?

A related problem is how to store placement key strings for objects with 
modified placement; I can't remember if we already did something for that 
yet or not (maybe specially-named xattr?).

sage

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: rados import/export
  2012-04-30 20:50 ` Sage Weil
@ 2012-04-30 21:01   ` Tommi Virtanen
  2012-04-30 21:08     ` Sage Weil
  0 siblings, 1 reply; 7+ messages in thread
From: Tommi Virtanen @ 2012-04-30 21:01 UTC (permalink / raw)
  To: Sage Weil; +Cc: Henry C Chang, ceph-devel

On Mon, Apr 30, 2012 at 13:50, Sage Weil <sage@newdream.net> wrote:
> The real question here is how *should* we be storing omap values when we
> export to/import from regular files?  A specially-marked (with xattr) .db
> file?

Does just adding a .db suffix make the name safe? If yes, xattr is not
needed; if no, it's not enough. Reading src/rados_export.cc and
src/rados_sync.cc, I see nothing like that. What if I have objects
"foo" and "foo.db" in a pool?

Note to self: the only thing avoiding collisions for e.g. "foo/" and
"foo$" is the hash.

> A related problem is how to store placement key strings for objects with
> modified placement; I can't remember if we already did something for that
> yet or not (maybe specially-named xattr?).

Commit 717621f66eb7da54c0000ff52985235dc6a17843, combined with nothing
else touching src/rados_export.cc after it, makes me think we don't
preserve the locator.
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: rados import/export
  2012-04-30 21:01   ` Tommi Virtanen
@ 2012-04-30 21:08     ` Sage Weil
  2012-04-30 21:20       ` Tommi Virtanen
  0 siblings, 1 reply; 7+ messages in thread
From: Sage Weil @ 2012-04-30 21:08 UTC (permalink / raw)
  To: Tommi Virtanen; +Cc: Henry C Chang, ceph-devel

[-- Attachment #1: Type: TEXT/PLAIN, Size: 1452 bytes --]

On Mon, 30 Apr 2012, Tommi Virtanen wrote:
> On Mon, Apr 30, 2012 at 13:50, Sage Weil <sage@newdream.net> wrote:
> > The real question here is how *should* we be storing omap values when we
> > export to/import from regular files?  A specially-marked (with xattr) .db
> > file?
> 
> Does just adding a .db suffix make the name safe? If yes, xattr is not
> needed; if no, it's not enough. Reading src/rados_export.cc and
> src/rados_sync.cc, I see nothing like that. What if I have objects
> "foo" and "foo.db" in a pool?

I'm pretty sure we need to mark the object with a magic xattr.  Which 
probably means the filename itself needs to be mangled to avoid colliding 
with other objects.  Probably an xattr on the object file referring to the 
external mangled file with the k/v content.

> Note to self: the only thing avoiding collisions for e.g. "foo/" and
> "foo$" is the hash.

I vaguely recall Colin talking about the trailing /, but we may have 
decided to ignore the problem for now.

> > A related problem is how to store placement key strings for objects with
> > modified placement; I can't remember if we already did something for that
> > yet or not (maybe specially-named xattr?).
> 
> Commit 717621f66eb7da54c0000ff52985235dc6a17843, combined with nothing
> else touching src/rados_export.cc after it, makes me think we don't
> preserve the locator.

Ok, we should fix that at the same time, then.

sage

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: rados import/export
  2012-04-30 21:08     ` Sage Weil
@ 2012-04-30 21:20       ` Tommi Virtanen
  2012-05-02  2:06         ` Colin McCabe
  0 siblings, 1 reply; 7+ messages in thread
From: Tommi Virtanen @ 2012-04-30 21:20 UTC (permalink / raw)
  To: Sage Weil; +Cc: Henry C Chang, ceph-devel

On Mon, Apr 30, 2012 at 14:08, Sage Weil <sage@newdream.net> wrote:
>> Does just adding a .db suffix make the name safe? If yes, xattr is not
>> needed; if no, it's not enough. Reading src/rados_export.cc and
>> src/rados_sync.cc, I see nothing like that. What if I have objects
>> "foo" and "foo.db" in a pool?
>
> I'm pretty sure we need to mark the object with a magic xattr.  Which
> probably means the filename itself needs to be mangled to avoid colliding
> with other objects.  Probably an xattr on the object file referring to the
> external mangled file with the k/v content.

Perhaps make the db file use one of the reserved characters that will
never appear in a plain old rados object. $ is the only one that's
really safe..

>> Note to self: the only thing avoiding collisions for e.g. "foo/" and
>> "foo$" is the hash.
> I vaguely recall Colin talking about the trailing /, but we may have
> decided to ignore the problem for now.

Oh I mean anything that gets mapped to @ can cause collisions. "foo@"
and "foo$" will collide. "foo\n" and "foo\\" will collide.

Frankly, my gut instinct right now is "kill it with fire". Collisions
in object names is a miserable problem to have. This would have been
better off with e.g. a HTTP-style %20 escaping mechanism, or a \x20
style one; one that preserves the byte, and does not just replace it
with '@'.
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: rados import/export
  2012-04-30 21:20       ` Tommi Virtanen
@ 2012-05-02  2:06         ` Colin McCabe
  0 siblings, 0 replies; 7+ messages in thread
From: Colin McCabe @ 2012-05-02  2:06 UTC (permalink / raw)
  To: Tommi Virtanen; +Cc: Sage Weil, Henry C Chang, ceph-devel

(sorry for repost, vger was cranky about the first message)

The code you are interested in is in src/rados_sync.cc.

It would have been impossible to preserve rados object names fully,
since they can be looong (it was either 2k or 4k, I forget), and most
Linux local filesystems like ext3 can only hold 256 bytes in a single
path component.

The solution was to store the true name in an extended attribute of
the locally exported file, and make the local name simply an
approximation of that true name.  If things get hairy, it appends a
hash of the true name to the end of the mangled name.  Rados import
ignores the mangled names, and checks the truename stored in the
user.rados_full_name xattr.

Is this a complete solution?  No.  Two objects could have names that
mangle to the same short name, and also have the same hash code.  If
you are interested in implementing the complete solution, simply use
an incrementing counter rather than a hash.

Another note.  If these omaps you speak of are big, you are probably
stuck using a separate file.  Don't forget the rather short limits
that ext3/ext4 puts on xattrs.  It should be straightforward enough
just to create a $FOO.omap for every $FOO.

cheers,
Colin

On Mon, Apr 30, 2012 at 2:20 PM, Tommi Virtanen
<tommi.virtanen@dreamhost.com> wrote:
>
> On Mon, Apr 30, 2012 at 14:08, Sage Weil <sage@newdream.net> wrote:
> >> Does just adding a .db suffix make the name safe? If yes, xattr is not
> >> needed; if no, it's not enough. Reading src/rados_export.cc and
> >> src/rados_sync.cc, I see nothing like that. What if I have objects
> >> "foo" and "foo.db" in a pool?
> >
> > I'm pretty sure we need to mark the object with a magic xattr.  Which
> > probably means the filename itself needs to be mangled to avoid colliding
> > with other objects.  Probably an xattr on the object file referring to the
> > external mangled file with the k/v content.
>
> Perhaps make the db file use one of the reserved characters that will
> never appear in a plain old rados object. $ is the only one that's
> really safe..
>
> >> Note to self: the only thing avoiding collisions for e.g. "foo/" and
> >> "foo$" is the hash.
> > I vaguely recall Colin talking about the trailing /, but we may have
> > decided to ignore the problem for now.
>
> Oh I mean anything that gets mapped to @ can cause collisions. "foo@"
> and "foo$" will collide. "foo\n" and "foo\\" will collide.
>
> Frankly, my gut instinct right now is "kill it with fire". Collisions
> in object names is a miserable problem to have. This would have been
> better off with e.g. a HTTP-style %20 escaping mechanism, or a \x20
> style one; one that preserves the byte, and does not just replace it
> with '@'.
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2012-05-02  2:06 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-04-30  3:31 rados import/export Henry C Chang
2012-04-30 17:51 ` Tommi Virtanen
2012-04-30 20:50 ` Sage Weil
2012-04-30 21:01   ` Tommi Virtanen
2012-04-30 21:08     ` Sage Weil
2012-04-30 21:20       ` Tommi Virtanen
2012-05-02  2:06         ` Colin McCabe

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.