All of lore.kernel.org
 help / color / mirror / Atom feed
* Auto-striping feature
@ 2011-01-25 11:53 Tatsuya Kawano
  2011-01-26  0:15 ` Gregory Farnum
  0 siblings, 1 reply; 6+ messages in thread
From: Tatsuya Kawano @ 2011-01-25 11:53 UTC (permalink / raw)
  To: ceph-devel


Hi, 

I have some questions about auto-striping feature in Ceph. 

- What is the default striping size?
- How can I specify the striping size for a specific file (via libceph and kernel driver)? 
- How many PGs will be involved on striping one file. 


I'm writing several files to Ceph and the size of each file will be about 64MB. There will be 10 to 20 OSDs in the cluster. I wonder how each file will be divided into objects and how these objects will be distributed in the cluster.

Thanks, 

--
Tatsuya Kawano (Mr.)
Tokyo, Japan


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Auto-striping feature
  2011-01-25 11:53 Auto-striping feature Tatsuya Kawano
@ 2011-01-26  0:15 ` Gregory Farnum
  2011-01-26  4:32   ` Sage Weil
  2011-01-26 11:56   ` Tatsuya Kawano
  0 siblings, 2 replies; 6+ messages in thread
From: Gregory Farnum @ 2011-01-26  0:15 UTC (permalink / raw)
  To: Tatsuya Kawano; +Cc: ceph-devel

On Tue, Jan 25, 2011 at 3:53 AM, Tatsuya Kawano <tatsuya6502@gmail.com> wrote:
>
> Hi,
>
> I have some questions about auto-striping feature in Ceph.
>
> - What is the default striping size?
The default is to stripe the file across 4MB objects, 4MB at a time.
You can also define your own striping strategy using cephfs. Make sure
that "stripe_unit" * "stripe_count" equals "object_size".

> - How can I specify the striping size for a specific file (via libceph and kernel driver)?
In the kernel, use the cephfs tool. It lets you use ioctls to specify
a single file layout or to define the default layout for newly created
files in a subtree of the fs. You can't do it in cfuse, unfortunately.
(Although you can set the default using the kernel client and cfuse
will follow that setting correctly.) If you're writing your own
application using libceph, you can also set it; use the cephfs source
as a model.

> - How many PGs will be involved on striping one file.
That depends on how large the file is, and is pseudorandom.
>
>
> I'm writing several files to Ceph and the size of each file will be about 64MB. There will be 10 to 20 OSDs in the cluster. I wonder how each file will be divided into objects and how these objects will be distributed in the cluster.
Well, the files will be divided into objects on 4MB blocks. (The last
block may be short.) The objects will be distributed pseudorandomly
into "placement groups" and those placement groups will be
pseudorandomly distributed across the OSDs in the cluster. If you're
interested in the specifics of how this works, I'd recommend reading
Sage's thesis, available on the Ceph website.
-Greg

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Auto-striping feature
  2011-01-26  0:15 ` Gregory Farnum
@ 2011-01-26  4:32   ` Sage Weil
  2011-01-26  4:34     ` tsuna
  2011-01-26 11:56   ` Tatsuya Kawano
  1 sibling, 1 reply; 6+ messages in thread
From: Sage Weil @ 2011-01-26  4:32 UTC (permalink / raw)
  To: Gregory Farnum; +Cc: Tatsuya Kawano, ceph-devel

On Tue, 25 Jan 2011, Gregory Farnum wrote:
> On Tue, Jan 25, 2011 at 3:53 AM, Tatsuya Kawano <tatsuya6502@gmail.com> wrote:
> >
> > Hi,
> >
> > I have some questions about auto-striping feature in Ceph.
> >
> > - What is the default striping size?
> The default is to stripe the file across 4MB objects, 4MB at a time.
> You can also define your own striping strategy using cephfs. Make sure
> that "stripe_unit" * "stripe_count" equals "object_size".

Actually, you just need to make sure that object_size is a multiple of 
stripe_unit.  The striping strategy puts each stripe_unit on stripe_count 
consecutive objects until they reach a size of stripe_count; it then moves 
onto the next set of stripe_count objects.  The default degenerate case 
has stripe_unit == object_size and stripe_count == 1.

sage

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Auto-striping feature
  2011-01-26  4:32   ` Sage Weil
@ 2011-01-26  4:34     ` tsuna
  2011-01-26  5:20       ` Sage Weil
  0 siblings, 1 reply; 6+ messages in thread
From: tsuna @ 2011-01-26  4:34 UTC (permalink / raw)
  To: Sage Weil; +Cc: Gregory Farnum, Tatsuya Kawano, ceph-devel

On Tue, Jan 25, 2011 at 8:32 PM, Sage Weil <sage@newdream.net> wrote:
> Actually, you just need to make sure that object_size is a multiple of
> stripe_unit.  The striping strategy puts each stripe_unit on stripe_count
> consecutive objects until they reach a size of stripe_count; it then moves

Did you mean "until they reach a size of object_size"?

> onto the next set of stripe_count objects.  The default degenerate case
> has stripe_unit == object_size and stripe_count == 1.

-- 
Benoit "tsuna" Sigoure
Software Engineer @ www.StumbleUpon.com
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Auto-striping feature
  2011-01-26  4:34     ` tsuna
@ 2011-01-26  5:20       ` Sage Weil
  0 siblings, 0 replies; 6+ messages in thread
From: Sage Weil @ 2011-01-26  5:20 UTC (permalink / raw)
  To: tsuna; +Cc: Gregory Farnum, Tatsuya Kawano, ceph-devel

[-- Attachment #1: Type: TEXT/PLAIN, Size: 858 bytes --]

On Tue, 25 Jan 2011, tsuna wrote:
> On Tue, Jan 25, 2011 at 8:32 PM, Sage Weil <sage@newdream.net> wrote:
> > Actually, you just need to make sure that object_size is a multiple of
> > stripe_unit.  The striping strategy puts each stripe_unit on stripe_count
> > consecutive objects until they reach a size of stripe_count; it then moves
> 
> Did you mean "until they reach a size of object_size"?

Yeah :)

sage

> 
> > onto the next set of stripe_count objects.  The default degenerate case
> > has stripe_unit == object_size and stripe_count == 1.
> 
> -- 
> Benoit "tsuna" Sigoure
> Software Engineer @ www.StumbleUpon.com
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Auto-striping feature
  2011-01-26  0:15 ` Gregory Farnum
  2011-01-26  4:32   ` Sage Weil
@ 2011-01-26 11:56   ` Tatsuya Kawano
  1 sibling, 0 replies; 6+ messages in thread
From: Tatsuya Kawano @ 2011-01-26 11:56 UTC (permalink / raw)
  To: Gregory Farnum; +Cc: ceph-devel

Thanks, Greg and Sage for the answers. I just started to read the Distributed Object Storage chapter of Sage's thesis as well.

--
Tatsuya Kawano (Mr.)
Tokyo, Japan


On Jan 26, 2011, at 9:15 AM, Gregory Farnum <gregf@hq.newdream.net> wrote:

> On Tue, Jan 25, 2011 at 3:53 AM, Tatsuya Kawano <tatsuya6502@gmail.com> wrote:
>> 
>> Hi,
>> 
>> I have some questions about auto-striping feature in Ceph.
>> 
>> - What is the default striping size?
> The default is to stripe the file across 4MB objects, 4MB at a time.
> You can also define your own striping strategy using cephfs. Make sure
> that "stripe_unit" * "stripe_count" equals "object_size".
> 
>> - How can I specify the striping size for a specific file (via libceph and kernel driver)?
> In the kernel, use the cephfs tool. It lets you use ioctls to specify
> a single file layout or to define the default layout for newly created
> files in a subtree of the fs. You can't do it in cfuse, unfortunately.
> (Although you can set the default using the kernel client and cfuse
> will follow that setting correctly.) If you're writing your own
> application using libceph, you can also set it; use the cephfs source
> as a model.
> 
>> - How many PGs will be involved on striping one file.
> That depends on how large the file is, and is pseudorandom.
>> 
>> 
>> I'm writing several files to Ceph and the size of each file will be about 64MB. There will be 10 to 20 OSDs in the cluster. I wonder how each file will be divided into objects and how these objects will be distributed in the cluster.
> Well, the files will be divided into objects on 4MB blocks. (The last
> block may be short.) The objects will be distributed pseudorandomly
> into "placement groups" and those placement groups will be
> pseudorandomly distributed across the OSDs in the cluster. If you're
> interested in the specifics of how this works, I'd recommend reading
> Sage's thesis, available on the Ceph website.
> -Greg

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2011-01-26 11:57 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-01-25 11:53 Auto-striping feature Tatsuya Kawano
2011-01-26  0:15 ` Gregory Farnum
2011-01-26  4:32   ` Sage Weil
2011-01-26  4:34     ` tsuna
2011-01-26  5:20       ` Sage Weil
2011-01-26 11:56   ` Tatsuya Kawano

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.