All of lore.kernel.org
 help / color / mirror / Atom feed
* Ceph for email storage
@ 2012-07-04 18:29 Mitsue Acosta Murakami
  2012-07-04 19:10 ` Gregory Farnum
  2012-07-04 20:40 ` rados mailbox? (was Re: Ceph for email storage) Sage Weil
  0 siblings, 2 replies; 6+ messages in thread
From: Mitsue Acosta Murakami @ 2012-07-04 18:29 UTC (permalink / raw)
  To: ceph-devel

Hello,

We are examining Ceph to use as email storage. In our current system, 
several clients servers with different services (imap, smtp, etc) access 
a NFS storage server. The mailboxes are stored in Maildir format, with 
many small files. We use Amazon AWS EC2 for clients and storage server. 
In this scenario, we have some questions about Ceph:

1. Is Ceph recommended for heavy write/read of small files?

2. Is there any problem in installing Ceph on Amazon instances?

3. Does Ceph already support quota?

4. What File System would you encourage us to use?


Thanks in advance,

-- 
Mitsue Acosta Murakami



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Ceph for email storage
  2012-07-04 18:29 Ceph for email storage Mitsue Acosta Murakami
@ 2012-07-04 19:10 ` Gregory Farnum
  2012-07-04 20:40 ` rados mailbox? (was Re: Ceph for email storage) Sage Weil
  1 sibling, 0 replies; 6+ messages in thread
From: Gregory Farnum @ 2012-07-04 19:10 UTC (permalink / raw)
  To: Mitsue Acosta Murakami; +Cc: ceph-devel

On Wednesday, July 4, 2012 at 11:29 AM, Mitsue Acosta Murakami wrote:
> Hello,
> 
> We are examining Ceph to use as email storage. In our current system, 
> several clients servers with different services (imap, smtp, etc) access 
> a NFS storage server. The mailboxes are stored in Maildir format, with 
> many small files. We use Amazon AWS EC2 for clients and storage server. 
> In this scenario, we have some questions about Ceph:
> 
> 1. Is Ceph recommended for heavy write/read of small files?
> 
> 2. Is there any problem in installing Ceph on Amazon instances?
> 
> 3. Does Ceph already support quota?
> 
> 4. What File System would you encourage us to use?
Are you interested in using RBD to back your mail servers, or in using the Ceph FS to provide shared storage? Ceph FS isn't considered production-ready at this time, but RBD should be, for appropriate use cases.

In general:
1) If you allow your caching layers to do their job, any Ceph system should handle small writes fine. Reads will require normal disk accesses.
2) There shouldn't be.
3) None of the Ceph systems support quotas right now, although CephFS does easy usage reports.
4) Assuming you mean for the OSDs, XFS seems to be your best bet right now, but we work to make Ceph perform as well as possible under btrfs and ext4 too.
-Greg



^ permalink raw reply	[flat|nested] 6+ messages in thread

* rados mailbox? (was Re: Ceph for email storage)
  2012-07-04 18:29 Ceph for email storage Mitsue Acosta Murakami
  2012-07-04 19:10 ` Gregory Farnum
@ 2012-07-04 20:40 ` Sage Weil
  2012-07-05 14:07   ` Wido den Hollander
  2012-07-10  5:45   ` Kristofer
  1 sibling, 2 replies; 6+ messages in thread
From: Sage Weil @ 2012-07-04 20:40 UTC (permalink / raw)
  To: Mitsue Acosta Murakami; +Cc: ceph-devel

Although Ceph fs would technically work for storing mail with maildir, 
when you step back from the situation, Maildir + a distributed file system 
is a pretty terrible way to approach mail storage.  Maildir was designed 
to work around the limited consistency of NFS, and manages that, but 
performs pretty horribly on almost any file system.  Mostly this is due to 
the message-per-file approach and the fact that file systems' internal 
management of inodes and directories mean lots and lots of seeks, even to 
read message headers.  Ceph's MDS will probably do better than most due to 
its embedded inodes, but it's hardly ideal.

However, and idea that has been kicking around here is building a mail 
storage system directly on top of RADOS.  In principle, it should be a 
relatively straightforward matter of implementing a library and plugging 
it into the storage backend for something like Dovecot, or any other mail 
system (delivery agent and/or IMAP/POP frontend) with a pluggable backend.  
(I think postfix has pluggable delivery agents, but that's about where my 
experience in this area runs out.)

The basic idea is this:

 - each mail message is a rados object, and immutable.
 - each mailbox is an index of messages, stored in a rados object.
   - the index consists of omap records, one for each message.
   - the key is some unique id
   - the value is a copy of (a useful subset of) the message headers

This has a number of nice properties:

 - you can efficiently list messages in the mailbox using the omap 
   operations
 - you can (more) efficiently search messages (everything but the message 
   body) based on the index contents (since it's all stored in one object)
 - you can efficiently grab recent messages with the omap ops (e.g., list 
   keys > last_seen_msgid)
 - moving messages between folders involves updating the indices only; the
   messages objects need not be copied/moved.
 - no metadata bottleneck: mailbox indices are distributed across the 
   entire cluster, just like the mail.
 - all the scaling benefits of rados for a growing mail system.

I don't know enough about what exactly the mail storage backends need to 
support to know what issues will come up.  Presumably there are several.  
E.g., if you delete a message, is the IMAP client expected to discover 
that efficiently?  And do the mail storage backends attempt to do it 
efficiently?

This also doesn't solve the problem of efficiently indexing/searching the 
bodies of messages, although I suspect that indexing could be efficiently 
implemented on top of this scheme.

So, a non-trivial project, but probably one that can be prototyped without 
that much pain, and one that would perform and scale drastically better 
than existing solutions I'm aware of.

I'm hoping there are some motivated hackers lurking who understand the 
pain that is maildir/mail infrastructure...

sage



On Wed, 4 Jul 2012, Mitsue Acosta Murakami wrote:

> Hello,
> 
> We are examining Ceph to use as email storage. In our current system, several
> clients servers with different services (imap, smtp, etc) access a NFS storage
> server. The mailboxes are stored in Maildir format, with many small files. We
> use Amazon AWS EC2 for clients and storage server. In this scenario, we have
> some questions about Ceph:
> 
> 1. Is Ceph recommended for heavy write/read of small files?
> 
> 2. Is there any problem in installing Ceph on Amazon instances?
> 
> 3. Does Ceph already support quota?
> 
> 4. What File System would you encourage us to use?
> 
> 
> Thanks in advance,
> 
> -- 
> Mitsue Acosta Murakami
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: rados mailbox? (was Re: Ceph for email storage)
  2012-07-04 20:40 ` rados mailbox? (was Re: Ceph for email storage) Sage Weil
@ 2012-07-05 14:07   ` Wido den Hollander
  2012-07-10  5:45   ` Kristofer
  1 sibling, 0 replies; 6+ messages in thread
From: Wido den Hollander @ 2012-07-05 14:07 UTC (permalink / raw)
  To: Sage Weil; +Cc: Mitsue Acosta Murakami, ceph-devel

On 04-07-12 22:40, Sage Weil wrote:
> Although Ceph fs would technically work for storing mail with maildir,
> when you step back from the situation, Maildir + a distributed file system
> is a pretty terrible way to approach mail storage.  Maildir was designed
> to work around the limited consistency of NFS, and manages that, but
> performs pretty horribly on almost any file system.  Mostly this is due to
> the message-per-file approach and the fact that file systems' internal
> management of inodes and directories mean lots and lots of seeks, even to
> read message headers.  Ceph's MDS will probably do better than most due to
> its embedded inodes, but it's hardly ideal.
>
> However, and idea that has been kicking around here is building a mail
> storage system directly on top of RADOS.  In principle, it should be a
> relatively straightforward matter of implementing a library and plugging
> it into the storage backend for something like Dovecot, or any other mail
> system (delivery agent and/or IMAP/POP frontend) with a pluggable backend.
> (I think postfix has pluggable delivery agents, but that's about where my
> experience in this area runs out.)

When you first told me the idea about a couple of months ago I took a 
look at the Dovecot code and it's not that trivial to implement.

It seems that mbox and Maildir are pretty hardcoded in Dovecot, but 
there is an advantage:

You can use Dovecot as your LDA/VDA (Local/Virtual Delivery Agent) for 
Postfix, so you'd only have to implement this library in Dovecot and 
you'd be able to handle IMAP, POP3 and Delivery of e-mails to RADOS.

Source: http://wiki.dovecot.org/LDA/Postfix

>
> The basic idea is this:
>
>   - each mail message is a rados object, and immutable.
>   - each mailbox is an index of messages, stored in a rados object.
>     - the index consists of omap records, one for each message.
>     - the key is some unique id
>     - the value is a copy of (a useful subset of) the message headers
>
> This has a number of nice properties:
>
>   - you can efficiently list messages in the mailbox using the omap
>     operations
>   - you can (more) efficiently search messages (everything but the message
>     body) based on the index contents (since it's all stored in one object)
>   - you can efficiently grab recent messages with the omap ops (e.g., list
>     keys > last_seen_msgid)
>   - moving messages between folders involves updating the indices only; the
>     messages objects need not be copied/moved.
>   - no metadata bottleneck: mailbox indices are distributed across the
>     entire cluster, just like the mail.
>   - all the scaling benefits of rados for a growing mail system.
>
> I don't know enough about what exactly the mail storage backends need to
> support to know what issues will come up.  Presumably there are several.
> E.g., if you delete a message, is the IMAP client expected to discover
> that efficiently?  And do the mail storage backends attempt to do it
> efficiently?

With IMAP a message gets marked as deleted until your do a "PURGE", that 
will actually remove the message,

Problem with IMAP clients however is that there are a lot of bugs in 
them, especially outlook.

But if you can somehow plug into Dovecot and only handle the calls that 
it's doing you should be fine.

>
> This also doesn't solve the problem of efficiently indexing/searching the
> bodies of messages, although I suspect that indexing could be efficiently
> implemented on top of this scheme.
>

Nowadays most clients keep a local cache, at least Thunderbird does and 
uses that for local search. Much faster!

Webmail clients like RoundCube have a local cache as well and 
applications like OpenXchange also have local caches.

> So, a non-trivial project, but probably one that can be prototyped without
> that much pain, and one that would perform and scale drastically better
> than existing solutions I'm aware of.

Yes, MUCH better than Maildir over CephFS or NFS.

>
> I'm hoping there are some motivated hackers lurking who understand the
> pain that is maildir/mail infrastructure...
>

Plenty of motivation, not enough time I think.

Wido

> sage
>
>
>
> On Wed, 4 Jul 2012, Mitsue Acosta Murakami wrote:
>
>> Hello,
>>
>> We are examining Ceph to use as email storage. In our current system, several
>> clients servers with different services (imap, smtp, etc) access a NFS storage
>> server. The mailboxes are stored in Maildir format, with many small files. We
>> use Amazon AWS EC2 for clients and storage server. In this scenario, we have
>> some questions about Ceph:
>>
>> 1. Is Ceph recommended for heavy write/read of small files?
>>
>> 2. Is there any problem in installing Ceph on Amazon instances?
>>
>> 3. Does Ceph already support quota?
>>
>> 4. What File System would you encourage us to use?
>>
>>
>> Thanks in advance,
>>
>> --
>> Mitsue Acosta Murakami
>>
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>>
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: rados mailbox? (was Re: Ceph for email storage)
  2012-07-04 20:40 ` rados mailbox? (was Re: Ceph for email storage) Sage Weil
  2012-07-05 14:07   ` Wido den Hollander
@ 2012-07-10  5:45   ` Kristofer
  2012-07-10 14:35     ` Smart Weblications GmbH
  1 sibling, 1 reply; 6+ messages in thread
From: Kristofer @ 2012-07-10  5:45 UTC (permalink / raw)
  To: Sage Weil; +Cc: Mitsue Acosta Murakami, ceph-devel

Very short answer to this.

It can work if you direct all email requests for a particular mailbox to 
a single machine. You need to avoid locking between servers as much as 
possible.

Messages will need to be indexed, period.  Or else your life will suck.

Dovecot has a nice writeup on this type of thing; not Ceph specific, but 
NFS related..it can be extrapolated to Ceph or any distributed storage: 
http://wiki.dovecot.org/NFS

On 07/04/2012 03:40 PM, Sage Weil wrote:
> Although Ceph fs would technically work for storing mail with maildir,
> when you step back from the situation, Maildir + a distributed file system
> is a pretty terrible way to approach mail storage.  Maildir was designed
> to work around the limited consistency of NFS, and manages that, but
> performs pretty horribly on almost any file system.  Mostly this is due to
> the message-per-file approach and the fact that file systems' internal
> management of inodes and directories mean lots and lots of seeks, even to
> read message headers.  Ceph's MDS will probably do better than most due to
> its embedded inodes, but it's hardly ideal.
>
> However, and idea that has been kicking around here is building a mail
> storage system directly on top of RADOS.  In principle, it should be a
> relatively straightforward matter of implementing a library and plugging
> it into the storage backend for something like Dovecot, or any other mail
> system (delivery agent and/or IMAP/POP frontend) with a pluggable backend.
> (I think postfix has pluggable delivery agents, but that's about where my
> experience in this area runs out.)
>
> The basic idea is this:
>
>   - each mail message is a rados object, and immutable.
>   - each mailbox is an index of messages, stored in a rados object.
>     - the index consists of omap records, one for each message.
>     - the key is some unique id
>     - the value is a copy of (a useful subset of) the message headers
>
> This has a number of nice properties:
>
>   - you can efficiently list messages in the mailbox using the omap
>     operations
>   - you can (more) efficiently search messages (everything but the message
>     body) based on the index contents (since it's all stored in one object)
>   - you can efficiently grab recent messages with the omap ops (e.g., list
>     keys > last_seen_msgid)
>   - moving messages between folders involves updating the indices only; the
>     messages objects need not be copied/moved.
>   - no metadata bottleneck: mailbox indices are distributed across the
>     entire cluster, just like the mail.
>   - all the scaling benefits of rados for a growing mail system.
>
> I don't know enough about what exactly the mail storage backends need to
> support to know what issues will come up.  Presumably there are several.
> E.g., if you delete a message, is the IMAP client expected to discover
> that efficiently?  And do the mail storage backends attempt to do it
> efficiently?
>
> This also doesn't solve the problem of efficiently indexing/searching the
> bodies of messages, although I suspect that indexing could be efficiently
> implemented on top of this scheme.
>
> So, a non-trivial project, but probably one that can be prototyped without
> that much pain, and one that would perform and scale drastically better
> than existing solutions I'm aware of.
>
> I'm hoping there are some motivated hackers lurking who understand the
> pain that is maildir/mail infrastructure...
>
> sage
>
>
>
> On Wed, 4 Jul 2012, Mitsue Acosta Murakami wrote:
>
>> Hello,
>>
>> We are examining Ceph to use as email storage. In our current system, several
>> clients servers with different services (imap, smtp, etc) access a NFS storage
>> server. The mailboxes are stored in Maildir format, with many small files. We
>> use Amazon AWS EC2 for clients and storage server. In this scenario, we have
>> some questions about Ceph:
>>
>> 1. Is Ceph recommended for heavy write/read of small files?
>>
>> 2. Is there any problem in installing Ceph on Amazon instances?
>>
>> 3. Does Ceph already support quota?
>>
>> 4. What File System would you encourage us to use?
>>
>>
>> Thanks in advance,
>>
>> -- 
>> Mitsue Acosta Murakami
>>
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>>
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: rados mailbox? (was Re: Ceph for email storage)
  2012-07-10  5:45   ` Kristofer
@ 2012-07-10 14:35     ` Smart Weblications GmbH
  0 siblings, 0 replies; 6+ messages in thread
From: Smart Weblications GmbH @ 2012-07-10 14:35 UTC (permalink / raw)
  To: Kristofer; +Cc: Sage Weil, Mitsue Acosta Murakami, ceph-devel

Am 10.07.2012 07:45, schrieb Kristofer:
> Very short answer to this.
> 
> It can work if you direct all email requests for a particular mailbox to a
> single machine. You need to avoid locking between servers as much as possible.
> 
> Messages will need to be indexed, period.  Or else your life will suck.
> 
> Dovecot has a nice writeup on this type of thing; not Ceph specific, but NFS
> related..it can be extrapolated to Ceph or any distributed storage:
> http://wiki.dovecot.org/NFS


>>
>>   - each mail message is a rados object, and immutable.
>>   - each mailbox is an index of messages, stored in a rados object.
>>     - the index consists of omap records, one for each message.
>>     - the key is some unique id
>>     - the value is a copy of (a useful subset of) the message headers
>>
>> This has a number of nice properties:
>>
>>   - you can efficiently list messages in the mailbox using the omap
>>     operations
>>   - you can (more) efficiently search messages (everything but the message
>>     body) based on the index contents (since it's all stored in one object)
>>   - you can efficiently grab recent messages with the omap ops (e.g., list
>>     keys > last_seen_msgid)
>>   - moving messages between folders involves updating the indices only; the
>>     messages objects need not be copied/moved.
>>   - no metadata bottleneck: mailbox indices are distributed across the
>>     entire cluster, just like the mail.
>>   - all the scaling benefits of rados for a growing mail system.
>>
>> I don't know enough about what exactly the mail storage backends need to
>> support to know what issues will come up.  Presumably there are several.
>> E.g., if you delete a message, is the IMAP client expected to discover
>> that efficiently?  And do the mail storage backends attempt to do it
>> efficiently?
>>
>> This also doesn't solve the problem of efficiently indexing/searching the
>> bodies of messages, although I suspect that indexing could be efficiently
>> implemented on top of this scheme.
>>
>> So, a non-trivial project, but probably one that can be prototyped without
>> that much pain, and one that would perform and scale drastically better
>> than existing solutions I'm aware of.
>>
>> I'm hoping there are some motivated hackers lurking who understand the
>> pain that is maildir/mail infrastructure...
>>

Maybe another idea which could be done with few effort would be to mostly reuse
the code from dbmail and make a cephmail version out of it.


-- 

Mit freundlichen Grüßen,


Smart Weblications GmbH
Martinsberger Str. 1
D-95119 Naila

fon.: +49 9282 9638 200
fax.: +49 9282 9638 205
24/7: +49 900 144 000 00 - 0,99 EUR/Min*
http://www.smart-weblications.de

--
Sitz der Gesellschaft: Naila
Geschäftsführer: Florian Wiessner
HRB-Nr.: HRB 3840 Amtsgericht Hof
*aus dem dt. Festnetz, ggf. abweichende Preise aus dem Mobilfunknetz


--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2012-07-10 14:43 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-07-04 18:29 Ceph for email storage Mitsue Acosta Murakami
2012-07-04 19:10 ` Gregory Farnum
2012-07-04 20:40 ` rados mailbox? (was Re: Ceph for email storage) Sage Weil
2012-07-05 14:07   ` Wido den Hollander
2012-07-10  5:45   ` Kristofer
2012-07-10 14:35     ` Smart Weblications GmbH

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.