All of lore.kernel.org
 help / color / mirror / Atom feed
* BTRFS and cyrus mail server
@ 2017-02-08 18:38 Libor Klepáč
  2017-02-08 19:21 ` Austin S. Hemmelgarn
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Libor Klepáč @ 2017-02-08 18:38 UTC (permalink / raw)
  To: linux-btrfs

Hello,
inspired by recent discussion on BTRFS vs. databases i wanted to ask on 
suitability of BTRFS for hosting a Cyrus imap server spool. I haven't found 
any recent article on this topic.

I'm preparing migration of our mailserver to Debian Stretch, ie. kernel 4.9 
for now. We are using XFS for storage now. I will migrate using imapsync to 
new server. Both are virtual machines running on vmware on Dell hardware.
Disks are on battery backed hw raid controllers over vmfs.

I'm considering using BTRFS, but I'm little concerned because of reading this 
mailing list ;)

I'm interested in using:
 - compression (emails should compress well - right?)
 - maybe deduplication (cyrus does it by hardlinking of same content messages 
now) later
 - snapshots for history
 - send/receive for offisite backup
 - what about data inlining, should it be turned off?

Our Cyrus pool consist of ~520GB of data in ~2,5million files, ~2000 
mailboxes.
We have message size limit of ~25MB, so emails are not bigger than that.
There are however bigger files, these are per mailbox caches/index files of 
cyrus (some of them are around 300MB) - and these are also files which are 
most modified.
Rest of files (messages) are usualy just writen once.

-----------
I started using btrfs on backup server as a storage for 4 backuppc run in 
containers (backups are then send away with btrbk), year ago.
After switching off data inlining i'm satisfied, everything works (send/
receive is sometime slow, but i guess it's because of sata disks on receive 
side).


Thanks for you opinions,

Libor


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: BTRFS and cyrus mail server
  2017-02-08 18:38 BTRFS and cyrus mail server Libor Klepáč
@ 2017-02-08 19:21 ` Austin S. Hemmelgarn
  2017-02-09 11:49   ` Adam Borowski
  2017-02-08 19:59 ` Kai Krakow
  2017-02-08 23:19 ` Graham Cobb
  2 siblings, 1 reply; 6+ messages in thread
From: Austin S. Hemmelgarn @ 2017-02-08 19:21 UTC (permalink / raw)
  To: Libor Klepáč, linux-btrfs

On 2017-02-08 13:38, Libor Klepáč wrote:
> Hello,
> inspired by recent discussion on BTRFS vs. databases i wanted to ask on
> suitability of BTRFS for hosting a Cyrus imap server spool. I haven't found
> any recent article on this topic.
>
> I'm preparing migration of our mailserver to Debian Stretch, ie. kernel 4.9
> for now. We are using XFS for storage now. I will migrate using imapsync to
> new server. Both are virtual machines running on vmware on Dell hardware.
> Disks are on battery backed hw raid controllers over vmfs.
>
> I'm considering using BTRFS, but I'm little concerned because of reading this
> mailing list ;)
FWIW, as long as you're using a recent kernel and take the time to do 
proper maintenance on the filesystem, BTRFS is generally very stable. 
WRT mail servers specifically, before we went to a cloud service for 
e-mail where I work, we used Postfix + Dovecot on our internal server, 
and actually saw a measurable performance improvement when switching 
from XFS to BTRFS.  That was about 3.12-3.18 vintage on the kernel 
though, so YMMV.
>
> I'm interested in using:
>  - compression (emails should compress well - right?)
Yes, very well assuming you're storing the actual text form of them (I 
don't recall if Cyrus does so, but I know Postfix, Sendmail, and most 
other FOSS mail server software do).  The in-line compression will also 
help reduce fragmentation, and unless you have a really fast storage 
device, should probably improve performance in general.
>  - maybe deduplication (cyrus does it by hardlinking of same content messages
> now) later
Deduplication beyond what Cyrus does is probably not worth it.  In most 
cases about 10% of an e-mail in text form is going to be duplicated if 
it's not a copy of an existing message, and that 10% is generally spread 
throughout the file (stuff like MIME headers and such), so you would 
probably see near zero space savings for doing anything beyond what 
Cyrus does while using an insanely larger amount of resources.
>  - snapshots for history
Make sure you use a sane exponential thinning system.  Once you get past 
about 300 snapshots, you'll start seeing some serious performance 
issues, and even double digits might hurt performance at the scale 
you're talking about.
>  - send/receive for offisite backup
This is up to you, but I would probably not use send-receive for 
off-site backups.  Unless you're using reflinking, you can copy all the 
same attributes that send-receive does using almost any other backup 
tool, and other tools often have much better security built-in.  Send 
streams also don't compress very well in my experience, so using 
send-receive has a tendency to require more network resources.
>  - what about data inlining, should it be turned off?
Generally no, and especially if you handle lots of small e-mails. 
Metadata blocks need to be looked up to open and read files anyway, 
in-lining the data means that you don't need to read in any more blocks 
for files small enough to fit in the spare space in the metadata block 
or when you only need to read the first few kilobytes of the file (and 
if Cyrus' IMAP/POP server works anything like most others I've seen, it 
will be parsing those first few KB because that's where the headers it 
indexes are).
>
> Our Cyrus pool consist of ~520GB of data in ~2,5million files, ~2000
> mailboxes.
> We have message size limit of ~25MB, so emails are not bigger than that.
> There are however bigger files, these are per mailbox caches/index files of
> cyrus (some of them are around 300MB) - and these are also files which are
> most modified.
I would mark these files NOCOW for performance reasons (and because if 
they're just caches and indexes, they should be pretty simple to 
regenerate).
> Rest of files (messages) are usualy just writen once.
>
> -----------
> I started using btrfs on backup server as a storage for 4 backuppc run in
> containers (backups are then send away with btrbk), year ago.
> After switching off data inlining i'm satisfied, everything works (send/
> receive is sometime slow, but i guess it's because of sata disks on receive
> side).


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: BTRFS and cyrus mail server
  2017-02-08 18:38 BTRFS and cyrus mail server Libor Klepáč
  2017-02-08 19:21 ` Austin S. Hemmelgarn
@ 2017-02-08 19:59 ` Kai Krakow
  2017-02-08 23:19 ` Graham Cobb
  2 siblings, 0 replies; 6+ messages in thread
From: Kai Krakow @ 2017-02-08 19:59 UTC (permalink / raw)
  To: linux-btrfs

Am Wed, 08 Feb 2017 19:38:06 +0100
schrieb Libor Klepáč <libor.klepac@bcom.cz>:

> Hello,
> inspired by recent discussion on BTRFS vs. databases i wanted to ask
> on suitability of BTRFS for hosting a Cyrus imap server spool. I
> haven't found any recent article on this topic.
> 
> I'm preparing migration of our mailserver to Debian Stretch, ie.
> kernel 4.9 for now. We are using XFS for storage now. I will migrate
> using imapsync to new server. Both are virtual machines running on
> vmware on Dell hardware. Disks are on battery backed hw raid
> controllers over vmfs.
> 
> I'm considering using BTRFS, but I'm little concerned because of
> reading this mailing list ;)
> 
> I'm interested in using:
>  - compression (emails should compress well - right?)

Not really... The small part that's compressible (headers and a few
lines of text) are already small, so a sector (maybe 4k) is still a
sector. Compression gains you no benefit here. That big parts of mails
is already compressed (images, attachments). Mail spools only compress
well if you're compressing mails to a solid archive (like 7zip or tgz).
If you're compressing each mail individually, there's almost no gain
because of file system slack.

>  - maybe deduplication (cyrus does it by hardlinking of same content
> messages now) later

It won't work that way. I'd stick to hardlinking. Only
offline/nearline deduplication will help you. And it will have a hard
time finding the duplicates. This would only properly work if Cyrus
separates mail headers and bodies (I don't know if it does, dovecot
doesn't which is what I use) because delivering to the spool usually
adds some headers like "Delivered-To". This changes the byte offsets
between similar mails so that deduplication will no longer work.
 
>  - snapshots for history

Don't do snapshots too deep. I had similar plans but instead decided it
would be better to use the following setup as a continuous backup
strategy: Deliver mails to two spools, one being the user accessible
spool, and one being the backup spool. Once per day you rename the
backup spool and let it be recreated. Then store away the old backup
store in whatever way you want (snapshots, traditional backup with
retention, ...).

>  - send/receive for offisite backup

It's not that stable that I'd use it in production...

>  - what about data inlining, should it be turned off?

How much data can be inlined? I'm not sure, I never thought about that.

> Our Cyrus pool consist of ~520GB of data in ~2,5million files, ~2000 
> mailboxes.

Similar numbers here, just more mailboxes and less space because we
take care that customers remove their mails from our servers and store
it in their own systems and backups. With a few exceptions, and those
have really big mailboxes.

> We have message size limit of ~25MB, so emails are not bigger than
> that.

50 MB raw size here... (after 3-in-4 decoding this makes around 37 MB
worth of attachments)

> There are however bigger files, these are per mailbox
> caches/index files of cyrus (some of them are around 300MB) - and
> these are also files which are most modified.
> Rest of files (messages) are usualy just writen once.

I'm still struggling if I should try btrfs or stay with xfs. Xfs has a
huge benefit of scaling very very well to parallel workloads and
accross multiple devices. Btrfs does exactly that not very well yet
(because of write-serialization etc).

> 
> -----------
> I started using btrfs on backup server as a storage for 4 backuppc
> run in containers (backups are then send away with btrbk), year ago.
> After switching off data inlining i'm satisfied, everything works
> (send/ receive is sometime slow, but i guess it's because of sata
> disks on receive side).

I've started to love borgbackup. It's very fast, efficient, and
reliable. Not sure how good it works for VM images, but for delta
backups in general it's very efficient and fast.


-- 
Regards,
Kai

Replies to list-only preferred.



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: BTRFS and cyrus mail server
  2017-02-08 18:38 BTRFS and cyrus mail server Libor Klepáč
  2017-02-08 19:21 ` Austin S. Hemmelgarn
  2017-02-08 19:59 ` Kai Krakow
@ 2017-02-08 23:19 ` Graham Cobb
  2 siblings, 0 replies; 6+ messages in thread
From: Graham Cobb @ 2017-02-08 23:19 UTC (permalink / raw)
  To: linux-btrfs

On 08/02/17 18:38, Libor Klepáč wrote:
> I'm interested in using:
...
>  - send/receive for offisite backup

I don't particularly recommend that. I do use send/receive for onsite
backups (I actually use btrbk). But for offsite I use a traditional
backup tool (I use dar). For three main reasons:

1) Paranoia: I want a backup that does not use btrfs just in case there
turned out to be some problem with btrfs which could corrupt the backup.
I can't think of anything but I did say it was paranoia!

2) send/receive in incremental mode (the obvious way to use it for
offsite backups) relies on the target being up to date and properly
synchronised with the source. If, for any reason, it gets out of sync,
you have to start again with sending a full backup - a lot of data.
Traditional backup formats are more forgiving and having a corrupted
incremental does not normally prevent you getting access to data stored
in the other incrementals. This would particularly be a risk if you
thought about storing the actual send streams instead of doing the
receive: a single bit error in one could make all the subsequent streams
useless.

3) send/receive doesn't work particularly well with encryption. I store
my offsite backups in a cloud service and I want them encrypted both in
transit and when stored. To get the same with send/receive requires
putting together your own encrypted communication channel (e.g. using
ssh) and requires that you have a remote server, with an encrypted
filesystem receiving the data (and it has to be accessible in the clear
on that server). Traditional backups can just be stored offsite as
encrypted files without ever having to be in the clear anywhere except
onsite.

Just my reasons.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: BTRFS and cyrus mail server
  2017-02-08 19:21 ` Austin S. Hemmelgarn
@ 2017-02-09 11:49   ` Adam Borowski
  2017-02-09 12:53     ` Austin S. Hemmelgarn
  0 siblings, 1 reply; 6+ messages in thread
From: Adam Borowski @ 2017-02-09 11:49 UTC (permalink / raw)
  To: Austin S. Hemmelgarn; +Cc: Libor Klepáč, linux-btrfs

On Wed, Feb 08, 2017 at 02:21:13PM -0500, Austin S. Hemmelgarn wrote:
> >  - maybe deduplication (cyrus does it by hardlinking of same content messages
> > now) later
> Deduplication beyond what Cyrus does is probably not worth it.  In most
> cases about 10% of an e-mail in text form is going to be duplicated if it's
> not a copy of an existing message, and that 10% is generally spread
> throughout the file (stuff like MIME headers and such), so you would
> probably see near zero space savings for doing anything beyond what Cyrus
> does while using an insanely larger amount of resources.

The problem is: users in a company tend to send mails to a group, so a bunch
of people have plenty of identical mails... then every delivered mail has
slightly different headers prepended, usually of different length to make
sure that 20MB mail has its contents shifted by a single byte so you can't
dedupe blocks after the first.

> >  - snapshots for history
> Make sure you use a sane exponential thinning system.  Once you get past
> about 300 snapshots, you'll start seeing some serious performance issues,
> and even double digits might hurt performance at the scale you're talking
> about.

It's not anywhere that bad in my experience.  As far as I know, regular
POSIX operations are not affected by the number of reflinks, only stuff like
balance (greatly), dedupe, and, to a lesser extent, deletion of snapshots.
You don't want to hit 100k snapshots like I once did, but even then the
filesystem keeps working in regular operation.

(Those snapshots were not deduped beyond natural reflinking from
snapshotting, every one having no more than a few hundreds links.  I now
realize that it'd probably explode had I tried coalescing identical
files between then.)

> >  - send/receive for offisite backup
> This is up to you, but I would probably not use send-receive for off-site
> backups.  Unless you're using reflinking, you can copy all the same
> attributes that send-receive does using almost any other backup tool, and
> other tools often have much better security built-in.  Send streams also
> don't compress very well in my experience, so using send-receive has a
> tendency to require more network resources.

I'd heartily recommend using _both_.  You use send-receive for that 3-hour
(or 1-hour!) backup, and rsync for dailies.  You do value your mails enough
to back them to two places, right?  Then you get to enjoy efficiency of
send-receive (statting everything takes ages!), while rsync helps with
paranoia about send-receive cloning potential filesystem errors.

> > Our Cyrus pool consist of ~520GB of data in ~2,5million files, ~2000
> > mailboxes.
> > We have message size limit of ~25MB, so emails are not bigger than that.
> > There are however bigger files, these are per mailbox caches/index files of
> > cyrus (some of them are around 300MB) - and these are also files which are
> > most modified.
> I would mark these files NOCOW for performance reasons (and because if
> they're just caches and indexes, they should be pretty simple to
> regenerate).

Using NOCOW with snapshots gets you the worst of both worlds: all the
downsides of CoW with no btrfs goodies.  NOCOW is useful only for "I wish I
had partitioned a traditional filesystem for this file, and I don't need to
snapshot it".


Meow!
-- 
Autotools hint: to do a zx-spectrum build on a pdp11 host, type:
  ./configure --host=zx-spectrum --build=pdp11

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: BTRFS and cyrus mail server
  2017-02-09 11:49   ` Adam Borowski
@ 2017-02-09 12:53     ` Austin S. Hemmelgarn
  0 siblings, 0 replies; 6+ messages in thread
From: Austin S. Hemmelgarn @ 2017-02-09 12:53 UTC (permalink / raw)
  To: Adam Borowski; +Cc: Libor Klepáč, linux-btrfs

On 2017-02-09 06:49, Adam Borowski wrote:
> On Wed, Feb 08, 2017 at 02:21:13PM -0500, Austin S. Hemmelgarn wrote:
>>>  - maybe deduplication (cyrus does it by hardlinking of same content messages
>>> now) later
>> Deduplication beyond what Cyrus does is probably not worth it.  In most
>> cases about 10% of an e-mail in text form is going to be duplicated if it's
>> not a copy of an existing message, and that 10% is generally spread
>> throughout the file (stuff like MIME headers and such), so you would
>> probably see near zero space savings for doing anything beyond what Cyrus
>> does while using an insanely larger amount of resources.
>
> The problem is: users in a company tend to send mails to a group, so a bunch
> of people have plenty of identical mails... then every delivered mail has
> slightly different headers prepended, usually of different length to make
> sure that 20MB mail has its contents shifted by a single byte so you can't
> dedupe blocks after the first.
Unless it's multiple copies of the mail or multiple BCC's, the headers 
will (with limited exception) be identical because they contain all the 
same TO: and CC: lines.
>
>>>  - snapshots for history
>> Make sure you use a sane exponential thinning system.  Once you get past
>> about 300 snapshots, you'll start seeing some serious performance issues,
>> and even double digits might hurt performance at the scale you're talking
>> about.
>
> It's not anywhere that bad in my experience.  As far as I know, regular
> POSIX operations are not affected by the number of reflinks, only stuff like
> balance (greatly), dedupe, and, to a lesser extent, deletion of snapshots.
> You don't want to hit 100k snapshots like I once did, but even then the
> filesystem keeps working in regular operation.
However, proper maintenance on a BTRFS filesystem is not just POSIX 
operations.  IOW, if you want a manageable filesystem that doesn't take 
forever to fix when something goes wrong, you want to avoid large 
numbers of snapshots.
>
> (Those snapshots were not deduped beyond natural reflinking from
> snapshotting, every one having no more than a few hundreds links.  I now
> realize that it'd probably explode had I tried coalescing identical
> files between then.)
>
>>>  - send/receive for offisite backup
>> This is up to you, but I would probably not use send-receive for off-site
>> backups.  Unless you're using reflinking, you can copy all the same
>> attributes that send-receive does using almost any other backup tool, and
>> other tools often have much better security built-in.  Send streams also
>> don't compress very well in my experience, so using send-receive has a
>> tendency to require more network resources.
>
> I'd heartily recommend using _both_.  You use send-receive for that 3-hour
> (or 1-hour!) backup, and rsync for dailies.  You do value your mails enough
> to back them to two places, right?  Then you get to enjoy efficiency of
> send-receive (statting everything takes ages!), while rsync helps with
> paranoia about send-receive cloning potential filesystem errors.
>
>>> Our Cyrus pool consist of ~520GB of data in ~2,5million files, ~2000
>>> mailboxes.
>>> We have message size limit of ~25MB, so emails are not bigger than that.
>>> There are however bigger files, these are per mailbox caches/index files of
>>> cyrus (some of them are around 300MB) - and these are also files which are
>>> most modified.
>> I would mark these files NOCOW for performance reasons (and because if
>> they're just caches and indexes, they should be pretty simple to
>> regenerate).
>
> Using NOCOW with snapshots gets you the worst of both worlds: all the
> downsides of CoW with no btrfs goodies.  NOCOW is useful only for "I wish I
> had partitioned a traditional filesystem for this file, and I don't need to
> snapshot it".
However, if those really are just caches and/or indexes, then you 
shouldn't need to snapshot them because the software can just rebuild 
them if they get lost, and that's actually safer in many cases than 
restoring backup copies.


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2017-02-09 13:00 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-02-08 18:38 BTRFS and cyrus mail server Libor Klepáč
2017-02-08 19:21 ` Austin S. Hemmelgarn
2017-02-09 11:49   ` Adam Borowski
2017-02-09 12:53     ` Austin S. Hemmelgarn
2017-02-08 19:59 ` Kai Krakow
2017-02-08 23:19 ` Graham Cobb

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.