All of lore.kernel.org
 help / color / mirror / Atom feed
* Problems with nodatacow/nodatasum
@ 2012-04-20  8:19 Avi Kivity
  2012-04-20  9:12 ` David Sterba
  0 siblings, 1 reply; 6+ messages in thread
From: Avi Kivity @ 2012-04-20  8:19 UTC (permalink / raw)
  To: linux-btrfs

I have a btrfs filesystem mounted in two locations as two subvolumes:


  /dev/mapper/luks-blah /                       btrfs
subvol=/rootvol        1 1
  /dev/mapper/luks-blah /var/lib/libvirt/images     btrfs
nodatasum,nodatacow,subvol=/images.libvirt        1 2

However, a file under the second  mount is getting seriously
fragmented.  It started out with a few dozen extents (reasonable for a
several gigabytes), now it's 11000 and counting, after an application
started pounding on it with a bit of threaded O_DIRECT random I/O.

3.3.1-5.fc16.x86_64

Any hints?

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Problems with nodatacow/nodatasum
  2012-04-20  8:19 Problems with nodatacow/nodatasum Avi Kivity
@ 2012-04-20  9:12 ` David Sterba
  2012-04-20  9:22   ` Avi Kivity
  0 siblings, 1 reply; 6+ messages in thread
From: David Sterba @ 2012-04-20  9:12 UTC (permalink / raw)
  To: Avi Kivity; +Cc: linux-btrfs

On Fri, Apr 20, 2012 at 11:19:39AM +0300, Avi Kivity wrote:
>   /dev/mapper/luks-blah /                       btrfs
> subvol=/rootvol        1 1
>   /dev/mapper/luks-blah /var/lib/libvirt/images     btrfs
> nodatasum,nodatacow,subvol=/images.libvirt        1 2

what does /proc/mounts say about the applied options? do you see
nodatasum and nodatacow there? afaik most (if not all) mount options
affect the whole filesystem including any subvolume mounts.  having
per-subvol options is possible, just not implemented.

david

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Problems with nodatacow/nodatasum
  2012-04-20  9:12 ` David Sterba
@ 2012-04-20  9:22   ` Avi Kivity
  2012-04-21  0:15     ` Chris Mason
  0 siblings, 1 reply; 6+ messages in thread
From: Avi Kivity @ 2012-04-20  9:22 UTC (permalink / raw)
  To: dave, Avi Kivity, linux-btrfs

On Fri, Apr 20, 2012 at 12:12 PM, David Sterba <dave@jikos.cz> wrote:
> On Fri, Apr 20, 2012 at 11:19:39AM +0300, Avi Kivity wrote:
>> =A0 /dev/mapper/luks-blah / =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =
=A0 btrfs
>> subvol=3D/rootvol =A0 =A0 =A0 =A01 1
>> =A0 /dev/mapper/luks-blah /var/lib/libvirt/images =A0 =A0 btrfs
>> nodatasum,nodatacow,subvol=3D/images.libvirt =A0 =A0 =A0 =A01 2
>
> what does /proc/mounts say about the applied options? do you see
> nodatasum and nodatacow there? afaik most (if not all) mount options
> affect the whole filesystem including any subvolume mounts. =A0having
> per-subvol options is possible, just not implemented.
>

Nothing, the options don't show up there.

Are there plans to allow per-subvolume nodatasum/nodatacow?
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" =
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Problems with nodatacow/nodatasum
  2012-04-20  9:22   ` Avi Kivity
@ 2012-04-21  0:15     ` Chris Mason
  2012-05-13 17:51       ` Avi Kivity
  0 siblings, 1 reply; 6+ messages in thread
From: Chris Mason @ 2012-04-21  0:15 UTC (permalink / raw)
  To: Avi Kivity; +Cc: dave, linux-btrfs

On Fri, Apr 20, 2012 at 12:22:11PM +0300, Avi Kivity wrote:
> On Fri, Apr 20, 2012 at 12:12 PM, David Sterba <dave@jikos.cz> wrote:
> > On Fri, Apr 20, 2012 at 11:19:39AM +0300, Avi Kivity wrote:
> >> =A0 /dev/mapper/luks-blah / =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0=
 =A0 btrfs
> >> subvol=3D/rootvol =A0 =A0 =A0 =A01 1
> >> =A0 /dev/mapper/luks-blah /var/lib/libvirt/images =A0 =A0 btrfs
> >> nodatasum,nodatacow,subvol=3D/images.libvirt =A0 =A0 =A0 =A01 2
> >
> > what does /proc/mounts say about the applied options? do you see
> > nodatasum and nodatacow there? afaik most (if not all) mount option=
s
> > affect the whole filesystem including any subvolume mounts. =A0havi=
ng
> > per-subvol options is possible, just not implemented.
> >
>=20
> Nothing, the options don't show up there.
>=20
> Are there plans to allow per-subvolume nodatasum/nodatacow?

It can be set on a per file basis, let me push out a commit to btrfs
progs with ioctls to set it.

-chris
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" =
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Problems with nodatacow/nodatasum
  2012-04-21  0:15     ` Chris Mason
@ 2012-05-13 17:51       ` Avi Kivity
  2012-05-15 22:12         ` David Sterba
  0 siblings, 1 reply; 6+ messages in thread
From: Avi Kivity @ 2012-05-13 17:51 UTC (permalink / raw)
  To: Chris Mason, Avi Kivity, dave, linux-btrfs

On Sat, Apr 21, 2012 at 3:15 AM, Chris Mason <chris.mason@oracle.com> wrote:
>>
>> Are there plans to allow per-subvolume nodatasum/nodatacow?
>
> It can be set on a per file basis, let me push out a commit to btrfs
> progs with ioctls to set it.

Did this not happen, or am I barking up the wrong btrfs-progs.git tree?

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Problems with nodatacow/nodatasum
  2012-05-13 17:51       ` Avi Kivity
@ 2012-05-15 22:12         ` David Sterba
  0 siblings, 0 replies; 6+ messages in thread
From: David Sterba @ 2012-05-15 22:12 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Chris Mason, dave, linux-btrfs

On Sun, May 13, 2012 at 08:51:53PM +0300, Avi Kivity wrote:
> On Sat, Apr 21, 2012 at 3:15 AM, Chris Mason <chris.mason@oracle.com> wrote:
> >>
> >> Are there plans to allow per-subvolume nodatasum/nodatacow?
> >
> > It can be set on a per file basis, let me push out a commit to btrfs
> > progs with ioctls to set it.
> 
> Did this not happen, or am I barking up the wrong btrfs-progs.git tree?

Taking minimalistic approach, the following patch allows to enable true
NOCOW feature on a file (limted to files of 0 size), no specific mount
options are needed.

Tested on a ~350MB file, filled with /dev/urandom, then randomly
rewritten with zeros, filefrag output is same before and after the
operation.

Please note that due to the simplicity of implementation, only a
zero-sized file really becomes NOCOW, but in fact this mimics what
creating a file under -o nodatacow will produce.

The change is done by setting NOCOW attribute, with patched chattr
http://www.spinics.net/lists/linux-btrfs/msg09605.html
(or using the ioctl directly)


david


---
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -234,10 +234,22 @@ static int btrfs_ioctl_setflags(struct file *file, void __user *arg)
                ip->flags |= BTRFS_INODE_DIRSYNC;
        else
                ip->flags &= ~BTRFS_INODE_DIRSYNC;
-       if (flags & FS_NOCOW_FL)
+       if (flags & FS_NOCOW_FL) {
                ip->flags |= BTRFS_INODE_NODATACOW;
-       else
+               /*
+                * Half-workaround for a NOCOW file.
+                * It's safe to turn of csums here, no extents exist
+                */
+               if (inode->i_size == 0)
+                       ip->flags |= BTRFS_INODE_NODATASUM;
+       } else {
+               /*
+                * Revert back under same assuptions as before
+                */
+               if (inode->i_size == 0)
+                       ip->flags &= ~BTRFS_INODE_NODATASUM;
                ip->flags &= ~BTRFS_INODE_NODATACOW;
+       }

        /*
         * The COMPRESS flag can only be changed by users, while the NOCOMPRESS


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2012-05-15 22:12 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-04-20  8:19 Problems with nodatacow/nodatasum Avi Kivity
2012-04-20  9:12 ` David Sterba
2012-04-20  9:22   ` Avi Kivity
2012-04-21  0:15     ` Chris Mason
2012-05-13 17:51       ` Avi Kivity
2012-05-15 22:12         ` David Sterba

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.