All of lore.kernel.org
 help / color / mirror / Atom feed
From: Amir Goldstein <amir73il@gmail.com>
To: Chris Mason <chris.mason@oracle.com>
Cc: liubo <liubo2009@cn.fujitsu.com>,
	Andreas Dilger <adilger@dilger.ca>,
	Christoph Hellwig <hch@lst.de>,
	Linux Btrfs <linux-btrfs@vger.kernel.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	Theodore Tso <tytso@mit.edu>
Subject: Re: [PATCH 1/2] Btrfs: add datacow flag in inode flag
Date: Thu, 17 Mar 2011 16:37:34 +0200	[thread overview]
Message-ID: <AANLkTi=eUT8tm9v8ozDOc_W9MBGr=ccVRXYpASRFT4vZ@mail.gmail.com> (raw)
In-Reply-To: <1300371584-sup-1674@think>

On Thu, Mar 17, 2011 at 4:21 PM, Chris Mason <chris.mason@oracle.com> w=
rote:
> Excerpts from liubo's message of 2011-03-16 22:10:09 -0400:
>> On 03/16/2011 05:06 PM, Amir Goldstein wrote:
>> > On Wed, Mar 16, 2011 at 1:35 AM, Chris Mason <chris.mason@oracle.c=
om> wrote:
>> >> Excerpts from Andreas Dilger's message of 2011-03-15 18:06:49 -04=
00:
>> >>> On 2011-03-15, at 2:57 PM, Christoph Hellwig wrote:
>> >>>> On Tue, Mar 15, 2011 at 04:26:50PM -0400, Chris Mason wrote:
>> >>>>> =A0#define FS_EXTENT_FL =A0 =A0 =A0 =A0 0x00080000 /* Extents =
*/
>> >>>>> =A0#define FS_DIRECTIO_FL =A0 =A0 =A0 0x00100000 /* Use direct=
 i/o */
>> >>>>> +#define FS_NOCOW_FL =A0 =A0 =A0 =A0 =A00x00800000 /* Do not c=
ow file */
>> >>>>> +#define FS_COW_FL =A0 =A0 =A0 =A0 =A0 =A00x01000000 /* Cow fi=
le */
>> >>>>> =A0#define FS_RESERVED_FL =A0 =A0 =A0 0x80000000 /* reserved f=
or ext2 lib */
>> >>>> I'm fine with it. =A0I'll defer the check for conflicts with ex=
tN-specific flags
>> >>>> to Ted, though.
>> >>> Looking at the upstream e2fsprogs I see in that range:
>> >>>
>> >>>> #define EXT4_EXTENTS_FL =A0 =A0 =A0 =A0 =A0 0x00080000 /* Inode=
 uses extents */
>> >>>> #define EXT4_EA_INODE_FL =A0 =A0 =A0 =A0 =A00x00200000 /* Inode=
 used for large EA */
>> >>>> #define EXT4_EOFBLOCKS_FL =A0 =A0 =A0 =A0 0x00400000 /* Blocks =
allocated beyond EOF */
>> >>>> #define EXT4_SNAPFILE_FL =A0 =A0 =A0 =A0 =A00x01000000 /* Inode=
 is a snapshot */
>> >>>> #define EXT4_SNAPFILE_DELETED_FL =A00x04000000 /* Snapshot is b=
eing deleted */
>> >>>> #define EXT4_SNAPFILE_SHRUNK_FL =A0 0x08000000 /* Snapshot shri=
nk has completed */
>> >>>> #define EXT2_RESERVED_FL =A0 =A0 =A0 =A0 =A00x80000000 /* reser=
ved for ext2 lib */
>> >>>>
>> >>>> #define EXT2_FL_USER_VISIBLE =A0 =A0 =A00x004BDFFF /* User visi=
ble flags */
>> >>> so there is a conflict with FS_COW_FL and EXT4_SNAPFILE_FL. =A0I=
 don't know the semantics of those two flags enough to say for sure whe=
ther it is reasonable that they alias to each other, but at first glanc=
e "COW" and "SNAPSHOT" don't seem completely unrelated.
>> >
>> > EXT4_SNAPFILE_FL indicates a special system snapshot file, so it h=
as
>> > no equivalence relation with FS_COW_FL.
>> > Please use 0x02000000 for FS_COW_FL.
>>
>> Fine with that, but it's up to Chris. :)
>
> I'd rather not conflict unless we're critically short on space.
>
>> >
>> > EXT4_SNAPFILE_DELETED_FL is a persistent state of a snapshot file,
>> > which is no longer
>> > available as a mountable device, but cannot be unlinked because it
>> > holds changed data sets
>> > needed by older snapshots.
>> >
>> > EXT4_SNAPFILE_SHRUNK_FL is a persistent state of a (deleted) snaps=
hot
>> > file, which has
>> > undergone a "shrink" process to free all change sets not needed by
>> > older snapshots.
>> > The persistence of the flag is needed to avoid tedious shrinking w=
hen
>> > it is not needed.
>> >
>> >
>> >> In the btrfs case FS_COW_FL means to do COW even when there are n=
o
>> >> snapshots. =A0FS_NOCOW_FL means to do cow only when there are sna=
pshots.
>> >>
>> >
>> > I am interested in FS_NOCOW_FL as well, but for my implementation =
it would mean
>> > do not do COW on rewrites even when there are snapshots, so a user=
 can
>> > create a pre-allocated
>> > "island of blocks", which are pinned to a physical location, for r=
aw
>> > VM image for example.
>
> I'm not sure how the island of blocks idea can work with snapshots?
> Wouldn't the snapshot corrupt if anything in the island were changed?
>

It would corrupt, but only to the extent that the file to which you req=
uested
NOCOW may contain newer data. It cannot contain uninitialized data,
because truncating the file would leave it's blocks referenced by the s=
napshot.

Think of a large database file, which is already replicated and hot bac=
ked up
regularly. An arbitrary snapshot of that file will give you a copy for
disaster recovery
at best. Not sure this is worth the effort of COWing it and
fragmenting it beyond
recognition.

Amir.
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel=
" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

WARNING: multiple messages have this Message-ID (diff)
From: Amir Goldstein <amir73il@gmail.com>
To: Chris Mason <chris.mason@oracle.com>
Cc: liubo <liubo2009@cn.fujitsu.com>,
	Andreas Dilger <adilger@dilger.ca>,
	Christoph Hellwig <hch@lst.de>,
	Linux Btrfs <linux-btrfs@vger.kernel.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	Theodore Tso <tytso@mit.edu>
Subject: Re: [PATCH 1/2] Btrfs: add datacow flag in inode flag
Date: Thu, 17 Mar 2011 16:37:34 +0200	[thread overview]
Message-ID: <AANLkTi=eUT8tm9v8ozDOc_W9MBGr=ccVRXYpASRFT4vZ@mail.gmail.com> (raw)
In-Reply-To: <1300371584-sup-1674@think>

On Thu, Mar 17, 2011 at 4:21 PM, Chris Mason <chris.mason@oracle.com> wrote:
> Excerpts from liubo's message of 2011-03-16 22:10:09 -0400:
>> On 03/16/2011 05:06 PM, Amir Goldstein wrote:
>> > On Wed, Mar 16, 2011 at 1:35 AM, Chris Mason <chris.mason@oracle.com> wrote:
>> >> Excerpts from Andreas Dilger's message of 2011-03-15 18:06:49 -0400:
>> >>> On 2011-03-15, at 2:57 PM, Christoph Hellwig wrote:
>> >>>> On Tue, Mar 15, 2011 at 04:26:50PM -0400, Chris Mason wrote:
>> >>>>>  #define FS_EXTENT_FL         0x00080000 /* Extents */
>> >>>>>  #define FS_DIRECTIO_FL       0x00100000 /* Use direct i/o */
>> >>>>> +#define FS_NOCOW_FL          0x00800000 /* Do not cow file */
>> >>>>> +#define FS_COW_FL            0x01000000 /* Cow file */
>> >>>>>  #define FS_RESERVED_FL       0x80000000 /* reserved for ext2 lib */
>> >>>> I'm fine with it.  I'll defer the check for conflicts with extN-specific flags
>> >>>> to Ted, though.
>> >>> Looking at the upstream e2fsprogs I see in that range:
>> >>>
>> >>>> #define EXT4_EXTENTS_FL           0x00080000 /* Inode uses extents */
>> >>>> #define EXT4_EA_INODE_FL          0x00200000 /* Inode used for large EA */
>> >>>> #define EXT4_EOFBLOCKS_FL         0x00400000 /* Blocks allocated beyond EOF */
>> >>>> #define EXT4_SNAPFILE_FL          0x01000000 /* Inode is a snapshot */
>> >>>> #define EXT4_SNAPFILE_DELETED_FL  0x04000000 /* Snapshot is being deleted */
>> >>>> #define EXT4_SNAPFILE_SHRUNK_FL   0x08000000 /* Snapshot shrink has completed */
>> >>>> #define EXT2_RESERVED_FL          0x80000000 /* reserved for ext2 lib */
>> >>>>
>> >>>> #define EXT2_FL_USER_VISIBLE      0x004BDFFF /* User visible flags */
>> >>> so there is a conflict with FS_COW_FL and EXT4_SNAPFILE_FL.  I don't know the semantics of those two flags enough to say for sure whether it is reasonable that they alias to each other, but at first glance "COW" and "SNAPSHOT" don't seem completely unrelated.
>> >
>> > EXT4_SNAPFILE_FL indicates a special system snapshot file, so it has
>> > no equivalence relation with FS_COW_FL.
>> > Please use 0x02000000 for FS_COW_FL.
>>
>> Fine with that, but it's up to Chris. :)
>
> I'd rather not conflict unless we're critically short on space.
>
>> >
>> > EXT4_SNAPFILE_DELETED_FL is a persistent state of a snapshot file,
>> > which is no longer
>> > available as a mountable device, but cannot be unlinked because it
>> > holds changed data sets
>> > needed by older snapshots.
>> >
>> > EXT4_SNAPFILE_SHRUNK_FL is a persistent state of a (deleted) snapshot
>> > file, which has
>> > undergone a "shrink" process to free all change sets not needed by
>> > older snapshots.
>> > The persistence of the flag is needed to avoid tedious shrinking when
>> > it is not needed.
>> >
>> >
>> >> In the btrfs case FS_COW_FL means to do COW even when there are no
>> >> snapshots.  FS_NOCOW_FL means to do cow only when there are snapshots.
>> >>
>> >
>> > I am interested in FS_NOCOW_FL as well, but for my implementation it would mean
>> > do not do COW on rewrites even when there are snapshots, so a user can
>> > create a pre-allocated
>> > "island of blocks", which are pinned to a physical location, for raw
>> > VM image for example.
>
> I'm not sure how the island of blocks idea can work with snapshots?
> Wouldn't the snapshot corrupt if anything in the island were changed?
>

It would corrupt, but only to the extent that the file to which you requested
NOCOW may contain newer data. It cannot contain uninitialized data,
because truncating the file would leave it's blocks referenced by the snapshot.

Think of a large database file, which is already replicated and hot backed up
regularly. An arbitrary snapshot of that file will give you a copy for
disaster recovery
at best. Not sure this is worth the effort of COWing it and
fragmenting it beyond
recognition.

Amir.
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  reply	other threads:[~2011-03-17 14:37 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-03-03  8:35 [PATCH 1/2] Btrfs: add datacow flag in inode flag liubo
2011-03-15 20:26 ` Chris Mason
2011-03-15 20:57   ` Christoph Hellwig
2011-03-15 22:06     ` Andreas Dilger
2011-03-15 23:35       ` Chris Mason
2011-03-16  9:06         ` Amir Goldstein
2011-03-16  9:06           ` Amir Goldstein
2011-03-17  2:10           ` liubo
2011-03-17 14:21             ` Chris Mason
2011-03-17 14:37               ` Amir Goldstein [this message]
2011-03-17 14:37                 ` Amir Goldstein

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='AANLkTi=eUT8tm9v8ozDOc_W9MBGr=ccVRXYpASRFT4vZ@mail.gmail.com' \
    --to=amir73il@gmail.com \
    --cc=adilger@dilger.ca \
    --cc=chris.mason@oracle.com \
    --cc=hch@lst.de \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=liubo2009@cn.fujitsu.com \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.