All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCHSET 0/3] xfs-documentation: updates for 6.1
@ 2023-01-18  0:42 Darrick J. Wong
  2023-01-18  0:44 ` [PATCH 1/3] design: update group quota inode information for v5 filesystems Darrick J. Wong
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Darrick J. Wong @ 2023-01-18  0:42 UTC (permalink / raw)
  To: djwong, darrick.wong; +Cc: linux-xfs, chandan.babu, allison.henderson

Hi all,

Here's a pile of updates detailing the changes made during 2022 for
kernel 6.1.

If you're going to start using this mess, you probably ought to just
pull from my git trees, which are linked below.

This is an extraordinary way to destroy everything.  Enjoy!
Comments and questions are, as always, welcome.
xfsdocs git tree:
https://git.kernel.org/cgit/linux/kernel/git/djwong/xfs-documentation.git/log/?h=xfsdocs-6.2-updates
---
 .../allocation_groups.asciidoc                     |   25 +++--
 .../journaling_log.asciidoc                        |  109 ++++++++++++++++++++
 design/XFS_Filesystem_Structure/magic.asciidoc     |    2 
 .../XFS_Filesystem_Structure/ondisk_inode.asciidoc |   61 ++++++++++-
 4 files changed, 184 insertions(+), 13 deletions(-)


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 1/3] design: update group quota inode information for v5 filesystems
  2023-01-18  0:42 [PATCHSET 0/3] xfs-documentation: updates for 6.1 Darrick J. Wong
@ 2023-01-18  0:44 ` Darrick J. Wong
  2023-01-24  5:29   ` Chandan Babu R
  2023-01-18  0:45 ` [PATCH 2/3] design: document the large extent count ondisk format changes Darrick J. Wong
  2023-01-18  0:45 ` [PATCH 3/3] design: document extended attribute log item changes Darrick J. Wong
  2 siblings, 1 reply; 8+ messages in thread
From: Darrick J. Wong @ 2023-01-18  0:44 UTC (permalink / raw)
  To: djwong, darrick.wong; +Cc: linux-xfs, chandan.babu, allison.henderson

From: Darrick J. Wong <djwong@kernel.org>

Fix a few out of date statements about the group quota inode field on v5
filesystems.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
 .../allocation_groups.asciidoc                     |    7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)


diff --git a/design/XFS_Filesystem_Structure/allocation_groups.asciidoc b/design/XFS_Filesystem_Structure/allocation_groups.asciidoc
index 0e48b4bf..7ee5d561 100644
--- a/design/XFS_Filesystem_Structure/allocation_groups.asciidoc
+++ b/design/XFS_Filesystem_Structure/allocation_groups.asciidoc
@@ -262,11 +262,12 @@ maintained in the first superblock.
 *sb_uquotino*::
 Inode for user quotas. This and the following two quota fields only apply if
 +XFS_SB_VERSION_QUOTABIT+ flag is set in +sb_versionnum+. Refer to
-xref:Quota_Inodes[quota inodes] for more information
+xref:Quota_Inodes[quota inodes] for more information.
 
 *sb_gquotino*::
-Inode for group or project quotas. Group and Project quotas cannot be used at
-the same time.
+Inode for group or project quotas. Group and project quotas cannot be used at
+the same time on v4 filesystems.  On a v5 filesystem, this inode always stores
+group quota information.
 
 *sb_qflags*::
 Quota flags. It can be a combination of the following flags:


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 2/3] design: document the large extent count ondisk format changes
  2023-01-18  0:42 [PATCHSET 0/3] xfs-documentation: updates for 6.1 Darrick J. Wong
  2023-01-18  0:44 ` [PATCH 1/3] design: update group quota inode information for v5 filesystems Darrick J. Wong
@ 2023-01-18  0:45 ` Darrick J. Wong
  2023-01-24  5:30   ` Chandan Babu R
  2023-01-18  0:45 ` [PATCH 3/3] design: document extended attribute log item changes Darrick J. Wong
  2 siblings, 1 reply; 8+ messages in thread
From: Darrick J. Wong @ 2023-01-18  0:45 UTC (permalink / raw)
  To: djwong, darrick.wong; +Cc: linux-xfs, chandan.babu, allison.henderson

From: Darrick J. Wong <djwong@kernel.org>

Update the ondisk format documentation to discuss the larger maximum
extent counts that were added in 2022.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
 .../allocation_groups.asciidoc                     |    4 +
 .../XFS_Filesystem_Structure/ondisk_inode.asciidoc |   61 ++++++++++++++++++--
 2 files changed, 58 insertions(+), 7 deletions(-)


diff --git a/design/XFS_Filesystem_Structure/allocation_groups.asciidoc b/design/XFS_Filesystem_Structure/allocation_groups.asciidoc
index 7ee5d561..c64b4fad 100644
--- a/design/XFS_Filesystem_Structure/allocation_groups.asciidoc
+++ b/design/XFS_Filesystem_Structure/allocation_groups.asciidoc
@@ -454,6 +454,10 @@ xref:Timestamps[timestamps] for more information.
 The filesystem is not in operable condition, and must be run through
 xfs_repair before it can be mounted.
 
+| +XFS_SB_FEAT_INCOMPAT_NREXT64+ |
+Large file fork extent counts.  This greatly expands the maximum number of
+space mappings allowed in data and extended attribute file forks.
+
 |=====
 
 *sb_features_log_incompat*::
diff --git a/design/XFS_Filesystem_Structure/ondisk_inode.asciidoc b/design/XFS_Filesystem_Structure/ondisk_inode.asciidoc
index 1922954e..34c06487 100644
--- a/design/XFS_Filesystem_Structure/ondisk_inode.asciidoc
+++ b/design/XFS_Filesystem_Structure/ondisk_inode.asciidoc
@@ -84,14 +84,41 @@ struct xfs_dinode_core {
      __uint32_t                di_nlink;
      __uint16_t                di_projid;
      __uint16_t                di_projid_hi;
-     __uint8_t                 di_pad[6];
-     __uint16_t                di_flushiter;
+     union {
+          /* Number of data fork extents if NREXT64 is set */
+          __be64               di_big_nextents;
+
+          /* Padding for V3 inodes without NREXT64 set. */
+          __be64               di_v3_pad;
+
+          /* Padding and inode flush counter for V2 inodes. */
+          struct {
+               __u8            di_v2_pad[6];
+               __be16          di_flushiter;
+          };
+     };
      xfs_timestamp_t           di_atime;
      xfs_timestamp_t           di_mtime;
      xfs_timestamp_t           di_ctime;
      xfs_fsize_t               di_size;
      xfs_rfsblock_t            di_nblocks;
      xfs_extlen_t              di_extsize;
+     union {
+          /*
+           * For V2 inodes and V3 inodes without NREXT64 set, this
+           * is the number of data and attr fork extents.
+           */
+          struct {
+               __be32          di_nextents;
+               __be16          di_anextents;
+          } __packed;
+
+          /* Number of attr fork extents if NREXT64 is set. */
+          struct {
+               __be32          di_big_anextents;
+               __be16          di_nrext64_pad;
+          } __packed;
+     } __packed;
      xfs_extnum_t              di_nextents;
      xfs_aextnum_t             di_anextents;
      __uint8_t                 di_forkoff;
@@ -162,7 +189,7 @@ When the number exceeds 65535, the inode is converted to v2 and the link count
 is stored in +di_nlink+.
 
 *di_uid*::
-Specifies the owner's UID of the inode. 
+Specifies the owner's UID of the inode.
 
 *di_gid*::
 Specifies the owner's GID of the inode.
@@ -181,10 +208,17 @@ Specifies the high 16 bits of the owner's project ID in v2 inodes, if the
 +XFS_SB_VERSION2_PROJID32BIT+ feature is set; and zero otherwise.
 
 *di_pad[6]*::
-Reserved, must be zero.
+Reserved, must be zero.  Only exists for v2 inodes.
 
 *di_flushiter*::
-Incremented on flush.
+Incremented on flush.  Only exists for v2 inodes.
+
+*di_v3_pad*::
+Must be zero for v3 inodes without the NREXT64 flag set.
+
+*di_big_nextents*::
+Specifies the number of data extents associated with this inode if the NREXT64
+flag is set.  This allows for up to 2^48^ - 1 extent mappings.
 
 *di_atime*::
 
@@ -231,10 +265,19 @@ file is written to beyond allocated space, XFS will attempt to allocate
 additional disk space based on this value.
 
 *di_nextents*::
-Specifies the number of data extents associated with this inode.
+Specifies the number of data extents associated with this inode if the NREXT64
+flag is not set.  Supports up to 2^31^ - 1 extents.
 
 *di_anextents*::
-Specifies the number of extended attribute extents associated with this inode.
+Specifies the number of extended attribute extents associated with this inode
+if the NREXT64 flag is not set.  Supports up to 2^15^ - 1 extents.
+
+*di_big_anextents*::
+Specifies the number of extended attribute extents associated with this inode
+if the NREXT64 flag is set.  Supports up to 2^32^ - 1 extents.
+
+*di_nrext64_pad*::
+Must be zero if the NREXT64 flag is set.
 
 *di_forkoff*::
 Specifies the offset into the inode's literal area where the extended attribute
@@ -336,6 +379,10 @@ This inode shares (or has shared) data blocks with another inode.
 For files, this is the extent size hint for copy on write operations; see
 +di_cowextsize+ for details.  For directories, the value in +di_cowextsize+
 will be copied to all newly created files and directories.
+| +XFS_DIFLAG2_NREXT64+		|
+Files with this flag set may have up to (2^48^ - 1) extents mapped to the data
+fork and up to (2^32^ - 1) extents mapped to the attribute fork.  This flag
+requires the +XFS_SB_FEAT_INCOMPAT_NREXT64+ feature to be enabled.
 |=====
 
 *di_cowextsize*::


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 3/3] design: document extended attribute log item changes
  2023-01-18  0:42 [PATCHSET 0/3] xfs-documentation: updates for 6.1 Darrick J. Wong
  2023-01-18  0:44 ` [PATCH 1/3] design: update group quota inode information for v5 filesystems Darrick J. Wong
  2023-01-18  0:45 ` [PATCH 2/3] design: document the large extent count ondisk format changes Darrick J. Wong
@ 2023-01-18  0:45 ` Darrick J. Wong
  2023-01-24  5:30   ` Chandan Babu R
  2 siblings, 1 reply; 8+ messages in thread
From: Darrick J. Wong @ 2023-01-18  0:45 UTC (permalink / raw)
  To: djwong, darrick.wong; +Cc: linux-xfs, chandan.babu, allison.henderson

From: Darrick J. Wong <djwong@kernel.org>

Describe the changes to the ondisk log format that are required to
support atomic updates to extended attributes.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
 .../allocation_groups.asciidoc                     |   14 ++-
 .../journaling_log.asciidoc                        |  109 ++++++++++++++++++++
 design/XFS_Filesystem_Structure/magic.asciidoc     |    2 
 3 files changed, 122 insertions(+), 3 deletions(-)


diff --git a/design/XFS_Filesystem_Structure/allocation_groups.asciidoc b/design/XFS_Filesystem_Structure/allocation_groups.asciidoc
index c64b4fad..c0ba16a8 100644
--- a/design/XFS_Filesystem_Structure/allocation_groups.asciidoc
+++ b/design/XFS_Filesystem_Structure/allocation_groups.asciidoc
@@ -461,9 +461,17 @@ space mappings allowed in data and extended attribute file forks.
 |=====
 
 *sb_features_log_incompat*::
-Read-write incompatible feature flags for the log.  The kernel cannot read or
-write this FS log if it doesn't understand the flag.  Currently, no flags are
-defined.
+Read-write incompatible feature flags for the log.  The kernel cannot recover
+the FS log if it doesn't understand the flag.
+
+.Extended Version 5 Superblock Log incompatibility flags
+[options="header"]
+|=====
+| Flag					| Description
+| +XFS_SB_FEAT_INCOMPAT_LOG_XATTRS+	|
+Extended attribute updates have been committed to the ondisk log.
+
+|=====
 
 *sb_crc*::
 Superblock checksum.
diff --git a/design/XFS_Filesystem_Structure/journaling_log.asciidoc b/design/XFS_Filesystem_Structure/journaling_log.asciidoc
index ddcb87f4..f36dd352 100644
--- a/design/XFS_Filesystem_Structure/journaling_log.asciidoc
+++ b/design/XFS_Filesystem_Structure/journaling_log.asciidoc
@@ -215,6 +215,8 @@ magic number to distinguish themselves.  Buffer data items only appear after
 | +XFS_LI_CUD+			| 0x1243        | xref:CUD_Log_Item[Reference Count Update Done]
 | +XFS_LI_BUI+			| 0x1244        | xref:BUI_Log_Item[File Block Mapping Update Intent]
 | +XFS_LI_BUD+			| 0x1245        | xref:BUD_Log_Item[File Block Mapping Update Done]
+| +XFS_LI_ATTRI+		| 0x1246        | xref:ATTRI_Log_Item[Extended Attribute Update Intent]
+| +XFS_LI_ATTRD+		| 0x1247        | xref:ATTRD_Log_Item[Extended Attribute Update Done]
 |=====
 
 Note that all log items (except for transaction headers) MUST start with
@@ -712,6 +714,113 @@ Size of this log item.  Should be 1.
 *bud_bui_id*::
 A 64-bit number that binds the corresponding BUI log item to this BUD log item.
 
+[[ATTRI_Log_Item]]
+=== Extended Attribute Update Intent
+
+The next two operation types work together to handle atomic extended attribute
+updates.
+
+The lower byte of the +alfi_op_flags+ field is a type code indicating what sort
+of file block mapping operation we want.
+
+.Extended attribute update log intent types
+[options="header"]
+|=====
+| Value				| Description
+| +XFS_ATTRI_OP_FLAGS_SET+	| Set a key/value pair.
+| +XFS_ATTRI_OP_FLAGS_REMOVE+	| Remove a key/value pair.
+| +XFS_ATTRI_OP_FLAGS_REPLACE+	| Replace one key/value pair with another.
+|=====
+
+The ``extended attribute update intent'' operation comes first; it tells the
+log that XFS wants to update one of a file's extended attributes.  This record
+is crucial for correct log recovery because it enables us to spread a complex
+metadata update across multiple transactions while ensuring that a crash midway
+through the complex update will be replayed fully during log recovery.
+
+[source, c]
+----
+struct xfs_attri_log_format {
+     uint16_t                  alfi_type;
+     uint16_t                  alfi_size;
+     uint32_t                  __pad;
+     uint64_t                  alfi_id;
+     uint64_t                  alfi_ino;
+     uint32_t                  alfi_op_flags;
+     uint32_t                  alfi_name_len;
+     uint32_t                  alfi_value_len;
+     uint32_t                  alfi_attr_filter;
+};
+----
+
+*alfi_type*::
+The signature of an ATTRI operation, 0x1246.  This value is in host-endian
+order, not big-endian like the rest of XFS.
+
+*alfi_size*::
+Size of this log item.  Should be 1.
+
+*alfi_id*::
+A 64-bit number that binds the corresponding ATTRD log item to this ATTRI log
+item.
+
+*alfi_ino*::
+Inode number of the file being updated.
+
+*alfi_op_flags*::
+The operation being performed.  The lower byte must be one of the
++XFS_ATTRI_OP_FLAGS_*+ flags defined above.  The upper bytes must be zero.
+
+*alfi_name_len*::
+Length of the name of the extended attribute.  This must not be zero.
+The attribute name itself is captured in the next log item.
+
+*alfi_value_len*::
+Length of the value of the extended attribute.  This must be zero for remove
+operations, and nonzero for set and replace operations.  The attribute value
+itself is captured in the log item immediately after the item containing the
+name.
+
+*alfi_attr_filter*::
+Attribute namespace filter flags.  This must be one of +ATTR_ROOT+,
++ATTR_SECURE+, or +ATTR_INCOMPLETE+.
+
+[[ATTRD_Log_Item]]
+=== Completion of Extended Attribute Updates
+
+The ``extended attribute update done'' operation complements the ``extended
+attribute update intent'' operation.  This second operation indicates that the
+update actually happened, so that log recovery needn't replay the update.  The
+ATTRD and the actual updates are typically found in a new transaction following
+the transaction in which the ATTRI was logged.
+
+[source, c]
+----
+struct xfs_attrd_log_format {
+      __uint16_t               alfd_type;
+      __uint16_t               alfd_size;
+      __uint32_t               __pad;
+      __uint64_t               alfd_alf_id;
+};
+----
+
+*alfd_type*::
+The signature of an ATTRD operation, 0x1247.  This value is in host-endian
+order, not big-endian like the rest of XFS.
+
+*alfd_size*::
+Size of this log item.  Should be 1.
+
+*alfd_bui_id*::
+A 64-bit number that binds the corresponding ATTRI log item to this ATTRD log
+item.
+
+=== Extended Attribute Name and Value
+
+These regions contain the name and value components of the extended attribute
+being updated, as needed.  There are no magic numbers; each region contains the
+data and nothing else.
+
 [[Inode_Log_Item]]
 === Inode Updates
 
diff --git a/design/XFS_Filesystem_Structure/magic.asciidoc b/design/XFS_Filesystem_Structure/magic.asciidoc
index 9be26f82..a343271a 100644
--- a/design/XFS_Filesystem_Structure/magic.asciidoc
+++ b/design/XFS_Filesystem_Structure/magic.asciidoc
@@ -71,6 +71,8 @@ are not aligned to blocks.
 | +XFS_LI_CUD+			| 0x1243        |       | xref:CUD_Log_Item[Reference Count Update Done]
 | +XFS_LI_BUI+			| 0x1244        |       | xref:BUI_Log_Item[File Block Mapping Update Intent]
 | +XFS_LI_BUD+			| 0x1245        |       | xref:BUD_Log_Item[File Block Mapping Update Done]
+| +XFS_LI_ATTRI+		| 0x1246        |       | xref:ATTRI_Log_Item[Extended Attribute Update Intent]
+| +XFS_LI_ATTRD+		| 0x1247        |       | xref:ATTRD_Log_Item[Extended Attribute Update Done]
 |=====
 
 = Theoretical Limits


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/3] design: update group quota inode information for v5 filesystems
  2023-01-18  0:44 ` [PATCH 1/3] design: update group quota inode information for v5 filesystems Darrick J. Wong
@ 2023-01-24  5:29   ` Chandan Babu R
  0 siblings, 0 replies; 8+ messages in thread
From: Chandan Babu R @ 2023-01-24  5:29 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: darrick.wong, linux-xfs, allison.henderson

On Tue, Jan 17, 2023 at 04:44:49 PM -0800, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
>
> Fix a few out of date statements about the group quota inode field on v5
> filesystems.
>

Looks good to me.

Reviewed-by: Chandan Babu R <chandan.babu@oracle.com>

-- 
chandan

> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
> ---
>  .../allocation_groups.asciidoc                     |    7 ++++---
>  1 file changed, 4 insertions(+), 3 deletions(-)
>
>
> diff --git a/design/XFS_Filesystem_Structure/allocation_groups.asciidoc b/design/XFS_Filesystem_Structure/allocation_groups.asciidoc
> index 0e48b4bf..7ee5d561 100644
> --- a/design/XFS_Filesystem_Structure/allocation_groups.asciidoc
> +++ b/design/XFS_Filesystem_Structure/allocation_groups.asciidoc
> @@ -262,11 +262,12 @@ maintained in the first superblock.
>  *sb_uquotino*::
>  Inode for user quotas. This and the following two quota fields only apply if
>  +XFS_SB_VERSION_QUOTABIT+ flag is set in +sb_versionnum+. Refer to
> -xref:Quota_Inodes[quota inodes] for more information
> +xref:Quota_Inodes[quota inodes] for more information.
>  
>  *sb_gquotino*::
> -Inode for group or project quotas. Group and Project quotas cannot be used at
> -the same time.
> +Inode for group or project quotas. Group and project quotas cannot be used at
> +the same time on v4 filesystems.  On a v5 filesystem, this inode always stores
> +group quota information.
>  
>  *sb_qflags*::
>  Quota flags. It can be a combination of the following flags:

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 2/3] design: document the large extent count ondisk format changes
  2023-01-18  0:45 ` [PATCH 2/3] design: document the large extent count ondisk format changes Darrick J. Wong
@ 2023-01-24  5:30   ` Chandan Babu R
  0 siblings, 0 replies; 8+ messages in thread
From: Chandan Babu R @ 2023-01-24  5:30 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: darrick.wong, linux-xfs, allison.henderson

On Tue, Jan 17, 2023 at 04:45:05 PM -0800, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
>
> Update the ondisk format documentation to discuss the larger maximum
> extent counts that were added in 2022.
>

Looks good to me.

Reviewed-by: Chandan Babu R <chandan.babu@oracle.com>

-- 
chandan

> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
> ---
>  .../allocation_groups.asciidoc                     |    4 +
>  .../XFS_Filesystem_Structure/ondisk_inode.asciidoc |   61 ++++++++++++++++++--
>  2 files changed, 58 insertions(+), 7 deletions(-)
>
>
> diff --git a/design/XFS_Filesystem_Structure/allocation_groups.asciidoc b/design/XFS_Filesystem_Structure/allocation_groups.asciidoc
> index 7ee5d561..c64b4fad 100644
> --- a/design/XFS_Filesystem_Structure/allocation_groups.asciidoc
> +++ b/design/XFS_Filesystem_Structure/allocation_groups.asciidoc
> @@ -454,6 +454,10 @@ xref:Timestamps[timestamps] for more information.
>  The filesystem is not in operable condition, and must be run through
>  xfs_repair before it can be mounted.
>  
> +| +XFS_SB_FEAT_INCOMPAT_NREXT64+ |
> +Large file fork extent counts.  This greatly expands the maximum number of
> +space mappings allowed in data and extended attribute file forks.
> +
>  |=====
>  
>  *sb_features_log_incompat*::
> diff --git a/design/XFS_Filesystem_Structure/ondisk_inode.asciidoc b/design/XFS_Filesystem_Structure/ondisk_inode.asciidoc
> index 1922954e..34c06487 100644
> --- a/design/XFS_Filesystem_Structure/ondisk_inode.asciidoc
> +++ b/design/XFS_Filesystem_Structure/ondisk_inode.asciidoc
> @@ -84,14 +84,41 @@ struct xfs_dinode_core {
>       __uint32_t                di_nlink;
>       __uint16_t                di_projid;
>       __uint16_t                di_projid_hi;
> -     __uint8_t                 di_pad[6];
> -     __uint16_t                di_flushiter;
> +     union {
> +          /* Number of data fork extents if NREXT64 is set */
> +          __be64               di_big_nextents;
> +
> +          /* Padding for V3 inodes without NREXT64 set. */
> +          __be64               di_v3_pad;
> +
> +          /* Padding and inode flush counter for V2 inodes. */
> +          struct {
> +               __u8            di_v2_pad[6];
> +               __be16          di_flushiter;
> +          };
> +     };
>       xfs_timestamp_t           di_atime;
>       xfs_timestamp_t           di_mtime;
>       xfs_timestamp_t           di_ctime;
>       xfs_fsize_t               di_size;
>       xfs_rfsblock_t            di_nblocks;
>       xfs_extlen_t              di_extsize;
> +     union {
> +          /*
> +           * For V2 inodes and V3 inodes without NREXT64 set, this
> +           * is the number of data and attr fork extents.
> +           */
> +          struct {
> +               __be32          di_nextents;
> +               __be16          di_anextents;
> +          } __packed;
> +
> +          /* Number of attr fork extents if NREXT64 is set. */
> +          struct {
> +               __be32          di_big_anextents;
> +               __be16          di_nrext64_pad;
> +          } __packed;
> +     } __packed;
>       xfs_extnum_t              di_nextents;
>       xfs_aextnum_t             di_anextents;
>       __uint8_t                 di_forkoff;
> @@ -162,7 +189,7 @@ When the number exceeds 65535, the inode is converted to v2 and the link count
>  is stored in +di_nlink+.
>  
>  *di_uid*::
> -Specifies the owner's UID of the inode. 
> +Specifies the owner's UID of the inode.
>  
>  *di_gid*::
>  Specifies the owner's GID of the inode.
> @@ -181,10 +208,17 @@ Specifies the high 16 bits of the owner's project ID in v2 inodes, if the
>  +XFS_SB_VERSION2_PROJID32BIT+ feature is set; and zero otherwise.
>  
>  *di_pad[6]*::
> -Reserved, must be zero.
> +Reserved, must be zero.  Only exists for v2 inodes.
>  
>  *di_flushiter*::
> -Incremented on flush.
> +Incremented on flush.  Only exists for v2 inodes.
> +
> +*di_v3_pad*::
> +Must be zero for v3 inodes without the NREXT64 flag set.
> +
> +*di_big_nextents*::
> +Specifies the number of data extents associated with this inode if the NREXT64
> +flag is set.  This allows for up to 2^48^ - 1 extent mappings.
>  
>  *di_atime*::
>  
> @@ -231,10 +265,19 @@ file is written to beyond allocated space, XFS will attempt to allocate
>  additional disk space based on this value.
>  
>  *di_nextents*::
> -Specifies the number of data extents associated with this inode.
> +Specifies the number of data extents associated with this inode if the NREXT64
> +flag is not set.  Supports up to 2^31^ - 1 extents.
>  
>  *di_anextents*::
> -Specifies the number of extended attribute extents associated with this inode.
> +Specifies the number of extended attribute extents associated with this inode
> +if the NREXT64 flag is not set.  Supports up to 2^15^ - 1 extents.
> +
> +*di_big_anextents*::
> +Specifies the number of extended attribute extents associated with this inode
> +if the NREXT64 flag is set.  Supports up to 2^32^ - 1 extents.
> +
> +*di_nrext64_pad*::
> +Must be zero if the NREXT64 flag is set.
>  
>  *di_forkoff*::
>  Specifies the offset into the inode's literal area where the extended attribute
> @@ -336,6 +379,10 @@ This inode shares (or has shared) data blocks with another inode.
>  For files, this is the extent size hint for copy on write operations; see
>  +di_cowextsize+ for details.  For directories, the value in +di_cowextsize+
>  will be copied to all newly created files and directories.
> +| +XFS_DIFLAG2_NREXT64+		|
> +Files with this flag set may have up to (2^48^ - 1) extents mapped to the data
> +fork and up to (2^32^ - 1) extents mapped to the attribute fork.  This flag
> +requires the +XFS_SB_FEAT_INCOMPAT_NREXT64+ feature to be enabled.
>  |=====
>  
>  *di_cowextsize*::

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 3/3] design: document extended attribute log item changes
  2023-01-18  0:45 ` [PATCH 3/3] design: document extended attribute log item changes Darrick J. Wong
@ 2023-01-24  5:30   ` Chandan Babu R
  2023-01-25  1:20     ` Darrick J. Wong
  0 siblings, 1 reply; 8+ messages in thread
From: Chandan Babu R @ 2023-01-24  5:30 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: darrick.wong, linux-xfs, allison.henderson

On Tue, Jan 17, 2023 at 04:45:20 PM -0800, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
>
> Describe the changes to the ondisk log format that are required to
> support atomic updates to extended attributes.
>
> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
> ---
>  .../allocation_groups.asciidoc                     |   14 ++-
>  .../journaling_log.asciidoc                        |  109 ++++++++++++++++++++
>  design/XFS_Filesystem_Structure/magic.asciidoc     |    2 
>  3 files changed, 122 insertions(+), 3 deletions(-)
>
>
> diff --git a/design/XFS_Filesystem_Structure/allocation_groups.asciidoc b/design/XFS_Filesystem_Structure/allocation_groups.asciidoc
> index c64b4fad..c0ba16a8 100644
> --- a/design/XFS_Filesystem_Structure/allocation_groups.asciidoc
> +++ b/design/XFS_Filesystem_Structure/allocation_groups.asciidoc
> @@ -461,9 +461,17 @@ space mappings allowed in data and extended attribute file forks.
>  |=====
>  
>  *sb_features_log_incompat*::
> -Read-write incompatible feature flags for the log.  The kernel cannot read or
> -write this FS log if it doesn't understand the flag.  Currently, no flags are
> -defined.
> +Read-write incompatible feature flags for the log.  The kernel cannot recover
> +the FS log if it doesn't understand the flag.
> +
> +.Extended Version 5 Superblock Log incompatibility flags
> +[options="header"]
> +|=====
> +| Flag					| Description
> +| +XFS_SB_FEAT_INCOMPAT_LOG_XATTRS+	|
> +Extended attribute updates have been committed to the ondisk log.
> +
> +|=====
>  
>  *sb_crc*::
>  Superblock checksum.
> diff --git a/design/XFS_Filesystem_Structure/journaling_log.asciidoc b/design/XFS_Filesystem_Structure/journaling_log.asciidoc
> index ddcb87f4..f36dd352 100644
> --- a/design/XFS_Filesystem_Structure/journaling_log.asciidoc
> +++ b/design/XFS_Filesystem_Structure/journaling_log.asciidoc
> @@ -215,6 +215,8 @@ magic number to distinguish themselves.  Buffer data items only appear after
>  | +XFS_LI_CUD+			| 0x1243        | xref:CUD_Log_Item[Reference Count Update Done]
>  | +XFS_LI_BUI+			| 0x1244        | xref:BUI_Log_Item[File Block Mapping Update Intent]
>  | +XFS_LI_BUD+			| 0x1245        | xref:BUD_Log_Item[File Block Mapping Update Done]
> +| +XFS_LI_ATTRI+		| 0x1246        | xref:ATTRI_Log_Item[Extended Attribute Update Intent]
> +| +XFS_LI_ATTRD+		| 0x1247        | xref:ATTRD_Log_Item[Extended Attribute Update Done]
>  |=====
>  
>  Note that all log items (except for transaction headers) MUST start with
> @@ -712,6 +714,113 @@ Size of this log item.  Should be 1.
>  *bud_bui_id*::
>  A 64-bit number that binds the corresponding BUI log item to this BUD log item.
>  
> +[[ATTRI_Log_Item]]
> +=== Extended Attribute Update Intent
> +
> +The next two operation types work together to handle atomic extended attribute
> +updates.
> +
> +The lower byte of the +alfi_op_flags+ field is a type code indicating what sort
> +of file block mapping operation we want.
> +
> +.Extended attribute update log intent types
> +[options="header"]
> +|=====
> +| Value				| Description
> +| +XFS_ATTRI_OP_FLAGS_SET+	| Set a key/value pair.
> +| +XFS_ATTRI_OP_FLAGS_REMOVE+	| Remove a key/value pair.
> +| +XFS_ATTRI_OP_FLAGS_REPLACE+	| Replace one key/value pair with another.
> +|=====
> +
> +The ``extended attribute update intent'' operation comes first; it tells the
> +log that XFS wants to update one of a file's extended attributes.  This record
> +is crucial for correct log recovery because it enables us to spread a complex
> +metadata update across multiple transactions while ensuring that a crash midway
> +through the complex update will be replayed fully during log recovery.
> +
> +[source, c]
> +----
> +struct xfs_attri_log_format {
> +     uint16_t                  alfi_type;
> +     uint16_t                  alfi_size;
> +     uint32_t                  __pad;
> +     uint64_t                  alfi_id;
> +     uint64_t                  alfi_ino;
> +     uint32_t                  alfi_op_flags;
> +     uint32_t                  alfi_name_len;
> +     uint32_t                  alfi_value_len;
> +     uint32_t                  alfi_attr_filter;
> +};
> +----
> +
> +*alfi_type*::
> +The signature of an ATTRI operation, 0x1246.  This value is in host-endian
> +order, not big-endian like the rest of XFS.
> +
> +*alfi_size*::
> +Size of this log item.  Should be 1.
> +
> +*alfi_id*::
> +A 64-bit number that binds the corresponding ATTRD log item to this ATTRI log
> +item.
> +
> +*alfi_ino*::
> +Inode number of the file being updated.
> +
> +*alfi_op_flags*::
> +The operation being performed.  The lower byte must be one of the
> ++XFS_ATTRI_OP_FLAGS_*+ flags defined above.  The upper bytes must be zero.
> +
> +*alfi_name_len*::
> +Length of the name of the extended attribute.  This must not be zero.
> +The attribute name itself is captured in the next log item.
> +
> +*alfi_value_len*::
> +Length of the value of the extended attribute.  This must be zero for remove
> +operations, and nonzero for set and replace operations.  The attribute value
> +itself is captured in the log item immediately after the item containing the
> +name.
> +
> +*alfi_attr_filter*::
> +Attribute namespace filter flags.  This must be one of +ATTR_ROOT+,
> ++ATTR_SECURE+, or +ATTR_INCOMPLETE+.
> +
> +[[ATTRD_Log_Item]]
> +=== Completion of Extended Attribute Updates
> +
> +The ``extended attribute update done'' operation complements the ``extended
> +attribute update intent'' operation.  This second operation indicates that the
> +update actually happened, so that log recovery needn't replay the update.  The
> +ATTRD and the actual updates are typically found in a new transaction following
> +the transaction in which the ATTRI was logged.
> +
> +[source, c]
> +----
> +struct xfs_attrd_log_format {
> +      __uint16_t               alfd_type;
> +      __uint16_t               alfd_size;
> +      __uint32_t               __pad;
> +      __uint64_t               alfd_alf_id;
> +};
> +----
> +
> +*alfd_type*::
> +The signature of an ATTRD operation, 0x1247.  This value is in host-endian
> +order, not big-endian like the rest of XFS.
> +
> +*alfd_size*::
> +Size of this log item.  Should be 1.
> +
> +*alfd_bui_id*::

The above should be "alfd_alf_id". Apart from that, the remaining
changes appear to be correct.

Reviewed-by: Chandan Babu R <chandan.babu@oracle.com>

-- 
chandan

> +A 64-bit number that binds the corresponding ATTRI log item to this ATTRD log
> +item.
> +
> +=== Extended Attribute Name and Value
> +
> +These regions contain the name and value components of the extended attribute
> +being updated, as needed.  There are no magic numbers; each region contains the
> +data and nothing else.
> +
>  [[Inode_Log_Item]]
>  === Inode Updates
>  
> diff --git a/design/XFS_Filesystem_Structure/magic.asciidoc b/design/XFS_Filesystem_Structure/magic.asciidoc
> index 9be26f82..a343271a 100644
> --- a/design/XFS_Filesystem_Structure/magic.asciidoc
> +++ b/design/XFS_Filesystem_Structure/magic.asciidoc
> @@ -71,6 +71,8 @@ are not aligned to blocks.
>  | +XFS_LI_CUD+			| 0x1243        |       | xref:CUD_Log_Item[Reference Count Update Done]
>  | +XFS_LI_BUI+			| 0x1244        |       | xref:BUI_Log_Item[File Block Mapping Update Intent]
>  | +XFS_LI_BUD+			| 0x1245        |       | xref:BUD_Log_Item[File Block Mapping Update Done]
> +| +XFS_LI_ATTRI+		| 0x1246        |       | xref:ATTRI_Log_Item[Extended Attribute Update Intent]
> +| +XFS_LI_ATTRD+		| 0x1247        |       | xref:ATTRD_Log_Item[Extended Attribute Update Done]
>  |=====
>  
>  = Theoretical Limits

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 3/3] design: document extended attribute log item changes
  2023-01-24  5:30   ` Chandan Babu R
@ 2023-01-25  1:20     ` Darrick J. Wong
  0 siblings, 0 replies; 8+ messages in thread
From: Darrick J. Wong @ 2023-01-25  1:20 UTC (permalink / raw)
  To: Chandan Babu R; +Cc: darrick.wong, linux-xfs, allison.henderson

On Tue, Jan 24, 2023 at 11:00:56AM +0530, Chandan Babu R wrote:
> On Tue, Jan 17, 2023 at 04:45:20 PM -0800, Darrick J. Wong wrote:
> > From: Darrick J. Wong <djwong@kernel.org>
> >
> > Describe the changes to the ondisk log format that are required to
> > support atomic updates to extended attributes.
> >
> > Signed-off-by: Darrick J. Wong <djwong@kernel.org>
> > ---
> >  .../allocation_groups.asciidoc                     |   14 ++-
> >  .../journaling_log.asciidoc                        |  109 ++++++++++++++++++++
> >  design/XFS_Filesystem_Structure/magic.asciidoc     |    2 
> >  3 files changed, 122 insertions(+), 3 deletions(-)
> >
> >
> > diff --git a/design/XFS_Filesystem_Structure/allocation_groups.asciidoc b/design/XFS_Filesystem_Structure/allocation_groups.asciidoc
> > index c64b4fad..c0ba16a8 100644
> > --- a/design/XFS_Filesystem_Structure/allocation_groups.asciidoc
> > +++ b/design/XFS_Filesystem_Structure/allocation_groups.asciidoc
> > @@ -461,9 +461,17 @@ space mappings allowed in data and extended attribute file forks.
> >  |=====
> >  
> >  *sb_features_log_incompat*::
> > -Read-write incompatible feature flags for the log.  The kernel cannot read or
> > -write this FS log if it doesn't understand the flag.  Currently, no flags are
> > -defined.
> > +Read-write incompatible feature flags for the log.  The kernel cannot recover
> > +the FS log if it doesn't understand the flag.
> > +
> > +.Extended Version 5 Superblock Log incompatibility flags
> > +[options="header"]
> > +|=====
> > +| Flag					| Description
> > +| +XFS_SB_FEAT_INCOMPAT_LOG_XATTRS+	|
> > +Extended attribute updates have been committed to the ondisk log.
> > +
> > +|=====
> >  
> >  *sb_crc*::
> >  Superblock checksum.
> > diff --git a/design/XFS_Filesystem_Structure/journaling_log.asciidoc b/design/XFS_Filesystem_Structure/journaling_log.asciidoc
> > index ddcb87f4..f36dd352 100644
> > --- a/design/XFS_Filesystem_Structure/journaling_log.asciidoc
> > +++ b/design/XFS_Filesystem_Structure/journaling_log.asciidoc
> > @@ -215,6 +215,8 @@ magic number to distinguish themselves.  Buffer data items only appear after
> >  | +XFS_LI_CUD+			| 0x1243        | xref:CUD_Log_Item[Reference Count Update Done]
> >  | +XFS_LI_BUI+			| 0x1244        | xref:BUI_Log_Item[File Block Mapping Update Intent]
> >  | +XFS_LI_BUD+			| 0x1245        | xref:BUD_Log_Item[File Block Mapping Update Done]
> > +| +XFS_LI_ATTRI+		| 0x1246        | xref:ATTRI_Log_Item[Extended Attribute Update Intent]
> > +| +XFS_LI_ATTRD+		| 0x1247        | xref:ATTRD_Log_Item[Extended Attribute Update Done]
> >  |=====
> >  
> >  Note that all log items (except for transaction headers) MUST start with
> > @@ -712,6 +714,113 @@ Size of this log item.  Should be 1.
> >  *bud_bui_id*::
> >  A 64-bit number that binds the corresponding BUI log item to this BUD log item.
> >  
> > +[[ATTRI_Log_Item]]
> > +=== Extended Attribute Update Intent
> > +
> > +The next two operation types work together to handle atomic extended attribute
> > +updates.
> > +
> > +The lower byte of the +alfi_op_flags+ field is a type code indicating what sort
> > +of file block mapping operation we want.
> > +
> > +.Extended attribute update log intent types
> > +[options="header"]
> > +|=====
> > +| Value				| Description
> > +| +XFS_ATTRI_OP_FLAGS_SET+	| Set a key/value pair.
> > +| +XFS_ATTRI_OP_FLAGS_REMOVE+	| Remove a key/value pair.
> > +| +XFS_ATTRI_OP_FLAGS_REPLACE+	| Replace one key/value pair with another.
> > +|=====
> > +
> > +The ``extended attribute update intent'' operation comes first; it tells the
> > +log that XFS wants to update one of a file's extended attributes.  This record
> > +is crucial for correct log recovery because it enables us to spread a complex
> > +metadata update across multiple transactions while ensuring that a crash midway
> > +through the complex update will be replayed fully during log recovery.
> > +
> > +[source, c]
> > +----
> > +struct xfs_attri_log_format {
> > +     uint16_t                  alfi_type;
> > +     uint16_t                  alfi_size;
> > +     uint32_t                  __pad;
> > +     uint64_t                  alfi_id;
> > +     uint64_t                  alfi_ino;
> > +     uint32_t                  alfi_op_flags;
> > +     uint32_t                  alfi_name_len;
> > +     uint32_t                  alfi_value_len;
> > +     uint32_t                  alfi_attr_filter;
> > +};
> > +----
> > +
> > +*alfi_type*::
> > +The signature of an ATTRI operation, 0x1246.  This value is in host-endian
> > +order, not big-endian like the rest of XFS.
> > +
> > +*alfi_size*::
> > +Size of this log item.  Should be 1.
> > +
> > +*alfi_id*::
> > +A 64-bit number that binds the corresponding ATTRD log item to this ATTRI log
> > +item.
> > +
> > +*alfi_ino*::
> > +Inode number of the file being updated.
> > +
> > +*alfi_op_flags*::
> > +The operation being performed.  The lower byte must be one of the
> > ++XFS_ATTRI_OP_FLAGS_*+ flags defined above.  The upper bytes must be zero.
> > +
> > +*alfi_name_len*::
> > +Length of the name of the extended attribute.  This must not be zero.
> > +The attribute name itself is captured in the next log item.
> > +
> > +*alfi_value_len*::
> > +Length of the value of the extended attribute.  This must be zero for remove
> > +operations, and nonzero for set and replace operations.  The attribute value
> > +itself is captured in the log item immediately after the item containing the
> > +name.
> > +
> > +*alfi_attr_filter*::
> > +Attribute namespace filter flags.  This must be one of +ATTR_ROOT+,
> > ++ATTR_SECURE+, or +ATTR_INCOMPLETE+.
> > +
> > +[[ATTRD_Log_Item]]
> > +=== Completion of Extended Attribute Updates
> > +
> > +The ``extended attribute update done'' operation complements the ``extended
> > +attribute update intent'' operation.  This second operation indicates that the
> > +update actually happened, so that log recovery needn't replay the update.  The
> > +ATTRD and the actual updates are typically found in a new transaction following
> > +the transaction in which the ATTRI was logged.
> > +
> > +[source, c]
> > +----
> > +struct xfs_attrd_log_format {
> > +      __uint16_t               alfd_type;
> > +      __uint16_t               alfd_size;
> > +      __uint32_t               __pad;
> > +      __uint64_t               alfd_alf_id;
> > +};
> > +----
> > +
> > +*alfd_type*::
> > +The signature of an ATTRD operation, 0x1247.  This value is in host-endian
> > +order, not big-endian like the rest of XFS.
> > +
> > +*alfd_size*::
> > +Size of this log item.  Should be 1.
> > +
> > +*alfd_bui_id*::
> 
> The above should be "alfd_alf_id". Apart from that, the remaining
> changes appear to be correct.
> 
> Reviewed-by: Chandan Babu R <chandan.babu@oracle.com>

I'll fix that.  Thanks for the review!

--D

> -- 
> chandan
> 
> > +A 64-bit number that binds the corresponding ATTRI log item to this ATTRD log
> > +item.
> > +
> > +=== Extended Attribute Name and Value
> > +
> > +These regions contain the name and value components of the extended attribute
> > +being updated, as needed.  There are no magic numbers; each region contains the
> > +data and nothing else.
> > +
> >  [[Inode_Log_Item]]
> >  === Inode Updates
> >  
> > diff --git a/design/XFS_Filesystem_Structure/magic.asciidoc b/design/XFS_Filesystem_Structure/magic.asciidoc
> > index 9be26f82..a343271a 100644
> > --- a/design/XFS_Filesystem_Structure/magic.asciidoc
> > +++ b/design/XFS_Filesystem_Structure/magic.asciidoc
> > @@ -71,6 +71,8 @@ are not aligned to blocks.
> >  | +XFS_LI_CUD+			| 0x1243        |       | xref:CUD_Log_Item[Reference Count Update Done]
> >  | +XFS_LI_BUI+			| 0x1244        |       | xref:BUI_Log_Item[File Block Mapping Update Intent]
> >  | +XFS_LI_BUD+			| 0x1245        |       | xref:BUD_Log_Item[File Block Mapping Update Done]
> > +| +XFS_LI_ATTRI+		| 0x1246        |       | xref:ATTRI_Log_Item[Extended Attribute Update Intent]
> > +| +XFS_LI_ATTRD+		| 0x1247        |       | xref:ATTRD_Log_Item[Extended Attribute Update Done]
> >  |=====
> >  
> >  = Theoretical Limits

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2023-01-25  1:20 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-01-18  0:42 [PATCHSET 0/3] xfs-documentation: updates for 6.1 Darrick J. Wong
2023-01-18  0:44 ` [PATCH 1/3] design: update group quota inode information for v5 filesystems Darrick J. Wong
2023-01-24  5:29   ` Chandan Babu R
2023-01-18  0:45 ` [PATCH 2/3] design: document the large extent count ondisk format changes Darrick J. Wong
2023-01-24  5:30   ` Chandan Babu R
2023-01-18  0:45 ` [PATCH 3/3] design: document extended attribute log item changes Darrick J. Wong
2023-01-24  5:30   ` Chandan Babu R
2023-01-25  1:20     ` Darrick J. Wong

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.