linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Darrick J. Wong" <darrick.wong@oracle.com>
To: Goldwyn Rodrigues <rgoldwyn@suse.de>, "Theodore Ts'o" <tytso@mit.edu>
Cc: linux-btrfs@vger.kernel.org, kilobyte@angband.pl,
	linux-fsdevel@vger.kernel.org, jack@suse.cz, david@fromorbit.com,
	willy@infradead.org, hch@lst.de, dsterba@suse.cz,
	nborisov@suse.com, linux-nvdimm@lists.01.org,
	Goldwyn Rodrigues <rgoldwyn@suse.com>
Subject: Re: [PATCH 01/18] btrfs: create a mount option for dax
Date: Tue, 21 May 2019 11:02:30 -0700	[thread overview]
Message-ID: <20190521180230.GG5125@magnolia> (raw)
In-Reply-To: <20190429172649.8288-2-rgoldwyn@suse.de>

[add Ted to the thread]

On Mon, Apr 29, 2019 at 12:26:32PM -0500, Goldwyn Rodrigues wrote:
> From: Goldwyn Rodrigues <rgoldwyn@suse.com>
> 
> This sets S_DAX in inode->i_flags, which can be used with
> IS_DAX().
> 
> The dax option is restricted to non multi-device mounts.
> dax interacts with the device directly instead of using bio, so
> all bio-hooks which we use for multi-device cannot be performed
> here. While regular read/writes could be manipulated with
> RAID0/1, mmap() is still an issue.
> 
> Auto-setting free space tree, because dealing with free space
> inode (specifically readpages) is a nightmare.
> Auto-setting nodatasum because we don't get callback for writing
> checksums after mmap()s.
> Deny compression because it does not go with direct I/O.
> 
> Store the dax_device in fs_info which will be used in iomap code.
> 
> I am aware of the push to directory-based flags for dax. Until, that
> code is in the kernel, we will work with mount flags.

Hmm.  This patchset was sent before LSFMM, and I've heard[1] that the
discussion there yielded some progress on how to move forward with the
user interface.  I've gotten the impression that means no new dax mount
options; a persistent flag that can be inherited by new files; and some
other means for userspace to check if writethrough worked.

However, the LWN article says Ted planned to summarize for fsdevel so
let's table this part until he does that.  Ted? :)

--D

[1] https://lwn.net/SubscriberLink/787973/ad85537bf8747e90/

> Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
> ---
>  fs/btrfs/ctree.h   |  2 ++
>  fs/btrfs/disk-io.c |  4 ++++
>  fs/btrfs/ioctl.c   |  5 ++++-
>  fs/btrfs/super.c   | 30 ++++++++++++++++++++++++++++++
>  4 files changed, 40 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
> index b3642367a595..8ca1c0d120f4 100644
> --- a/fs/btrfs/ctree.h
> +++ b/fs/btrfs/ctree.h
> @@ -1067,6 +1067,7 @@ struct btrfs_fs_info {
>  	u32 metadata_ratio;
>  
>  	void *bdev_holder;
> +	struct dax_device *dax_dev;
>  
>  	/* private scrub information */
>  	struct mutex scrub_lock;
> @@ -1442,6 +1443,7 @@ static inline u32 BTRFS_MAX_XATTR_SIZE(const struct btrfs_fs_info *info)
>  #define BTRFS_MOUNT_FREE_SPACE_TREE	(1 << 26)
>  #define BTRFS_MOUNT_NOLOGREPLAY		(1 << 27)
>  #define BTRFS_MOUNT_REF_VERIFY		(1 << 28)
> +#define BTRFS_MOUNT_DAX			(1 << 29)
>  
>  #define BTRFS_DEFAULT_COMMIT_INTERVAL	(30)
>  #define BTRFS_DEFAULT_MAX_INLINE	(2048)
> diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
> index 6fe9197f6ee4..2bbb63b2fcff 100644
> --- a/fs/btrfs/disk-io.c
> +++ b/fs/btrfs/disk-io.c
> @@ -16,6 +16,7 @@
>  #include <linux/uuid.h>
>  #include <linux/semaphore.h>
>  #include <linux/error-injection.h>
> +#include <linux/dax.h>
>  #include <linux/crc32c.h>
>  #include <linux/sched/mm.h>
>  #include <asm/unaligned.h>
> @@ -2805,6 +2806,8 @@ int open_ctree(struct super_block *sb,
>  		goto fail_alloc;
>  	}
>  
> +	fs_info->dax_dev = fs_dax_get_by_bdev(fs_devices->latest_bdev);
> +
>  	/*
>  	 * We want to check superblock checksum, the type is stored inside.
>  	 * Pass the whole disk block of size BTRFS_SUPER_INFO_SIZE (4k).
> @@ -4043,6 +4046,7 @@ void close_ctree(struct btrfs_fs_info *fs_info)
>  #endif
>  
>  	btrfs_close_devices(fs_info->fs_devices);
> +	fs_put_dax(fs_info->dax_dev);
>  	btrfs_mapping_tree_free(&fs_info->mapping_tree);
>  
>  	percpu_counter_destroy(&fs_info->dirty_metadata_bytes);
> diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
> index cd4e693406a0..0138119cd9a3 100644
> --- a/fs/btrfs/ioctl.c
> +++ b/fs/btrfs/ioctl.c
> @@ -149,8 +149,11 @@ void btrfs_sync_inode_flags_to_i_flags(struct inode *inode)
>  	if (binode->flags & BTRFS_INODE_DIRSYNC)
>  		new_fl |= S_DIRSYNC;
>  
> +	if ((btrfs_test_opt(btrfs_sb(inode->i_sb), DAX)) && S_ISREG(inode->i_mode))
> +		new_fl |= S_DAX;
> +
>  	set_mask_bits(&inode->i_flags,
> -		      S_SYNC | S_APPEND | S_IMMUTABLE | S_NOATIME | S_DIRSYNC,
> +		      S_SYNC | S_APPEND | S_IMMUTABLE | S_NOATIME | S_DIRSYNC | S_DAX,
>  		      new_fl);
>  }
>  
> diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
> index 120e4340792a..3b85e61e5182 100644
> --- a/fs/btrfs/super.c
> +++ b/fs/btrfs/super.c
> @@ -326,6 +326,7 @@ enum {
>  	Opt_treelog, Opt_notreelog,
>  	Opt_usebackuproot,
>  	Opt_user_subvol_rm_allowed,
> +	Opt_dax,
>  
>  	/* Deprecated options */
>  	Opt_alloc_start,
> @@ -393,6 +394,7 @@ static const match_table_t tokens = {
>  	{Opt_notreelog, "notreelog"},
>  	{Opt_usebackuproot, "usebackuproot"},
>  	{Opt_user_subvol_rm_allowed, "user_subvol_rm_allowed"},
> +	{Opt_dax, "dax"},
>  
>  	/* Deprecated options */
>  	{Opt_alloc_start, "alloc_start=%s"},
> @@ -745,6 +747,32 @@ int btrfs_parse_options(struct btrfs_fs_info *info, char *options,
>  		case Opt_user_subvol_rm_allowed:
>  			btrfs_set_opt(info->mount_opt, USER_SUBVOL_RM_ALLOWED);
>  			break;
> +		case Opt_dax:
> +#ifdef CONFIG_FS_DAX
> +			if (btrfs_super_num_devices(info->super_copy) > 1) {
> +				btrfs_info(info,
> +					   "dax not supported for multi-device btrfs partition\n");
> +				ret = -EOPNOTSUPP;
> +				goto out;
> +			}
> +			btrfs_set_opt(info->mount_opt, DAX);
> +			btrfs_warn(info, "DAX enabled. Warning: EXPERIMENTAL, use at your own risk\n");
> +			btrfs_set_and_info(info, NODATASUM,
> +					   "auto-setting nodatasum (dax)");
> +			btrfs_clear_opt(info->mount_opt, SPACE_CACHE);
> +			btrfs_set_and_info(info, FREE_SPACE_TREE,
> +					"auto-setting free space tree (dax)");
> +			if (btrfs_test_opt(info, COMPRESS)) {
> +				btrfs_info(info, "disabling compress (dax)");
> +				btrfs_clear_opt(info->mount_opt, COMPRESS);
> +			}
> +			break;
> +#else
> +			btrfs_err(info,
> +				  "DAX option not supported\n");
> +			ret = -EINVAL;
> +			goto out;
> +#endif
>  		case Opt_enospc_debug:
>  			btrfs_set_opt(info->mount_opt, ENOSPC_DEBUG);
>  			break;
> @@ -1335,6 +1363,8 @@ static int btrfs_show_options(struct seq_file *seq, struct dentry *dentry)
>  		seq_puts(seq, ",clear_cache");
>  	if (btrfs_test_opt(info, USER_SUBVOL_RM_ALLOWED))
>  		seq_puts(seq, ",user_subvol_rm_allowed");
> +	if (btrfs_test_opt(info, DAX))
> +		seq_puts(seq, ",dax");
>  	if (btrfs_test_opt(info, ENOSPC_DEBUG))
>  		seq_puts(seq, ",enospc_debug");
>  	if (btrfs_test_opt(info, AUTO_DEFRAG))
> -- 
> 2.16.4
> 

  reply	other threads:[~2019-05-21 18:03 UTC|newest]

Thread overview: 58+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-04-29 17:26 [PATCH v4 00/18] btrfs dax support Goldwyn Rodrigues
2019-04-29 17:26 ` [PATCH 01/18] btrfs: create a mount option for dax Goldwyn Rodrigues
2019-05-21 18:02   ` Darrick J. Wong [this message]
2019-04-29 17:26 ` [PATCH 02/18] btrfs: Carve out btrfs_get_extent_map_write() out of btrfs_get_blocks_write() Goldwyn Rodrigues
2019-04-29 17:26 ` [PATCH 03/18] btrfs: basic dax read Goldwyn Rodrigues
2019-05-21 15:14   ` Darrick J. Wong
2019-05-22 21:50     ` Goldwyn Rodrigues
2019-04-29 17:26 ` [PATCH 04/18] dax: Introduce IOMAP_DAX_COW to CoW edges during writes Goldwyn Rodrigues
2019-05-21 16:51   ` Darrick J. Wong
2019-05-22 20:14     ` Goldwyn Rodrigues
2019-05-23  2:10       ` Dave Chinner
2019-05-23  9:05     ` Shiyang Ruan
2019-05-23 11:51       ` Goldwyn Rodrigues
2019-05-27  8:25         ` Shiyang Ruan
2019-05-28  9:17           ` Jan Kara
2019-05-29  2:01             ` Shiyang Ruan
2019-05-29  2:47               ` Dave Chinner
2019-05-29  4:02                 ` Shiyang Ruan
2019-05-29  4:07                   ` Darrick J. Wong
2019-05-29  4:46                     ` Dave Chinner
2019-05-29 13:46                       ` Jan Kara
2019-05-29 22:14                         ` Dave Chinner
2019-05-30 11:16                           ` Jan Kara
2019-05-30 22:59                             ` Dave Chinner
2019-04-29 17:26 ` [PATCH 05/18] btrfs: return whether extent is nocow or not Goldwyn Rodrigues
2019-04-29 17:26 ` [PATCH 06/18] btrfs: Rename __endio_write_update_ordered() to btrfs_update_ordered_extent() Goldwyn Rodrigues
2019-04-29 17:26 ` [PATCH 07/18] btrfs: add dax write support Goldwyn Rodrigues
2019-05-21 17:08   ` Darrick J. Wong
2019-04-29 17:26 ` [PATCH 08/18] dax: memcpy page in case of IOMAP_DAX_COW for mmap faults Goldwyn Rodrigues
2019-05-21 17:46   ` Darrick J. Wong
2019-05-22 19:11     ` Goldwyn Rodrigues
2019-05-23  4:02       ` Darrick J. Wong
2019-05-23 12:10     ` Jan Kara
2019-04-29 17:26 ` [PATCH 09/18] btrfs: Add dax specific address_space_operations Goldwyn Rodrigues
2019-04-29 17:26 ` [PATCH 10/18] dax: replace mmap entry in case of CoW Goldwyn Rodrigues
2019-05-21 17:35   ` Darrick J. Wong
2019-05-23 13:38   ` Jan Kara
2019-04-29 17:26 ` [PATCH 11/18] btrfs: add dax mmap support Goldwyn Rodrigues
2019-04-29 17:26 ` [PATCH 12/18] btrfs: allow MAP_SYNC mmap Goldwyn Rodrigues
2019-05-10 15:32   ` [PATCH for-goldwyn] btrfs: disallow MAP_SYNC outside of DAX mounts Adam Borowski
2019-05-10 15:41     ` Dan Williams
2019-05-10 15:59       ` Pankaj Gupta
2019-05-23 13:44   ` [PATCH 12/18] btrfs: allow MAP_SYNC mmap Jan Kara
2019-05-23 16:19     ` Adam Borowski
2019-04-29 17:26 ` [PATCH 13/18] fs: dedup file range to use a compare function Goldwyn Rodrigues
2019-05-21 18:17   ` Darrick J. Wong
2019-04-29 17:26 ` [PATCH 14/18] dax: memcpy before zeroing range Goldwyn Rodrigues
2019-05-21 17:27   ` Darrick J. Wong
2019-04-29 17:26 ` [PATCH 15/18] btrfs: handle dax page zeroing Goldwyn Rodrigues
2019-04-29 17:26 ` [PATCH 16/18] btrfs: Writeprotect mmap pages on snapshot Goldwyn Rodrigues
2019-05-23 14:04   ` Jan Kara
2019-05-23 15:27     ` Goldwyn Rodrigues
2019-05-23 19:07       ` Jan Kara
2019-05-23 21:22         ` Goldwyn Rodrigues
2019-04-29 17:26 ` [PATCH 17/18] btrfs: Disable dax-based defrag and send Goldwyn Rodrigues
2019-04-29 17:26 ` [PATCH 18/18] btrfs: trace functions for btrfs_iomap_begin/end Goldwyn Rodrigues
  -- strict thread matches above, loose matches on Subject: below --
2019-04-16 16:41 [PATCH v3 00/18] btrfs dax support Goldwyn Rodrigues
2019-04-16 16:41 ` [PATCH 01/18] btrfs: create a mount option for dax Goldwyn Rodrigues
2019-04-16 16:52   ` Dan Williams

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190521180230.GG5125@magnolia \
    --to=darrick.wong@oracle.com \
    --cc=david@fromorbit.com \
    --cc=dsterba@suse.cz \
    --cc=hch@lst.de \
    --cc=jack@suse.cz \
    --cc=kilobyte@angband.pl \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-nvdimm@lists.01.org \
    --cc=nborisov@suse.com \
    --cc=rgoldwyn@suse.com \
    --cc=rgoldwyn@suse.de \
    --cc=tytso@mit.edu \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).