All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dan Williams <dan.j.williams@intel.com>
To: Christoph Hellwig <hch@lst.de>
Cc: Mike Snitzer <snitzer@redhat.com>,
	Matthew Wilcox <mawilcox@microsoft.com>,
	"linux-nvdimm@lists.01.org" <linux-nvdimm@lists.01.org>,
	linux-block@vger.kernel.org,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>
Subject: Re: [RFC PATCH 10/17] block: introduce bdev_dax_direct_access()
Date: Mon, 30 Jan 2017 10:16:29 -0800	[thread overview]
Message-ID: <CAPcyv4iARmZQSBybJ1iJhwkndVxSe62rk4hjP1T9prBuOqVQ-A@mail.gmail.com> (raw)
In-Reply-To: <20170130123226.GD9043@lst.de>

On Mon, Jan 30, 2017 at 4:32 AM, Christoph Hellwig <hch@lst.de> wrote:
> On Sat, Jan 28, 2017 at 12:36:58AM -0800, Dan Williams wrote:
>> Provide a replacement for bdev_direct_access() that uses
>> dax_operations.direct_access() instead of
>> block_device_operations.direct_access(). Once all consumers of the old
>> api have been converted bdev_direct_access() will be deleted.
>>
>> Given that block device partitioning decisions can cause dax page
>> alignment constraints to be violated we still need to validate the
>> block_device before calling the dax ->direct_access method.
>>
>> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
>> ---
>>  block/Kconfig          |    1 +
>>  drivers/dax/super.c    |   33 +++++++++++++++++++++++++++++++++
>>  fs/block_dev.c         |   28 ++++++++++++++++++++++++++++
>>  include/linux/blkdev.h |    3 +++
>>  include/linux/dax.h    |    2 ++
>>  5 files changed, 67 insertions(+)
>>
>> diff --git a/block/Kconfig b/block/Kconfig
>> index 8bf114a3858a..9be785173280 100644
>> --- a/block/Kconfig
>> +++ b/block/Kconfig
>> @@ -6,6 +6,7 @@ menuconfig BLOCK
>>         default y
>>         select SBITMAP
>>         select SRCU
>> +       select DAX
>>         help
>>        Provide block layer support for the kernel.
>>
>> diff --git a/drivers/dax/super.c b/drivers/dax/super.c
>> index eb844ffea3cf..ab5b082df5dd 100644
>> --- a/drivers/dax/super.c
>> +++ b/drivers/dax/super.c
>> @@ -65,6 +65,39 @@ struct dax_inode {
>>       const struct dax_operations *ops;
>>  };
>>
>> +long dax_direct_access(struct dax_inode *dax_inode, phys_addr_t dev_addr,
>> +             void **kaddr, pfn_t *pfn, long size)
>> +{
>> +     long avail;
>> +
>> +     /*
>> +      * The device driver is allowed to sleep, in order to make the
>> +      * memory directly accessible.
>> +      */
>> +     might_sleep();
>> +
>> +     if (!dax_inode)
>> +             return -EOPNOTSUPP;
>> +
>> +     if (!dax_inode_alive(dax_inode))
>> +             return -ENXIO;
>> +
>> +     if (size < 0)
>> +             return size;
>> +
>> +     if (dev_addr % PAGE_SIZE)
>> +             return -EINVAL;
>> +
>> +     avail = dax_inode->ops->direct_access(dax_inode, dev_addr, kaddr, pfn,
>> +                     size);
>> +     if (!avail)
>> +             return -ERANGE;
>> +     if (avail > 0 && avail & ~PAGE_MASK)
>> +             return -ENXIO;
>> +     return min(avail, size);
>> +}
>> +EXPORT_SYMBOL_GPL(dax_direct_access);
>> +
>>  bool dax_inode_alive(struct dax_inode *dax_inode)
>>  {
>>       lockdep_assert_held(&dax_srcu);
>> diff --git a/fs/block_dev.c b/fs/block_dev.c
>> index edb1d2b16b8f..bf4b51a3a412 100644
>> --- a/fs/block_dev.c
>> +++ b/fs/block_dev.c
>> @@ -18,6 +18,7 @@
>>  #include <linux/module.h>
>>  #include <linux/blkpg.h>
>>  #include <linux/magic.h>
>> +#include <linux/dax.h>
>>  #include <linux/buffer_head.h>
>>  #include <linux/swap.h>
>>  #include <linux/pagevec.h>
>> @@ -763,6 +764,33 @@ long bdev_direct_access(struct block_device *bdev, struct blk_dax_ctl *dax)
>>  EXPORT_SYMBOL_GPL(bdev_direct_access);
>>
>>  /**
>> + * bdev_dax_direct_access() - bdev-sector to pfn_t and kernel virtual address
>> + * @bdev: host block device for @dax_inode
>> + * @dax_inode: interface data and operations for a memory device
>> + * @dax: control and output parameters for ->direct_access
>> + *
>> + * Return: negative errno if an error occurs, otherwise the number of bytes
>> + * accessible at this address.
>> + *
>> + * Locking: must be called with dax_read_lock() held
>> + */
>> +long bdev_dax_direct_access(struct block_device *bdev,
>> +             struct dax_inode *dax_inode, struct blk_dax_ctl *dax)
>> +{
>> +     sector_t sector = dax->sector;
>> +
>> +     if (!blk_queue_dax(bdev->bd_queue))
>> +             return -EOPNOTSUPP;
>
> I don't think this should take a bdev - the caller should know if
> it has a dax_inode.  Also if you touch this anyway can we kill
> the annoying struct blk_dax_ctl calling convention?  Passing the
> four arguments explicitly is just a lot more readable and understandable.

Ok, now that dax_map_atomic() is gone, it's much easier to remove
struct blk_dax_ctl.

We can also move the partition alignment checks to be a one-time check
at bdev_dax_capable() time and kill bdev_dax_direct_access() in favor
of calling dax_direct_access() directly.

>> +     if ((sector + DIV_ROUND_UP(dax->size, 512))
>> +                     > part_nr_sects_read(bdev->bd_part))
>> +             return -ERANGE;
>> +     sector += get_start_sect(bdev);
>> +     return dax_direct_access(dax_inode, sector * 512, &dax->addr,
>> +                     &dax->pfn, dax->size);
>
> And please switch to using bytes as the granularity given that we're
> deadling with byte addressable memory.

dax_direct_access() does take a byte aligned physical address, but it
needs to be at least page aligned since we are returning a pfn_t...

Hmm, perhaps the input should be raw page frame number. We could
reduce one of the arguments by making the current 'pfn_t *' parameter
an in/out-parameter.
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

WARNING: multiple messages have this Message-ID (diff)
From: Dan Williams <dan.j.williams@intel.com>
To: Christoph Hellwig <hch@lst.de>
Cc: "linux-nvdimm@lists.01.org" <linux-nvdimm@lists.01.org>,
	Mike Snitzer <snitzer@redhat.com>,
	Toshi Kani <toshi.kani@hpe.com>,
	Matthew Wilcox <mawilcox@microsoft.com>,
	linux-block@vger.kernel.org, jmoyer <jmoyer@redhat.com>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	Ross Zwisler <ross.zwisler@linux.intel.com>
Subject: Re: [RFC PATCH 10/17] block: introduce bdev_dax_direct_access()
Date: Mon, 30 Jan 2017 10:16:29 -0800	[thread overview]
Message-ID: <CAPcyv4iARmZQSBybJ1iJhwkndVxSe62rk4hjP1T9prBuOqVQ-A@mail.gmail.com> (raw)
In-Reply-To: <20170130123226.GD9043@lst.de>

On Mon, Jan 30, 2017 at 4:32 AM, Christoph Hellwig <hch@lst.de> wrote:
> On Sat, Jan 28, 2017 at 12:36:58AM -0800, Dan Williams wrote:
>> Provide a replacement for bdev_direct_access() that uses
>> dax_operations.direct_access() instead of
>> block_device_operations.direct_access(). Once all consumers of the old
>> api have been converted bdev_direct_access() will be deleted.
>>
>> Given that block device partitioning decisions can cause dax page
>> alignment constraints to be violated we still need to validate the
>> block_device before calling the dax ->direct_access method.
>>
>> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
>> ---
>>  block/Kconfig          |    1 +
>>  drivers/dax/super.c    |   33 +++++++++++++++++++++++++++++++++
>>  fs/block_dev.c         |   28 ++++++++++++++++++++++++++++
>>  include/linux/blkdev.h |    3 +++
>>  include/linux/dax.h    |    2 ++
>>  5 files changed, 67 insertions(+)
>>
>> diff --git a/block/Kconfig b/block/Kconfig
>> index 8bf114a3858a..9be785173280 100644
>> --- a/block/Kconfig
>> +++ b/block/Kconfig
>> @@ -6,6 +6,7 @@ menuconfig BLOCK
>>         default y
>>         select SBITMAP
>>         select SRCU
>> +       select DAX
>>         help
>>        Provide block layer support for the kernel.
>>
>> diff --git a/drivers/dax/super.c b/drivers/dax/super.c
>> index eb844ffea3cf..ab5b082df5dd 100644
>> --- a/drivers/dax/super.c
>> +++ b/drivers/dax/super.c
>> @@ -65,6 +65,39 @@ struct dax_inode {
>>       const struct dax_operations *ops;
>>  };
>>
>> +long dax_direct_access(struct dax_inode *dax_inode, phys_addr_t dev_addr,
>> +             void **kaddr, pfn_t *pfn, long size)
>> +{
>> +     long avail;
>> +
>> +     /*
>> +      * The device driver is allowed to sleep, in order to make the
>> +      * memory directly accessible.
>> +      */
>> +     might_sleep();
>> +
>> +     if (!dax_inode)
>> +             return -EOPNOTSUPP;
>> +
>> +     if (!dax_inode_alive(dax_inode))
>> +             return -ENXIO;
>> +
>> +     if (size < 0)
>> +             return size;
>> +
>> +     if (dev_addr % PAGE_SIZE)
>> +             return -EINVAL;
>> +
>> +     avail = dax_inode->ops->direct_access(dax_inode, dev_addr, kaddr, pfn,
>> +                     size);
>> +     if (!avail)
>> +             return -ERANGE;
>> +     if (avail > 0 && avail & ~PAGE_MASK)
>> +             return -ENXIO;
>> +     return min(avail, size);
>> +}
>> +EXPORT_SYMBOL_GPL(dax_direct_access);
>> +
>>  bool dax_inode_alive(struct dax_inode *dax_inode)
>>  {
>>       lockdep_assert_held(&dax_srcu);
>> diff --git a/fs/block_dev.c b/fs/block_dev.c
>> index edb1d2b16b8f..bf4b51a3a412 100644
>> --- a/fs/block_dev.c
>> +++ b/fs/block_dev.c
>> @@ -18,6 +18,7 @@
>>  #include <linux/module.h>
>>  #include <linux/blkpg.h>
>>  #include <linux/magic.h>
>> +#include <linux/dax.h>
>>  #include <linux/buffer_head.h>
>>  #include <linux/swap.h>
>>  #include <linux/pagevec.h>
>> @@ -763,6 +764,33 @@ long bdev_direct_access(struct block_device *bdev, struct blk_dax_ctl *dax)
>>  EXPORT_SYMBOL_GPL(bdev_direct_access);
>>
>>  /**
>> + * bdev_dax_direct_access() - bdev-sector to pfn_t and kernel virtual address
>> + * @bdev: host block device for @dax_inode
>> + * @dax_inode: interface data and operations for a memory device
>> + * @dax: control and output parameters for ->direct_access
>> + *
>> + * Return: negative errno if an error occurs, otherwise the number of bytes
>> + * accessible at this address.
>> + *
>> + * Locking: must be called with dax_read_lock() held
>> + */
>> +long bdev_dax_direct_access(struct block_device *bdev,
>> +             struct dax_inode *dax_inode, struct blk_dax_ctl *dax)
>> +{
>> +     sector_t sector = dax->sector;
>> +
>> +     if (!blk_queue_dax(bdev->bd_queue))
>> +             return -EOPNOTSUPP;
>
> I don't think this should take a bdev - the caller should know if
> it has a dax_inode.  Also if you touch this anyway can we kill
> the annoying struct blk_dax_ctl calling convention?  Passing the
> four arguments explicitly is just a lot more readable and understandable.

Ok, now that dax_map_atomic() is gone, it's much easier to remove
struct blk_dax_ctl.

We can also move the partition alignment checks to be a one-time check
at bdev_dax_capable() time and kill bdev_dax_direct_access() in favor
of calling dax_direct_access() directly.

>> +     if ((sector + DIV_ROUND_UP(dax->size, 512))
>> +                     > part_nr_sects_read(bdev->bd_part))
>> +             return -ERANGE;
>> +     sector += get_start_sect(bdev);
>> +     return dax_direct_access(dax_inode, sector * 512, &dax->addr,
>> +                     &dax->pfn, dax->size);
>
> And please switch to using bytes as the granularity given that we're
> deadling with byte addressable memory.

dax_direct_access() does take a byte aligned physical address, but it
needs to be at least page aligned since we are returning a pfn_t...

Hmm, perhaps the input should be raw page frame number. We could
reduce one of the arguments by making the current 'pfn_t *' parameter
an in/out-parameter.

  reply	other threads:[~2017-01-30 18:16 UTC|newest]

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-01-28  8:36 [RFC PATCH 00/17] introduce a dax_inode for dax_operations Dan Williams
2017-01-28  8:36 ` Dan Williams
2017-01-28  8:36 ` [RFC PATCH 01/17] dax: refactor dax-fs into a generic provider of dax inodes Dan Williams
2017-01-28  8:36   ` Dan Williams
2017-01-30 12:28   ` Christoph Hellwig
2017-01-30 17:12     ` Dan Williams
2017-01-30 17:12       ` Dan Williams
2017-01-28  8:36 ` [RFC PATCH 02/17] dax: convert dax_inode locking to srcu Dan Williams
2017-01-28  8:36   ` Dan Williams
2017-01-28  8:36 ` [RFC PATCH 03/17] dax: add a facility to lookup a dax inode by 'host' device name Dan Williams
2017-01-28  8:36   ` Dan Williams
2017-01-28  8:36 ` [RFC PATCH 04/17] dax: introduce dax_operations Dan Williams
2017-01-28  8:36   ` Dan Williams
2017-01-28  8:36 ` [RFC PATCH 05/17] pmem: add dax_operations support Dan Williams
2017-01-28  8:36   ` Dan Williams
2017-01-28  8:36 ` [RFC PATCH 06/17] axon_ram: " Dan Williams
2017-01-28  8:36   ` Dan Williams
2017-01-28  8:36 ` [RFC PATCH 07/17] brd: " Dan Williams
2017-01-28  8:36   ` Dan Williams
2017-01-28  8:36 ` [RFC PATCH 08/17] dcssblk: " Dan Williams
2017-01-28  8:36   ` Dan Williams
2017-01-28  8:36 ` [RFC PATCH 09/17] block: kill bdev_dax_capable() Dan Williams
2017-01-28  8:36   ` Dan Williams
2017-01-28  8:36 ` [RFC PATCH 10/17] block: introduce bdev_dax_direct_access() Dan Williams
2017-01-28  8:36   ` Dan Williams
2017-01-30 12:32   ` Christoph Hellwig
2017-01-30 18:16     ` Dan Williams [this message]
2017-01-30 18:16       ` Dan Williams
2017-02-01  8:10       ` Christoph Hellwig
2017-02-01  8:10         ` Christoph Hellwig
2017-02-01  9:21         ` Dan Williams
2017-02-01  9:21           ` Dan Williams
2017-02-01  9:28           ` Christoph Hellwig
2017-02-01  9:28             ` Christoph Hellwig
2017-01-28  8:37 ` [RFC PATCH 11/17] dm: add dax_operations support (producer) Dan Williams
2017-01-28  8:37   ` Dan Williams
2017-01-28  8:37 ` [RFC PATCH 12/17] dm: add dax_operations support (consumer) Dan Williams
2017-01-28  8:37   ` Dan Williams
2017-01-28  8:37 ` [RFC PATCH 13/17] fs: update mount_bdev() to lookup dax infrastructure Dan Williams
2017-01-28  8:37   ` Dan Williams
2017-01-30 12:26   ` Christoph Hellwig
2017-01-30 18:29     ` Dan Williams
2017-01-30 18:29       ` Dan Williams
2017-02-01  8:08       ` Christoph Hellwig
2017-02-01  8:08         ` Christoph Hellwig
2017-02-01  9:16         ` Dan Williams
2017-02-01  9:16           ` Dan Williams
2017-01-28  8:37 ` [RFC PATCH 14/17] ext2, ext4, xfs: retrieve dax_inode through iomap operations Dan Williams
2017-01-28  8:37   ` Dan Williams
2017-01-28  8:37 ` [RFC PATCH 15/17] Revert "block: use DAX for partition table reads" Dan Williams
2017-01-28  8:37   ` Dan Williams
2017-01-28  8:37 ` [RFC PATCH 16/17] fs, dax: convert filesystem-dax to bdev_dax_direct_access Dan Williams
2017-01-28  8:37   ` Dan Williams
2017-01-28  8:37 ` [RFC PATCH 17/17] block: remove block_device_operations.direct_access and related infrastructure Dan Williams
2017-01-28  8:37   ` Dan Williams

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAPcyv4iARmZQSBybJ1iJhwkndVxSe62rk4hjP1T9prBuOqVQ-A@mail.gmail.com \
    --to=dan.j.williams@intel.com \
    --cc=hch@lst.de \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-nvdimm@lists.01.org \
    --cc=mawilcox@microsoft.com \
    --cc=snitzer@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.