All of lore.kernel.org
 help / color / mirror / Atom feed
From: Coly Li <colyli@suse.de>
To: Dan Williams <dan.j.williams@intel.com>
Cc: linux-bcache@vger.kernel.org, linux-block@vger.kernel.org,
	linux-nvdimm@lists.linux.dev, Jens Axboe <axboe@kernel.dk>,
	Hannes Reinecke <hare@suse.com>, Jan Kara <jack@suse.cz>,
	Christoph Hellwig <hch@lst.de>,
	"Huang, Ying" <ying.huang@intel.com>,
	Jianpeng Ma <jianpeng.ma@intel.com>,
	Randy Dunlap <rdunlap@infradead.org>,
	Qiaowei Ren <qiaowei.ren@intel.com>,
	Hannes Reinecke <hare@suse.de>
Subject: Re: [PATCH v12 02/12] bcache: initialize the nvm pages allocator
Date: Thu, 12 Aug 2021 16:26:26 +0800	[thread overview]
Message-ID: <dffaf01e-9ac4-e71b-3e38-1f2b0bfc5aed@suse.de> (raw)
In-Reply-To: <CAPcyv4hhfg=mgN4AW8T2VWGVbKsQZkpPwpU5yVAVh2nFOxCBcg@mail.gmail.com>

On 8/12/21 1:43 PM, Dan Williams wrote:
> On Wed, Aug 11, 2021 at 10:04 AM Coly Li <colyli@suse.de> wrote:
>> From: Jianpeng Ma <jianpeng.ma@intel.com>
>>
>> This patch define the prototype data structures in memory and
>> initializes the nvm pages allocator.
>>
>> The nvm address space which is managed by this allocator can consist of
>> many nvm namespaces, and some namespaces can compose into one nvm set,
>> like cache set. For this initial implementation, only one set can be
>> supported.
>>
>> The users of this nvm pages allocator need to call register_namespace()
>> to register the nvdimm device (like /dev/pmemX) into this allocator as
>> the instance of struct nvm_namespace.
>>
>> Reported-by: Randy Dunlap <rdunlap@infradead.org>
>> Signed-off-by: Jianpeng Ma <jianpeng.ma@intel.com>
>> Co-developed-by: Qiaowei Ren <qiaowei.ren@intel.com>
>> Signed-off-by: Qiaowei Ren <qiaowei.ren@intel.com>
>> Cc: Christoph Hellwig <hch@lst.de>
>> Cc: Dan Williams <dan.j.williams@intel.com>
>> Cc: Hannes Reinecke <hare@suse.de>
>> Cc: Jens Axboe <axboe@kernel.dk>
>> ---
>>  drivers/md/bcache/Kconfig     |  10 +
>>  drivers/md/bcache/Makefile    |   1 +
>>  drivers/md/bcache/nvm-pages.c | 339 ++++++++++++++++++++++++++++++++++
>>  drivers/md/bcache/nvm-pages.h |  96 ++++++++++
>>  drivers/md/bcache/super.c     |   3 +
>>  5 files changed, 449 insertions(+)
>>  create mode 100644 drivers/md/bcache/nvm-pages.c
>>  create mode 100644 drivers/md/bcache/nvm-pages.h
>>
[snipped]
>> +
>> +       err = -EOPNOTSUPP;
>> +       if (!bdev_dax_supported(bdev, ns->page_size)) {
>> +               pr_err("%s don't support DAX\n", bdevname(bdev, buf));
>> +               goto free_ns;
>> +       }
>> +
>> +       err = -EINVAL;
>> +       if (bdev_dax_pgoff(bdev, 0, ns->page_size, &pgoff)) {
>> +               pr_err("invalid offset of %s\n", bdevname(bdev, buf));
>> +               goto free_ns;
>> +       }
>> +
>> +       err = -ENOMEM;
>> +       ns->dax_dev = fs_dax_get_by_bdev(bdev);
>> +       if (!ns->dax_dev) {
>> +               pr_err("can't by dax device by %s\n", bdevname(bdev, buf));
>> +               goto free_ns;
>> +       }
>> +
>> +       err = -EINVAL;
>> +       id = dax_read_lock();
>> +       dax_ret = dax_direct_access(ns->dax_dev, pgoff, ns->pages_total,
>> +                                   &ns->base_addr, &ns->start_pfn);
>> +       if (dax_ret <= 0) {
>> +               pr_err("dax_direct_access error\n");
>> +               dax_read_unlock(id);
>> +               goto free_ns;
>> +       }
>> +
>> +       if (dax_ret < ns->pages_total) {
>> +               pr_warn("mapped range %ld is less than ns->pages_total %lu\n",
>> +                       dax_ret, ns->pages_total);

Hi Dan,

Many thanks for your information.

> This failure will become a common occurrence with CXL namespaces that
> will have discontiguous range support. It's already the case for
> dax-devices for soft-reserved memory [1]. In the CXL case the
> discontinuity will be 256MB aligned, for the soft-reserved dax-devices
> the discontinuity granularity can be as small as 4K.
>
> [1]: https://elixir.bootlin.com/linux/v5.14-rc5/source/drivers/dax/device.c#L414

Fortunately the on-media allocation list format works with multiple
ranges of the namespace. For the in-memory struct bch_nvmpg_ns currently
assumes the namespace is a flat continuous range. Yes, we need to
consider and support multiple ranges in struct bch_nvmpg_ns for buddy
allocation initialization to skip the discontinuous gap.

It will be in the to-do list for next work. Thanks for your comments and
hint.

Coly Li

  reply	other threads:[~2021-08-12  8:26 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-11 17:02 [PATCH v12 00/12] bcache: support NVDIMM for journaling Coly Li
2021-08-11 17:02 ` [PATCH v12 01/12] bcache: add initial data structures for nvm pages Coly Li
2021-08-11 17:02 ` [PATCH v12 02/12] bcache: initialize the nvm pages allocator Coly Li
2021-08-12  5:43   ` Dan Williams
2021-08-12  8:26     ` Coly Li [this message]
2021-08-11 17:02 ` [PATCH v12 03/12] bcache: initialization of the buddy Coly Li
2021-08-11 17:02 ` [PATCH v12 04/12] bcache: bch_nvmpg_alloc_pages() " Coly Li
2021-08-11 17:02 ` [PATCH v12 05/12] bcache: bch_nvmpg_free_pages() of the buddy allocator Coly Li
2021-08-11 17:02 ` [PATCH v12 06/12] bcache: get recs list head for allocated pages by specific uuid Coly Li
2021-08-11 17:02 ` [PATCH v12 07/12] bcache: use bucket index to set GC_MARK_METADATA for journal buckets in bch_btree_gc_finish() Coly Li
2021-08-11 17:02 ` [PATCH v12 08/12] bcache: add BCH_FEATURE_INCOMPAT_NVDIMM_META into incompat feature set Coly Li
2021-08-11 17:02 ` [PATCH v12 09/12] bcache: initialize bcache journal for NVDIMM meta device Coly Li
2021-08-11 17:02 ` [PATCH v12 10/12] bcache: support storing bcache journal into " Coly Li
2021-08-11 17:02 ` [PATCH v12 11/12] bcache: read jset from NVDIMM pages for journal replay Coly Li
2021-08-11 17:02 ` [PATCH v12 12/12] bcache: add sysfs interface register_nvdimm_meta to register NVDIMM meta device Coly Li
2021-08-15 16:21 ` [PATCH v12 00/12] bcache: support NVDIMM for journaling Coly Li

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=dffaf01e-9ac4-e71b-3e38-1f2b0bfc5aed@suse.de \
    --to=colyli@suse.de \
    --cc=axboe@kernel.dk \
    --cc=dan.j.williams@intel.com \
    --cc=hare@suse.com \
    --cc=hare@suse.de \
    --cc=hch@lst.de \
    --cc=jack@suse.cz \
    --cc=jianpeng.ma@intel.com \
    --cc=linux-bcache@vger.kernel.org \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-nvdimm@lists.linux.dev \
    --cc=qiaowei.ren@intel.com \
    --cc=rdunlap@infradead.org \
    --cc=ying.huang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.