linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Logan Gunthorpe <logang@deltatee.com>
To: Christoph Hellwig <hch@infradead.org>,
	Stephen Bates <stephen.bates@microsemi.com>
Cc: linux-kernel@vger.kernel.org, linux-nvdimm@ml01.01.org,
	linux-rdma@vger.kernel.org, linux-block@vger.kernel.org,
	linux-mm@kvack.org, dan.j.williams@intel.com,
	ross.zwisler@linux.intel.com, willy@linux.intel.com,
	jgunthorpe@obsidianresearch.com, haggaie@mellanox.com,
	axboe@fb.com, corbet@lwn.net, jim.macdonald@everspin.com,
	sbates@raithin.com
Subject: Re: [PATCH 2/3] iopmem : Add a block device driver for PCIe attached IO memory.
Date: Fri, 28 Oct 2016 13:22:16 -0600	[thread overview]
Message-ID: <f61ba5bd-b81e-d5bf-02e5-45f6b523dd4c@deltatee.com> (raw)
In-Reply-To: <20161028064556.GA3231@infradead.org>

Hi Christoph,

Thanks so much for the detailed review of the code! Even though by the
sounds of things we will be moving to device dax and most of this is
moot. Still, it's great to get some feedback and learn a few things.

I've given some responses below.

On 28/10/16 12:45 AM, Christoph Hellwig wrote:
>> + * This driver is heavily based on drivers/block/pmem.c.
>> + * Copyright (c) 2014, Intel Corporation.
>> + * Copyright (C) 2007 Nick Piggin
>> + * Copyright (C) 2007 Novell Inc.
> 
> Is there anything left of it actually?  I didn't spot anything
> obvious.  Nevermind that we don't have a file with that name anymore :)

Yes, actually there's still a lot of similarities with the current
pmem.c. Though, yes, the path was on oversight. Some of this code is
getting pretty old (it started from an out-of-tree version of pmem.c)
and we've tried our best to track as many of the changes to the pmem.c
as possible. This proved to be difficult. Note: this is now the nvdimm
pmem and not the dax pmem (drivers/nvdimm/pmem.c)

>> +  /*
>> +   * We can only access the iopmem device with full 32-bit word
>> +   * accesses which cannot be gaurantee'd by the regular memcpy
>> +   */
> 
> Odd comment formatting. 

Oops. I'm surprised check_patch didn't pick up on that.

> 
>> +static void memcpy_from_iopmem(void *dst, const void *src, size_t sz)
>> +{
>> +	u64 *wdst = dst;
>> +	const u64 *wsrc = src;
>> +	u64 tmp;
>> +
>> +	while (sz >= sizeof(*wdst)) {
>> +		*wdst++ = *wsrc++;
>> +		sz -= sizeof(*wdst);
>> +	}
>> +
>> +	if (!sz)
>> +		return;
>> +
>> +	tmp = *wsrc;
>> +	memcpy(wdst, &tmp, sz);
>> +}
> 
> And then we dod a memcpy here anyway.  And no volatile whatsover, so
> the compiler could do anything to it.  I defintively feel a bit uneasy
> about having this in the driver as well.  Can we define the exact
> semantics for this and define it by the system, possibly in an arch
> specific way?

Yeah, you're right. We should have reviewed this function a bit more.
Anyway, I'd be interested in learning a better approach to forcing a
copy from a mapped BAR with larger widths.


>> +static void iopmem_do_bvec(struct iopmem_device *iopmem, struct page *page,
>> +			   unsigned int len, unsigned int off, bool is_write,
>> +			   sector_t sector)
>> +{
>> +	phys_addr_t iopmem_off = sector * 512;
>> +	void *iopmem_addr = iopmem->virt_addr + iopmem_off;
>> +
>> +	if (!is_write) {
>> +		read_iopmem(page, off, iopmem_addr, len);
>> +		flush_dcache_page(page);
>> +	} else {
>> +		flush_dcache_page(page);
>> +		write_iopmem(iopmem_addr, page, off, len);
>> +	}
> 
> How about moving the  address and offset calculation as well as the
> cache flushing into read_iopmem/write_iopmem and removing this function?

Could do. This was copied from the existing pmem.c and once the bad_pmem
stuff was stripped out this function became relatively simple.


> 
>> +static blk_qc_t iopmem_make_request(struct request_queue *q, struct bio *bio)
>> +{
>> +	struct iopmem_device *iopmem = q->queuedata;
>> +	struct bio_vec bvec;
>> +	struct bvec_iter iter;
>> +
>> +	bio_for_each_segment(bvec, bio, iter) {
>> +		iopmem_do_bvec(iopmem, bvec.bv_page, bvec.bv_len,
>> +			    bvec.bv_offset, op_is_write(bio_op(bio)),
>> +			    iter.bi_sector);
> 
> op_is_write just checks the data direction.  I'd feel much more
> comfortable with a switch on the op, e.g.

That makes sense. This was also copied from pmem.c, so this same change
may make sense there too.


>> +static long iopmem_direct_access(struct block_device *bdev, sector_t sector,
>> +			       void **kaddr, pfn_t *pfn, long size)
>> +{
>> +	struct iopmem_device *iopmem = bdev->bd_queue->queuedata;
>> +	resource_size_t offset = sector * 512;
>> +
>> +	if (!iopmem)
>> +		return -ENODEV;
> 
> I don't think this can ever happen, can it?

Yes, I think now that's the case. This is probably a holdover from a
previous version.

> Just use ida_simple_get/ida_simple_remove instead to take care
> of the locking and preloading, and get rid of these two functions.

Thanks, noted. That would be much better. I never found a simple example
of that when I was looking, though I expected there should have been.

> 
>> +static int iopmem_attach_disk(struct iopmem_device *iopmem)
>> +{
>> +	struct gendisk *disk;
>> +	int nid = dev_to_node(iopmem->dev);
>> +	struct request_queue *q = iopmem->queue;
>> +
>> +	blk_queue_write_cache(q, true, true);
> 
> You don't handle flush commands or the fua bit in make_request, so
> this setting seems wrong.

Yup, ok. I'm afraid this is a case of copying without complete
comprehension.

> 
>> +	int err = 0;
>> +	int nid = dev_to_node(&pdev->dev);
>> +
>> +	if (pci_enable_device_mem(pdev) < 0) {
> 
> propagate the actual error code, please.

Hmm, yup. Not sure why that was missed.

Thanks,

Logan

  reply	other threads:[~2016-10-28 20:04 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-10-18 21:42 [PATCH 0/3] iopmem : A block device for PCIe memory Stephen Bates
2016-10-18 21:42 ` [PATCH 1/3] memremap.c : Add support for ZONE_DEVICE IO memory with struct pages Stephen Bates
2016-10-19 17:50   ` Dan Williams
2016-10-19 18:40     ` Stephen Bates
2016-10-19 20:01       ` Dan Williams
2016-10-25 11:54         ` Stephen Bates
2016-10-18 21:42 ` [PATCH 2/3] iopmem : Add a block device driver for PCIe attached IO memory Stephen Bates
2016-10-28  6:45   ` Christoph Hellwig
2016-10-28 19:22     ` Logan Gunthorpe [this message]
2016-10-18 21:42 ` [PATCH 3/3] iopmem : Add documentation for iopmem driver Stephen Bates
2016-10-28  6:46   ` Christoph Hellwig
2016-10-19  3:51 ` [PATCH 0/3] iopmem : A block device for PCIe memory Dan Williams
2016-10-19 18:48   ` Stephen Bates
2016-10-19 19:58     ` Dan Williams
2016-10-19 22:54       ` Stephen Bates
2016-10-20 23:22     ` Dave Chinner
2016-10-21  9:57       ` Christoph Hellwig
2016-10-21 11:12         ` Dave Chinner
2016-10-25 11:50           ` Stephen Bates
2016-10-25 21:19             ` Dave Chinner
2016-11-06 14:05               ` Stephen Bates
2016-10-27 10:22         ` Sagi Grimberg
2016-10-27 12:32           ` Christoph Hellwig
2016-10-26  8:24   ` Haggai Eran
2016-10-26 13:39     ` Dan Williams

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f61ba5bd-b81e-d5bf-02e5-45f6b523dd4c@deltatee.com \
    --to=logang@deltatee.com \
    --cc=axboe@fb.com \
    --cc=corbet@lwn.net \
    --cc=dan.j.williams@intel.com \
    --cc=haggaie@mellanox.com \
    --cc=hch@infradead.org \
    --cc=jgunthorpe@obsidianresearch.com \
    --cc=jim.macdonald@everspin.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-nvdimm@ml01.01.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=ross.zwisler@linux.intel.com \
    --cc=sbates@raithin.com \
    --cc=stephen.bates@microsemi.com \
    --cc=willy@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).