All of lore.kernel.org
 help / color / mirror / Atom feed
From: tip-bot for Ross Zwisler <tipbot@zytor.com>
To: linux-tip-commits@vger.kernel.org
Cc: ross.zwisler@linux.intel.com, dan.j.williams@intel.com,
	tglx@linutronix.de, keith.busch@intel.com, mingo@kernel.org,
	bp@alien8.de, akpm@linux-foundation.org, hpa@zytor.com,
	linux-kernel@vger.kernel.org, luto@amacapital.net,
	willy@linux.intel.com, boaz@plexistor.com, axboe@kernel.dk,
	hch@lst.de, torvalds@linux-foundation.org, axboe@fb.com
Subject: [tip:x86/pmem] drivers/block/pmem: Add a driver for persistent memory
Date: Thu, 2 Apr 2015 05:31:35 -0700	[thread overview]
Message-ID: <tip-9e853f2313e5eb163cb1ea461b23c2332cf6438a@git.kernel.org> (raw)
In-Reply-To: <1427872339-6688-3-git-send-email-hch@lst.de>

Commit-ID:  9e853f2313e5eb163cb1ea461b23c2332cf6438a
Gitweb:     http://git.kernel.org/tip/9e853f2313e5eb163cb1ea461b23c2332cf6438a
Author:     Ross Zwisler <ross.zwisler@linux.intel.com>
AuthorDate: Wed, 1 Apr 2015 09:12:19 +0200
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Wed, 1 Apr 2015 17:03:56 +0200

drivers/block/pmem: Add a driver for persistent memory

PMEM is a new driver that presents a reserved range of memory as
a block device.  This is useful for developing with NV-DIMMs,
and can be used with volatile memory as a development platform.

This patch contains the initial driver from Ross Zwisler, with
various changes: converted it to use a platform_device for
discovery, fixed partition support and merged various patches
from Boaz Harrosh.

Tested-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Dan Williams <dan.j.williams@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Boaz Harrosh <boaz@plexistor.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jens Axboe <axboe@fb.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Matthew Wilcox <willy@linux.intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-nvdimm@ml01.01.org
Link: http://lkml.kernel.org/r/1427872339-6688-3-git-send-email-hch@lst.de
[ Minor cleanups. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 MAINTAINERS            |   6 ++
 drivers/block/Kconfig  |  11 +++
 drivers/block/Makefile |   1 +
 drivers/block/pmem.c   | 263 +++++++++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 281 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 1de6afa..4517613 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -8071,6 +8071,12 @@ S:	Maintained
 F:	Documentation/blockdev/ramdisk.txt
 F:	drivers/block/brd.c
 
+PERSISTENT MEMORY DRIVER
+M:	Ross Zwisler <ross.zwisler@linux.intel.com>
+L:	linux-nvdimm@lists.01.org
+S:	Supported
+F:	drivers/block/pmem.c
+
 RANDOM NUMBER DRIVER
 M:	"Theodore Ts'o" <tytso@mit.edu>
 S:	Maintained
diff --git a/drivers/block/Kconfig b/drivers/block/Kconfig
index 1b8094d..eb1fed5 100644
--- a/drivers/block/Kconfig
+++ b/drivers/block/Kconfig
@@ -404,6 +404,17 @@ config BLK_DEV_RAM_DAX
 	  and will prevent RAM block device backing store memory from being
 	  allocated from highmem (only a problem for highmem systems).
 
+config BLK_DEV_PMEM
+	tristate "Persistent memory block device support"
+	help
+	  Saying Y here will allow you to use a contiguous range of reserved
+	  memory as one or more persistent block devices.
+
+	  To compile this driver as a module, choose M here: the module will be
+	  called 'pmem'.
+
+	  If unsure, say N.
+
 config CDROM_PKTCDVD
 	tristate "Packet writing on CD/DVD media"
 	depends on !UML
diff --git a/drivers/block/Makefile b/drivers/block/Makefile
index 02b688d..9cc6c18 100644
--- a/drivers/block/Makefile
+++ b/drivers/block/Makefile
@@ -14,6 +14,7 @@ obj-$(CONFIG_PS3_VRAM)		+= ps3vram.o
 obj-$(CONFIG_ATARI_FLOPPY)	+= ataflop.o
 obj-$(CONFIG_AMIGA_Z2RAM)	+= z2ram.o
 obj-$(CONFIG_BLK_DEV_RAM)	+= brd.o
+obj-$(CONFIG_BLK_DEV_PMEM)	+= pmem.o
 obj-$(CONFIG_BLK_DEV_LOOP)	+= loop.o
 obj-$(CONFIG_BLK_CPQ_DA)	+= cpqarray.o
 obj-$(CONFIG_BLK_CPQ_CISS_DA)  += cciss.o
diff --git a/drivers/block/pmem.c b/drivers/block/pmem.c
new file mode 100644
index 0000000..988f384
--- /dev/null
+++ b/drivers/block/pmem.c
@@ -0,0 +1,263 @@
+/*
+ * Persistent Memory Driver
+ *
+ * Copyright (c) 2014, Intel Corporation.
+ * Copyright (c) 2015, Christoph Hellwig <hch@lst.de>.
+ * Copyright (c) 2015, Boaz Harrosh <boaz@plexistor.com>.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ */
+
+#include <asm/cacheflush.h>
+#include <linux/blkdev.h>
+#include <linux/hdreg.h>
+#include <linux/init.h>
+#include <linux/platform_device.h>
+#include <linux/module.h>
+#include <linux/moduleparam.h>
+#include <linux/slab.h>
+
+#define PMEM_MINORS		16
+
+struct pmem_device {
+	struct request_queue	*pmem_queue;
+	struct gendisk		*pmem_disk;
+
+	/* One contiguous memory region per device */
+	phys_addr_t		phys_addr;
+	void			*virt_addr;
+	size_t			size;
+};
+
+static int pmem_major;
+static atomic_t pmem_index;
+
+static void pmem_do_bvec(struct pmem_device *pmem, struct page *page,
+			unsigned int len, unsigned int off, int rw,
+			sector_t sector)
+{
+	void *mem = kmap_atomic(page);
+	size_t pmem_off = sector << 9;
+
+	if (rw == READ) {
+		memcpy(mem + off, pmem->virt_addr + pmem_off, len);
+		flush_dcache_page(page);
+	} else {
+		flush_dcache_page(page);
+		memcpy(pmem->virt_addr + pmem_off, mem + off, len);
+	}
+
+	kunmap_atomic(mem);
+}
+
+static void pmem_make_request(struct request_queue *q, struct bio *bio)
+{
+	struct block_device *bdev = bio->bi_bdev;
+	struct pmem_device *pmem = bdev->bd_disk->private_data;
+	int rw;
+	struct bio_vec bvec;
+	sector_t sector;
+	struct bvec_iter iter;
+	int err = 0;
+
+	if (bio_end_sector(bio) > get_capacity(bdev->bd_disk)) {
+		err = -EIO;
+		goto out;
+	}
+
+	BUG_ON(bio->bi_rw & REQ_DISCARD);
+
+	rw = bio_data_dir(bio);
+	sector = bio->bi_iter.bi_sector;
+	bio_for_each_segment(bvec, bio, iter) {
+		pmem_do_bvec(pmem, bvec.bv_page, bvec.bv_len, bvec.bv_offset,
+			     rw, sector);
+		sector += bvec.bv_len >> 9;
+	}
+
+out:
+	bio_endio(bio, err);
+}
+
+static int pmem_rw_page(struct block_device *bdev, sector_t sector,
+		       struct page *page, int rw)
+{
+	struct pmem_device *pmem = bdev->bd_disk->private_data;
+
+	pmem_do_bvec(pmem, page, PAGE_CACHE_SIZE, 0, rw, sector);
+	page_endio(page, rw & WRITE, 0);
+
+	return 0;
+}
+
+static long pmem_direct_access(struct block_device *bdev, sector_t sector,
+			      void **kaddr, unsigned long *pfn, long size)
+{
+	struct pmem_device *pmem = bdev->bd_disk->private_data;
+	size_t offset = sector << 9;
+
+	if (!pmem)
+		return -ENODEV;
+
+	*kaddr = pmem->virt_addr + offset;
+	*pfn = (pmem->phys_addr + offset) >> PAGE_SHIFT;
+
+	return pmem->size - offset;
+}
+
+static const struct block_device_operations pmem_fops = {
+	.owner =		THIS_MODULE,
+	.rw_page =		pmem_rw_page,
+	.direct_access =	pmem_direct_access,
+};
+
+static struct pmem_device *pmem_alloc(struct device *dev, struct resource *res)
+{
+	struct pmem_device *pmem;
+	struct gendisk *disk;
+	int idx, err;
+
+	err = -ENOMEM;
+	pmem = kzalloc(sizeof(*pmem), GFP_KERNEL);
+	if (!pmem)
+		goto out;
+
+	pmem->phys_addr = res->start;
+	pmem->size = resource_size(res);
+
+	err = -EINVAL;
+	if (!request_mem_region(pmem->phys_addr, pmem->size, "pmem")) {
+		dev_warn(dev, "could not reserve region [0x%llx:0x%zx]\n",
+			   pmem->phys_addr, pmem->size);
+		goto out_free_dev;
+	}
+
+	/*
+	 * Map the memory as non-cachable, as we can't write back the contents
+	 * of the CPU caches in case of a crash.
+	 */
+	err = -ENOMEM;
+	pmem->virt_addr = ioremap_nocache(pmem->phys_addr, pmem->size);
+	if (!pmem->virt_addr)
+		goto out_release_region;
+
+	pmem->pmem_queue = blk_alloc_queue(GFP_KERNEL);
+	if (!pmem->pmem_queue)
+		goto out_unmap;
+
+	blk_queue_make_request(pmem->pmem_queue, pmem_make_request);
+	blk_queue_max_hw_sectors(pmem->pmem_queue, 1024);
+	blk_queue_bounce_limit(pmem->pmem_queue, BLK_BOUNCE_ANY);
+
+	disk = alloc_disk(PMEM_MINORS);
+	if (!disk)
+		goto out_free_queue;
+
+	idx = atomic_inc_return(&pmem_index) - 1;
+
+	disk->major		= pmem_major;
+	disk->first_minor	= PMEM_MINORS * idx;
+	disk->fops		= &pmem_fops;
+	disk->private_data	= pmem;
+	disk->queue		= pmem->pmem_queue;
+	disk->flags		= GENHD_FL_EXT_DEVT;
+	sprintf(disk->disk_name, "pmem%d", idx);
+	disk->driverfs_dev = dev;
+	set_capacity(disk, pmem->size >> 9);
+	pmem->pmem_disk = disk;
+
+	add_disk(disk);
+
+	return pmem;
+
+out_free_queue:
+	blk_cleanup_queue(pmem->pmem_queue);
+out_unmap:
+	iounmap(pmem->virt_addr);
+out_release_region:
+	release_mem_region(pmem->phys_addr, pmem->size);
+out_free_dev:
+	kfree(pmem);
+out:
+	return ERR_PTR(err);
+}
+
+static void pmem_free(struct pmem_device *pmem)
+{
+	del_gendisk(pmem->pmem_disk);
+	put_disk(pmem->pmem_disk);
+	blk_cleanup_queue(pmem->pmem_queue);
+	iounmap(pmem->virt_addr);
+	release_mem_region(pmem->phys_addr, pmem->size);
+	kfree(pmem);
+}
+
+static int pmem_probe(struct platform_device *pdev)
+{
+	struct pmem_device *pmem;
+	struct resource *res;
+
+	if (WARN_ON(pdev->num_resources > 1))
+		return -ENXIO;
+
+	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+	if (!res)
+		return -ENXIO;
+
+	pmem = pmem_alloc(&pdev->dev, res);
+	if (IS_ERR(pmem))
+		return PTR_ERR(pmem);
+
+	platform_set_drvdata(pdev, pmem);
+
+	return 0;
+}
+
+static int pmem_remove(struct platform_device *pdev)
+{
+	struct pmem_device *pmem = platform_get_drvdata(pdev);
+
+	pmem_free(pmem);
+	return 0;
+}
+
+static struct platform_driver pmem_driver = {
+	.probe		= pmem_probe,
+	.remove		= pmem_remove,
+	.driver		= {
+		.owner	= THIS_MODULE,
+		.name	= "pmem",
+	},
+};
+
+static int __init pmem_init(void)
+{
+	int error;
+
+	pmem_major = register_blkdev(0, "pmem");
+	if (pmem_major < 0)
+		return pmem_major;
+
+	error = platform_driver_register(&pmem_driver);
+	if (error)
+		unregister_blkdev(pmem_major, "pmem");
+	return error;
+}
+module_init(pmem_init);
+
+static void pmem_exit(void)
+{
+	platform_driver_unregister(&pmem_driver);
+	unregister_blkdev(pmem_major, "pmem");
+}
+module_exit(pmem_exit);
+
+MODULE_AUTHOR("Ross Zwisler <ross.zwisler@linux.intel.com>");
+MODULE_LICENSE("GPL v2");

  parent reply	other threads:[~2015-04-02 12:32 UTC|newest]

Thread overview: 50+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-04-01  7:12 another pmem variant V3 Christoph Hellwig
2015-04-01  7:12 ` [PATCH 1/2] x86: add support for the non-standard protected e820 type Christoph Hellwig
2015-04-01 14:25   ` [PATCH] SQUASHME: Fixes to e820 handling of pmem Boaz Harrosh
2015-04-02  9:30     ` Christoph Hellwig
2015-04-02  9:37       ` Ingo Molnar
2015-04-02  9:40         ` Christoph Hellwig
2015-04-02 11:18         ` Christoph Hellwig
2015-04-02 11:20       ` Boaz Harrosh
2015-04-02 12:31   ` [tip:x86/pmem] x86/mm: Add support for the non-standard protected e820 type tip-bot for Christoph Hellwig
2015-04-02 19:08     ` Andy Lutomirski
2015-04-02 19:13       ` Ingo Molnar
2015-04-02 19:51         ` Andy Lutomirski
2015-04-16 22:31           ` Andy Lutomirski
2015-04-17  0:55             ` Elliott, Robert (Server Storage)
2015-04-17  0:59               ` Andy Lutomirski
2015-04-02 20:28     ` Yinghai Lu
2015-04-02 20:23   ` [PATCH 1/2] x86: add " Yinghai Lu
2015-04-03 16:14   ` [Linux-nvdimm] " Toshi Kani
2015-04-03 17:12     ` Yinghai Lu
2015-04-03 20:54       ` Toshi Kani
2015-04-04  9:40         ` Ingo Molnar
2015-04-05  7:44           ` Yinghai Lu
2015-04-06  7:27             ` Ingo Molnar
2015-04-06 17:29           ` Toshi Kani
2015-04-06 18:26             ` Yinghai Lu
2015-04-06 18:23               ` Toshi Kani
2015-04-05  9:18       ` Boaz Harrosh
2015-04-05 20:06         ` Yinghai Lu
2015-04-06  7:16           ` Boaz Harrosh
2015-04-06 15:55       ` Christoph Hellwig
2015-04-01  7:12 ` [PATCH 2/2] pmem: add a driver for persistent memory Christoph Hellwig
2015-04-01 15:18   ` Boaz Harrosh
2015-04-02  9:32     ` Christoph Hellwig
2015-04-02 12:31   ` tip-bot for Ross Zwisler [this message]
2015-04-02 15:31 ` [PATCH] pmem: Add prints at module load and unload Boaz Harrosh
2015-04-02 15:39   ` [Linux-nvdimm] " Dan Williams
2015-04-02 15:47     ` Boaz Harrosh
2015-04-02 16:01       ` Dan Williams
2015-04-02 16:44         ` Christoph Hellwig
2015-04-05  8:50           ` Boaz Harrosh
2015-04-07 15:19             ` Christoph Hellwig
2015-04-07 15:34               ` Boaz Harrosh
2015-04-07 15:46 ` [PATCH A+B] " Boaz Harrosh
2015-04-07 15:47   ` [PATCH 1A] pmem: Add prints at pmem_probe/remove Boaz Harrosh
2015-04-07 15:47   ` [PATCH 1B] pmem: Add prints at module load and unload Boaz Harrosh
2015-04-13  9:05   ` [PATCH A+B] " Greg KH
2015-04-13 12:05     ` Boaz Harrosh
2015-04-13 12:36       ` Greg KH
2015-04-13 13:20         ` Boaz Harrosh
2015-04-13 13:36           ` Greg KH

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=tip-9e853f2313e5eb163cb1ea461b23c2332cf6438a@git.kernel.org \
    --to=tipbot@zytor.com \
    --cc=akpm@linux-foundation.org \
    --cc=axboe@fb.com \
    --cc=axboe@kernel.dk \
    --cc=boaz@plexistor.com \
    --cc=bp@alien8.de \
    --cc=dan.j.williams@intel.com \
    --cc=hch@lst.de \
    --cc=hpa@zytor.com \
    --cc=keith.busch@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-tip-commits@vger.kernel.org \
    --cc=luto@amacapital.net \
    --cc=mingo@kernel.org \
    --cc=ross.zwisler@linux.intel.com \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=willy@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.