linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCHSET 0/6] exofs: few patches for Linux 2.6.31
@ 2009-06-17 16:01 Boaz Harrosh
  2009-06-17 16:03 ` [PATCH 1/6] exofs: Fix bio leak in error handling path (sync read) Boaz Harrosh
                   ` (6 more replies)
  0 siblings, 7 replies; 11+ messages in thread
From: Boaz Harrosh @ 2009-06-17 16:01 UTC (permalink / raw)
  To: Jeff Garzik, open-osd mailing-list; +Cc: linux-fsdevel, linux-kernel

For review the few exofs patches I would like included for current merge window
(Linux 2.6.31)

The main effort on the exofs front was to make it pnfs-exportable. Which for now
only lives in the pnfs tree. So there was not much left outside of that.

Also for final review is the osdblk stacking block driver which is ready
for submission but is missing a supporting user-mode tool, and some more
testing love.

The list of patches:
  [PATCH 1/6] exofs: Fix bio leak in error handling path (sync read)

	A small and rare bug fix

  [PATCH 2/6] exofs: Remove IBM copyrights

	As requested by Original author and IBM folks. The code does not belong
	to IBM. (Never was)

  [PATCH 3/6] exofs: Avoid using file_fsync()

	This one is left like this. Since other options suggested are not good
	for exofs since it does not have a block-device and any of the BH stuff.

  [PATCH 4/6] MAINTAINERS: Add osd maintained files (F:)

	Use the new MAINTAINERS F: annotation for the OSD files

  [PATCH 5/6] osdblk: a Linux block device for OSD objects
  [PATCH 6/6] osdblk: Adjust queue limits to lower device's limits

	This is the proposed osdblk driver by Jeff Garzik. It as all the Kernel
	pre-requisites, but is missing a user-mode tool and more testing. So I'm
	not sure it will make it into this Kernel. But for review.

Thanks
Boaz

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 1/6] exofs: Fix bio leak in error handling path (sync read)
  2009-06-17 16:01 [PATCHSET 0/6] exofs: few patches for Linux 2.6.31 Boaz Harrosh
@ 2009-06-17 16:03 ` Boaz Harrosh
  2009-06-17 16:04 ` [PATCH 2/6] exofs: Remove IBM copyrights Boaz Harrosh
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 11+ messages in thread
From: Boaz Harrosh @ 2009-06-17 16:03 UTC (permalink / raw)
  To: Jeff Garzik, open-osd; +Cc: linux-fsdevel, linux-kernel

When failing a read request in the sync path, called from
write_begin, I forgot to free the allocated bio, fix it.

Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
---
 fs/exofs/inode.c |    3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/fs/exofs/inode.c b/fs/exofs/inode.c
index 77d0a29..bb5d6ed 100644
--- a/fs/exofs/inode.c
+++ b/fs/exofs/inode.c
@@ -295,6 +295,9 @@ static int read_exec(struct page_collect *pcol, bool is_sync)
 err:
 	if (!is_sync)
 		_unlock_pcol_pages(pcol, ret, READ);
+	else /* Pages unlocked by caller in sync mode only free bio */
+		pcol_free(pcol);
+
 	kfree(pcol_copy);
 	if (or)
 		osd_end_request(or);
-- 
1.6.2.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 2/6] exofs: Remove IBM copyrights
  2009-06-17 16:01 [PATCHSET 0/6] exofs: few patches for Linux 2.6.31 Boaz Harrosh
  2009-06-17 16:03 ` [PATCH 1/6] exofs: Fix bio leak in error handling path (sync read) Boaz Harrosh
@ 2009-06-17 16:04 ` Boaz Harrosh
  2009-06-17 16:05 ` [PATCH 3/6] exofs: Avoid using file_fsync() Boaz Harrosh
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 11+ messages in thread
From: Boaz Harrosh @ 2009-06-17 16:04 UTC (permalink / raw)
  To: Jeff Garzik, open-osd; +Cc: linux-fsdevel, linux-kernel

Boaz,
Congrats on getting all the OSD stuff into 2.6.30!
I just pulled the git, and saw that the IBM copyrights are still there.
Please remove them from all files:
 * Copyright (C) 2005, 2006
 * International Business Machines

IBM has revoked all rights on the code - they gave it to me.

Thanks!
Avishay

Signed-off-by: Avishay Traeger <avishay@gmail.com>
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
---
 fs/exofs/common.h  |    4 +---
 fs/exofs/dir.c     |    4 +---
 fs/exofs/exofs.h   |    4 +---
 fs/exofs/file.c    |    4 +---
 fs/exofs/inode.c   |    4 +---
 fs/exofs/namei.c   |    4 +---
 fs/exofs/osd.c     |    4 +---
 fs/exofs/super.c   |    4 +---
 fs/exofs/symlink.c |    4 +---
 9 files changed, 9 insertions(+), 27 deletions(-)

diff --git a/fs/exofs/common.h b/fs/exofs/common.h
index 24667ee..c6718e4 100644
--- a/fs/exofs/common.h
+++ b/fs/exofs/common.h
@@ -2,9 +2,7 @@
  * common.h - Common definitions for both Kernel and user-mode utilities
  *
  * Copyright (C) 2005, 2006
- * Avishay Traeger (avishay@gmail.com) (avishay@il.ibm.com)
- * Copyright (C) 2005, 2006
- * International Business Machines
+ * Avishay Traeger (avishay@gmail.com)
  * Copyright (C) 2008, 2009
  * Boaz Harrosh <bharrosh@panasas.com>
  *
diff --git a/fs/exofs/dir.c b/fs/exofs/dir.c
index 65b0c8c..4cfab1c 100644
--- a/fs/exofs/dir.c
+++ b/fs/exofs/dir.c
@@ -1,8 +1,6 @@
 /*
  * Copyright (C) 2005, 2006
- * Avishay Traeger (avishay@gmail.com) (avishay@il.ibm.com)
- * Copyright (C) 2005, 2006
- * International Business Machines
+ * Avishay Traeger (avishay@gmail.com)
  * Copyright (C) 2008, 2009
  * Boaz Harrosh <bharrosh@panasas.com>
  *
diff --git a/fs/exofs/exofs.h b/fs/exofs/exofs.h
index 0fd4c78..c413b74 100644
--- a/fs/exofs/exofs.h
+++ b/fs/exofs/exofs.h
@@ -1,8 +1,6 @@
 /*
  * Copyright (C) 2005, 2006
- * Avishay Traeger (avishay@gmail.com) (avishay@il.ibm.com)
- * Copyright (C) 2005, 2006
- * International Business Machines
+ * Avishay Traeger (avishay@gmail.com)
  * Copyright (C) 2008, 2009
  * Boaz Harrosh <bharrosh@panasas.com>
  *
diff --git a/fs/exofs/file.c b/fs/exofs/file.c
index 6ed7fe4..c681003 100644
--- a/fs/exofs/file.c
+++ b/fs/exofs/file.c
@@ -1,8 +1,6 @@
 /*
  * Copyright (C) 2005, 2006
- * Avishay Traeger (avishay@gmail.com) (avishay@il.ibm.com)
- * Copyright (C) 2005, 2006
- * International Business Machines
+ * Avishay Traeger (avishay@gmail.com)
  * Copyright (C) 2008, 2009
  * Boaz Harrosh <bharrosh@panasas.com>
  *
diff --git a/fs/exofs/inode.c b/fs/exofs/inode.c
index bb5d6ed..6c10f74 100644
--- a/fs/exofs/inode.c
+++ b/fs/exofs/inode.c
@@ -1,8 +1,6 @@
 /*
  * Copyright (C) 2005, 2006
- * Avishay Traeger (avishay@gmail.com) (avishay@il.ibm.com)
- * Copyright (C) 2005, 2006
- * International Business Machines
+ * Avishay Traeger (avishay@gmail.com)
  * Copyright (C) 2008, 2009
  * Boaz Harrosh <bharrosh@panasas.com>
  *
diff --git a/fs/exofs/namei.c b/fs/exofs/namei.c
index 77fdd76..b7dd0c2 100644
--- a/fs/exofs/namei.c
+++ b/fs/exofs/namei.c
@@ -1,8 +1,6 @@
 /*
  * Copyright (C) 2005, 2006
- * Avishay Traeger (avishay@gmail.com) (avishay@il.ibm.com)
- * Copyright (C) 2005, 2006
- * International Business Machines
+ * Avishay Traeger (avishay@gmail.com)
  * Copyright (C) 2008, 2009
  * Boaz Harrosh <bharrosh@panasas.com>
  *
diff --git a/fs/exofs/osd.c b/fs/exofs/osd.c
index b3d2ccb..4372542 100644
--- a/fs/exofs/osd.c
+++ b/fs/exofs/osd.c
@@ -1,8 +1,6 @@
 /*
  * Copyright (C) 2005, 2006
- * Avishay Traeger (avishay@gmail.com) (avishay@il.ibm.com)
- * Copyright (C) 2005, 2006
- * International Business Machines
+ * Avishay Traeger (avishay@gmail.com)
  * Copyright (C) 2008, 2009
  * Boaz Harrosh <bharrosh@panasas.com>
  *
diff --git a/fs/exofs/super.c b/fs/exofs/super.c
index 8216c5b..e47b38e 100644
--- a/fs/exofs/super.c
+++ b/fs/exofs/super.c
@@ -1,8 +1,6 @@
 /*
  * Copyright (C) 2005, 2006
- * Avishay Traeger (avishay@gmail.com) (avishay@il.ibm.com)
- * Copyright (C) 2005, 2006
- * International Business Machines
+ * Avishay Traeger (avishay@gmail.com)
  * Copyright (C) 2008, 2009
  * Boaz Harrosh <bharrosh@panasas.com>
  *
diff --git a/fs/exofs/symlink.c b/fs/exofs/symlink.c
index 36e2d7b..4dd687c 100644
--- a/fs/exofs/symlink.c
+++ b/fs/exofs/symlink.c
@@ -1,8 +1,6 @@
 /*
  * Copyright (C) 2005, 2006
- * Avishay Traeger (avishay@gmail.com) (avishay@il.ibm.com)
- * Copyright (C) 2005, 2006
- * International Business Machines
+ * Avishay Traeger (avishay@gmail.com)
  * Copyright (C) 2008, 2009
  * Boaz Harrosh <bharrosh@panasas.com>
  *
-- 
1.6.2.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 3/6] exofs: Avoid using file_fsync()
  2009-06-17 16:01 [PATCHSET 0/6] exofs: few patches for Linux 2.6.31 Boaz Harrosh
  2009-06-17 16:03 ` [PATCH 1/6] exofs: Fix bio leak in error handling path (sync read) Boaz Harrosh
  2009-06-17 16:04 ` [PATCH 2/6] exofs: Remove IBM copyrights Boaz Harrosh
@ 2009-06-17 16:05 ` Boaz Harrosh
  2009-06-17 16:06 ` [PATCH 4/6] MAINTAINERS: Add osd maintained files (F:) Boaz Harrosh
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 11+ messages in thread
From: Boaz Harrosh @ 2009-06-17 16:05 UTC (permalink / raw)
  To: Jeff Garzik, open-osd; +Cc: linux-fsdevel, linux-kernel

The use of file_fsync() in exofs_file_sync() is not necessary since it
does some extra stuff not used by exofs. Open code just the parts that
are currently needed.

TODO: Farther optimization can be done to sync the sb only on inode
update of new files, Usually the sb update is not needed in exofs.

Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
---
 fs/exofs/exofs.h |    3 +++
 fs/exofs/file.c  |   17 ++++++++++++-----
 fs/exofs/super.c |    2 +-
 3 files changed, 16 insertions(+), 6 deletions(-)

diff --git a/fs/exofs/exofs.h b/fs/exofs/exofs.h
index c413b74..5ec72e0 100644
--- a/fs/exofs/exofs.h
+++ b/fs/exofs/exofs.h
@@ -154,6 +154,9 @@ ino_t exofs_parent_ino(struct dentry *child);
 int exofs_set_link(struct inode *, struct exofs_dir_entry *, struct page *,
 		    struct inode *);
 
+/* super.c               */
+int exofs_sync_fs(struct super_block *sb, int wait);
+
 /*********************
  * operation vectors *
  *********************/
diff --git a/fs/exofs/file.c b/fs/exofs/file.c
index c681003..839b9dc 100644
--- a/fs/exofs/file.c
+++ b/fs/exofs/file.c
@@ -45,16 +45,23 @@ static int exofs_file_fsync(struct file *filp, struct dentry *dentry,
 {
 	int ret;
 	struct address_space *mapping = filp->f_mapping;
+	struct inode *inode = dentry->d_inode;
+	struct super_block *sb;
 
 	ret = filemap_write_and_wait(mapping);
 	if (ret)
 		return ret;
 
-	/*Note: file_fsync below also calles sync_blockdev, which is a no-op
-	 *      for exofs, but other then that it does sync_inode and
-	 *      sync_superblock which is what we need here.
-	 */
-	return file_fsync(filp, dentry, datasync);
+	/* sync the inode attributes */
+	ret = write_inode_now(inode, 1);
+
+	/* This is a good place to write the sb */
+	/* TODO: Sechedule an sb-sync on create */
+	sb = inode->i_sb;
+	if (sb->s_dirt)
+		exofs_sync_fs(sb, 1);
+
+	return ret;
 }
 
 static int exofs_flush(struct file *file, fl_owner_t id)
diff --git a/fs/exofs/super.c b/fs/exofs/super.c
index e47b38e..a343b4e 100644
--- a/fs/exofs/super.c
+++ b/fs/exofs/super.c
@@ -198,7 +198,7 @@ static const struct export_operations exofs_export_ops;
 /*
  * Write the superblock to the OSD
  */
-static int exofs_sync_fs(struct super_block *sb, int wait)
+int exofs_sync_fs(struct super_block *sb, int wait)
 {
 	struct exofs_sb_info *sbi;
 	struct exofs_fscb *fscb;
-- 
1.6.2.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 4/6] MAINTAINERS: Add osd maintained files (F:)
  2009-06-17 16:01 [PATCHSET 0/6] exofs: few patches for Linux 2.6.31 Boaz Harrosh
                   ` (2 preceding siblings ...)
  2009-06-17 16:05 ` [PATCH 3/6] exofs: Avoid using file_fsync() Boaz Harrosh
@ 2009-06-17 16:06 ` Boaz Harrosh
  2009-06-17 16:07 ` [PATCH 5/6] osdblk: a Linux block device for OSD objects Boaz Harrosh
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 11+ messages in thread
From: Boaz Harrosh @ 2009-06-17 16:06 UTC (permalink / raw)
  To: Jeff Garzik, open-osd; +Cc: linux-fsdevel, linux-kernel

OSD files are found in three places:
drivers/scsi/osd/
include/scsi/osd_*
fs/exofs/

Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
---
 MAINTAINERS |    3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index fb94add..f42cd7a 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -4331,6 +4331,9 @@ L:	osd-dev@open-osd.org
 W:	http://open-osd.org
 T:	git git://git.open-osd.org/open-osd.git
 S:	Maintained
+F:	drivers/scsi/osd/
+F:	drivers/include/scsi/osd_*
+F:	fs/exofs/
 
 P54 WIRELESS DRIVER
 P:	Michael Wu
-- 
1.6.2.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 5/6] osdblk: a Linux block device for OSD objects
  2009-06-17 16:01 [PATCHSET 0/6] exofs: few patches for Linux 2.6.31 Boaz Harrosh
                   ` (3 preceding siblings ...)
  2009-06-17 16:06 ` [PATCH 4/6] MAINTAINERS: Add osd maintained files (F:) Boaz Harrosh
@ 2009-06-17 16:07 ` Boaz Harrosh
  2009-06-17 16:07 ` [PATCH 6/6] osdblk: Adjust queue limits to lower device's limits Boaz Harrosh
  2009-06-18 12:46 ` [osd-dev] [PATCHSET 0/6] exofs: few patches for Linux 2.6.31 Boaz Harrosh
  6 siblings, 0 replies; 11+ messages in thread
From: Boaz Harrosh @ 2009-06-17 16:07 UTC (permalink / raw)
  To: Jeff Garzik, open-osd; +Cc: linux-fsdevel, linux-kernel, Jeff Garzik

From: Jeff Garzik <jeff@garzik.org>

Submitted driver exports a block device of the form /dev/osdblkX,
where X is a decimal number.

It does that by mounting a stacking block device on top
of an osd object. For example, if you create a 2G object
on an OSD device, you can then use this module to present
that 2G object as a Linux block device.

See inside patch for exact documentation.

[Sitting at linux-next helped fix proper Kconfig dependency
 for this driver, thanks to Randy Dunlap]

Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
---
 drivers/block/Kconfig  |   16 ++
 drivers/block/Makefile |    1 +
 drivers/block/osdblk.c |  663 ++++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 680 insertions(+), 0 deletions(-)
 create mode 100644 drivers/block/osdblk.c

diff --git a/drivers/block/Kconfig b/drivers/block/Kconfig
index bb72ada..1d886e0 100644
--- a/drivers/block/Kconfig
+++ b/drivers/block/Kconfig
@@ -298,6 +298,22 @@ config BLK_DEV_NBD
 
 	  If unsure, say N.
 
+config BLK_DEV_OSD
+	tristate "OSD object-as-blkdev support"
+	depends on SCSI_OSD_ULD
+	---help---
+	  Saying Y or M here will allow the exporting of a single SCSI
+	  OSD (object-based storage) object as a Linux block device.
+
+	  For example, if you create a 2G object on an OSD device,
+	  you can then use this module to present that 2G object as
+	  a Linux block device.
+
+	  To compile this driver as a module, choose M here: the
+	  module will be called osdblk.
+
+	  If unsure, say N.
+
 config BLK_DEV_SX8
 	tristate "Promise SATA SX8 support"
 	depends on PCI
diff --git a/drivers/block/Makefile b/drivers/block/Makefile
index 7755a5e..cdaa3f8 100644
--- a/drivers/block/Makefile
+++ b/drivers/block/Makefile
@@ -23,6 +23,7 @@ obj-$(CONFIG_XILINX_SYSACE)	+= xsysace.o
 obj-$(CONFIG_CDROM_PKTCDVD)	+= pktcdvd.o
 obj-$(CONFIG_MG_DISK)		+= mg_disk.o
 obj-$(CONFIG_SUNVDC)		+= sunvdc.o
+obj-$(CONFIG_BLK_DEV_OSD)	+= osdblk.o
 
 obj-$(CONFIG_BLK_DEV_UMEM)	+= umem.o
 obj-$(CONFIG_BLK_DEV_NBD)	+= nbd.o
diff --git a/drivers/block/osdblk.c b/drivers/block/osdblk.c
new file mode 100644
index 0000000..e829360
--- /dev/null
+++ b/drivers/block/osdblk.c
@@ -0,0 +1,663 @@
+
+/*
+   osdblk.c -- Export a single SCSI OSD object as a Linux block device
+
+
+   Copyright 2009 Red Hat, Inc.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program; see the file COPYING.  If not, write to
+   the Free Software Foundation, 675 Mass Ave, Cambridge, MA 02139, USA.
+
+
+   Instructions for use
+   --------------------
+
+   1) Map a Linux block device to an existing OSD object.
+
+      In this example, we will use partition id 1234, object id 5678,
+      OSD device /dev/osd1.
+
+      $ echo "1234 5678 /dev/osd1" > /sys/class/osdblk/add
+
+
+   2) List all active blkdev<->object mappings.
+
+      In this example, we have performed step #1 twice, creating two blkdevs,
+      mapped to two separate OSD objects.
+
+      $ cat /sys/class/osdblk/list
+      0 174 1234 5678 /dev/osd1
+      1 179 1994 897123 /dev/osd0
+
+      The columns, in order, are:
+      - blkdev unique id
+      - blkdev assigned major
+      - OSD object partition id
+      - OSD object id
+      - OSD device
+
+
+   3) Remove an active blkdev<->object mapping.
+
+      In this example, we remove the mapping with blkdev unique id 1.
+
+      $ echo 1 > /sys/class/osdblk/remove
+
+
+   NOTE:  The actual creation and deletion of OSD objects is outside the scope
+   of this driver.
+
+ */
+
+#include <linux/kernel.h>
+#include <linux/device.h>
+#include <linux/module.h>
+#include <linux/fs.h>
+#include <scsi/osd_initiator.h>
+#include <scsi/osd_attributes.h>
+#include <scsi/osd_sec.h>
+
+#define DRV_NAME "osdblk"
+#define PFX DRV_NAME ": "
+
+struct osdblk_device;
+
+enum {
+	OSDBLK_MINORS_PER_MAJOR	= 256,		/* max minors per blkdev */
+	OSDBLK_MAX_REQ		= 32,		/* max parallel requests */
+	OSDBLK_OP_TIMEOUT	= 4 * 60,	/* sync OSD req timeout */
+};
+
+struct osdblk_request {
+	struct request		*rq;		/* blk layer request */
+	struct bio		*bio;		/* cloned bio */
+	struct osdblk_device	*osdev;		/* associated blkdev */
+};
+
+struct osdblk_device {
+	int			id;		/* blkdev unique id */
+
+	int			major;		/* blkdev assigned major */
+	struct gendisk		*disk;		/* blkdev's gendisk and rq */
+	struct request_queue	*q;
+
+	struct osd_dev		*osd;		/* associated OSD */
+
+	char			name[32];	/* blkdev name, e.g. osdblk34 */
+
+	spinlock_t		lock;		/* queue lock */
+
+	struct osd_obj_id	obj;		/* OSD partition, obj id */
+	uint8_t			obj_cred[OSD_CAP_LEN]; /* OSD cred */
+
+	struct osdblk_request	req[OSDBLK_MAX_REQ]; /* request table */
+
+	struct list_head	node;
+
+	char			osd_path[0];	/* OSD device path */
+};
+
+static struct class *class_osdblk;		/* /sys/class/osdblk */
+static struct mutex ctl_mutex;	/* Serialize open/close/setup/teardown */
+static LIST_HEAD(osdblkdev_list);
+
+static struct block_device_operations osdblk_bd_ops = {
+	.owner		= THIS_MODULE,
+};
+
+static const struct osd_attr g_attr_logical_length = ATTR_DEF(
+	OSD_APAGE_OBJECT_INFORMATION, OSD_ATTR_OI_LOGICAL_LENGTH, 8);
+
+static void osdblk_make_credential(u8 cred_a[OSD_CAP_LEN],
+				   const struct osd_obj_id *obj)
+{
+	osd_sec_init_nosec_doall_caps(cred_a, obj, false, true);
+}
+
+/* copied from exofs; move to libosd? */
+/*
+ * Perform a synchronous OSD operation.  copied from exofs; move to libosd?
+ */
+static int osd_sync_op(struct osd_request *or, int timeout, uint8_t *credential)
+{
+	int ret;
+
+	or->timeout = timeout;
+	ret = osd_finalize_request(or, 0, credential, NULL);
+	if (ret)
+		return ret;
+
+	ret = osd_execute_request(or);
+
+	/* osd_req_decode_sense(or, ret); */
+	return ret;
+}
+
+/*
+ * Perform an asynchronous OSD operation.  copied from exofs; move to libosd?
+ */
+static int osd_async_op(struct osd_request *or, osd_req_done_fn *async_done,
+		   void *caller_context, u8 *cred)
+{
+	int ret;
+
+	ret = osd_finalize_request(or, 0, cred, NULL);
+	if (ret)
+		return ret;
+
+	ret = osd_execute_request_async(or, async_done, caller_context);
+
+	return ret;
+}
+
+/* copied from exofs; move to libosd? */
+static int extract_attr_from_req(struct osd_request *or, struct osd_attr *attr)
+{
+	struct osd_attr cur_attr = {.attr_page = 0}; /* start with zeros */
+	void *iter = NULL;
+	int nelem;
+
+	do {
+		nelem = 1;
+		osd_req_decode_get_attr_list(or, &cur_attr, &nelem, &iter);
+		if ((cur_attr.attr_page == attr->attr_page) &&
+		    (cur_attr.attr_id == attr->attr_id)) {
+			attr->len = cur_attr.len;
+			attr->val_ptr = cur_attr.val_ptr;
+			return 0;
+		}
+	} while (iter);
+
+	return -EIO;
+}
+
+static int osdblk_get_obj_size(struct osdblk_device *osdev, u64 *size_out)
+{
+	struct osd_request *or;
+	struct osd_attr attr;
+	int ret;
+
+	/* start request */
+	or = osd_start_request(osdev->osd, GFP_KERNEL);
+	if (!or)
+		return -ENOMEM;
+
+	/* create a get-attributes(length) request */
+	osd_req_get_attributes(or, &osdev->obj);
+
+	osd_req_add_get_attr_list(or, &g_attr_logical_length, 1);
+
+	/* execute op synchronously */
+	ret = osd_sync_op(or, OSDBLK_OP_TIMEOUT, osdev->obj_cred);
+	if (ret)
+		goto out;
+
+	/* extract length from returned attribute info */
+	attr = g_attr_logical_length;
+	ret = extract_attr_from_req(or, &attr);
+	if (ret)
+		goto out;
+
+	*size_out = get_unaligned_be64(attr.val_ptr);
+
+out:
+	osd_end_request(or);
+	return ret;
+
+}
+
+static void osdblk_osd_complete(struct osd_request *or, void *private)
+{
+	struct osdblk_request *orq = private;
+	struct osd_sense_info osi;
+	int ret = osd_req_decode_sense(or, &osi);
+
+	if (ret)
+		ret = -EIO;
+
+	/* complete OSD request */
+	osd_end_request(or);
+
+	/* complete request passed to osdblk by block layer */
+	__blk_end_request_all(orq->rq, ret);
+}
+
+static void bio_chain_put(struct bio *chain)
+{
+	struct bio *tmp;
+
+	while (chain) {
+		tmp = chain;
+		chain = chain->bi_next;
+
+		bio_put(tmp);
+	}
+}
+
+static struct bio *bio_chain_clone(struct bio *old_chain, gfp_t gfpmask)
+{
+	struct bio *tmp, *new_chain = NULL, *tail = NULL;
+
+	while (old_chain) {
+		tmp = bio_kmalloc(gfpmask, old_chain->bi_vcnt);
+		if (!tmp)
+			goto err_out;
+
+		__bio_clone(tmp, old_chain);
+		gfpmask &= ~__GFP_WAIT;
+		tmp->bi_next = NULL;
+		if (!new_chain)
+			new_chain = tail = tmp;
+		else {
+			tail->bi_next = tmp;
+			tail = tmp;
+		}
+
+		old_chain = old_chain->bi_next;
+	}
+
+	return new_chain;
+
+err_out:
+	bio_chain_put(new_chain);
+	return NULL;
+}
+
+static void osdblk_rq_fn(struct request_queue *q)
+{
+	struct osdblk_device *osdev = q->queuedata;
+	struct request *rq;
+	struct osdblk_request *orq;
+	struct osd_request *or;
+	struct bio *bio;
+	int do_write, do_flush;
+
+	while (1) {
+		/* peek at request from block layer */
+		rq = blk_fetch_request(q);
+		if (!rq)
+			break;
+
+		/* filter out block requests we don't understand */
+		if (!blk_fs_request(rq) && !blk_barrier_rq(rq)) {
+			blk_end_request_all(rq, 0);
+			continue;
+		}
+
+		/* deduce our operation (read, write, flush) */
+		/* I wish the block layer simplified cmd_type/cmd_flags/cmd[]
+		 * into a clearly defined set of RPC commands:
+		 * read, write, flush, scsi command, power mgmt req,
+		 * driver-specific, etc.
+		 */
+
+		do_flush = (rq->special == (void *) 0xdeadbeefUL);
+		do_write = (rq_data_dir(rq) == WRITE);
+
+		if (!do_flush) { /* osd_flush does not use a bio */
+			/* a bio clone to be passed down to OSD request */
+			bio = bio_chain_clone(rq->bio, GFP_ATOMIC);
+			if (!bio)
+				break;
+		} else
+			bio = NULL;
+
+		/* alloc internal OSD request, for OSD command execution */
+		or = osd_start_request(osdev->osd, GFP_ATOMIC);
+		if (!or) {
+			bio_chain_put(bio);
+			break;
+		}
+
+		orq = &osdev->req[rq->tag];
+		orq->rq = rq;
+		orq->bio = bio;
+		orq->osdev = osdev;
+
+		/* init OSD command: flush, write or read */
+		if (do_flush)
+			osd_req_flush_object(or, &osdev->obj,
+					     OSD_CDB_FLUSH_ALL, 0, 0);
+		else if (do_write)
+			osd_req_write(or, &osdev->obj, blk_rq_pos(rq) * 512ULL,
+				      bio, blk_rq_bytes(rq));
+		else
+			osd_req_read(or, &osdev->obj, blk_rq_pos(rq) * 512ULL,
+				     bio, blk_rq_bytes(rq));
+
+		/* begin OSD command execution */
+		if (osd_async_op(or, osdblk_osd_complete, orq,
+				 osdev->obj_cred)) {
+			osd_end_request(or);
+			blk_requeue_request(q, rq);
+			bio_chain_put(bio);
+		}
+
+		/* remove the special 'flush' marker, now that the command
+		 * is executing
+		 */
+		rq->special = NULL;
+	}
+}
+
+static void osdblk_prepare_flush(struct request_queue *q, struct request *rq)
+{
+	/* add driver-specific marker, to indicate that this request
+	 * is a flush command
+	 */
+	rq->special = (void *) 0xdeadbeefUL;
+}
+
+static void osdblk_free_disk(struct osdblk_device *osdev)
+{
+	struct gendisk *disk = osdev->disk;
+
+	if (!disk)
+		return;
+
+	if (disk->flags & GENHD_FL_UP)
+		del_gendisk(disk);
+	if (disk->queue)
+		blk_cleanup_queue(disk->queue);
+	put_disk(disk);
+}
+
+static int osdblk_init_disk(struct osdblk_device *osdev)
+{
+	struct gendisk *disk;
+	struct request_queue *q;
+	int rc;
+	u64 obj_size = 0;
+
+	/* contact OSD, request size info about the object being mapped */
+	rc = osdblk_get_obj_size(osdev, &obj_size);
+	if (rc)
+		return rc;
+
+	/* create gendisk info */
+	disk = alloc_disk(OSDBLK_MINORS_PER_MAJOR);
+	if (!disk)
+		return -ENOMEM;
+
+	sprintf(disk->disk_name, DRV_NAME "/%d", osdev->id);
+	disk->major = osdev->major;
+	disk->first_minor = 0;
+	disk->fops = &osdblk_bd_ops;
+	disk->private_data = osdev;
+
+	/* init rq */
+	q = blk_init_queue(osdblk_rq_fn, &osdev->lock);
+	if (!q) {
+		put_disk(disk);
+		return -ENOMEM;
+	}
+
+	/* switch queue to TCQ mode; allocate tag map */
+	rc = blk_queue_init_tags(q, OSDBLK_MAX_REQ, NULL);
+	if (rc) {
+		blk_cleanup_queue(q);
+		put_disk(disk);
+		return rc;
+	}
+
+	blk_queue_prep_rq(q, blk_queue_start_tag);
+	blk_queue_ordered(q, QUEUE_ORDERED_DRAIN_FLUSH, osdblk_prepare_flush);
+
+	disk->queue = q;
+
+	q->queuedata = osdev;
+
+	osdev->disk = disk;
+	osdev->q = q;
+
+	/* finally, announce the disk to the world */
+	set_capacity(disk, obj_size);
+	add_disk(disk);
+
+	return 0;
+}
+
+/********************************************************************
+ * /sys/class/osdblk/
+ *                   add	map OSD object to blkdev
+ *                   remove	unmap OSD object
+ *                   list	show mappings
+ *******************************************************************/
+
+static void class_osdblk_release(struct class *cls)
+{
+	kfree(cls);
+}
+
+static ssize_t class_osdblk_list(struct class *c, char *data)
+{
+	int n = 0;
+	struct list_head *tmp;
+
+	mutex_lock_nested(&ctl_mutex, SINGLE_DEPTH_NESTING);
+
+	list_for_each(tmp, &osdblkdev_list) {
+		struct osdblk_device *osdev;
+
+		osdev = list_entry(tmp, struct osdblk_device, node);
+
+		n += sprintf(data+n, "%d %d %llu %llu %s\n",
+			osdev->id,
+			osdev->major,
+			osdev->obj.partition,
+			osdev->obj.id,
+			osdev->osd_path);
+	}
+
+	mutex_unlock(&ctl_mutex);
+	return n;
+}
+
+static ssize_t class_osdblk_add(struct class *c, const char *buf, size_t count)
+{
+	struct osdblk_device *osdev;
+	ssize_t rc;
+	int irc, new_id = 0;
+	struct list_head *tmp;
+
+	if (!try_module_get(THIS_MODULE))
+		return -ENODEV;
+
+	/* new osdblk_device object */
+	osdev = kzalloc(sizeof(*osdev) + strlen(buf) + 1, GFP_KERNEL);
+	if (!osdev) {
+		rc = -ENOMEM;
+		goto err_out_mod;
+	}
+
+	/* static osdblk_device initialization */
+	spin_lock_init(&osdev->lock);
+	INIT_LIST_HEAD(&osdev->node);
+
+	/* generate unique id: find highest unique id, add one */
+
+	mutex_lock_nested(&ctl_mutex, SINGLE_DEPTH_NESTING);
+
+	list_for_each(tmp, &osdblkdev_list) {
+		struct osdblk_device *osdev;
+
+		osdev = list_entry(tmp, struct osdblk_device, node);
+		if (osdev->id > new_id)
+			new_id = osdev->id + 1;
+	}
+
+	osdev->id = new_id;
+
+	/* add to global list */
+	list_add_tail(&osdev->node, &osdblkdev_list);
+
+	mutex_unlock(&ctl_mutex);
+
+	/* parse add command */
+	if (sscanf(buf, "%llu %llu %s", &osdev->obj.partition, &osdev->obj.id,
+		   osdev->osd_path) != 3) {
+		rc = -EINVAL;
+		goto err_out_slot;
+	}
+
+	/* initialize rest of new object */
+	sprintf(osdev->name, DRV_NAME "%d", osdev->id);
+
+	/* contact requested OSD */
+	osdev->osd = osduld_path_lookup(osdev->osd_path);
+	if (IS_ERR(osdev->osd)) {
+		rc = PTR_ERR(osdev->osd);
+		goto err_out_slot;
+	}
+
+	/* build OSD credential */
+	osdblk_make_credential(osdev->obj_cred, &osdev->obj);
+
+	/* register our block device */
+	irc = register_blkdev(0, osdev->name);
+	if (irc < 0) {
+		rc = irc;
+		goto err_out_osd;
+	}
+
+	osdev->major = irc;
+
+	/* set up and announce blkdev mapping */
+	rc = osdblk_init_disk(osdev);
+	if (rc)
+		goto err_out_blkdev;
+
+	return 0;
+
+err_out_blkdev:
+	unregister_blkdev(osdev->major, osdev->name);
+err_out_osd:
+	osduld_put_device(osdev->osd);
+err_out_slot:
+	mutex_lock_nested(&ctl_mutex, SINGLE_DEPTH_NESTING);
+	list_del_init(&osdev->node);
+	mutex_unlock(&ctl_mutex);
+
+	kfree(osdev);
+err_out_mod:
+	module_put(THIS_MODULE);
+	return rc;
+}
+
+static ssize_t class_osdblk_remove(struct class *c, const char *buf,
+					size_t count)
+{
+	struct osdblk_device *osdev = NULL;
+	int target_id, rc;
+	unsigned long ul;
+	struct list_head *tmp;
+
+	rc = strict_strtoul(buf, 10, &ul);
+	if (rc)
+		return rc;
+
+	/* convert to int; abort if we lost anything in the conversion */
+	target_id = (int) ul;
+	if (target_id != ul)
+		return -EINVAL;
+
+	/* remove object from list immediately */
+	mutex_lock_nested(&ctl_mutex, SINGLE_DEPTH_NESTING);
+
+	list_for_each(tmp, &osdblkdev_list) {
+		osdev = list_entry(tmp, struct osdblk_device, node);
+		if (osdev->id == target_id) {
+			list_del_init(&osdev->node);
+			break;
+		}
+		osdev = NULL;
+	}
+
+	mutex_unlock(&ctl_mutex);
+
+	if (!osdev)
+		return -ENOENT;
+
+	/* clean up and free blkdev and associated OSD connection */
+	osdblk_free_disk(osdev);
+	unregister_blkdev(osdev->major, osdev->name);
+	osduld_put_device(osdev->osd);
+	kfree(osdev);
+
+	/* release module ref */
+	module_put(THIS_MODULE);
+
+	return 0;
+}
+
+static struct class_attribute class_osdblk_attrs[] = {
+	__ATTR(add,	0200, NULL, class_osdblk_add),
+	__ATTR(remove,	0200, NULL, class_osdblk_remove),
+	__ATTR(list,	0444, class_osdblk_list, NULL),
+	__ATTR_NULL
+};
+
+static int osdblk_sysfs_init(void)
+{
+	int ret = 0;
+
+	/*
+	 * create control files in sysfs
+	 * /sys/class/osdblk/...
+	 */
+	class_osdblk = kzalloc(sizeof(*class_osdblk), GFP_KERNEL);
+	if (!class_osdblk)
+		return -ENOMEM;
+
+	class_osdblk->name = DRV_NAME;
+	class_osdblk->owner = THIS_MODULE;
+	class_osdblk->class_release = class_osdblk_release;
+	class_osdblk->class_attrs = class_osdblk_attrs;
+
+	ret = class_register(class_osdblk);
+	if (ret) {
+		kfree(class_osdblk);
+		class_osdblk = NULL;
+		printk(PFX "failed to create class osdblk\n");
+		return ret;
+	}
+
+	return 0;
+}
+
+static void osdblk_sysfs_cleanup(void)
+{
+	if (class_osdblk)
+		class_destroy(class_osdblk);
+	class_osdblk = NULL;
+}
+
+static int __init osdblk_init(void)
+{
+	int rc;
+
+	rc = osdblk_sysfs_init();
+	if (rc)
+		return rc;
+
+	return 0;
+}
+
+static void __exit osdblk_exit(void)
+{
+	osdblk_sysfs_cleanup();
+}
+
+module_init(osdblk_init);
+module_exit(osdblk_exit);
+
-- 
1.6.2.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 6/6] osdblk: Adjust queue limits to lower device's limits
  2009-06-17 16:01 [PATCHSET 0/6] exofs: few patches for Linux 2.6.31 Boaz Harrosh
                   ` (4 preceding siblings ...)
  2009-06-17 16:07 ` [PATCH 5/6] osdblk: a Linux block device for OSD objects Boaz Harrosh
@ 2009-06-17 16:07 ` Boaz Harrosh
  2009-06-18 12:46 ` [osd-dev] [PATCHSET 0/6] exofs: few patches for Linux 2.6.31 Boaz Harrosh
  6 siblings, 0 replies; 11+ messages in thread
From: Boaz Harrosh @ 2009-06-17 16:07 UTC (permalink / raw)
  To: Jeff Garzik, open-osd; +Cc: linux-fsdevel, linux-kernel

call blk_queue_stack_limits() to copy queue limits from
the underline osd scsi_device. This is absolutely needed
because osdblk cannot sleep when allocating a lower-request and
therefore cannot be bouncing.

TODO: Dynamic changes of limits to the lower device queue
will not reflect in the upper driver

Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
---
 drivers/block/osdblk.c |    7 +++++++
 1 files changed, 7 insertions(+), 0 deletions(-)

diff --git a/drivers/block/osdblk.c b/drivers/block/osdblk.c
index e829360..b07e154 100644
--- a/drivers/block/osdblk.c
+++ b/drivers/block/osdblk.c
@@ -66,6 +66,7 @@
 #include <scsi/osd_initiator.h>
 #include <scsi/osd_attributes.h>
 #include <scsi/osd_sec.h>
+#include <scsi/scsi_device.h>
 
 #define DRV_NAME "osdblk"
 #define PFX DRV_NAME ": "
@@ -410,6 +411,12 @@ static int osdblk_init_disk(struct osdblk_device *osdev)
 		return rc;
 	}
 
+	/* Set our limits to the lower device limits, because osdblk cannot
+	 * sleep when allocating a lower-request and therefore cannot be
+	 * bouncing.
+	 */
+	blk_queue_stack_limits(q, osd_request_queue(osdev->osd));
+
 	blk_queue_prep_rq(q, blk_queue_start_tag);
 	blk_queue_ordered(q, QUEUE_ORDERED_DRAIN_FLUSH, osdblk_prepare_flush);
 
-- 
1.6.2.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [osd-dev] [PATCHSET 0/6] exofs: few patches for Linux 2.6.31
  2009-06-17 16:01 [PATCHSET 0/6] exofs: few patches for Linux 2.6.31 Boaz Harrosh
                   ` (5 preceding siblings ...)
  2009-06-17 16:07 ` [PATCH 6/6] osdblk: Adjust queue limits to lower device's limits Boaz Harrosh
@ 2009-06-18 12:46 ` Boaz Harrosh
  2009-06-18 12:55   ` [PATCH] open-osd: osdblk User Mode utility Boaz Harrosh
  2009-06-18 15:31   ` [osd-dev] [PATCHSET 0/6] exofs: few patches for Linux 2.6.31 Jeff Garzik
  6 siblings, 2 replies; 11+ messages in thread
From: Boaz Harrosh @ 2009-06-18 12:46 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: open-osd mailing-list, linux-fsdevel, linux-kernel

On 06/17/2009 07:01 PM, Boaz Harrosh wrote:
> 
>   [PATCH 5/6] osdblk: a Linux block device for OSD objects
>   [PATCH 6/6] osdblk: Adjust queue limits to lower device's limits
> 
> 	This is the proposed osdblk driver by Jeff Garzik. It as all the Kernel
> 	pre-requisites, but is missing a user-mode tool and more testing. So I'm
> 	not sure it will make it into this Kernel. But for review.
> 

Jeff hi.

I've hacked up a very quick and dirty small utility for Creating / Removing / Resizing
objects on an OSD device. [I'm sending as reply to this mail]

Please give it a fast testing. Should we now submit the osdblk driver, for 2.6.31?
Which tree? I can push it through the open-osd.org tree. The driver was included for
some weeks in linux-next through that tree.

Thanks
Boaz

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH] open-osd: osdblk User Mode utility
  2009-06-18 12:46 ` [osd-dev] [PATCHSET 0/6] exofs: few patches for Linux 2.6.31 Boaz Harrosh
@ 2009-06-18 12:55   ` Boaz Harrosh
  2009-06-22 12:15     ` [osd-dev] " Boaz Harrosh
  2009-06-18 15:31   ` [osd-dev] [PATCHSET 0/6] exofs: few patches for Linux 2.6.31 Jeff Garzik
  1 sibling, 1 reply; 11+ messages in thread
From: Boaz Harrosh @ 2009-06-18 12:55 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: linux-fsdevel, open-osd mailing-list, linux-kernel


A minimal user-mode application to Create / Remove / Resize
OSD objects from a device, for use with the osdblk.ko block
device driver.

See inside patch for Usage instructions.

Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
---
 usr/Makefile |   12 ++-
 usr/osdblk.c |  336 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 347 insertions(+), 1 deletions(-)
 create mode 100644 usr/osdblk.c

diff --git a/usr/Makefile b/usr/Makefile
index 2021d9b..534fba7 100755
--- a/usr/Makefile
+++ b/usr/Makefile
@@ -36,7 +36,7 @@ CFLAGS = -fPIC $(CWARN) $(INCLUDES) $(DEFINES)
 
 OSD_LIBS=-L../lib -losd
 
-ALL = osd_test mkfs.exofs
+ALL = osd_test mkfs.exofs osdblk
 all: $(DEPEND) $(ALL)
 
 clean: $(ALL:=_clean)
@@ -78,6 +78,16 @@ mkfs.exofs_clean:
 
 $(DEPEND): $(mkfs_COMMON_OBJ:.o=.c) $(mkfs_OBJ:.o=.c)
 
+# =============== osdblk ======================================================
+osdblk_OBJ=osdblk.o
+
+osdblk:  $(osdblk_OBJ)
+	$(CC) -o $@ $^ $(OSD_LIBS)
+
+osdblk_clean:
+
+$(DEPEND): $(osdblk_OBJ:.o=.c)
+
 # =============== common rules =================================================
 # every thing should compile if Makefile changed
 %.o: %.c Makefile
diff --git a/usr/osdblk.c b/usr/osdblk.c
new file mode 100644
index 0000000..f7df257
--- /dev/null
+++ b/usr/osdblk.c
@@ -0,0 +1,336 @@
+/*
+ * osdblk.c - A user-mode program that calls into the osd ULD
+ *
+ * Copyright (C) 2009 Panasas Inc.  All rights reserved.
+ *
+ * Authors:
+ *   Boaz Harrosh <bharrosh@panasas.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ *  1. Redistributions of source code must retain the above copyright
+ *     notice, this list of conditions and the following disclaimer.
+ *  2. Redistributions in binary form must reproduce the above copyright
+ *     notice, this list of conditions and the following disclaimer in the
+ *     documentation and/or other materials provided with the distribution.
+ *  3. Neither the name of the Panasas company nor the names of its
+ *     contributors may be used to endorse or promote products derived
+ *     from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESS OR IMPLIED
+ * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
+ * MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+ * DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
+ * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+ * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+ * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
+ * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
+ * LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
+ * NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
+ * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <open-osd/libosd.h>
+
+#include <getopt.h>
+#include <stdio.h>
+#include <stdlib.h>
+
+#define OSDBLK_ERR(fmt, a...) fprintf(stderr, "osdblk: " fmt, ##a)
+#define OSDBLK_INFO(fmt, a...) printf("osdblk: " fmt, ##a)
+
+#ifdef CONFIG_OSDBLK_DEBUG
+#define OSDBLK_DBGMSG(fmt, a...) \
+	printf("osdblk @%s:%d: " fmt, __func__, __LINE__, ##a)
+#else
+#define EXOFS_DBGMSG(fmt, a...) \
+	if (0) printf(fmt, ##a);
+#endif
+
+static void usage(void)
+{
+	static char msg[] = {
+	"usage: osdblk COMMAND --pid=pid_no --obj=obj_no --length=ob_size /dev/osdX\n"
+	"\n"
+	"COMMAND is one of: --create | --remove | --resize\n"
+	"--create | -c\n"
+	"        Create a new object. If object exist returns error\n"
+	"        --length can be used to denote an initial size\n"
+	"\n"
+	"--remove\n"
+	"        remove an existing object. If does not exist does nothing\n"
+	"        --length is ignored\n"
+	"\n"
+	"--resize | -s\n"
+	"        Resize an existing object. If does not exist errors\n"
+	"        If --length=0 then does nothing (Only check for existance)\n"
+	"\n"
+	"--pid=pid_no | -p pid_no\n"
+	"       pid_no is the partition 64bit number of the object in question\n"
+	"       Both 0xabc hex or decimal anotation can be used\n"
+	"\n"
+	"--obj=obj_no | -o obj_no\n"
+	"       obj_no is the object 64bit number of the object in question\n"
+	"       Both 0xabc hex or decimal anotation can be used\n"
+	"\n"
+	"--length=size | -l size\n"
+	"       \"size\" is the new size of the object to be set\n"
+	"       0xhex or decimal can be used. G, M, K can be appended to the\n"
+	"       number to denote base-two Giga Mega or Kilo\n"
+	"\n"
+	"/dev/osdX is the osd LUN (char-dev) to use containing the object\n"
+	"\n"
+	"Description: Create Remove or Resize an OSD object on an OSD LUN\n"
+	"             The object can later be used, for example, by the\n"
+	"             osdblk device driver\n"
+	};
+
+	printf(msg);
+}
+
+#define _LLU(x) ((unsigned long long)x)
+
+static u64 ullwithGMK(char *optarg)
+{
+	char *pGMK;
+	u64 mul;
+	u64 val = strtoll(optarg, &pGMK, 0);
+
+	switch (*pGMK) {
+	case 'K':
+	case 'k':
+		mul = 1024LLU;
+		break;
+	case 'M':
+		mul = 1024LLU * 1024LLU;
+		break;
+	case 'G':
+		mul = 1024LLU * 1024LLU * 1024LLU;
+		break;
+	default:
+		mul = 1;
+	}
+
+	return val * mul;
+}
+
+static void osdblk_make_credential(u8 *creds, struct osd_obj_id *obj,
+				   bool is_v1)
+{
+	osd_sec_init_nosec_doall_caps(creds, obj, false, is_v1);
+}
+
+static int osdblk_exec(struct osd_request *or, u8 *cred)
+{
+	struct osd_sense_info osi;
+	int ret;
+
+	ret = osd_finalize_request(or, 0, cred, NULL);
+	if (ret) {
+		OSDBLK_ERR("Error: Faild to osd_finalize_request() => %d\n",
+			   ret);
+		return ret;
+	}
+
+	osd_execute_request(or);
+	ret = osd_req_decode_sense(or, &osi);
+
+	if (ret) { /* translate to Linux codes */
+		if (osi.additional_code == scsi_invalid_field_in_cdb) {
+			if (osi.cdb_field_offset == OSD_CFO_STARTING_BYTE)
+				ret = 0; /*this is OK*/
+			if (osi.cdb_field_offset == OSD_CFO_OBJECT_ID)
+				ret = -ENOENT;
+			else
+				ret = -EINVAL;
+		} else if (osi.additional_code == osd_quota_error)
+			ret = -ENOSPC;
+		else
+			ret = -EIO;
+	}
+
+	return ret;
+}
+
+static int do_resize(struct osd_dev *od, struct osd_obj_id *obj, u64 size)
+{
+	struct osd_request *or = osd_start_request(od, GFP_KERNEL);
+	__be64 be_size = cpu_to_be64(size);
+	u8 creds[OSD_CAP_LEN];
+	struct osd_attr attr_logical_length = ATTR_SET(
+		OSD_APAGE_OBJECT_INFORMATION, OSD_ATTR_OI_LOGICAL_LENGTH,
+		sizeof(be_size), &be_size);
+	int ret;
+
+	if (unlikely(!or))
+		return -ENOMEM;
+
+	osdblk_make_credential(creds, obj, osd_req_is_ver1(or));
+
+	osd_req_set_attributes(or, obj);
+	osd_req_add_set_attr_list(or, &attr_logical_length, 1);
+
+	ret = osdblk_exec(or, creds);
+	osd_end_request(or);
+
+	if (ret)
+		return ret;
+
+	OSDBLK_INFO("Resized: pid=0x%llx oid=0x%llx length=0x%llx\n",
+		_LLU(obj->partition), _LLU(obj->id),
+		_LLU(size));
+
+	return 0;
+}
+
+static int do_create(struct osd_dev *od, struct osd_obj_id *obj, u64 size)
+{
+	struct osd_request *or = osd_start_request(od, GFP_KERNEL);
+	u8 creds[OSD_CAP_LEN];
+	int ret;
+
+	if (unlikely(!or))
+		return -ENOMEM;
+
+	osdblk_make_credential(creds, obj, osd_req_is_ver1(or));
+	osd_req_create_object(or, obj);
+	ret = osdblk_exec(or, creds);
+	osd_end_request(or);
+
+	if (ret)
+		return ret;
+
+	OSDBLK_INFO("Created: pid=0x%llx oid=0x%llx\n",
+		_LLU(obj->partition), _LLU(obj->id));
+
+	return do_resize(od, obj, size);
+}
+
+static int do_remove(struct osd_dev *od, struct osd_obj_id *obj)
+{
+	struct osd_request *or = osd_start_request(od, GFP_KERNEL);
+	u8 creds[OSD_CAP_LEN];
+	int ret;
+
+	if (unlikely(!or))
+		return -ENOMEM;
+
+	osdblk_make_credential(creds, obj, osd_req_is_ver1(or));
+	osd_req_remove_object(or, obj);
+	ret = osdblk_exec(or, creds);
+	osd_end_request(or);
+
+	if (ret)
+		return ret;
+
+	OSDBLK_INFO("Removed: pid=0x%llx oid=0x%llx\n",
+		_LLU(obj->partition), _LLU(obj->id));
+
+	return 0;
+}
+
+enum osd_todo {
+	osd_none = 0,
+	osd_create,
+	osd_remove,
+	osd_resize,
+};
+
+static int _do(char *path, struct osd_obj_id *obj, u64 size,
+	       enum osd_todo todo)
+{
+	struct osd_dev *od;
+	int ret;
+
+	ret = osd_open(path, &od);
+	if (ret)
+		return ret;
+
+	switch (todo) {
+	case osd_create:
+		ret = do_create(od, obj, size);
+		break;
+	case osd_remove:
+		ret = do_remove(od, obj);
+		break;
+	case osd_resize:
+		ret = do_resize(od, obj, size);
+		break;
+	default:
+		usage();
+		return 1;
+	}
+
+	osd_close(od);
+
+	/* osd lib has Kernel API which return negative errors */
+	return -ret;
+}
+
+int main(int argc, char *argv[])
+{
+	struct option opt[] = {
+		{.name = "create", .has_arg = 0, .flag = NULL, .val = 'c'} ,
+		{.name = "remove", .has_arg = 0, .flag = NULL, .val = 'r'} ,
+		{.name = "resize", .has_arg = 0, .flag = NULL, .val = 's'} ,
+		{.name = "pid", .has_arg = 1, .flag = NULL, .val =  'p'} ,
+		{.name = "oid", .has_arg = 1, .flag = NULL, .val =  'o'} ,
+		{.name = "size", .has_arg = 1, .flag = NULL, .val = 'l'} ,
+
+		{.name = 0, .has_arg = 0, .flag = 0, .val = 0} ,
+	};
+	struct osd_obj_id obj = {.id = 0};
+	enum osd_todo todo = osd_none;
+	u64 size = 0;
+	char op;
+	int err;
+
+	while ((op = getopt_long(argc, argv, "csp:o:l:", opt, NULL)) != -1) {
+		switch (op) {
+		case 'c':
+			todo = osd_create;
+			break;
+		case 'r':
+			todo = osd_remove;
+			break;
+		case 's':
+			todo = osd_resize;
+			break;
+
+		case 'p':
+			obj.partition = strtoll(optarg, NULL, 0);
+			break;
+		case 'o':
+			obj.id = strtoll(optarg, NULL, 0);
+			break;
+
+		case 'l':
+			size = ullwithGMK(optarg);
+			break;
+		}
+	}
+
+	argc -= optind;
+	argv += optind;
+
+	if (argc <= 0) {
+		usage();
+		return 1;
+	}
+
+	if ((todo == osd_none) || !obj.partition || !obj.id) {
+		usage();
+		return 1;
+	}
+
+	err = _do(argv[0], &obj, size, todo);
+	if (err)
+		OSDBLK_ERR("Error: %s\n", strerror(err));
+
+	return err;
+}
-- 
1.6.2.1

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [osd-dev] [PATCHSET 0/6] exofs: few patches for Linux 2.6.31
  2009-06-18 12:46 ` [osd-dev] [PATCHSET 0/6] exofs: few patches for Linux 2.6.31 Boaz Harrosh
  2009-06-18 12:55   ` [PATCH] open-osd: osdblk User Mode utility Boaz Harrosh
@ 2009-06-18 15:31   ` Jeff Garzik
  1 sibling, 0 replies; 11+ messages in thread
From: Jeff Garzik @ 2009-06-18 15:31 UTC (permalink / raw)
  To: Boaz Harrosh; +Cc: open-osd mailing-list, linux-fsdevel, linux-kernel

Boaz Harrosh wrote:
> On 06/17/2009 07:01 PM, Boaz Harrosh wrote:
>>   [PATCH 5/6] osdblk: a Linux block device for OSD objects
>>   [PATCH 6/6] osdblk: Adjust queue limits to lower device's limits
>>
>> 	This is the proposed osdblk driver by Jeff Garzik. It as all the Kernel
>> 	pre-requisites, but is missing a user-mode tool and more testing. So I'm
>> 	not sure it will make it into this Kernel. But for review.
>>
> 
> Jeff hi.
> 
> I've hacked up a very quick and dirty small utility for Creating / Removing / Resizing
> objects on an OSD device. [I'm sending as reply to this mail]

Looks good to me!


> Please give it a fast testing. Should we now submit the osdblk driver, for 2.6.31?
> Which tree? I can push it through the open-osd.org tree. The driver was included for
> some weeks in linux-next through that tree.

I'll try to give it a test early next week, as my OSD simulator setup is 
completely disassembled at the moment.

Yes, please go ahead and push the driver.  In general, Linus accepts new 
drivers to the kernel at any time, even during a -rc, once the pre-reqs 
are upstream.

	Jeff




^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [osd-dev] [PATCH] open-osd: osdblk User Mode utility
  2009-06-18 12:55   ` [PATCH] open-osd: osdblk User Mode utility Boaz Harrosh
@ 2009-06-22 12:15     ` Boaz Harrosh
  0 siblings, 0 replies; 11+ messages in thread
From: Boaz Harrosh @ 2009-06-22 12:15 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: linux-fsdevel, open-osd mailing-list, linux-kernel

On 06/18/2009 03:55 PM, Boaz Harrosh wrote:
> A minimal user-mode application to Create / Remove / Resize
> OSD objects from a device, for use with the osdblk.ko block
> device driver.
> 
> See inside patch for Usage instructions.
> 
> Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
> ---

This version had a bug which, if the partition did not exist
it would fail to create the object.

I have squashed the below patch into this one, on the open-osd
git tree

---
Subject: [PATCH] {SQUASHME} open-osd: usr/osdblk: Need to also create the partition

A fall out of the usr/osdblk application

Boaz
---
 usr/osdblk.c |   17 +++++++++++++++--
 1 files changed, 15 insertions(+), 2 deletions(-)

diff --git a/usr/osdblk.c b/usr/osdblk.c
index f7df257..b9982c0 100644
--- a/usr/osdblk.c
+++ b/usr/osdblk.c
@@ -74,7 +74,7 @@ static void usage(void)
 	"       pid_no is the partition 64bit number of the object in question\n"
 	"       Both 0xabc hex or decimal anotation can be used\n"
 	"\n"
-	"--obj=obj_no | -o obj_no\n"
+	"--oid=obj_no | -o obj_no\n"
 	"       obj_no is the object 64bit number of the object in question\n"
 	"       Both 0xabc hex or decimal anotation can be used\n"
 	"\n"
@@ -198,6 +198,19 @@ static int do_create(struct osd_dev *od, struct osd_obj_id *obj, u64 size)
 		return -ENOMEM;
 
 	osdblk_make_credential(creds, obj, osd_req_is_ver1(or));
+
+	/* Create partition OK to fail (all ready exist) */
+	osd_req_create_partition(or, obj->partition);
+	ret = osdblk_exec(or, creds);
+	osd_end_request(or);
+
+	if (ret)
+		OSDBLK_INFO("pid=0x%llx exists\n", _LLU(obj->partition));
+
+	or = osd_start_request(od, GFP_KERNEL);
+	if (unlikely(!or))
+		return -ENOMEM;
+
 	osd_req_create_object(or, obj);
 	ret = osdblk_exec(or, creds);
 	osd_end_request(or);
@@ -280,7 +293,7 @@ int main(int argc, char *argv[])
 		{.name = "resize", .has_arg = 0, .flag = NULL, .val = 's'} ,
 		{.name = "pid", .has_arg = 1, .flag = NULL, .val =  'p'} ,
 		{.name = "oid", .has_arg = 1, .flag = NULL, .val =  'o'} ,
-		{.name = "size", .has_arg = 1, .flag = NULL, .val = 'l'} ,
+		{.name = "length", .has_arg = 1, .flag = NULL, .val = 'l'} ,
 
 		{.name = 0, .has_arg = 0, .flag = 0, .val = 0} ,
 	};
-- 
1.6.2.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2009-06-22 12:15 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-06-17 16:01 [PATCHSET 0/6] exofs: few patches for Linux 2.6.31 Boaz Harrosh
2009-06-17 16:03 ` [PATCH 1/6] exofs: Fix bio leak in error handling path (sync read) Boaz Harrosh
2009-06-17 16:04 ` [PATCH 2/6] exofs: Remove IBM copyrights Boaz Harrosh
2009-06-17 16:05 ` [PATCH 3/6] exofs: Avoid using file_fsync() Boaz Harrosh
2009-06-17 16:06 ` [PATCH 4/6] MAINTAINERS: Add osd maintained files (F:) Boaz Harrosh
2009-06-17 16:07 ` [PATCH 5/6] osdblk: a Linux block device for OSD objects Boaz Harrosh
2009-06-17 16:07 ` [PATCH 6/6] osdblk: Adjust queue limits to lower device's limits Boaz Harrosh
2009-06-18 12:46 ` [osd-dev] [PATCHSET 0/6] exofs: few patches for Linux 2.6.31 Boaz Harrosh
2009-06-18 12:55   ` [PATCH] open-osd: osdblk User Mode utility Boaz Harrosh
2009-06-22 12:15     ` [osd-dev] " Boaz Harrosh
2009-06-18 15:31   ` [osd-dev] [PATCHSET 0/6] exofs: few patches for Linux 2.6.31 Jeff Garzik

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).