linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/3] block: add queue-private command filter, editable via sysfs
@ 2012-09-25 15:30 Paolo Bonzini
  2012-09-25 15:30 ` [PATCH v2 1/3] block: add back queue-private command filter Paolo Bonzini
                   ` (3 more replies)
  0 siblings, 4 replies; 37+ messages in thread
From: Paolo Bonzini @ 2012-09-25 15:30 UTC (permalink / raw)
  To: linux-kernel
  Cc: Tejun Heo, James Bottomley, Jens Axboe, Ric Wheeler, Alan Cox,
	linux-scsi

The set of use cases for SG_IO is quite variable that no single filter can
accomodate all of them.  The current filter is tailored very much to
CD burning, and includes many MMC-specific commands that may have
other meanings in different standards.  Someone may want to remove
those commands; at the same time, people that trust their users may
want to add persistent reservations, trim/discard, or even access to
vendor-specific commands.

Filters used to be mutable via sysfs, but the implementation was
never enabled.  Add it back, and let the admin set this up per device.

A simple bitmap does not let you do things like enabling command A with
option B on a specific device for a certain block range.  However, the
question is really whether this is needed---in fact, neither of the known
uses for the filtering need it.

In one use case, the administrator then needs the ability to configure
devices easily, for example to be much more restrictive on non-MMC
devices.  It must be done with the same tools it uses for other aspects
of the policy---which will be a combination of DAC (Unix permissions and
ACLs) and sysfs.  Different SCSI standards may give different meanings
for the same byte value, but a simple bitmap is enough for this.

In the virtualization case, the problem is really that you want to
pass through everything or almost everything, while still running as
confined as possible (i.e. CAP_SYS_RAWIO is not a choice).  But in this
case a more complex filtering can be done just as easily in userspace,
in the virtual machine monitor.  While the userspace filter can be
subverted if the guest can escape the QEMU jail, the bitmap still lets
you block some commands at the kernel level if really necessary.

One alternative is a ioctl to disable the filter altogether, to be
used together with SCM_RIGHTS file descriptor passing.  This works in
the virtualization case but not for policy decisions.  So this patch
series provides the sysfs knob.  It is a tweaked revert of commit 018e044
(block: get rid of queue-private command filter, 2009-06-26).

Please review!

Paolo

v1->v2: add OOM and capability checks

Paolo Bonzini (3):
  block: add back queue-private command filter
  scsi: create an all-zero filter for scanners
  block: add back command filter modification via sysfs

 Documentation/block/queue-sysfs.txt |   16 +++++
 block/Kconfig                       |   10 +++
 block/blk-sysfs.c                   |   43 +++++++++++++
 block/bsg.c                         |    2 +-
 block/scsi_ioctl.c                  |  117 +++++++++++++++++++++++++++++++----
 drivers/scsi/scsi_scan.c            |    6 ++-
 drivers/scsi/sg.c                   |    7 +-
 include/linux/blkdev.h              |   31 +++++++++-
 8 files changed, 213 insertions(+), 19 deletions(-)


^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH v2 1/3] block: add back queue-private command filter
  2012-09-25 15:30 [PATCH v2 0/3] block: add queue-private command filter, editable via sysfs Paolo Bonzini
@ 2012-09-25 15:30 ` Paolo Bonzini
  2012-09-25 15:30 ` [PATCH v2 2/3] scsi: create an all-zero filter for scanners Paolo Bonzini
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 37+ messages in thread
From: Paolo Bonzini @ 2012-09-25 15:30 UTC (permalink / raw)
  To: linux-kernel
  Cc: Tejun Heo, James Bottomley, Jens Axboe, Ric Wheeler, Alan Cox,
	linux-scsi

The command filter used to be mutable via sysfs, but this was broken
and backed out.  Let's add it back.  This patch adds the infrastructure
for filtering, but unlike the old code this one just adds a pointer to
request_queue, so as to make it cheaper in the majority of cases where
no special filtering is desired.

This is a partial (and massaged) revert of commit 018e044 (block: get
rid of queue-private command filter, 2009-06-26).

Cc: linux-scsi@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 block/blk-sysfs.c      |    2 ++
 block/bsg.c            |    2 +-
 block/scsi_ioctl.c     |   17 +++++------------
 drivers/scsi/sg.c      |    4 +++-
 include/linux/blkdev.h |   10 +++++++++-
 5 files changed, 20 insertions(+), 15 deletions(-)

diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c
index 9628b29..5a0de07 100644
--- a/block/blk-sysfs.c
+++ b/block/blk-sysfs.c
@@ -487,6 +487,8 @@ static void blk_release_queue(struct kobject *kobj)
 
 	blkcg_exit_queue(q);
 
+	kfree(q->cmd_filter);
+
 	if (q->elevator) {
 		spin_lock_irq(q->queue_lock);
 		ioc_clear_queue(q);
diff --git a/block/bsg.c b/block/bsg.c
index ff64ae3..09956da 100644
--- a/block/bsg.c
+++ b/block/bsg.c
@@ -187,7 +187,7 @@ static int blk_fill_sgv4_hdr_rq(struct request_queue *q, struct request *rq,
 		return -EFAULT;
 
 	if (hdr->subprotocol == BSG_SUB_PROTOCOL_SCSI_CMD) {
-		if (blk_verify_command(rq->cmd, has_write_perm))
+		if (blk_verify_command(q->cmd_filter, rq->cmd, has_write_perm))
 			return -EPERM;
 	} else if (!capable(CAP_SYS_RAWIO))
 		return -EPERM;
diff --git a/block/scsi_ioctl.c b/block/scsi_ioctl.c
index 9a87daa..c8862e9 100644
--- a/block/scsi_ioctl.c
+++ b/block/scsi_ioctl.c
@@ -33,11 +33,6 @@
 #include <scsi/scsi_ioctl.h>
 #include <scsi/scsi_cmnd.h>
 
-struct blk_cmd_filter {
-	unsigned long read_ok[BLK_SCSI_CMD_PER_LONG];
-	unsigned long write_ok[BLK_SCSI_CMD_PER_LONG];
-};
-
 static struct blk_cmd_filter blk_default_cmd_filter;
 
 /* Command group 3 is reserved and should never be used.  */
@@ -196,17 +191,15 @@ static void blk_set_cmd_filter_defaults(struct blk_cmd_filter *filter)
 	__set_bit(GPCMD_SET_READ_AHEAD, filter->write_ok);
 }
 
-int blk_verify_command(unsigned char *cmd, fmode_t has_write_perm)
+int blk_verify_command(struct blk_cmd_filter *filter,
+		       unsigned char *cmd, fmode_t has_write_perm)
 {
-	struct blk_cmd_filter *filter = &blk_default_cmd_filter;
-
 	/* root can do any command. */
 	if (capable(CAP_SYS_RAWIO))
 		return 0;
 
-	/* if there's no filter set, assume we're filtering everything out */
 	if (!filter)
-		return -EPERM;
+		filter = &blk_default_cmd_filter;
 
 	/* Anybody who can open the device can do a read-safe command */
 	if (test_bit(cmd[0], filter->read_ok))
@@ -225,7 +218,7 @@ static int blk_fill_sghdr_rq(struct request_queue *q, struct request *rq,
 {
 	if (copy_from_user(rq->cmd, hdr->cmdp, hdr->cmd_len))
 		return -EFAULT;
-	if (blk_verify_command(rq->cmd, mode & FMODE_WRITE))
+	if (blk_verify_command(q->cmd_filter, rq->cmd, mode & FMODE_WRITE))
 		return -EPERM;
 
 	/*
@@ -472,7 +465,7 @@ int sg_scsi_ioctl(struct request_queue *q, struct gendisk *disk, fmode_t mode,
 	if (in_len && copy_from_user(buffer, sic->data + cmdlen, in_len))
 		goto error;
 
-	err = blk_verify_command(rq->cmd, mode & FMODE_WRITE);
+	err = blk_verify_command(q->cmd_filter, rq->cmd, mode & FMODE_WRITE);
 	if (err)
 		goto error;
 
diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c
index 9c5c5f2..2ba7c82 100644
--- a/drivers/scsi/sg.c
+++ b/drivers/scsi/sg.c
@@ -217,11 +217,13 @@ static void sg_put_dev(Sg_device *sdp);
 static int sg_allow_access(struct file *filp, unsigned char *cmd)
 {
 	struct sg_fd *sfp = filp->private_data;
+	struct request_queue *q = sfp->parentdp->device->request_queue;
 
 	if (sfp->parentdp->device->type == TYPE_SCANNER)
 		return 0;
 
-	return blk_verify_command(cmd, filp->f_mode & FMODE_WRITE);
+	return blk_verify_command(q->cmd_filter,
+				  cmd, filp->f_mode & FMODE_WRITE);
 }
 
 static int get_exclude(Sg_device *sdp)
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 4a2ab7c..b5c5f8a 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -258,6 +258,11 @@ struct blk_queue_tag {
 #define BLK_SCSI_MAX_CMDS	(256)
 #define BLK_SCSI_CMD_PER_LONG	(BLK_SCSI_MAX_CMDS / (sizeof(long) * 8))
 
+struct blk_cmd_filter {
+	unsigned long read_ok[BLK_SCSI_CMD_PER_LONG];
+	unsigned long write_ok[BLK_SCSI_CMD_PER_LONG];
+};
+
 struct queue_limits {
 	unsigned long		bounce_pfn;
 	unsigned long		seg_boundary_mask;
@@ -423,6 +428,8 @@ struct request_queue {
 	struct bsg_class_device bsg_dev;
 #endif
 
+	struct blk_cmd_filter	*cmd_filter;
+
 #ifdef CONFIG_BLK_CGROUP
 	struct list_head	all_q_node;
 #endif
@@ -1005,7 +1012,8 @@ static inline int sb_issue_zeroout(struct super_block *sb, sector_t block,
 				    gfp_mask);
 }
 
-extern int blk_verify_command(unsigned char *cmd, fmode_t has_write_perm);
+extern int blk_verify_command(struct blk_cmd_filter *filter,
+			      unsigned char *cmd, fmode_t has_write_perm);
 
 enum blk_default_limits {
 	BLK_MAX_SEGMENTS	= 128,
-- 
1.7.1



^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v2 2/3] scsi: create an all-zero filter for scanners
  2012-09-25 15:30 [PATCH v2 0/3] block: add queue-private command filter, editable via sysfs Paolo Bonzini
  2012-09-25 15:30 ` [PATCH v2 1/3] block: add back queue-private command filter Paolo Bonzini
@ 2012-09-25 15:30 ` Paolo Bonzini
  2012-09-25 15:30 ` [PATCH v2 3/3] block: add back command filter modification via sysfs Paolo Bonzini
  2012-10-04 10:12 ` [PATCH v2 0/3] block: add queue-private command filter, editable " Paolo Bonzini
  3 siblings, 0 replies; 37+ messages in thread
From: Paolo Bonzini @ 2012-09-25 15:30 UTC (permalink / raw)
  To: linux-kernel
  Cc: Tejun Heo, James Bottomley, Jens Axboe, Ric Wheeler, Alan Cox,
	linux-scsi

Using /dev/sg for scanners is blocked from unprivileged users.  Reimplement
this using customizable command filters, so that the sysfs knobs will work
in this case too.

Cc: linux-scsi@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
	v1->v2: OOM check [Alan Cox]
		use GFP_ATOMIC, not GFP_KERNEL

 drivers/scsi/scsi_scan.c |    8 +++++++-
 drivers/scsi/sg.c        |    3 ---
 2 files changed, 7 insertions(+), 4 deletions(-)

diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
index 56a9379..81b1579 100644
--- a/drivers/scsi/scsi_scan.c
+++ b/drivers/scsi/scsi_scan.c
@@ -773,13 +773,19 @@ static int scsi_add_lun(struct scsi_device *sdev, unsigned char *inq_result,
 	}
 
 	switch (sdev->type) {
+	case TYPE_SCANNER:
+		sdev->request_queue->cmd_filter =
+			kzalloc(sizeof(struct blk_cmd_filter), GFP_ATOMIC);
+		if (sdev->request_queue->cmd_filter == NULL)
+			return SCSI_SCAN_NO_RESPONSE;
+		/* fallthrough */
+
 	case TYPE_RBC:
 	case TYPE_TAPE:
 	case TYPE_DISK:
 	case TYPE_PRINTER:
 	case TYPE_MOD:
 	case TYPE_PROCESSOR:
-	case TYPE_SCANNER:
 	case TYPE_MEDIUM_CHANGER:
 	case TYPE_ENCLOSURE:
 	case TYPE_COMM:
diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c
index 2ba7c82..c7474f5 100644
--- a/drivers/scsi/sg.c
+++ b/drivers/scsi/sg.c
@@ -219,9 +219,6 @@ static int sg_allow_access(struct file *filp, unsigned char *cmd)
 	struct sg_fd *sfp = filp->private_data;
 	struct request_queue *q = sfp->parentdp->device->request_queue;
 
-	if (sfp->parentdp->device->type == TYPE_SCANNER)
-		return 0;
-
 	return blk_verify_command(q->cmd_filter,
 				  cmd, filp->f_mode & FMODE_WRITE);
 }
-- 
1.7.1




^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v2 3/3] block: add back command filter modification via sysfs
  2012-09-25 15:30 [PATCH v2 0/3] block: add queue-private command filter, editable via sysfs Paolo Bonzini
  2012-09-25 15:30 ` [PATCH v2 1/3] block: add back queue-private command filter Paolo Bonzini
  2012-09-25 15:30 ` [PATCH v2 2/3] scsi: create an all-zero filter for scanners Paolo Bonzini
@ 2012-09-25 15:30 ` Paolo Bonzini
  2012-10-04 10:12 ` [PATCH v2 0/3] block: add queue-private command filter, editable " Paolo Bonzini
  3 siblings, 0 replies; 37+ messages in thread
From: Paolo Bonzini @ 2012-09-25 15:30 UTC (permalink / raw)
  To: linux-kernel
  Cc: Tejun Heo, James Bottomley, Jens Axboe, Ric Wheeler, Alan Cox,
	linux-scsi

This adds two new sysfs attributes to the queue kobject.  The attributes
allow reading and writing the whitelist of unprivileged commands.

This is again a bit different from what was removed in commit 018e044
(block: get rid of queue-private command filter, 2009-06-26), but the idea
is the same.  One difference is that it does not use a separate kobject.
Also, the supported sysfs syntax is a bit more expressive: it includes
ranges, the ability to replace all of the filter with a single command,
and does not force usage of hexadecimal.

Since the names are different, and the old ones were anyway never enabled
(they were always #if 0-ed), the different API is not a problem.

Cc: linux-scsi@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
	v2->v3: OOM and capability check [Alan Cox]

 Documentation/block/queue-sysfs.txt |   16 ++++++
 block/Kconfig                       |   10 ++++
 block/blk-sysfs.c                   |   41 ++++++++++++++
 block/scsi_ioctl.c                  |  103 +++++++++++++++++++++++++++++++++++
 include/linux/blkdev.h              |   21 +++++++
 5 files changed, 191 insertions(+), 0 deletions(-)

diff --git a/Documentation/block/queue-sysfs.txt b/Documentation/block/queue-sysfs.txt
index e54ac1d..1152b38 100644
--- a/Documentation/block/queue-sysfs.txt
+++ b/Documentation/block/queue-sysfs.txt
@@ -133,6 +133,22 @@ control of this block device to that new IO scheduler. Note that writing
 an IO scheduler name to this file will attempt to load that IO scheduler
 module, if it isn't already present in the system.
 
+sgio_read_filter (RW)
+---------------------
+When read, this file will display a list of SCSI commands (i.e. values of
+the first byte of a CDB) that are always available for unprivileged users
+(via /dev/bsg, /dev/sgNN, or ioctls such as SG_IO and CDROM_SEND_PACKET).
+When written, the list of commands will be modified.  By default it
+will be completely replaced; writing a string that begins with '+' will
+add new commands, and writing a string that begins with '-' will remove
+some commands.  Ranges of commands are supported, for example '0x00-0xff'.
+
+sgio_write_filter (RW)
+----------------------
+When read, this file will display a list of SCSI commands (i.e. values of
+the first byte of a CDB) that are available for unprivileged users
+when the block device is open for writing.  Writing to this file behaves
+as for sgio_read_filter.
 
 
 Jens Axboe <jens.axboe@oracle.com>, February 2009
diff --git a/block/Kconfig b/block/Kconfig
index 09acf1b..7c369d9 100644
--- a/block/Kconfig
+++ b/block/Kconfig
@@ -65,6 +65,16 @@ config BLK_DEV_BSG
 
 	  If unsure, say Y.
 
+config BLK_DEV_SG_FILTER_SYSFS
+	bool "Customizable SG_IO filters in sysfs"
+	default y
+	help
+	  Saying Y here will let you use sysfs to customize the list
+	  of SCSI commands that are available (via /dev/sg, /dev/bsg or
+	  ioctls such as SG_IO) to unprivileged users.
+
+	  If unsure, say Y.
+
 config BLK_DEV_BSGLIB
 	bool "Block layer SG support v4 helper lib"
 	default n
diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c
index 5a0de07..b287d62 100644
--- a/block/blk-sysfs.c
+++ b/block/blk-sysfs.c
@@ -394,6 +394,43 @@ static struct queue_sysfs_entry queue_random_entry = {
 	.store = queue_store_random,
 };
 
+#ifdef CONFIG_BLK_DEV_SG_FILTER_SYSFS
+static ssize_t queue_sgio_filter_read_show(struct request_queue *q, char *page)
+{
+	return blk_filter_show(q, page, READ);
+}
+
+static ssize_t queue_sgio_filter_write_show(struct request_queue *q,
+				 char *page)
+{
+	return blk_filter_show(q, page, WRITE);
+}
+
+static ssize_t queue_sgio_filter_read_store(struct request_queue *q,
+				    const char *page, size_t count)
+{
+	return blk_filter_store(q, page, count, READ);
+}
+
+static ssize_t queue_sgio_filter_write_store(struct request_queue *q,
+				     const char *page, size_t count)
+{
+	return blk_filter_store(q, page, count, WRITE);
+}
+
+static struct queue_sysfs_entry queue_sgio_filter_read_entry = {
+	.attr = { .name = "sgio_filter_read", .mode = S_IRUGO | S_IWUSR },
+	.show = queue_sgio_filter_read_show,
+	.store = queue_sgio_filter_read_store,
+};
+
+static struct queue_sysfs_entry queue_sgio_filter_write_entry = {
+	.attr = {.name = "sgio_filter_write", .mode = S_IRUGO | S_IWUSR },
+	.show = queue_sgio_filter_write_show,
+	.store = queue_sgio_filter_write_store,
+};
+#endif
+
 static struct attribute *default_attrs[] = {
 	&queue_requests_entry.attr,
 	&queue_ra_entry.attr,
@@ -416,6 +453,10 @@ static struct attribute *default_attrs[] = {
 	&queue_rq_affinity_entry.attr,
 	&queue_iostats_entry.attr,
 	&queue_random_entry.attr,
+#ifdef CONFIG_BLK_DEV_SG_FILTER_SYSFS
+	&queue_sgio_filter_read_entry.attr,
+	&queue_sgio_filter_write_entry.attr,
+#endif
 	NULL,
 };
 
diff --git a/block/scsi_ioctl.c b/block/scsi_ioctl.c
index c8862e9..d945b2a 100644
--- a/block/scsi_ioctl.c
+++ b/block/scsi_ioctl.c
@@ -18,6 +18,7 @@
  */
 #include <linux/kernel.h>
 #include <linux/errno.h>
+#include <linux/ctype.h>
 #include <linux/string.h>
 #include <linux/module.h>
 #include <linux/blkdev.h>
@@ -110,6 +111,8 @@ static int sg_emulated_host(struct request_queue *q, int __user *p)
 
 static void blk_set_cmd_filter_defaults(struct blk_cmd_filter *filter)
 {
+	memset(filter, 0, sizeof(*filter));
+
 	/* Basic read-only commands */
 	__set_bit(TEST_UNIT_READY, filter->read_ok);
 	__set_bit(REQUEST_SENSE, filter->read_ok);
@@ -738,6 +741,109 @@ int scsi_cmd_blk_ioctl(struct block_device *bd, fmode_t mode,
 }
 EXPORT_SYMBOL(scsi_cmd_blk_ioctl);
 
+#ifdef CONFIG_BLK_DEV_SG_FILTER_SYSFS
+ssize_t blk_filter_show(struct request_queue *q, char *page, int rw)
+{
+	struct blk_cmd_filter *filter;
+	char *p = page;
+	unsigned long *okbits;
+	int i;
+
+	filter = q->cmd_filter;
+	if (!filter)
+		filter = &blk_default_cmd_filter;
+
+	if (rw == READ)
+		okbits = filter->read_ok;
+	else
+		okbits = filter->write_ok;
+
+	for (i = 0; i < BLK_SCSI_MAX_CMDS; i++)
+		if (test_bit(i, okbits))
+			p += sprintf(p, "0x%02x ", i);
+
+	if (p > page)
+		p[-1] = '\n';
+
+	return p - page;
+}
+EXPORT_SYMBOL_GPL(blk_filter_show);
+
+ssize_t blk_filter_store(struct request_queue *q,
+			 const char *page, size_t count, int rw)
+{
+	unsigned long okbits[BLK_SCSI_CMD_PER_LONG], *target_okbits;
+	bool set;
+	const char *p = page;
+	char *endp;
+	int start = -1, cmd;
+
+	if (!capable(CAP_SYS_ADMIN))
+		return -EPERM;
+
+	if (!q->cmd_filter) {
+		q->cmd_filter = kmalloc(sizeof(struct blk_cmd_filter),
+					GFP_KERNEL);
+		if (!q->cmd_filter)
+			return -ENOMEM;
+
+		blk_set_cmd_filter_defaults(q->cmd_filter);
+	}
+
+	if (rw == READ)
+		target_okbits = q->cmd_filter->read_ok;
+	else
+		target_okbits = q->cmd_filter->write_ok;
+
+	set = (p[0] != '-');
+	if (p[0] != '-' && p[0] != '+')
+		memset(okbits, 0, sizeof(okbits));
+	else {
+		memcpy(okbits, target_okbits, sizeof(okbits));
+		p++;
+	}
+
+	while (p < page + count) {
+		if (start == -1 && isspace(*p)) {
+			p++;
+			continue;
+		}
+
+		cmd = simple_strtol(p, &endp, 0);
+
+		/* all of these cases means invalid input, so do nothing. */
+		if (endp == p || cmd < 0 || cmd >= BLK_SCSI_MAX_CMDS ||
+		    (start != -1 && endp < page + count && !isspace(*endp)))
+			return -EINVAL;
+
+		p = endp;
+		if (p < page + count && *p == '-') {
+			BUG_ON(start != -1);
+			start = cmd;
+			p++;
+			continue;
+		}
+
+		if (start == -1)
+			start = cmd;
+
+		for (; start <= cmd; start++)
+			if (set)
+				__set_bit(start, okbits);
+			else
+				__clear_bit(start, okbits);
+		start = -1;
+	}
+
+	if (start != -1)
+		return -EINVAL;
+
+	memcpy(target_okbits, okbits, sizeof(okbits));
+	return count;
+}
+EXPORT_SYMBOL_GPL(blk_filter_store);
+#endif
+
 static int __init blk_scsi_ioctl_init(void)
 {
 	blk_set_cmd_filter_defaults(&blk_default_cmd_filter);
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index b5c5f8a..18e7c25 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -1012,9 +1012,30 @@ static inline int sb_issue_zeroout(struct super_block *sb, sector_t block,
 				    gfp_mask);
 }
 
+/*
+ * command filter functions
+ */
 extern int blk_verify_command(struct blk_cmd_filter *filter,
 			      unsigned char *cmd, fmode_t has_write_perm);
 
+#ifdef CONFIG_BLK_DEV_SG_FILTER_SYSFS
+ssize_t blk_filter_show(struct request_queue *q, char *page, int rw);
+ssize_t blk_filter_store(struct request_queue *q,
+			 const char *page, size_t count, int rw);
+#else
+static inline ssize_t blk_filter_show(struct request_queue *q, char *page, int rw)
+{
+	return -EINVAL;
+}
+
+static inline ssize_t blk_filter_store(struct request_queue *q,
+				       const char *page, size_t count, int rw);
+{
+	return -EINVAL;
+}
+#endif /* CONFIG_BLK_DEV_SG_FILTER_SYSFS */
+
+
 enum blk_default_limits {
 	BLK_MAX_SEGMENTS	= 128,
 	BLK_SAFE_MAX_SECTORS	= 255,
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 0/3] block: add queue-private command filter, editable via sysfs
  2012-09-25 15:30 [PATCH v2 0/3] block: add queue-private command filter, editable via sysfs Paolo Bonzini
                   ` (2 preceding siblings ...)
  2012-09-25 15:30 ` [PATCH v2 3/3] block: add back command filter modification via sysfs Paolo Bonzini
@ 2012-10-04 10:12 ` Paolo Bonzini
  2012-10-19  0:22   ` Tejun Heo
  3 siblings, 1 reply; 37+ messages in thread
From: Paolo Bonzini @ 2012-10-04 10:12 UTC (permalink / raw)
  Cc: linux-kernel, Tejun Heo, James Bottomley, Jens Axboe,
	Ric Wheeler, Alan Cox, linux-scsi

Il 25/09/2012 17:30, Paolo Bonzini ha scritto:
> The set of use cases for SG_IO is quite variable that no single filter can
> accomodate all of them.  The current filter is tailored very much to
> CD burning, and includes many MMC-specific commands that may have
> other meanings in different standards.  Someone may want to remove
> those commands; at the same time, people that trust their users may
> want to add persistent reservations, trim/discard, or even access to
> vendor-specific commands.
> 
> Filters used to be mutable via sysfs, but the implementation was
> never enabled.  Add it back, and let the admin set this up per device.
> 
> A simple bitmap does not let you do things like enabling command A with
> option B on a specific device for a certain block range.  However, the
> question is really whether this is needed---in fact, neither of the known
> uses for the filtering need it.
> 
> In one use case, the administrator then needs the ability to configure
> devices easily, for example to be much more restrictive on non-MMC
> devices.  It must be done with the same tools it uses for other aspects
> of the policy---which will be a combination of DAC (Unix permissions and
> ACLs) and sysfs.  Different SCSI standards may give different meanings
> for the same byte value, but a simple bitmap is enough for this.
> 
> In the virtualization case, the problem is really that you want to
> pass through everything or almost everything, while still running as
> confined as possible (i.e. CAP_SYS_RAWIO is not a choice).  But in this
> case a more complex filtering can be done just as easily in userspace,
> in the virtual machine monitor.  While the userspace filter can be
> subverted if the guest can escape the QEMU jail, the bitmap still lets
> you block some commands at the kernel level if really necessary.
> 
> One alternative is a ioctl to disable the filter altogether, to be
> used together with SCM_RIGHTS file descriptor passing.  This works in
> the virtualization case but not for policy decisions.  So this patch
> series provides the sysfs knob.  It is a tweaked revert of commit 018e044
> (block: get rid of queue-private command filter, 2009-06-26).
> 
> Please review!
> 
> Paolo
> 
> v1->v2: add OOM and capability checks
> 
> Paolo Bonzini (3):
>   block: add back queue-private command filter
>   scsi: create an all-zero filter for scanners
>   block: add back command filter modification via sysfs
> 
>  Documentation/block/queue-sysfs.txt |   16 +++++
>  block/Kconfig                       |   10 +++
>  block/blk-sysfs.c                   |   43 +++++++++++++
>  block/bsg.c                         |    2 +-
>  block/scsi_ioctl.c                  |  117 +++++++++++++++++++++++++++++++----
>  drivers/scsi/scsi_scan.c            |    6 ++-
>  drivers/scsi/sg.c                   |    7 +-
>  include/linux/blkdev.h              |   31 +++++++++-
>  8 files changed, 213 insertions(+), 19 deletions(-)
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

Ping?

Tejun, Jens, anyone else, any hope that this gets in 3.7?

Paolo

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 0/3] block: add queue-private command filter, editable via sysfs
  2012-10-04 10:12 ` [PATCH v2 0/3] block: add queue-private command filter, editable " Paolo Bonzini
@ 2012-10-19  0:22   ` Tejun Heo
  2012-10-19  9:07     ` Paolo Bonzini
  0 siblings, 1 reply; 37+ messages in thread
From: Tejun Heo @ 2012-10-19  0:22 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: linux-kernel, James Bottomley, Jens Axboe, Ric Wheeler, Alan Cox,
	linux-scsi

Hello, Paolo.

Sorry about the delay.  -EWASTRAVELING and it seems Jens was too.

On Thu, Oct 04, 2012 at 12:12:12PM +0200, Paolo Bonzini wrote:
> Il 25/09/2012 17:30, Paolo Bonzini ha scritto:
> > The set of use cases for SG_IO is quite variable that no single filter can
> > accomodate all of them.  The current filter is tailored very much to
> > CD burning, and includes many MMC-specific commands that may have
> > other meanings in different standards.  Someone may want to remove
> > those commands; at the same time, people that trust their users may
> > want to add persistent reservations, trim/discard, or even access to
> > vendor-specific commands.
> > 
> > Filters used to be mutable via sysfs, but the implementation was
> > never enabled.  Add it back, and let the admin set this up per device.
> > 
> > A simple bitmap does not let you do things like enabling command A with
> > option B on a specific device for a certain block range.  However, the
> > question is really whether this is needed---in fact, neither of the known
> > uses for the filtering need it.
> > 
> > In one use case, the administrator then needs the ability to configure
> > devices easily, for example to be much more restrictive on non-MMC
> > devices.  It must be done with the same tools it uses for other aspects
> > of the policy---which will be a combination of DAC (Unix permissions and
> > ACLs) and sysfs.  Different SCSI standards may give different meanings
> > for the same byte value, but a simple bitmap is enough for this.
> > 
> > In the virtualization case, the problem is really that you want to
> > pass through everything or almost everything, while still running as
> > confined as possible (i.e. CAP_SYS_RAWIO is not a choice).  But in this
> > case a more complex filtering can be done just as easily in userspace,
> > in the virtual machine monitor.  While the userspace filter can be
> > subverted if the guest can escape the QEMU jail, the bitmap still lets
> > you block some commands at the kernel level if really necessary.
> > 
> > One alternative is a ioctl to disable the filter altogether, to be
> > used together with SCM_RIGHTS file descriptor passing.  This works in
> > the virtualization case but not for policy decisions.  So this patch
> > series provides the sysfs knob.  It is a tweaked revert of commit 018e044
> > (block: get rid of queue-private command filter, 2009-06-26).

If your use case at hand is mostly virtualization, I personally would
prefer the latter solution instead of extending in-kernel SG_IO
filter.  It could be that I'm just not aware but I don't think there
have been too many complaints regarding other use cases.  This type of
deep inspection and filtering has fairly strong tendency to be
unpopular.  Userland configurable one is even scarier and I don't
think too many would know and make use of it in the end.

Jens, what are your thoughts?

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 0/3] block: add queue-private command filter, editable via sysfs
  2012-10-19  0:22   ` Tejun Heo
@ 2012-10-19  9:07     ` Paolo Bonzini
       [not found]       ` <2007908429.13363375.1350637872646.JavaMail.root@redhat.com>
  0 siblings, 1 reply; 37+ messages in thread
From: Paolo Bonzini @ 2012-10-19  9:07 UTC (permalink / raw)
  To: Tejun Heo
  Cc: linux-kernel, James Bottomley, Jens Axboe, Ric Wheeler, Alan Cox,
	linux-scsi

> > In one use case, the administrator then needs the ability to configure
> > devices easily, for example to be much more restrictive on non-MMC
> > devices.  It must be done with the same tools it uses for other
> > aspects of the policy---which will be a combination of DAC (Unix
> > permissions and ACLs) and sysfs.  Different SCSI standards may give different
> > meanings for the same byte value, but a simple bitmap is enough for this.
> > 
> > In the virtualization case, the problem is really that you want to
> > pass through everything or almost everything, while still running
> > as confined as possible (i.e. CAP_SYS_RAWIO is not a choice).  But
> > in this case a more complex filtering can be done just as easily in
> > userspace, in the virtual machine monitor.
> > 
> > One alternative is a ioctl to disable the filter altogether, to be
> > used together with SCM_RIGHTS file descriptor passing.  This works in
> > the virtualization case but not for policy decisions.  So this patch
> > series provides the sysfs knob.  It is a tweaked revert of commit
> > 018e044 (block: get rid of queue-private command filter, 2009-06-26).
> 
> If your use case at hand is mostly virtualization, I personally would
> prefer the latter solution instead of extending in-kernel SG_IO
> filter.  It could be that I'm just not aware but I don't think there
> have been too many complaints regarding other use cases.  This type
> of deep inspection and filtering has fairly strong tendency to be
> unpopular.  Userland configurable one is even scarier and I don't
> think too many would know and make use of it in the end.

Actually I plan to set up the filtering in udev, so that we need not
pass through many MMC commands (some of which have completely different
meanings) to SBC devices.  So it would really be useful.

Paolo

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: setting up CDB filters in udev (was Re: [PATCH v2 0/3] block: add queue-private command filter, editable via sysfs)
       [not found]               ` <5088EC43.2010600@redhat.com>
@ 2012-10-25 18:00                 ` Tejun Heo
  2012-10-25 18:35                   ` Paolo Bonzini
  0 siblings, 1 reply; 37+ messages in thread
From: Tejun Heo @ 2012-10-25 18:00 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Ric Wheeler, Petr Matousek, Kay Sievers, Jens Axboe,
	linux-kernel, James E.J. Bottomley

(restoring cc lists)

Hey, Paolo.

On Thu, Oct 25, 2012 at 09:37:39AM +0200, Paolo Bonzini wrote:
> Il 24/10/2012 18:47, Tejun Heo ha scritto:
> > So, I'm still not convinced we need to go forward with full
> > configurability. All use cases you described can be covered with
> > per-class static filters + simple override switch to disable all,
> > which would result in a lot simpler implementation w/ much smaller
> > userland interface.
> 
> I'm not sure the userland interface would be smaller, and it would be
> more complex to get right:
> 
> 1) how do you override the default?  ioctl+SCM_RIGHTS or sysfs?

Disabling filters if opened by root and tranfering via SCM_RIGHTS
would be the simplest interface-wise (there's no new interface at
all).  Would that be too dangerous security-wise?

> 2) do you need to override the default to "no access", "full access" and
> "default access", or is a binary knob (default access/full access)
> sufficient?

Default / full should be enough, no?

> 3) what capabilities control the setting?

CAP_SYS_RAWIO seems to be a pretty good fit.

> > What's the rationale for full configurability?
> 
> Depending on the level of trust you have in your users, there are
> different policies that are applicable.  Even virtualization could have
> a range of choices like "permit only standard operations", "also permit
> UNMAP", "also permit persistent reservations", "permit everything
> including vendor specific commands"

I guess I just feel quite reluctant to expose another rather obscure
userland configurable in-kernel filter and at the same time I'm not
sure whether this is flexible enough.  What if a device is shared by
multiple virtual machines which are trusted at different levels?  What
if we end up actually having to filter cdb contents?

I'm not trying to block it at all cost but let's make sure we looked
into most possibilities before (re)adding this userland visible
interface.

Jens, James, what do you guys think?

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: setting up CDB filters in udev (was Re: [PATCH v2 0/3] block: add queue-private command filter, editable via sysfs)
  2012-10-25 18:00                 ` setting up CDB filters in udev (was Re: [PATCH v2 0/3] block: add queue-private command filter, editable via sysfs) Tejun Heo
@ 2012-10-25 18:35                   ` Paolo Bonzini
  2012-10-31 12:52                     ` Paolo Bonzini
  2012-10-31 21:22                     ` Tejun Heo
  0 siblings, 2 replies; 37+ messages in thread
From: Paolo Bonzini @ 2012-10-25 18:35 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Ric Wheeler, Petr Matousek, Kay Sievers, Jens Axboe,
	linux-kernel, James E.J. Bottomley


> On Thu, Oct 25, 2012 at 09:37:39AM +0200, Paolo Bonzini wrote:
> > Il 24/10/2012 18:47, Tejun Heo ha scritto:
> > > So, I'm still not convinced we need to go forward with full
> > > configurability. All use cases you described can be covered with
> > > per-class static filters + simple override switch to disable all,
> > > which would result in a lot simpler implementation w/ much
> > > smaller userland interface.
> > 
> > I'm not sure the userland interface would be smaller, and it would
> > be more complex to get right:
> > 
> > 1) how do you override the default?  ioctl+SCM_RIGHTS or sysfs?
> 
> Disabling filters if opened by root and tranfering via SCM_RIGHTS
> would be the simplest interface-wise (there's no new interface at
> all).  Would that be too dangerous security-wise?

That would be a change with respect to what we have now.  After
transferring a root-opened (better: CAP_SYS_RAWIO-opened) file
descriptor to an unprivileged process your SG_IO commands get
filtered.  So a ioctl is needed if you want to rely on SCM_RIGHTS.

> > 2) do you need to override the default to "no access", "full
> > access" and "default access", or is a binary knob (default
> > access/full access) sufficient?
> 
> Default / full should be enough, no?

If a ioctl has to be added, I'd rather have at least none/full/default.

> > 3) what capabilities control the setting?
> 
> CAP_SYS_RAWIO seems to be a pretty good fit.

Yes, for a ioctl it is (for sysfs CAP_SYS_ADMIN is better IMHO).

> I guess I just feel quite reluctant to expose another rather obscure
> userland configurable in-kernel filter and at the same time I'm not
> sure whether this is flexible enough.  What if a device is shared by
> multiple virtual machines which are trusted at different levels?

No, you just don't do that.  If a device is passed through to virtual
machines, it is between similar virtual machines (for some definition
of similar).  The only case where you have this sharing is in practice
if either the device is read-only (my patch does give you a basic
two-level filtering, with two separate bitmaps for RO and RW) or if you
allow persistent reservations (which is as close to full trust as you
can get).

> I'm not trying to block it at all cost but let's make sure we looked
> into most possibilities before (re)adding this userland visible
> interface.

Sure, understood. :)

> Jens, James, what do you guys think?

Paolo

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: setting up CDB filters in udev (was Re: [PATCH v2 0/3] block: add queue-private command filter, editable via sysfs)
  2012-10-25 18:35                   ` Paolo Bonzini
@ 2012-10-31 12:52                     ` Paolo Bonzini
  2012-10-31 21:22                     ` Tejun Heo
  1 sibling, 0 replies; 37+ messages in thread
From: Paolo Bonzini @ 2012-10-31 12:52 UTC (permalink / raw)
  Cc: Tejun Heo, Ric Wheeler, Petr Matousek, Kay Sievers, Jens Axboe,
	linux-kernel, James E.J. Bottomley

Ping?

Paolo

Il 25/10/2012 20:35, Paolo Bonzini ha scritto:
>> On Thu, Oct 25, 2012 at 09:37:39AM +0200, Paolo Bonzini wrote:
>>> Il 24/10/2012 18:47, Tejun Heo ha scritto:
>>>> So, I'm still not convinced we need to go forward with full
>>>> configurability. All use cases you described can be covered with
>>>> per-class static filters + simple override switch to disable all,
>>>> which would result in a lot simpler implementation w/ much
>>>> smaller userland interface.
>>>
>>> I'm not sure the userland interface would be smaller, and it would
>>> be more complex to get right:
>>>
>>> 1) how do you override the default?  ioctl+SCM_RIGHTS or sysfs?
>>
>> Disabling filters if opened by root and tranfering via SCM_RIGHTS
>> would be the simplest interface-wise (there's no new interface at
>> all).  Would that be too dangerous security-wise?
> 
> That would be a change with respect to what we have now.  After
> transferring a root-opened (better: CAP_SYS_RAWIO-opened) file
> descriptor to an unprivileged process your SG_IO commands get
> filtered.  So a ioctl is needed if you want to rely on SCM_RIGHTS.
> 
>>> 2) do you need to override the default to "no access", "full
>>> access" and "default access", or is a binary knob (default
>>> access/full access) sufficient?
>>
>> Default / full should be enough, no?
> 
> If a ioctl has to be added, I'd rather have at least none/full/default.
> 
>>> 3) what capabilities control the setting?
>>
>> CAP_SYS_RAWIO seems to be a pretty good fit.
> 
> Yes, for a ioctl it is (for sysfs CAP_SYS_ADMIN is better IMHO).
> 
>> I guess I just feel quite reluctant to expose another rather obscure
>> userland configurable in-kernel filter and at the same time I'm not
>> sure whether this is flexible enough.  What if a device is shared by
>> multiple virtual machines which are trusted at different levels?
> 
> No, you just don't do that.  If a device is passed through to virtual
> machines, it is between similar virtual machines (for some definition
> of similar).  The only case where you have this sharing is in practice
> if either the device is read-only (my patch does give you a basic
> two-level filtering, with two separate bitmaps for RO and RW) or if you
> allow persistent reservations (which is as close to full trust as you
> can get).
> 
>> I'm not trying to block it at all cost but let's make sure we looked
>> into most possibilities before (re)adding this userland visible
>> interface.
> 
> Sure, understood. :)
> 
>> Jens, James, what do you guys think?
> 
> Paolo
> 


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: setting up CDB filters in udev (was Re: [PATCH v2 0/3] block: add queue-private command filter, editable via sysfs)
  2012-10-25 18:35                   ` Paolo Bonzini
  2012-10-31 12:52                     ` Paolo Bonzini
@ 2012-10-31 21:22                     ` Tejun Heo
  2012-11-02 14:49                       ` Paolo Bonzini
  1 sibling, 1 reply; 37+ messages in thread
From: Tejun Heo @ 2012-10-31 21:22 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Ric Wheeler, Petr Matousek, Kay Sievers, Jens Axboe,
	linux-kernel, James E.J. Bottomley

Hello, Paolo.

On Thu, Oct 25, 2012 at 02:35:20PM -0400, Paolo Bonzini wrote:
> > Disabling filters if opened by root and tranfering via SCM_RIGHTS
> > would be the simplest interface-wise (there's no new interface at
> > all).  Would that be too dangerous security-wise?
> 
> That would be a change with respect to what we have now.  After
> transferring a root-opened (better: CAP_SYS_RAWIO-opened) file
> descriptor to an unprivileged process your SG_IO commands get
> filtered.  So a ioctl is needed if you want to rely on SCM_RIGHTS.

Yeah, I get that it's a behavior change, but would that be a problem?

> > I guess I just feel quite reluctant to expose another rather obscure
> > userland configurable in-kernel filter and at the same time I'm not
> > sure whether this is flexible enough.  What if a device is shared by
> > multiple virtual machines which are trusted at different levels?
> 
> No, you just don't do that.  If a device is passed through to virtual
> machines, it is between similar virtual machines (for some definition
> of similar).  The only case where you have this sharing is in practice
> if either the device is read-only (my patch does give you a basic
> two-level filtering, with two separate bitmaps for RO and RW) or if you
> allow persistent reservations (which is as close to full trust as you
> can get).

What disturbs me is that it's a completely new interface to userland
and at the same a very limited one at that.  So, yeah, it's
bothersome.  I personally would prefer SCM_RIGHTS behavior change +
hard coded filters per device class.

But, I'd really like to hear what other guys are thinking.  Jens?
Jens? Jens? Jens? Jens? Jens? Jens? Jens? Jens? Jens? Jens? Jens? :P

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: setting up CDB filters in udev (was Re: [PATCH v2 0/3] block: add queue-private command filter, editable via sysfs)
  2012-10-31 21:22                     ` Tejun Heo
@ 2012-11-02 14:49                       ` Paolo Bonzini
  2012-11-02 15:35                         ` Alan Cox
  2012-11-02 16:51                         ` Tejun Heo
  0 siblings, 2 replies; 37+ messages in thread
From: Paolo Bonzini @ 2012-11-02 14:49 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Ric Wheeler, Petr Matousek, Kay Sievers, Jens Axboe,
	linux-kernel, James E.J. Bottomley

Il 31/10/2012 22:22, Tejun Heo ha scritto:
> Hello, Paolo.
> 
> On Thu, Oct 25, 2012 at 02:35:20PM -0400, Paolo Bonzini wrote:
>>> Disabling filters if opened by root and tranfering via SCM_RIGHTS
>>> would be the simplest interface-wise (there's no new interface at
>>> all).  Would that be too dangerous security-wise?
>>
>> That would be a change with respect to what we have now.  After
>> transferring a root-opened (better: CAP_SYS_RAWIO-opened) file
>> descriptor to an unprivileged process your SG_IO commands get
>> filtered.  So a ioctl is needed if you want to rely on SCM_RIGHTS.
> 
> Yeah, I get that it's a behavior change, but would that be a problem?

Worse, it's a potential security hole because previously you'd get
filtering and now you wouldn't.

Considering that SCM_RIGHTS is usually used to transfer a file
descriptor from a privileged process to an unprivileged one, I'd be very
worried of that.

>>> I guess I just feel quite reluctant to expose another rather obscure
>>> userland configurable in-kernel filter and at the same time I'm not
>>> sure whether this is flexible enough.  What if a device is shared by
>>> multiple virtual machines which are trusted at different levels?
>>
>> No, you just don't do that.  If a device is passed through to virtual
>> machines, it is between similar virtual machines (for some definition
>> of similar).  The only case where you have this sharing is in practice
>> if either the device is read-only (my patch does give you a basic
>> two-level filtering, with two separate bitmaps for RO and RW) or if you
>> allow persistent reservations (which is as close to full trust as you
>> can get).
> 
> What disturbs me is that it's a completely new interface to userland
> and at the same a very limited one at that.  So, yeah, it's
> bothersome.  I personally would prefer SCM_RIGHTS behavior change +
> hard coded filters per device class.

I think hard-coded filters are bad (I prefer to move policy to
userspace), and SCM_RIGHTS without a ioctl is out of question, really.

> But, I'd really like to hear what other guys are thinking.  Jens?
> Jens? Jens? Jens? Jens? Jens? Jens? Jens? Jens? Jens? Jens? Jens? :P

:P

Paolo


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: setting up CDB filters in udev (was Re: [PATCH v2 0/3] block: add queue-private command filter, editable via sysfs)
  2012-11-02 14:49                       ` Paolo Bonzini
@ 2012-11-02 15:35                         ` Alan Cox
  2012-11-02 16:48                           ` Tejun Heo
  2012-11-02 16:51                         ` Tejun Heo
  1 sibling, 1 reply; 37+ messages in thread
From: Alan Cox @ 2012-11-02 15:35 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Tejun Heo, Ric Wheeler, Petr Matousek, Kay Sievers, Jens Axboe,
	linux-kernel, James E.J. Bottomley

On Fri, 02 Nov 2012 15:49:02 +0100
Paolo Bonzini <pbonzini@redhat.com> wrote:

> Il 31/10/2012 22:22, Tejun Heo ha scritto:
> > Hello, Paolo.
> > 
> > On Thu, Oct 25, 2012 at 02:35:20PM -0400, Paolo Bonzini wrote:
> >>> Disabling filters if opened by root and tranfering via SCM_RIGHTS
> >>> would be the simplest interface-wise (there's no new interface at
> >>> all).  Would that be too dangerous security-wise?
> >>
> >> That would be a change with respect to what we have now.  After
> >> transferring a root-opened (better: CAP_SYS_RAWIO-opened) file
> >> descriptor to an unprivileged process your SG_IO commands get
> >> filtered.  So a ioctl is needed if you want to rely on SCM_RIGHTS.
> > 
> > Yeah, I get that it's a behavior change, but would that be a problem?
> 
> Worse, it's a potential security hole because previously you'd get
> filtering and now you wouldn't.
> 
> Considering that SCM_RIGHTS is usually used to transfer a file
> descriptor from a privileged process to an unprivileged one, I'd be very
> worried of that.

In other contexts you inherit file handles via exec and having a "root
opened so its special" model is bad. Historically it led to things like
the rlogin/rsh hacks on SunOS and friends where a program run by the rsh
daemon got a root opened socket as its stdin/out and could issue ifconfig
ioctls on it at will.

Not a good model. Any removal of filters and passing them to a task
should be explicit. The behaviour really ought to be to permit the
intentional setting of explicit filters then passing them, not touch the
default behaviour.

Alan

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: setting up CDB filters in udev (was Re: [PATCH v2 0/3] block: add queue-private command filter, editable via sysfs)
  2012-11-02 15:35                         ` Alan Cox
@ 2012-11-02 16:48                           ` Tejun Heo
  2012-11-02 17:21                             ` Alan Cox
  0 siblings, 1 reply; 37+ messages in thread
From: Tejun Heo @ 2012-11-02 16:48 UTC (permalink / raw)
  To: Alan Cox
  Cc: Paolo Bonzini, Ric Wheeler, Petr Matousek, Kay Sievers,
	Jens Axboe, linux-kernel, James E.J. Bottomley

Hey, Alan, Paolo.

On Fri, Nov 02, 2012 at 03:35:30PM +0000, Alan Cox wrote:
> > >> That would be a change with respect to what we have now.  After
> > >> transferring a root-opened (better: CAP_SYS_RAWIO-opened) file
> > >> descriptor to an unprivileged process your SG_IO commands get
> > >> filtered.  So a ioctl is needed if you want to rely on SCM_RIGHTS.
> > > 
> > > Yeah, I get that it's a behavior change, but would that be a problem?
> > 
> > Worse, it's a potential security hole because previously you'd get
> > filtering and now you wouldn't.
> > 
> > Considering that SCM_RIGHTS is usually used to transfer a file
> > descriptor from a privileged process to an unprivileged one, I'd be very
> > worried of that.
> 
> In other contexts you inherit file handles via exec and having a "root
> opened so its special" model is bad. Historically it led to things like
> the rlogin/rsh hacks on SunOS and friends where a program run by the rsh
> daemon got a root opened socket as its stdin/out and could issue ifconfig
> ioctls on it at will.
> 
> Not a good model. Any removal of filters and passing them to a task
> should be explicit. The behaviour really ought to be to permit the
> intentional setting of explicit filters then passing them, not touch the
> default behaviour.

Yeah, well, then I guess it'll have to be a separate ioctl to switch
SG_IO for !root users.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: setting up CDB filters in udev (was Re: [PATCH v2 0/3] block: add queue-private command filter, editable via sysfs)
  2012-11-02 14:49                       ` Paolo Bonzini
  2012-11-02 15:35                         ` Alan Cox
@ 2012-11-02 16:51                         ` Tejun Heo
  2012-11-02 17:49                           ` Paolo Bonzini
  1 sibling, 1 reply; 37+ messages in thread
From: Tejun Heo @ 2012-11-02 16:51 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Ric Wheeler, Petr Matousek, Kay Sievers, Jens Axboe,
	linux-kernel, James E.J. Bottomley

Hey, Paolo.

On Fri, Nov 02, 2012 at 03:49:02PM +0100, Paolo Bonzini wrote:
> > Yeah, I get that it's a behavior change, but would that be a problem?
> 
> Worse, it's a potential security hole because previously you'd get
> filtering and now you wouldn't.
> 
> Considering that SCM_RIGHTS is usually used to transfer a file
> descriptor from a privileged process to an unprivileged one, I'd be very
> worried of that.

Yeah, I know it's a security thing, was wondering how bad it was.  So,
if we choose this, I guess we'll need an ioctl to switch userland
SG_IO filtering.

> > What disturbs me is that it's a completely new interface to userland
> > and at the same a very limited one at that.  So, yeah, it's
> > bothersome.  I personally would prefer SCM_RIGHTS behavior change +
> > hard coded filters per device class.
> 
> I think hard-coded filters are bad (I prefer to move policy to
> userspace), and SCM_RIGHTS without a ioctl is out of question, really.

No rule is really absolute.  To me, it seems the suggested in-kernel
per-device command code filter is both too big for the given problem
while being too limited for much beyond that.  So, if we can get away
with adding an ioctl, I personally think that would be a better
approach.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: setting up CDB filters in udev (was Re: [PATCH v2 0/3] block: add queue-private command filter, editable via sysfs)
  2012-11-02 16:48                           ` Tejun Heo
@ 2012-11-02 17:21                             ` Alan Cox
  2012-11-02 17:30                               ` Tejun Heo
  0 siblings, 1 reply; 37+ messages in thread
From: Alan Cox @ 2012-11-02 17:21 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Paolo Bonzini, Ric Wheeler, Petr Matousek, Kay Sievers,
	Jens Axboe, linux-kernel, James E.J. Bottomley

> > Not a good model. Any removal of filters and passing them to a task
> > should be explicit. The behaviour really ought to be to permit the
> > intentional setting of explicit filters then passing them, not touch the
> > default behaviour.
> 
> Yeah, well, then I guess it'll have to be a separate ioctl to switch
> SG_IO for !root users.


My first thought would be to have the basic behaviour as


allowed 	IFF	passes user filter
		&&	CAP_SYS_RAWIO || passes 'root' filter


that allows untrusted to also push unprivileged filters for their own
purposes (consider things like exokernel experiments or just trying to
ensure a raw disk emulation doesn't go wrong). The default user feature
would be 'allow anything'.

then add a way to 'set' the root filter only if you have CAP_SYS_RAWIO
with the default 'root' filter being the current hardcoded filter.

That also means that a normal app running as superuser for some reason
would set its user filter and any accidentally inherited descriptors will
be less dangerous as the are today. It also means a CAP_SYS_RAWIO capable
app can still use filters itself as good programming practise.

It effectively means you have to deliberately and intentionally set up an
'inherited' extra rights case.

Alan

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: setting up CDB filters in udev (was Re: [PATCH v2 0/3] block: add queue-private command filter, editable via sysfs)
  2012-11-02 17:21                             ` Alan Cox
@ 2012-11-02 17:30                               ` Tejun Heo
  2012-11-02 20:18                                 ` Alan Cox
  0 siblings, 1 reply; 37+ messages in thread
From: Tejun Heo @ 2012-11-02 17:30 UTC (permalink / raw)
  To: Alan Cox
  Cc: Paolo Bonzini, Ric Wheeler, Petr Matousek, Kay Sievers,
	Jens Axboe, linux-kernel, James E.J. Bottomley

Hey, Alan.

On Fri, Nov 02, 2012 at 05:21:45PM +0000, Alan Cox wrote:
> That also means that a normal app running as superuser for some reason
> would set its user filter and any accidentally inherited descriptors will
> be less dangerous as the are today. It also means a CAP_SYS_RAWIO capable
> app can still use filters itself as good programming practise.
> 
> It effectively means you have to deliberately and intentionally set up an
> 'inherited' extra rights case.

The last part, I agree, but in general I think what you're describing
is way too elaborate for the problem at hand.  It's like adding
arbitrary range-filter for /dev/sdX which can be overridden by
userland.  You sure can find use case for such thing if you try hard
enough, but it's way over-engineered nonetheless.  I don't think we're
addressing huge range and number of use cases here and would much
prefer to keep it as simple as possible.

 * Devices are given standard filter matching the device class.  Any
   !CAP_SYS_RAWIO user can only issue commands allowed by the filter.

 * CAP_SYS_RAWIO can issue an ioctl to disable the filter all
   accessors of the fd and transfer it.

That should be enough, no?

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: setting up CDB filters in udev (was Re: [PATCH v2 0/3] block: add queue-private command filter, editable via sysfs)
  2012-11-02 16:51                         ` Tejun Heo
@ 2012-11-02 17:49                           ` Paolo Bonzini
  2012-11-02 17:53                             ` Tejun Heo
  0 siblings, 1 reply; 37+ messages in thread
From: Paolo Bonzini @ 2012-11-02 17:49 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Ric Wheeler, Petr Matousek, Kay Sievers, Jens Axboe,
	linux-kernel, James E.J. Bottomley

Il 02/11/2012 17:51, Tejun Heo ha scritto:
>>> > > What disturbs me is that it's a completely new interface to userland
>>> > > and at the same a very limited one at that.  So, yeah, it's
>>> > > bothersome.  I personally would prefer SCM_RIGHTS behavior change +
>>> > > hard coded filters per device class.
>> > 
>> > I think hard-coded filters are bad (I prefer to move policy to
>> > userspace), and SCM_RIGHTS without a ioctl is out of question, really.
> No rule is really absolute.  To me, it seems the suggested in-kernel
> per-device command code filter is both too big for the given problem

Is it?  150 lines of code?  The per-class filters would share the first
two patches with this series, add a long list of commands to filter, and
the ioctl would be on top of that.

Long lists are better kept in configuration files than in kernel
sources; not to mention the higher cost of getting the API wrong for a
ioctl vs. sysfs.

> while being too limited for much beyond that.

What are the use cases beyond these?  AFAIK these were the first two in
ten years or so...

> So, if we can get away
> with adding an ioctl, I personally think that would be a better
> approach.

I would really prefer to get a green light from Jens/James for per-class
filters in the kernel (which are worth a few hundred lines of data)
before implementing that.

Paolo

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: setting up CDB filters in udev (was Re: [PATCH v2 0/3] block: add queue-private command filter, editable via sysfs)
  2012-11-02 17:49                           ` Paolo Bonzini
@ 2012-11-02 17:53                             ` Tejun Heo
  2012-11-03 13:20                               ` Paolo Bonzini
  0 siblings, 1 reply; 37+ messages in thread
From: Tejun Heo @ 2012-11-02 17:53 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Ric Wheeler, Petr Matousek, Kay Sievers, Jens Axboe,
	linux-kernel, James E.J. Bottomley

Hello, Paolo.

On Fri, Nov 02, 2012 at 06:49:43PM +0100, Paolo Bonzini wrote:
> > No rule is really absolute.  To me, it seems the suggested in-kernel
> > per-device command code filter is both too big for the given problem
> 
> Is it?  150 lines of code?  The per-class filters would share the first
> two patches with this series, add a long list of commands to filter, and
> the ioctl would be on top of that.

It's not really about the lines of code.  It adds a new userland
visible interface.  As for the "long" list of commands, it depends on
how you write it but even if it's textually long it's still very
simple in terms of actual complexity.

> > while being too limited for much beyond that.
> 
> What are the use cases beyond these?  AFAIK these were the first two in
> ten years or so...

If this is such a cold area, why do we want do anything other than the
simplest possible?

> > So, if we can get away
> > with adding an ioctl, I personally think that would be a better
> > approach.
> 
> I would really prefer to get a green light from Jens/James for per-class
> filters in the kernel (which are worth a few hundred lines of data)
> before implementing that.

Sure, Jens, James?  Guys, come on.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: setting up CDB filters in udev (was Re: [PATCH v2 0/3] block: add queue-private command filter, editable via sysfs)
  2012-11-02 17:30                               ` Tejun Heo
@ 2012-11-02 20:18                                 ` Alan Cox
  2012-11-02 20:21                                   ` Tejun Heo
  0 siblings, 1 reply; 37+ messages in thread
From: Alan Cox @ 2012-11-02 20:18 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Paolo Bonzini, Ric Wheeler, Petr Matousek, Kay Sievers,
	Jens Axboe, linux-kernel, James E.J. Bottomley

>  * Devices are given standard filter matching the device class.  Any
>    !CAP_SYS_RAWIO user can only issue commands allowed by the filter.
> 
>  * CAP_SYS_RAWIO can issue an ioctl to disable the filter all
>    accessors of the fd and transfer it.
> 
> That should be enough, no?

No 

a - there are lots of cases you want to allow only a subset of commands.

b - if you are using a BPF filter which is the obvious way to do it then
the flexibility comes for free without any extra complexity as the kernel
provides a generic implementation, and even a JIT for complex cases.

Alan

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: setting up CDB filters in udev (was Re: [PATCH v2 0/3] block: add queue-private command filter, editable via sysfs)
  2012-11-02 20:18                                 ` Alan Cox
@ 2012-11-02 20:21                                   ` Tejun Heo
  2012-11-02 20:48                                     ` Alan Cox
  0 siblings, 1 reply; 37+ messages in thread
From: Tejun Heo @ 2012-11-02 20:21 UTC (permalink / raw)
  To: Alan Cox
  Cc: Paolo Bonzini, Ric Wheeler, Petr Matousek, Kay Sievers,
	Jens Axboe, linux-kernel, James E.J. Bottomley

Hey,

On Fri, Nov 02, 2012 at 08:18:24PM +0000, Alan Cox wrote:
> a - there are lots of cases you want to allow only a subset of commands.

Care to spell them out.  At least the cases Paolo listed should be
served by what's described.

> b - if you are using a BPF filter which is the obvious way to do it then
> the flexibility comes for free without any extra complexity as the kernel
> provides a generic implementation, and even a JIT for complex cases.

Yeah, sure, but it's all about what tool to use where and maybe it's
my ignorance about the problem space but it's difficult for me to
believe that we need full-blown BPF filter here when this is the only
activity we've got in a decade.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: setting up CDB filters in udev (was Re: [PATCH v2 0/3] block: add queue-private command filter, editable via sysfs)
  2012-11-02 20:21                                   ` Tejun Heo
@ 2012-11-02 20:48                                     ` Alan Cox
  2012-11-02 22:59                                       ` Tejun Heo
  0 siblings, 1 reply; 37+ messages in thread
From: Alan Cox @ 2012-11-02 20:48 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Paolo Bonzini, Ric Wheeler, Petr Matousek, Kay Sievers,
	Jens Axboe, linux-kernel, James E.J. Bottomley

On Fri, 2 Nov 2012 13:21:31 -0700
Tejun Heo <tj@kernel.org> wrote:

> Hey,
> 
> On Fri, Nov 02, 2012 at 08:18:24PM +0000, Alan Cox wrote:
> > a - there are lots of cases you want to allow only a subset of commands.
> 
> Care to spell them out.  At least the cases Paolo listed should be
> served by what's described.
> 
> > b - if you are using a BPF filter which is the obvious way to do it then
> > the flexibility comes for free without any extra complexity as the kernel
> > provides a generic implementation, and even a JIT for complex cases.
> 
> Yeah, sure, but it's all about what tool to use where and maybe it's
> my ignorance about the problem space but it's difficult for me to
> believe that we need full-blown BPF filter here when this is the only
> activity we've got in a decade.

If you look back through the archive you'll find people have been
spending a good decade bitching about the lack of filter configurability
and trying to get someone else to fix it.

The BPF filter is simpler than just about any other implementation
because the tools exist and are used for lots of other things and it has
an API that is precisely defined as well as kernel calls to run the
filter.

Some reasons for it

- giving people access to parts of disks
- allowing access to specific vendor specific commands on certain
  non-standard CD and DVD drives so they can be used for burning but you
  can't trash them
- giving end users minimal access to things like SMART but only on drives
  where it is safe
- giving a user a SCSI disk or partition to play with but preventing them
  issuing weird ass commands that can disrupt other devices in the fabric
  (like drive to drive transfers, some kinds of resets, management
  commands)
- minimising the ability of a VM to do damage if compromised while
  maximising its raw access to a device



Alan

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: setting up CDB filters in udev (was Re: [PATCH v2 0/3] block: add queue-private command filter, editable via sysfs)
  2012-11-02 20:48                                     ` Alan Cox
@ 2012-11-02 22:59                                       ` Tejun Heo
  2012-11-02 23:52                                         ` Alan Cox
  0 siblings, 1 reply; 37+ messages in thread
From: Tejun Heo @ 2012-11-02 22:59 UTC (permalink / raw)
  To: Alan Cox
  Cc: Paolo Bonzini, Ric Wheeler, Petr Matousek, Kay Sievers,
	Jens Axboe, linux-kernel, James E.J. Bottomley

Hello, Alan.

On Fri, Nov 02, 2012 at 08:48:25PM +0000, Alan Cox wrote:
> If you look back through the archive you'll find people have been
> spending a good decade bitching about the lack of filter configurability
> and trying to get someone else to fix it.
> 
> The BPF filter is simpler than just about any other implementation
> because the tools exist and are used for lots of other things and it has
> an API that is precisely defined as well as kernel calls to run the
> filter.

Sure, if we *have* to go for flexible filtering, BPF would be the
right way to do it.  I'm just not convinced such flexibility is
justified.

> Some reasons for it
> 
> - giving people access to parts of disks

Are we gonna implement random range restriction inside a partition
too?  If we want to check against partition ranges for allowed SG_IO
commands, by all means, but it can easily be implemented as part of
the fixed filter.

> - allowing access to specific vendor specific commands on certain
>   non-standard CD and DVD drives so they can be used for burning but you
>   can't trash them

At this point, most burning commands are standardized, and optical
drives are generally on the way out.  If you absolutely have to use
some vendor specific commands, be root.

> - giving end users minimal access to things like SMART but only on drives
>   where it is safe

End users already have pretty good access to SMART data via udisks and
it's way more flexible and intelligent than some in-kernel filter.

> - giving a user a SCSI disk or partition to play with but preventing them
>   issuing weird ass commands that can disrupt other devices in the fabric
>   (like drive to drive transfers, some kinds of resets, management
>   commands)
> - minimising the ability of a VM to do damage if compromised while
>   maximising its raw access to a device

Maybe, I don't know.  It all sounds highly marginal to me.

For complex/intelligent access policies, kernel isn't the right place
to do it anyway - you want strong integration with the rest of the
system especially for desktop.  Ones which can justify kernel-side
filtering would be things which are pretty low-level and require low
overhead.  I think VMs fall in this category, but then again being
able to bypass standard filters seems workable enough for that.  Sure,
an extra layer of filtering on top could be nice but again I just
don't think that's a strong enough justification.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: setting up CDB filters in udev (was Re: [PATCH v2 0/3] block: add queue-private command filter, editable via sysfs)
  2012-11-02 22:59                                       ` Tejun Heo
@ 2012-11-02 23:52                                         ` Alan Cox
  2012-11-02 23:58                                           ` Tejun Heo
  0 siblings, 1 reply; 37+ messages in thread
From: Alan Cox @ 2012-11-02 23:52 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Paolo Bonzini, Ric Wheeler, Petr Matousek, Kay Sievers,
	Jens Axboe, linux-kernel, James E.J. Bottomley

> > - giving people access to parts of disks
> 
> Are we gonna implement random range restriction inside a partition
> too?  If we want to check against partition ranges for allowed SG_IO
> commands, by all means, but it can easily be implemented as part of
> the fixed filter.

Really. Can your filter implement it only for certain commands, and only
for certain vendor specific commands ? Not really because your filter is
fixed - it has policy in kernel which is the wrong place for device
specific stuff.

And if you add it to the "fixed" policy how much code and ioctls is that
to specify ranges by command and pass partitions when they are device
mapper user space created mappings ? More code than the BPF interface and
more new APIs.

> > - allowing access to specific vendor specific commands on certain
> >   non-standard CD and DVD drives so they can be used for burning but you
> >   can't trash them
> 
> At this point, most burning commands are standardized, and optical
> drives are generally on the way out.  If you absolutely have to use
> some vendor specific commands, be root.

So that translates to me as "There is a good reason, but if your drive is
one of the awkward ones then f**k you go use root". Again policy in the
kernel just creates inflexibility and is the wrong place for it.

> > - giving end users minimal access to things like SMART but only on drives
> >   where it is safe
> 
> End users already have pretty good access to SMART data via udisks and
> it's way more flexible and intelligent than some in-kernel filter.

Thats a bit Gnome developer - "Our way or the highway". The point of
having a filter is that you put policy in user space. If you don't care
the default filter carries on just working, if you do care you can change
stuff with BPF.

> > - giving a user a SCSI disk or partition to play with but preventing them
> >   issuing weird ass commands that can disrupt other devices in the fabric
> >   (like drive to drive transfers, some kinds of resets, management
> >   commands)
> > - minimising the ability of a VM to do damage if compromised while
> >   maximising its raw access to a device
> 
> Maybe, I don't know.  It all sounds highly marginal to me.

If you are doing virtual machines it is far from marginal.

> For complex/intelligent access policies, kernel isn't the right place
> to do it anyway

So why are you arguing for kernel policies which is exactly what the
fixed ones are ? The BPF ones moves the policy to user space !

Alan

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: setting up CDB filters in udev (was Re: [PATCH v2 0/3] block: add queue-private command filter, editable via sysfs)
  2012-11-02 23:52                                         ` Alan Cox
@ 2012-11-02 23:58                                           ` Tejun Heo
  2012-11-03  0:19                                             ` Alan Cox
  0 siblings, 1 reply; 37+ messages in thread
From: Tejun Heo @ 2012-11-02 23:58 UTC (permalink / raw)
  To: Alan Cox
  Cc: Paolo Bonzini, Ric Wheeler, Petr Matousek, Kay Sievers,
	Jens Axboe, linux-kernel, James E.J. Bottomley

Hello, Alan.

On Fri, Nov 02, 2012 at 11:52:43PM +0000, Alan Cox wrote:
> Really. Can your filter implement it only for certain commands, and only
> for certain vendor specific commands ? Not really because your filter is
> fixed - it has policy in kernel which is the wrong place for device
> specific stuff.
> 
> And if you add it to the "fixed" policy how much code and ioctls is that
> to specify ranges by command and pass partitions when they are device
> mapper user space created mappings ? More code than the BPF interface and
> more new APIs.

Hmmm?  You know which commands you're allowing.  You can definitely
filter those commands for their ranges.  What ioctls?

> > At this point, most burning commands are standardized, and optical
> > drives are generally on the way out.  If you absolutely have to use
> > some vendor specific commands, be root.
> 
> So that translates to me as "There is a good reason, but if your drive is
> one of the awkward ones then f**k you go use root". Again policy in the
> kernel just creates inflexibility and is the wrong place for it.

Yes, pretty much.  I don't think it's unreasonable for this one.  It's
not like we aim for ultimate flexibility all the time.  Everything is
a trade-off.  BPF filter for vendor-specific CD/DVD burning commands
seems way off to me, especially these days.

> > > - giving end users minimal access to things like SMART but only on drives
> > >   where it is safe
> > 
> > End users already have pretty good access to SMART data via udisks and
> > it's way more flexible and intelligent than some in-kernel filter.
> 
> Thats a bit Gnome developer - "Our way or the highway". The point of
> having a filter is that you put policy in user space. If you don't care
> the default filter carries on just working, if you do care you can change
> stuff with BPF.

We already have it implemented in userspace and it works well.
Actually it works better than anything we can do in kernel (e.g. how
is the kernel gonna know who's logged in front of the machine has
physical access to the hardware?).  What's the justification
duplicating it worse in kernel?

> > Maybe, I don't know.  It all sounds highly marginal to me.
> 
> If you are doing virtual machines it is far from marginal.

Yeah, I agree VMs are the only one which isn't marginal, but then
again, VMs can work reasonably well with far simpler mechanism.

> > For complex/intelligent access policies, kernel isn't the right place
> > to do it anyway
> 
> So why are you arguing for kernel policies which is exactly what the
> fixed ones are ? The BPF ones moves the policy to user space !

No, it's doing something half-way inbetween in unnecessarily complex
way when you can get by with much simpler mechanism.  We are stuck
with the in-kernel whitelist for historical reasons (the proper way
would be implementing burning service in userspace with proper
integration with the rest of userland).  The in-kernel whitelist is
broken for different classes of devices, so let's fix that in simple
way and give userland a simple mechanism to solve the VM case.

It's unfortunate that we have this SG_IO filtering in kernel at all.
Let's please not make it larger.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: setting up CDB filters in udev (was Re: [PATCH v2 0/3] block: add queue-private command filter, editable via sysfs)
  2012-11-02 23:58                                           ` Tejun Heo
@ 2012-11-03  0:19                                             ` Alan Cox
  2012-11-03  0:23                                               ` Tejun Heo
  0 siblings, 1 reply; 37+ messages in thread
From: Alan Cox @ 2012-11-03  0:19 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Paolo Bonzini, Ric Wheeler, Petr Matousek, Kay Sievers,
	Jens Axboe, linux-kernel, James E.J. Bottomley

On Fri, 2 Nov 2012 16:58:11 -0700
Tejun Heo <tj@kernel.org> wrote:

> Hello, Alan.
> 
> On Fri, Nov 02, 2012 at 11:52:43PM +0000, Alan Cox wrote:
> > Really. Can your filter implement it only for certain commands, and only
> > for certain vendor specific commands ? Not really because your filter is
> > fixed - it has policy in kernel which is the wrong place for device
> > specific stuff.
> > 
> > And if you add it to the "fixed" policy how much code and ioctls is that
> > to specify ranges by command and pass partitions when they are device
> > mapper user space created mappings ? More code than the BPF interface and
> > more new APIs.
> 
> Hmmm?  You know which commands you're allowing.  You can definitely
> filter those commands for their ranges.  What ioctls?

How do you know what the rules are in kernel. If I'm locking you to fixed
mappings you have no idea in kernel what my user policy model is.

> > So that translates to me as "There is a good reason, but if your drive is
> > one of the awkward ones then f**k you go use root". Again policy in the
> > kernel just creates inflexibility and is the wrong place for it.
> 
> Yes, pretty much. 

Unfortunate and guaranteed to end up with problems not getting fixed
again - or having to redo the work a second time later on. Plus this is
but one example and you are blocking all the ones that haven't been
considered.

> > If you are doing virtual machines it is far from marginal.
> 
> Yeah, I agree VMs are the only one which isn't marginal, but then
> again, VMs can work reasonably well with far simpler mechanism.

I'm dying to see your "simpler mechanism" - I bet by the time you've
written the code it isn't simpler than calling the existing BPF logic.
Your kernel already has all the BPF stuff in it unless you are building
with no networking support so its essentially free.

The BPF execution is a call to sk_run_filter(). Is your mechanism simpler
than that ? It's also been closely audited.

> > > For complex/intelligent access policies, kernel isn't the right place
> > > to do it anyway
> > 
> > So why are you arguing for kernel policies which is exactly what the
> > fixed ones are ? The BPF ones moves the policy to user space !
> 
> No, it's doing something half-way inbetween in unnecessarily complex
> way when you can get by with much simpler mechanism.  We are stuck
> with the in-kernel whitelist for historical reasons (the proper way
> would be implementing burning service in userspace with proper
> integration with the rest of userland).  The in-kernel whitelist is

The burning service is one tiny example.

> broken for different classes of devices, so let's fix that in simple
> way and give userland a simple mechanism to solve the VM case.

Which you haven't done, because you've not provided a filter mechanism
that is flexible and takes policy from user space.
 
> It's unfortunate that we have this SG_IO filtering in kernel at all.
> Let's please not make it larger.

We have filtering because it is necessary. All you are doing is putting
off the inevitable by adding more kernel hack "one true kernel enforced
religion" policy and putting off the inevitable while adding APIs you'll
then have to maintain until the job is done right.

Ultimately policy has to be user space driven, adding more temporary
hacks is just a waste.

Alan

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: setting up CDB filters in udev (was Re: [PATCH v2 0/3] block: add queue-private command filter, editable via sysfs)
  2012-11-03  0:19                                             ` Alan Cox
@ 2012-11-03  0:23                                               ` Tejun Heo
  2012-11-03  0:52                                                 ` Alan Cox
  0 siblings, 1 reply; 37+ messages in thread
From: Tejun Heo @ 2012-11-03  0:23 UTC (permalink / raw)
  To: Alan Cox
  Cc: Paolo Bonzini, Ric Wheeler, Petr Matousek, Kay Sievers,
	Jens Axboe, linux-kernel, James E.J. Bottomley

Hello,

On Sat, Nov 03, 2012 at 12:19:05AM +0000, Alan Cox wrote:
> > Hmmm?  You know which commands you're allowing.  You can definitely
> > filter those commands for their ranges.  What ioctls?
> 
> How do you know what the rules are in kernel. If I'm locking you to fixed
> mappings you have no idea in kernel what my user policy model is.

So, my first response was whether you mean to add arbitrary range
filtering for standard read/writes too.  If you're not gonna do that
and use the existing partition based access model, it's natural to
apply the same partition ranges to the allowed SG_IO commands, right?
There's no new access model which should be configured here.  It just
applied the same block device access model to SG_IO commands too.

> > > So that translates to me as "There is a good reason, but if your drive is
> > > one of the awkward ones then f**k you go use root". Again policy in the
> > > kernel just creates inflexibility and is the wrong place for it.
> > 
> > Yes, pretty much. 
> 
> Unfortunate and guaranteed to end up with problems not getting fixed
> again - or having to redo the work a second time later on. Plus this is
> but one example and you are blocking all the ones that haven't been
> considered.

But we should't add features for the ones which haven't been
considered.  Unless you can actually justify with actual use cases,
it's just hand waving.

> > > If you are doing virtual machines it is far from marginal.
> > 
> > Yeah, I agree VMs are the only one which isn't marginal, but then
> > again, VMs can work reasonably well with far simpler mechanism.
> 
> I'm dying to see your "simpler mechanism" - I bet by the time you've
> written the code it isn't simpler than calling the existing BPF logic.
> Your kernel already has all the BPF stuff in it unless you are building
> with no networking support so its essentially free.

The suggested mechanism is just having a switch to allow all SG_IO
commands and pass it to the hypervisor.  There's no filtering in
kernel at all.

> We have filtering because it is necessary. All you are doing is putting
> off the inevitable by adding more kernel hack "one true kernel enforced
> religion" policy and putting off the inevitable while adding APIs you'll
> then have to maintain until the job is done right.
> 
> Ultimately policy has to be user space driven, adding more temporary
> hacks is just a waste.

Exactly, let's provide a turn-off switch for in-kernel filtering and
let userland drive any appropriate access policy on it.  Let's please
stay away from doing deep packet inspecting on SG_IO commands.

I don't think we're disagreeing on the principle.  It's just that
we're drawing different where the line lies between mechanisms and
policies.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: setting up CDB filters in udev (was Re: [PATCH v2 0/3] block: add queue-private command filter, editable via sysfs)
  2012-11-03  0:23                                               ` Tejun Heo
@ 2012-11-03  0:52                                                 ` Alan Cox
  0 siblings, 0 replies; 37+ messages in thread
From: Alan Cox @ 2012-11-03  0:52 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Paolo Bonzini, Ric Wheeler, Petr Matousek, Kay Sievers,
	Jens Axboe, linux-kernel, James E.J. Bottomley

> So, my first response was whether you mean to add arbitrary range
> filtering for standard read/writes too.  If you're not gonna do that

Thats relevant for /dev/sd but not /dev/sg. Now it might be the sane
thing to do is to support BPF filters on /dev/sg only ?

> But we should't add features for the ones which haven't been
> considered.  Unless you can actually justify with actual use cases,
> it's just hand waving.

The CD writer one isn't but you tried to dismiss that too. 

> The suggested mechanism is just having a switch to allow all SG_IO
> commands and pass it to the hypervisor.  There's no filtering in
> kernel at all.

That requires you have a very trusted hypervisor, that to me is a
weaker model because it increases the probability that a hack on a
guest of a cloud becomes a hack of the cloud itself at least in terms of
quiet data theft from other guests. It's still a minor risk because
you've got to break the guest then break the hypervisor from guest
virtual ring 0, but the hypervisor is a large target,

For a lot of cases being able to do the filtering in the hypervisor makes
sense and I don't have a problem with that except that you do need to
enforce CAP_SYS_RAWIO on the process setting the flags or you break the
kernel capability model. For a hypervisor guest flipping flags at boot
thats not a big deal.

> > We have filtering because it is necessary. All you are doing is putting
> > off the inevitable by adding more kernel hack "one true kernel enforced
> > religion" policy and putting off the inevitable while adding APIs you'll
> > then have to maintain until the job is done right.
> > 
> > Ultimately policy has to be user space driven, adding more temporary
> > hacks is just a waste.
> 
> Exactly, let's provide a turn-off switch for in-kernel filtering and
> let userland drive any appropriate access policy on it.  Let's please
> stay away from doing deep packet inspecting on SG_IO commands.

And you've left stuff like CD burning broken still despite people pointing
out for years that sorting out the filtering would fix it.

> I don't think we're disagreeing on the principle.  It's just that
> we're drawing different where the line lies between mechanisms and
> policies.

On the CD burning case and on generality I think we are disagreeing in
principle. Which isn't to say there is a right answer and one of us is an
idiot, just that there are things to consider carefully here.

IMHO you are hacking kernel policy by a magic flag to support a specific
problem case, not fixing the problem. The question is whether the problem
needs fixing 8)

Alan

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: setting up CDB filters in udev (was Re: [PATCH v2 0/3] block: add queue-private command filter, editable via sysfs)
  2012-11-02 17:53                             ` Tejun Heo
@ 2012-11-03 13:20                               ` Paolo Bonzini
  2012-11-03 14:50                                 ` Alan Cox
  2012-11-05 18:26                                 ` Tejun Heo
  0 siblings, 2 replies; 37+ messages in thread
From: Paolo Bonzini @ 2012-11-03 13:20 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Ric Wheeler, Petr Matousek, Kay Sievers, Jens Axboe,
	linux-kernel, James E.J. Bottomley

Il 02/11/2012 18:53, Tejun Heo ha scritto:
> Hello, Paolo.
> 
> On Fri, Nov 02, 2012 at 06:49:43PM +0100, Paolo Bonzini wrote:
>>> No rule is really absolute.  To me, it seems the suggested in-kernel
>>> per-device command code filter is both too big for the given problem
>>
>> Is it?  150 lines of code?  The per-class filters would share the first
>> two patches with this series, add a long list of commands to filter, and
>> the ioctl would be on top of that.
> 
> It's not really about the lines of code.  It adds a new userland
> visible interface.  As for the "long" list of commands, it depends on
> how you write it but even if it's textually long it's still very
> simple in terms of actual complexity.

Sure, but its place is not the kernel.

As to implementing the ioctl, it's all but trivial.  For one thing, you
have to make the block device ioctl op take a "struct file".  I have
been asking Al Viro about it for 6 months and I haven't got any answer yet.

Second, getting a security-sensitive ioctl right is hard, as you
demonstrated yourself in this thread by proposing a gapingly insecure
approach.  Adding a little bit of customization to the current solution
may be but a local optimum, but you cannot really get it wrong.

>>> while being too limited for much beyond that.
>>
>> What are the use cases beyond these?  AFAIK these were the first two in
>> ten years or so...
> 
> If this is such a cold area, why do we want do anything other than the
> simplest possible?

Because _this_ is the simplest possible.

I proposed a way to implement the ultimately flexible solution (BPF) and
you shot it down because it was too complex.  Alan is showing you with
multiple examples of why the flexibility would be useful (perhaps nobody
would use it, but the use cases _are_ there), and you are mostly
ignoring them.

James suggested the sysfs knob, which is not as flexible but is the
simplest thing that can work, and was even part of the original design.
 You are still shooting it down because it is too complex, yet you're
proposing to replace one simple mechanism with two; one of which is
absolutely inflexible (unlike MMC which only has "ripping" and
"burning", other device classes have many use cases), while the other is
hard to both implement and get right.

Sounds great...

Paolo

>>> So, if we can get away
>>> with adding an ioctl, I personally think that would be a better
>>> approach.
>>
>> I would really prefer to get a green light from Jens/James for per-class
>> filters in the kernel (which are worth a few hundred lines of data)
>> before implementing that.
> 
> Sure, Jens, James?  Guys, come on.
> 
> Thanks.
> 


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: setting up CDB filters in udev (was Re: [PATCH v2 0/3] block: add queue-private command filter, editable via sysfs)
  2012-11-03 13:20                               ` Paolo Bonzini
@ 2012-11-03 14:50                                 ` Alan Cox
  2012-11-05 11:08                                   ` Paolo Bonzini
  2012-11-05 18:18                                   ` Tejun Heo
  2012-11-05 18:26                                 ` Tejun Heo
  1 sibling, 2 replies; 37+ messages in thread
From: Alan Cox @ 2012-11-03 14:50 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Tejun Heo, Ric Wheeler, Petr Matousek, Kay Sievers, Jens Axboe,
	linux-kernel, James E.J. Bottomley

> > It's not really about the lines of code.  It adds a new userland
> > visible interface.  As for the "long" list of commands, it depends on
> > how you write it but even if it's textually long it's still very
> > simple in terms of actual complexity.
> 
> Sure, but its place is not the kernel.
> 
> As to implementing the ioctl, it's all but trivial.  For one thing, you
> have to make the block device ioctl op take a "struct file".  I have
> been asking Al Viro about it for 6 months and I haven't got any answer yet.

Just do it - if Al cared he'd have replied about it.

> I proposed a way to implement the ultimately flexible solution (BPF) and
> you shot it down because it was too complex.  Alan is showing you with
> multiple examples of why the flexibility would be useful (perhaps nobody
> would use it, but the use cases _are_ there), and you are mostly
> ignoring them.

My feeling too - It feels to me like Tejun is trying to railroad a broken
non-solution into the system without regards for anyone else and by
simply dismissing any other input.

Alan

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: setting up CDB filters in udev (was Re: [PATCH v2 0/3] block: add queue-private command filter, editable via sysfs)
  2012-11-03 14:50                                 ` Alan Cox
@ 2012-11-05 11:08                                   ` Paolo Bonzini
  2012-11-05 18:18                                   ` Tejun Heo
  1 sibling, 0 replies; 37+ messages in thread
From: Paolo Bonzini @ 2012-11-05 11:08 UTC (permalink / raw)
  To: Alan Cox
  Cc: Tejun Heo, Ric Wheeler, Petr Matousek, Kay Sievers, Jens Axboe,
	linux-kernel, James E.J. Bottomley

Il 03/11/2012 15:50, Alan Cox ha scritto:
>>> It's not really about the lines of code.  It adds a new userland
>>> visible interface.  As for the "long" list of commands, it depends on
>>> how you write it but even if it's textually long it's still very
>>> simple in terms of actual complexity.
>>
>> Sure, but its place is not the kernel.
>>
>> As to implementing the ioctl, it's all but trivial.  For one thing, you
>> have to make the block device ioctl op take a "struct file".  I have
>> been asking Al Viro about it for 6 months and I haven't got any answer yet.
> 
> Just do it - if Al cared he'd have replied about it.

Yeah, so far I chickened out because---for the usecases I know, at
least---a sysfs knob would be simpler to use.

>> I proposed a way to implement the ultimately flexible solution (BPF) and
>> you shot it down because it was too complex.  Alan is showing you with
>> multiple examples of why the flexibility would be useful (perhaps nobody
>> would use it, but the use cases _are_ there), and you are mostly
>> ignoring them.
> 
> My feeling too - It feels to me like Tejun is trying to railroad a broken
> non-solution into the system

Thanks for spelling it out. :)

> without regards for anyone else and by
> simply dismissing any other input.

My main worry about the BPF implementation is that I need to do CDB
inspection, which is similar to packet inspection but doesn't have an
sk_buff.  The seccomp guys worked around it by using an ANC opcode to
access syscall arguments, but here I really need LD.  This means for
example that it wouldn't be so easy to use the JIT.

These are the usecases that really require BPF:

> - giving people access to parts of disks

This is quite obvious, but I'm not sure why you would that (other than
just because).  It feels a bit strange to give access to part of a disk
with "absolute" LBAs rather than relative.

For people needing access to partitions of LVs, what we really need is
better access to block device features.  We just got BLKZEROOUT, but
access to block devices via fallocate() or lseek(SEEK_HOLE/SEEK_DATA)
would be a more interesting and useful thing to do.

> - giving end users minimal access to things like SMART but only on drives
>   where it is safe

I'm actually in agreement with Tejun that this is handled more than
decently by udisks.  And the reason this needs BPF, is because the ATA
command set is incredibly complicated.  If there were request, SMART
access could be moved to the kernel/libata and use sysfs+uevents.  This
is similar to how media change is in the kernel, and SCSI disks allow
limited access to mode pages via sysfs.

These uses instead are fine with a bitmap:

> - giving a user a SCSI disk or partition to play with but preventing them
>   issuing weird ass commands that can disrupt other devices in the fabric
>   (like drive to drive transfers, some kinds of resets, management
>   commands)
> - allowing access to specific vendor specific commands on certain
>   non-standard CD and DVD drives so they can be used for burning but you
>   can't trash them
> - minimising the ability of a VM to do damage if compromised while
>   maximising its raw access to a device

(I'm obviously trying to champion my own patches, but I'll try to be
open minded about it). :)

Paolo

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: setting up CDB filters in udev (was Re: [PATCH v2 0/3] block: add queue-private command filter, editable via sysfs)
  2012-11-03 14:50                                 ` Alan Cox
  2012-11-05 11:08                                   ` Paolo Bonzini
@ 2012-11-05 18:18                                   ` Tejun Heo
  2012-11-05 20:12                                     ` Alan Cox
  1 sibling, 1 reply; 37+ messages in thread
From: Tejun Heo @ 2012-11-05 18:18 UTC (permalink / raw)
  To: Alan Cox
  Cc: Paolo Bonzini, Ric Wheeler, Petr Matousek, Kay Sievers,
	Jens Axboe, linux-kernel, James E.J. Bottomley

Hello, Alan.

On Sat, Nov 03, 2012 at 02:50:52PM +0000, Alan Cox wrote:
> > I proposed a way to implement the ultimately flexible solution (BPF) and
> > you shot it down because it was too complex.  Alan is showing you with
> > multiple examples of why the flexibility would be useful (perhaps nobody
> > would use it, but the use cases _are_ there), and you are mostly
> > ignoring them.
> 
> My feeling too - It feels to me like Tejun is trying to railroad a broken
> non-solution into the system without regards for anyone else and by
> simply dismissing any other input.

The only other use case brought up is allowing use of vendor-specific
commands while burning CDs.  Given that the usual burning has been
working well enough for years now, I don't really think that's a
strong enough reason to add full BPF filtering to SG_IO.  It's just
highly unusual thing to do and there isn't strong enough use case for
it.

To me, it feels like you guys are pushing a feature without strong
enough use case.  So, I'm still pretty strongly against it.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: setting up CDB filters in udev (was Re: [PATCH v2 0/3] block: add queue-private command filter, editable via sysfs)
  2012-11-03 13:20                               ` Paolo Bonzini
  2012-11-03 14:50                                 ` Alan Cox
@ 2012-11-05 18:26                                 ` Tejun Heo
  1 sibling, 0 replies; 37+ messages in thread
From: Tejun Heo @ 2012-11-05 18:26 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Ric Wheeler, Petr Matousek, Kay Sievers, Jens Axboe,
	linux-kernel, James E.J. Bottomley

Hey, Paolo.

On Sat, Nov 03, 2012 at 02:20:14PM +0100, Paolo Bonzini wrote:
> As to implementing the ioctl, it's all but trivial.  For one thing, you
> have to make the block device ioctl op take a "struct file".  I have
> been asking Al Viro about it for 6 months and I haven't got any answer yet.

I don't think that really matters.  If necessary, just change it.

> Second, getting a security-sensitive ioctl right is hard, as you
> demonstrated yourself in this thread by proposing a gapingly insecure
> approach.  Adding a little bit of customization to the current solution
> may be but a local optimum, but you cannot really get it wrong.

Eh?  Sure, I was in "who cares about SG_IO?" area a bit but that has
nothing to do with the interface being an ioctl.  It's a binary switch
ioctl, it's not too difficult to get right.

> Because _this_ is the simplest possible.

I guess we'll have to agree to disagree.  This is way more complex.
You need to review what's going on with the userland tools and inspect
weird sysfs files to actually know what the system policy is.

> I proposed a way to implement the ultimately flexible solution (BPF) and
> you shot it down because it was too complex.  Alan is showing you with
> multiple examples of why the flexibility would be useful (perhaps nobody
> would use it, but the use cases _are_ there), and you are mostly
> ignoring them.

The only valid use case was using proprietary commands during CD
burning.  At this point, that's a really really minor use case IMHO.
I think adding full BPF filtering on SG_IO for that is rather silly.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: setting up CDB filters in udev (was Re: [PATCH v2 0/3] block: add queue-private command filter, editable via sysfs)
  2012-11-05 20:12                                     ` Alan Cox
@ 2012-11-05 20:09                                       ` Tejun Heo
  2012-11-05 20:17                                         ` Alan Cox
  0 siblings, 1 reply; 37+ messages in thread
From: Tejun Heo @ 2012-11-05 20:09 UTC (permalink / raw)
  To: Alan Cox
  Cc: Paolo Bonzini, Ric Wheeler, Petr Matousek, Kay Sievers,
	Jens Axboe, linux-kernel, James E.J. Bottomley

Hey, Alan.

On Mon, Nov 05, 2012 at 08:12:08PM +0000, Alan Cox wrote:
> There are two sensible choices here IMHO
> 
> - The simple sysctl
> 
> - Doing the job right
> 
> A half way solution such as that you are proposing seems to me to achieve
> nothing other than guaranteeing we'll have another pile of useless crap
> to maintain forever as ABI.

I'm all for simple.  Just throw in a sysfs binary switch to allow all
SG_IO for a given block device for users w/ write access.  I'd be
completely happy with that.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: setting up CDB filters in udev (was Re: [PATCH v2 0/3] block: add queue-private command filter, editable via sysfs)
  2012-11-05 18:18                                   ` Tejun Heo
@ 2012-11-05 20:12                                     ` Alan Cox
  2012-11-05 20:09                                       ` Tejun Heo
  0 siblings, 1 reply; 37+ messages in thread
From: Alan Cox @ 2012-11-05 20:12 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Paolo Bonzini, Ric Wheeler, Petr Matousek, Kay Sievers,
	Jens Axboe, linux-kernel, James E.J. Bottomley

> To me, it feels like you guys are pushing a feature without strong
> enough use case.  So, I'm still pretty strongly against it.

A feature people have been asking for repeatedly for ten years. Go scan
the archive. I'm also pushing for a *generic* do it once solution so we
never have to have the debate again regardless of what media pops up in
future.

There are two sensible choices here IMHO

- The simple sysctl

- Doing the job right

A half way solution such as that you are proposing seems to me to achieve
nothing other than guaranteeing we'll have another pile of useless crap
to maintain forever as ABI.

Alan

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: setting up CDB filters in udev (was Re: [PATCH v2 0/3] block: add queue-private command filter, editable via sysfs)
  2012-11-05 20:17                                         ` Alan Cox
@ 2012-11-05 20:15                                           ` Tejun Heo
  0 siblings, 0 replies; 37+ messages in thread
From: Tejun Heo @ 2012-11-05 20:15 UTC (permalink / raw)
  To: Alan Cox
  Cc: Paolo Bonzini, Ric Wheeler, Petr Matousek, Kay Sievers,
	Jens Axboe, linux-kernel, James E.J. Bottomley

On Mon, Nov 05, 2012 at 08:17:54PM +0000, Alan Cox wrote:
> IMHO that's the best option for now - if it fixes the specific case of
> concern right now. It's a trivial interface, it's trivial security (need
> to check CAP_SYS_RAWIO to flip) and it means that the job can be done
> properly when there is consensus without a whole extra legacy API stuck
> behind.

Oh yeah, fully agreed.  We could just set the sysfs file only
writeable by root (basically just setting perm to 0644).  Would the
CAP_SYS_RAWIO distinction matter?

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: setting up CDB filters in udev (was Re: [PATCH v2 0/3] block: add queue-private command filter, editable via sysfs)
  2012-11-05 20:09                                       ` Tejun Heo
@ 2012-11-05 20:17                                         ` Alan Cox
  2012-11-05 20:15                                           ` Tejun Heo
  0 siblings, 1 reply; 37+ messages in thread
From: Alan Cox @ 2012-11-05 20:17 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Paolo Bonzini, Ric Wheeler, Petr Matousek, Kay Sievers,
	Jens Axboe, linux-kernel, James E.J. Bottomley

On Mon, 5 Nov 2012 12:09:55 -0800
Tejun Heo <tj@kernel.org> wrote:

> Hey, Alan.
> 
> On Mon, Nov 05, 2012 at 08:12:08PM +0000, Alan Cox wrote:
> > There are two sensible choices here IMHO
> > 
> > - The simple sysctl
> > 
> > - Doing the job right
> > 
> > A half way solution such as that you are proposing seems to me to achieve
> > nothing other than guaranteeing we'll have another pile of useless crap
> > to maintain forever as ABI.
> 
> I'm all for simple.  Just throw in a sysfs binary switch to allow all
> SG_IO for a given block device for users w/ write access.  I'd be
> completely happy with that.

IMHO that's the best option for now - if it fixes the specific case of
concern right now. It's a trivial interface, it's trivial security (need
to check CAP_SYS_RAWIO to flip) and it means that the job can be done
properly when there is consensus without a whole extra legacy API stuck
behind.

Alan

^ permalink raw reply	[flat|nested] 37+ messages in thread

end of thread, other threads:[~2012-11-05 20:15 UTC | newest]

Thread overview: 37+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-09-25 15:30 [PATCH v2 0/3] block: add queue-private command filter, editable via sysfs Paolo Bonzini
2012-09-25 15:30 ` [PATCH v2 1/3] block: add back queue-private command filter Paolo Bonzini
2012-09-25 15:30 ` [PATCH v2 2/3] scsi: create an all-zero filter for scanners Paolo Bonzini
2012-09-25 15:30 ` [PATCH v2 3/3] block: add back command filter modification via sysfs Paolo Bonzini
2012-10-04 10:12 ` [PATCH v2 0/3] block: add queue-private command filter, editable " Paolo Bonzini
2012-10-19  0:22   ` Tejun Heo
2012-10-19  9:07     ` Paolo Bonzini
     [not found]       ` <2007908429.13363375.1350637872646.JavaMail.root@redhat.com>
     [not found]         ` <20121019201058.GP13370@google.com>
     [not found]           ` <5087E093.50700@redhat.com>
     [not found]             ` <CAOS58YM5ZO9h0XUCNxV+6U3UzpeUen5ZuyqsNEUaJ81ux=QKvw@mail.gmail.com>
     [not found]               ` <5088EC43.2010600@redhat.com>
2012-10-25 18:00                 ` setting up CDB filters in udev (was Re: [PATCH v2 0/3] block: add queue-private command filter, editable via sysfs) Tejun Heo
2012-10-25 18:35                   ` Paolo Bonzini
2012-10-31 12:52                     ` Paolo Bonzini
2012-10-31 21:22                     ` Tejun Heo
2012-11-02 14:49                       ` Paolo Bonzini
2012-11-02 15:35                         ` Alan Cox
2012-11-02 16:48                           ` Tejun Heo
2012-11-02 17:21                             ` Alan Cox
2012-11-02 17:30                               ` Tejun Heo
2012-11-02 20:18                                 ` Alan Cox
2012-11-02 20:21                                   ` Tejun Heo
2012-11-02 20:48                                     ` Alan Cox
2012-11-02 22:59                                       ` Tejun Heo
2012-11-02 23:52                                         ` Alan Cox
2012-11-02 23:58                                           ` Tejun Heo
2012-11-03  0:19                                             ` Alan Cox
2012-11-03  0:23                                               ` Tejun Heo
2012-11-03  0:52                                                 ` Alan Cox
2012-11-02 16:51                         ` Tejun Heo
2012-11-02 17:49                           ` Paolo Bonzini
2012-11-02 17:53                             ` Tejun Heo
2012-11-03 13:20                               ` Paolo Bonzini
2012-11-03 14:50                                 ` Alan Cox
2012-11-05 11:08                                   ` Paolo Bonzini
2012-11-05 18:18                                   ` Tejun Heo
2012-11-05 20:12                                     ` Alan Cox
2012-11-05 20:09                                       ` Tejun Heo
2012-11-05 20:17                                         ` Alan Cox
2012-11-05 20:15                                           ` Tejun Heo
2012-11-05 18:26                                 ` Tejun Heo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).