linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 00/14] Corrections and customization of the SG_IO command whitelist (CVE-2012-4542)
@ 2013-02-06 15:15 Paolo Bonzini
  2013-02-06 15:15 ` [PATCH v2 01/14] sg_io: pass request_queue to blk_verify_command Paolo Bonzini
                   ` (16 more replies)
  0 siblings, 17 replies; 73+ messages in thread
From: Paolo Bonzini @ 2013-02-06 15:15 UTC (permalink / raw)
  To: linux-kernel; +Cc: Tejun Heo, James E.J. Bottomley, linux-scsi, Jens Axboe

This series regards the whitelist that is used for the SG_IO ioctl.  This
whitelist has three problems:

* the bitmap of allowed commands is designed for MMC devices (roughly,
  "play/burn CDs without requiring root") but some opcodes overlap across SCSI
  device classes and have different meanings for different classes.

* also because the bitmap of allowed commands is designed for MMC devices
  only, some commands are missing even though they are generally useful and
  not insecure.  At least not more insecure than anything else you can
  do if you have access to /dev/sdX or /dev/stX nodes.

* the whitelist can be disabled per-process but not per-disk.  In addition,
  the required capability (CAP_SYS_RAWIO) gives access to a range of other 
  resources, enough to make it insecure.

The series corrects these problems.  Patches 1-4 solve the first problem,
which also has an assigned CVE, by using different bitmaps for the various
device classes.  Patches 5-11 solve the second by adding more commands
to the bitmaps.  Patches 12 and 13 solve the third, and were already
posted but ignored by the maintainers despite multiple pings.

Note: checkpatch hates the formatting of the command table.  I know about this,
and ensured that there are no errors in the rest of the code.  The current
formatting is IMHO quite handy, and roughly based on the files available
from the SCSI standard body.

Ok for the next merge window?

Paolo

v1->v2: remove 2 MMC commands and 6 SBC commands (see patches 6 and 9
        for details).  Added patch 14 and added a few more scanner
        commands based on SANE (scanners are not whitelisted by default,
        also were not in v1, but this makes it possible to opt into the
        whitelist out of paranoia).  Removed C++ comments.  Removed the
        large #if 0'd list of commands that the kernel does not pass
        though.  Marked blk_set_cmd_filter_defaults as __init.


Paolo Bonzini (14):
  sg_io: pass request_queue to blk_verify_command
  sg_io: reorganize list of allowed commands
  sg_io: use different default filters for each device class
  sg_io: resolve conflicts between commands assigned to multiple
    classes (CVE-2012-4542)
  sg_io: whitelist a few more commands for rare & obsolete device types
  sg_io: whitelist another command for multimedia devices
  sg_io: whitelist a few more commands for media changers
  sg_io: whitelist a few more commands for tapes
  sg_io: whitelist a few more commands for disks
  sg_io: whitelist a few obsolete commands
  sg_io: mark blk_set_cmd_filter_defaults as __init
  sg_io: remove remnants of sysfs SG_IO filters
  sg_io: introduce unpriv_sgio queue flag
  sg_io: use unpriv_sgio to disable whitelisting for scanners

 Documentation/block/queue-sysfs.txt |    8 +
 block/blk-sysfs.c                   |   33 +++
 block/bsg.c                         |    2 +-
 block/scsi_ioctl.c                  |  369 ++++++++++++++++++++++++++---------
 drivers/scsi/scsi_scan.c            |   14 ++-
 drivers/scsi/sg.c                   |    6 +-
 include/linux/blkdev.h              |    8 +-
 include/linux/genhd.h               |    9 -
 include/scsi/scsi.h                 |    3 +
 9 files changed, 344 insertions(+), 108 deletions(-)


^ permalink raw reply	[flat|nested] 73+ messages in thread

* [PATCH v2 01/14] sg_io: pass request_queue to blk_verify_command
  2013-02-06 15:15 [PATCH v2 00/14] Corrections and customization of the SG_IO command whitelist (CVE-2012-4542) Paolo Bonzini
@ 2013-02-06 15:15 ` Paolo Bonzini
  2013-02-06 15:15 ` [PATCH v2 02/14] sg_io: reorganize list of allowed commands Paolo Bonzini
                   ` (15 subsequent siblings)
  16 siblings, 0 replies; 73+ messages in thread
From: Paolo Bonzini @ 2013-02-06 15:15 UTC (permalink / raw)
  To: linux-kernel
  Cc: Tejun Heo, FUJITA Tomonori, Doug Gilbert, James E.J. Bottomley,
	linux-scsi, Jens Axboe

Adjust the blk_verify_command function to let it look at per-queue
data.  This will be done in the next patch.

Acked-by: Tejun Heo <tj@kernel.org>
Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Doug Gilbert <dgilbert@interlog.com>
Cc: "James E.J. Bottomley" <JBottomley@parallels.com>
Cc: linux-scsi@vger.kernel.org
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 block/bsg.c            |    2 +-
 block/scsi_ioctl.c     |    7 ++++---
 drivers/scsi/sg.c      |    3 ++-
 include/linux/blkdev.h |    3 ++-
 4 files changed, 9 insertions(+), 6 deletions(-)

diff --git a/block/bsg.c b/block/bsg.c
index ff64ae3..eb5ad59 100644
--- a/block/bsg.c
+++ b/block/bsg.c
@@ -187,7 +187,7 @@ static int blk_fill_sgv4_hdr_rq(struct request_queue *q, struct request *rq,
 		return -EFAULT;
 
 	if (hdr->subprotocol == BSG_SUB_PROTOCOL_SCSI_CMD) {
-		if (blk_verify_command(rq->cmd, has_write_perm))
+		if (blk_verify_command(q, rq->cmd, has_write_perm))
 			return -EPERM;
 	} else if (!capable(CAP_SYS_RAWIO))
 		return -EPERM;
diff --git a/block/scsi_ioctl.c b/block/scsi_ioctl.c
index 9a87daa..a737562 100644
--- a/block/scsi_ioctl.c
+++ b/block/scsi_ioctl.c
@@ -196,7 +196,8 @@ static void blk_set_cmd_filter_defaults(struct blk_cmd_filter *filter)
 	__set_bit(GPCMD_SET_READ_AHEAD, filter->write_ok);
 }
 
-int blk_verify_command(unsigned char *cmd, fmode_t has_write_perm)
+int blk_verify_command(struct request_queue *q,
+		       unsigned char *cmd, fmode_t has_write_perm)
 {
 	struct blk_cmd_filter *filter = &blk_default_cmd_filter;
 
@@ -225,7 +226,7 @@ static int blk_fill_sghdr_rq(struct request_queue *q, struct request *rq,
 {
 	if (copy_from_user(rq->cmd, hdr->cmdp, hdr->cmd_len))
 		return -EFAULT;
-	if (blk_verify_command(rq->cmd, mode & FMODE_WRITE))
+	if (blk_verify_command(q, rq->cmd, mode & FMODE_WRITE))
 		return -EPERM;
 
 	/*
@@ -472,7 +473,7 @@ int sg_scsi_ioctl(struct request_queue *q, struct gendisk *disk, fmode_t mode,
 	if (in_len && copy_from_user(buffer, sic->data + cmdlen, in_len))
 		goto error;
 
-	err = blk_verify_command(rq->cmd, mode & FMODE_WRITE);
+	err = blk_verify_command(q, rq->cmd, mode & FMODE_WRITE);
 	if (err)
 		goto error;
 
diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c
index be2c9a6..cab816f 100644
--- a/drivers/scsi/sg.c
+++ b/drivers/scsi/sg.c
@@ -217,11 +217,12 @@ static void sg_put_dev(Sg_device *sdp);
 static int sg_allow_access(struct file *filp, unsigned char *cmd)
 {
 	struct sg_fd *sfp = filp->private_data;
+	struct request_queue *q = sfp->parentdp->device->request_queue;
 
 	if (sfp->parentdp->device->type == TYPE_SCANNER)
 		return 0;
 
-	return blk_verify_command(cmd, filp->f_mode & FMODE_WRITE);
+	return blk_verify_command(q, cmd, filp->f_mode & FMODE_WRITE);
 }
 
 static int get_exclude(Sg_device *sdp)
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index f94bc83..0782336 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -1061,7 +1061,8 @@ static inline int sb_issue_zeroout(struct super_block *sb, sector_t block,
 				    gfp_mask);
 }
 
-extern int blk_verify_command(unsigned char *cmd, fmode_t has_write_perm);
+extern int blk_verify_command(struct request_queue *q,
+			      unsigned char *cmd, fmode_t has_write_perm);
 
 enum blk_default_limits {
 	BLK_MAX_SEGMENTS	= 128,
-- 
1.7.1



^ permalink raw reply related	[flat|nested] 73+ messages in thread

* [PATCH v2 02/14] sg_io: reorganize list of allowed commands
  2013-02-06 15:15 [PATCH v2 00/14] Corrections and customization of the SG_IO command whitelist (CVE-2012-4542) Paolo Bonzini
  2013-02-06 15:15 ` [PATCH v2 01/14] sg_io: pass request_queue to blk_verify_command Paolo Bonzini
@ 2013-02-06 15:15 ` Paolo Bonzini
  2013-02-06 15:15 ` [PATCH v2 03/14] sg_io: use different default filters for each device class Paolo Bonzini
                   ` (14 subsequent siblings)
  16 siblings, 0 replies; 73+ messages in thread
From: Paolo Bonzini @ 2013-02-06 15:15 UTC (permalink / raw)
  To: linux-kernel; +Cc: Tejun Heo, James E.J. Bottomley, linux-scsi, Jens Axboe

To prepare for the next patches, reorganize the list of commands into
a two-way table of command numbers and device types.

One command (READ CAPACITY) was listed twice in the old table, hence
the new table has one entry less than the old one.

Right now, there is still just one bitmap and the mask is ignored,
so there is no semantic change yet.

Of course, checkpatch hates this table.  It has long lines and
non-standard spacing.  IMO the improved readability trumps the problems
reported by checkpatch.

Cc: "James E.J. Bottomley" <JBottomley@parallels.com>
Cc: linux-scsi@vger.kernel.org
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
        v1->v2: do not use C++ commands, add definitions for missing types
                in include/scsi/scsi.h

 block/scsi_ioctl.c  |  210 ++++++++++++++++++++++++++++++++-------------------
 include/scsi/scsi.h |    2 +
 2 files changed, 133 insertions(+), 79 deletions(-)

diff --git a/block/scsi_ioctl.c b/block/scsi_ioctl.c
index a737562..9e15784 100644
--- a/block/scsi_ioctl.c
+++ b/block/scsi_ioctl.c
@@ -115,85 +115,137 @@ static int sg_emulated_host(struct request_queue *q, int __user *p)
 
 static void blk_set_cmd_filter_defaults(struct blk_cmd_filter *filter)
 {
-	/* Basic read-only commands */
-	__set_bit(TEST_UNIT_READY, filter->read_ok);
-	__set_bit(REQUEST_SENSE, filter->read_ok);
-	__set_bit(READ_6, filter->read_ok);
-	__set_bit(READ_10, filter->read_ok);
-	__set_bit(READ_12, filter->read_ok);
-	__set_bit(READ_16, filter->read_ok);
-	__set_bit(READ_BUFFER, filter->read_ok);
-	__set_bit(READ_DEFECT_DATA, filter->read_ok);
-	__set_bit(READ_CAPACITY, filter->read_ok);
-	__set_bit(READ_LONG, filter->read_ok);
-	__set_bit(INQUIRY, filter->read_ok);
-	__set_bit(MODE_SENSE, filter->read_ok);
-	__set_bit(MODE_SENSE_10, filter->read_ok);
-	__set_bit(LOG_SENSE, filter->read_ok);
-	__set_bit(START_STOP, filter->read_ok);
-	__set_bit(GPCMD_VERIFY_10, filter->read_ok);
-	__set_bit(VERIFY_16, filter->read_ok);
-	__set_bit(REPORT_LUNS, filter->read_ok);
-	__set_bit(SERVICE_ACTION_IN, filter->read_ok);
-	__set_bit(RECEIVE_DIAGNOSTIC, filter->read_ok);
-	__set_bit(MAINTENANCE_IN, filter->read_ok);
-	__set_bit(GPCMD_READ_BUFFER_CAPACITY, filter->read_ok);
-
-	/* Audio CD commands */
-	__set_bit(GPCMD_PLAY_CD, filter->read_ok);
-	__set_bit(GPCMD_PLAY_AUDIO_10, filter->read_ok);
-	__set_bit(GPCMD_PLAY_AUDIO_MSF, filter->read_ok);
-	__set_bit(GPCMD_PLAY_AUDIO_TI, filter->read_ok);
-	__set_bit(GPCMD_PAUSE_RESUME, filter->read_ok);
-
-	/* CD/DVD data reading */
-	__set_bit(GPCMD_READ_CD, filter->read_ok);
-	__set_bit(GPCMD_READ_CD_MSF, filter->read_ok);
-	__set_bit(GPCMD_READ_DISC_INFO, filter->read_ok);
-	__set_bit(GPCMD_READ_CDVD_CAPACITY, filter->read_ok);
-	__set_bit(GPCMD_READ_DVD_STRUCTURE, filter->read_ok);
-	__set_bit(GPCMD_READ_HEADER, filter->read_ok);
-	__set_bit(GPCMD_READ_TRACK_RZONE_INFO, filter->read_ok);
-	__set_bit(GPCMD_READ_SUBCHANNEL, filter->read_ok);
-	__set_bit(GPCMD_READ_TOC_PMA_ATIP, filter->read_ok);
-	__set_bit(GPCMD_REPORT_KEY, filter->read_ok);
-	__set_bit(GPCMD_SCAN, filter->read_ok);
-	__set_bit(GPCMD_GET_CONFIGURATION, filter->read_ok);
-	__set_bit(GPCMD_READ_FORMAT_CAPACITIES, filter->read_ok);
-	__set_bit(GPCMD_GET_EVENT_STATUS_NOTIFICATION, filter->read_ok);
-	__set_bit(GPCMD_GET_PERFORMANCE, filter->read_ok);
-	__set_bit(GPCMD_SEEK, filter->read_ok);
-	__set_bit(GPCMD_STOP_PLAY_SCAN, filter->read_ok);
-
-	/* Basic writing commands */
-	__set_bit(WRITE_6, filter->write_ok);
-	__set_bit(WRITE_10, filter->write_ok);
-	__set_bit(WRITE_VERIFY, filter->write_ok);
-	__set_bit(WRITE_12, filter->write_ok);
-	__set_bit(WRITE_VERIFY_12, filter->write_ok);
-	__set_bit(WRITE_16, filter->write_ok);
-	__set_bit(WRITE_LONG, filter->write_ok);
-	__set_bit(WRITE_LONG_2, filter->write_ok);
-	__set_bit(ERASE, filter->write_ok);
-	__set_bit(GPCMD_MODE_SELECT_10, filter->write_ok);
-	__set_bit(MODE_SELECT, filter->write_ok);
-	__set_bit(LOG_SELECT, filter->write_ok);
-	__set_bit(GPCMD_BLANK, filter->write_ok);
-	__set_bit(GPCMD_CLOSE_TRACK, filter->write_ok);
-	__set_bit(GPCMD_FLUSH_CACHE, filter->write_ok);
-	__set_bit(GPCMD_FORMAT_UNIT, filter->write_ok);
-	__set_bit(GPCMD_REPAIR_RZONE_TRACK, filter->write_ok);
-	__set_bit(GPCMD_RESERVE_RZONE_TRACK, filter->write_ok);
-	__set_bit(GPCMD_SEND_DVD_STRUCTURE, filter->write_ok);
-	__set_bit(GPCMD_SEND_EVENT, filter->write_ok);
-	__set_bit(GPCMD_SEND_KEY, filter->write_ok);
-	__set_bit(GPCMD_SEND_OPC, filter->write_ok);
-	__set_bit(GPCMD_SEND_CUE_SHEET, filter->write_ok);
-	__set_bit(GPCMD_SET_SPEED, filter->write_ok);
-	__set_bit(GPCMD_PREVENT_ALLOW_MEDIUM_REMOVAL, filter->write_ok);
-	__set_bit(GPCMD_LOAD_UNLOAD, filter->write_ok);
-	__set_bit(GPCMD_SET_STREAMING, filter->write_ok);
-	__set_bit(GPCMD_SET_READ_AHEAD, filter->write_ok);
+#define sgio_bitmap_set(cmd, mask, rw) \
+	if ((mask) != 0) __set_bit((cmd), filter->rw##_ok)
+
+#define D (1u << TYPE_DISK)           /* Direct Access Block Device (SBC-3) */
+#define T (1u << TYPE_TAPE)           /* Sequential Access Device (SSC-3) */
+#define L (1u << TYPE_PRINTER)        /* Printer Device (SSC) */
+#define P (1u << TYPE_PROCESSOR)      /* Processor Device (SPC-2) */
+#define W (1u << TYPE_WORM)           /* Write Once Block Device (SBC) */
+#define R (1u << TYPE_ROM)            /* C/DVD Device (MMC-6) */
+#define O (1u << TYPE_MOD)            /* Optical Memory Block Device (SBC) */
+#define M (1u << TYPE_MEDIUM_CHANGER) /* Media Changer Device (SMC-3) */
+#define A (1u << TYPE_RAID)           /* Storage Array Device (SCC-2) */
+#define E (1u << TYPE_ENCLOSURE)      /* SCSI Enclosure Services device (SES-2) */
+#define B (1u << TYPE_RBC)            /* Simplified Direct-Access (Reduced Block) device (RBC) */
+#define K (1u << TYPE_OCRW)           /* Optical Card Reader/Writer device (OCRW) */
+#define V (1u << TYPE_ADC)            /* Automation/Device Interface device (ADC-2) */
+#define F (1u << TYPE_OSD)            /* Object-based Storage Device (OSD-2) */
+
+#define C (1u << TYPE_COMM)           /* Communication devices (obsolete) */
+#define S (1u << TYPE_SCANNER)        /* Scanner device (obsolete) */
+
+	/* control, universal except possibly RBC, read */
+
+	sgio_bitmap_set(0x00, D|T|L|P|W|R|O|M|A|E|B|K|V|F|C|S, read);  /* TEST UNIT READY */
+	sgio_bitmap_set(0x03, D|T|L|P|W|R|O|M|A|E|B|K|V|F|C|S, read);  /* REQUEST SENSE */
+	sgio_bitmap_set(0x12, D|T|L|P|W|R|O|M|A|E|B|K|V|F|C|S, read);  /* INQUIRY */
+	sgio_bitmap_set(0x1A, D|T|L|P|W|R|O|M|A|E|B|K|V|F|C|S, read);  /* MODE SENSE(6) */
+	sgio_bitmap_set(0x1B, D|T|L|  W|R|O|M|A|  B|K|V|F|  S, read);  /* START STOP UNIT */
+	sgio_bitmap_set(0x1C, D|T|B|O|W|R|O|M|A|E|  K|V|F|C|S, read);  /* RECEIVE DIAGNOSTIC RESULTS */
+	sgio_bitmap_set(0x2B, D|T|    W|R|O|M|      K        , read);  /* SEEK(10) */
+	sgio_bitmap_set(0x3C, D|T|B|O|W|R|O|M|A|E|  K|V|F|C|S, read);  /* READ BUFFER */
+	sgio_bitmap_set(0x4D, D|T|L|P|W|R|O|M|A|E|B|K|V|F|C|S, read);  /* LOG SENSE */
+	sgio_bitmap_set(0x5A, D|T|L|P|W|R|O|M|A|E|B|K|V|F|C|S, read);  /* MODE SENSE(10) */
+	sgio_bitmap_set(0x9E, D|T|L|P|W|R|O|M|A|E|B|K|V|F|C|S, read);  /* SERVICE ACTION IN(16) */
+	sgio_bitmap_set(0xA0, D|T|L|P|W|R|O|M|A|E|B|K|V|F|C|S, read);  /* REPORT LUNS */
+	sgio_bitmap_set(0xA3, D|T|L|  W|  O|M|A|E|B|K|V      , read);  /* MAINTENANCE IN */
+
+	/* control, universal, write */
+
+	sgio_bitmap_set(0x15, D|T|L|P|W|R|O|M|A|E|B|K|V|F|C|S, write); /* MODE SELECT(6) */
+	sgio_bitmap_set(0x4C, D|T|L|P|W|R|O|M|A|E|B|K|V|F|C|S, write); /* LOG SELECT */
+	sgio_bitmap_set(0x55, D|T|L|P|W|R|O|M|A|E|B|K|V|F|C|S, write); /* MODE SELECT(10) */
+
+	/* control, write */
+
+	sgio_bitmap_set(0x1E, D|T|    W|R|O|M|      K|  F    , write); /* PREVENT ALLOW MEDIUM REMOVAL */
+
+	/* input */
+
+	sgio_bitmap_set(0x08, D|T|  P|W|  O|              C|S, read);  /* READ(6) */
+	sgio_bitmap_set(0x25, D|      W|R|O|      B|K|      S, read);  /* READ CAPACITY(10) */
+	sgio_bitmap_set(0x28, D|      W|R|O|      B|K|    C|S, read);  /* READ(10) */
+	sgio_bitmap_set(0x2F, D|      W|R|O                  , read);  /* VERIFY(10) */
+	sgio_bitmap_set(0x37, D|          O|M                , read);  /* READ DEFECT DATA(10) */
+	sgio_bitmap_set(0x3E, D|      W|  O                  , read);  /* READ LONG(10) */
+	sgio_bitmap_set(0x88, D|T|    W|  O|      B          , read);  /* READ(16) */
+	sgio_bitmap_set(0x8F, D|T|    W|  O|      B          , read);  /* VERIFY(16) */
+	sgio_bitmap_set(0xA8, D|      W|R|O|              C  , read);  /* READ(12) */
+
+	/* write */
+
+	sgio_bitmap_set(0x04, D|T|L|    R|O                  , write); /* FORMAT UNIT */
+	sgio_bitmap_set(0x0A, D|T|L|P|W|  O|              C|S, write); /* WRITE(6) */
+	sgio_bitmap_set(0x2A, D|      W|R|O|      B|K|    C|S, write); /* WRITE(10) */
+	sgio_bitmap_set(0x2E, D|      W|R|O|      B|K        , write); /* WRITE AND VERIFY(10) */
+	sgio_bitmap_set(0x35, D|      W|R|O|      B|K        , write); /* SYNCHRONIZE CACHE(10) */
+	sgio_bitmap_set(0x3F, D|      W|  O                  , write); /* WRITE LONG(10) */
+	sgio_bitmap_set(0x8A, D|T|    W|  O|      B          , write); /* WRITE(16) */
+	sgio_bitmap_set(0xAA, D|      W|R|O|              C  , write); /* WRITE(12) */
+	sgio_bitmap_set(0xAE, D|      W|  O                  , write); /* WRITE AND VERIFY(12) */
+	sgio_bitmap_set(0xEA, D|      W|  O                  , write); /* WRITE_LONG_2 ?? */
+
+	/* (mostly) MMC */
+
+	sgio_bitmap_set(0x23,           R                    , read);  /* READ FORMAT CAPACITIES */
+	sgio_bitmap_set(0x42, D|        R                    , read);  /* READ SUB-CHANNEL / UNMAP !! */
+	sgio_bitmap_set(0x43,           R                    , read);  /* READ TOC/PMA/ATIP */
+	sgio_bitmap_set(0x44,   T|      R|            V      , read);  /* READ HEADER */
+	sgio_bitmap_set(0x45,           R                    , read);  /* PLAY AUDIO(10) */
+	sgio_bitmap_set(0x46,           R                    , read);  /* GET CONFIGURATION */
+	sgio_bitmap_set(0x47,           R                    , read);  /* PLAY AUDIO MSF */
+	sgio_bitmap_set(0x48, D|        R|        B          , read);  /* PLAY AUDIO TI / SANITIZE !! */
+	sgio_bitmap_set(0x4A,           R                    , read);  /* GET EVENT STATUS NOTIFICATION */
+	sgio_bitmap_set(0x4B,           R                    , read);  /* PAUSE/RESUME */
+	sgio_bitmap_set(0x4E,           R                    , read);  /* STOP PLAY/SCAN */
+	sgio_bitmap_set(0x51, D|        R                    , read);  /* READ DISC INFORMATION / XPWRITE(10) !! */
+	sgio_bitmap_set(0x52,           R                    , read);  /* READ TRACK INFORMATION */
+	sgio_bitmap_set(0x5C,           R                    , read);  /* READ BUFFER CAPACITY */
+	sgio_bitmap_set(0xA4,           R                    , read);  /* REPORT KEY */
+	sgio_bitmap_set(0xAC,           R|O                  , read);  /* GET PERFORMANCE / ERASE !! */
+	sgio_bitmap_set(0xAD,           R                    , read);  /* READ DVD STRUCTURE */
+	sgio_bitmap_set(0xB9,           R                    , read);  /* READ CD MSF */
+	sgio_bitmap_set(0xBA,           R                    , read);  /* SCAN */
+	sgio_bitmap_set(0xBC,           R                    , read);  /* PLAY CD */
+	sgio_bitmap_set(0xBE,           R                    , read);  /* READ CD */
+
+	sgio_bitmap_set(0x53, D|        R                    , write); /* RESERVE TRACK / XDWRITEREAD(10) */
+	sgio_bitmap_set(0x54,           R                    , write); /* SEND OPC INFORMATION */
+	sgio_bitmap_set(0x58,           R                    , write); /* REPAIR TRACK */
+	sgio_bitmap_set(0x5B,           R                    , write); /* CLOSE TRACK/SESSION */
+	sgio_bitmap_set(0x5D,           R                    , write); /* SEND CUE SHEET */
+	sgio_bitmap_set(0xA1, D|        R|        B          , write); /* BLANK / ATA PASS-THROUGH(12) */
+	sgio_bitmap_set(0xA2,           R                    , write); /* SEND EVENT */
+	sgio_bitmap_set(0xA3,           R                    , write); /* SEND KEY */
+	sgio_bitmap_set(0xA6,           R|  M                , write); /* LOAD/UNLOAD C/DVD */
+	sgio_bitmap_set(0xA7,           R                    , write); /* SET READ AHEAD */
+	sgio_bitmap_set(0xB6,           R|  M                , write); /* SET STREAMING */
+	sgio_bitmap_set(0xBB,           R                    , write); /* SET CD SPEED */
+	sgio_bitmap_set(0xBF,           R                    , write); /* SEND DVD STRUCTURE */
+
+	/* (mostly) tape */
+
+	sgio_bitmap_set(0x19,   T                            , write); /* ERASE(6) */
+
+#undef D
+#undef T
+#undef L
+#undef P
+#undef W
+#undef R
+#undef S
+#undef O
+#undef M
+#undef C
+#undef A
+#undef E
+#undef B
+#undef K
+#undef V
+#undef F
+#undef sgio_bitmap_set
 }
 
 int blk_verify_command(struct request_queue *q,
diff --git a/include/scsi/scsi.h b/include/scsi/scsi.h
index 66216c1..b67553f 100644
--- a/include/scsi/scsi.h
+++ b/include/scsi/scsi.h
@@ -324,6 +324,8 @@ static inline int scsi_status_is_good(int status)
 #define TYPE_RAID           0x0c
 #define TYPE_ENCLOSURE      0x0d    /* Enclosure Services Device */
 #define TYPE_RBC	    0x0e
+#define TYPE_OCRW           0x0f
+#define TYPE_ADC            0x10
 #define TYPE_OSD            0x11
 #define TYPE_NO_LUN         0x7f
 
-- 
1.7.1



^ permalink raw reply related	[flat|nested] 73+ messages in thread

* [PATCH v2 03/14] sg_io: use different default filters for each device class
  2013-02-06 15:15 [PATCH v2 00/14] Corrections and customization of the SG_IO command whitelist (CVE-2012-4542) Paolo Bonzini
  2013-02-06 15:15 ` [PATCH v2 01/14] sg_io: pass request_queue to blk_verify_command Paolo Bonzini
  2013-02-06 15:15 ` [PATCH v2 02/14] sg_io: reorganize list of allowed commands Paolo Bonzini
@ 2013-02-06 15:15 ` Paolo Bonzini
  2013-02-06 15:15 ` [PATCH v2 04/14] sg_io: resolve conflicts between commands assigned to multiple classes (CVE-2012-4542) Paolo Bonzini
                   ` (13 subsequent siblings)
  16 siblings, 0 replies; 73+ messages in thread
From: Paolo Bonzini @ 2013-02-06 15:15 UTC (permalink / raw)
  To: linux-kernel; +Cc: Tejun Heo, James E.J. Bottomley, linux-scsi, Jens Axboe

Store the filters in a 256-entry array, and pick an appropriate filter
for SCSI devices.  Apart from SCSI disks, SG_IO is supported for CCISS,
ide-floppy and virtio-blk devices; TYPE_DISK (which is zero, i.e. the
default) is more appropriate for these devices than TYPE_ROM.

This patch already introduces some semantic change, albeit very limited;
in addition to the above change for CCISS/ide-floppy/virtio-blk, a few
commands are now forbidden for devices of type other than TYPE_ROM,
where they are reserved or vendor-specific.

Cc: "James E.J. Bottomley" <JBottomley@parallels.com>
Cc: linux-scsi@vger.kernel.org
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 block/scsi_ioctl.c       |   14 +++++---------
 drivers/scsi/scsi_scan.c |    2 ++
 include/linux/blkdev.h   |    2 +-
 include/scsi/scsi.h      |    1 +
 4 files changed, 9 insertions(+), 10 deletions(-)

diff --git a/block/scsi_ioctl.c b/block/scsi_ioctl.c
index 9e15784..c4c42dd 100644
--- a/block/scsi_ioctl.c
+++ b/block/scsi_ioctl.c
@@ -34,8 +34,8 @@
 #include <scsi/scsi_cmnd.h>
 
 struct blk_cmd_filter {
-	unsigned long read_ok[BLK_SCSI_CMD_PER_LONG];
-	unsigned long write_ok[BLK_SCSI_CMD_PER_LONG];
+	u32 read_ok[BLK_SCSI_MAX_CMDS];
+	u32 write_ok[BLK_SCSI_MAX_CMDS];
 };
 
 static struct blk_cmd_filter blk_default_cmd_filter;
@@ -116,7 +116,7 @@ static int sg_emulated_host(struct request_queue *q, int __user *p)
 static void blk_set_cmd_filter_defaults(struct blk_cmd_filter *filter)
 {
 #define sgio_bitmap_set(cmd, mask, rw) \
-	if ((mask) != 0) __set_bit((cmd), filter->rw##_ok)
+	filter->rw##_ok[(cmd)] |= (mask);
 
 #define D (1u << TYPE_DISK)           /* Direct Access Block Device (SBC-3) */
 #define T (1u << TYPE_TAPE)           /* Sequential Access Device (SSC-3) */
@@ -257,16 +257,12 @@ int blk_verify_command(struct request_queue *q,
 	if (capable(CAP_SYS_RAWIO))
 		return 0;
 
-	/* if there's no filter set, assume we're filtering everything out */
-	if (!filter)
-		return -EPERM;
-
 	/* Anybody who can open the device can do a read-safe command */
-	if (test_bit(cmd[0], filter->read_ok))
+	if (filter->read_ok[cmd[0]] & (1 << q->sgio_type))
 		return 0;
 
 	/* Write-safe commands require a writable open */
-	if (test_bit(cmd[0], filter->write_ok) && has_write_perm)
+	if (has_write_perm && filter->write_ok[cmd[0]] & (1 << q->sgio_type))
 		return 0;
 
 	return -EPERM;
diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
index 3e58b22..86940f3 100644
--- a/drivers/scsi/scsi_scan.c
+++ b/drivers/scsi/scsi_scan.c
@@ -782,6 +782,8 @@ static int scsi_add_lun(struct scsi_device *sdev, unsigned char *inq_result,
 		sdev->removable = (inq_result[1] & 0x80) >> 7;
 	}
 
+	sdev->request_queue->sgio_type = sdev->type;
+
 	switch (sdev->type) {
 	case TYPE_RBC:
 	case TYPE_TAPE:
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 0782336..b376d37 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -256,7 +256,6 @@ struct blk_queue_tag {
 };
 
 #define BLK_SCSI_MAX_CMDS	(256)
-#define BLK_SCSI_CMD_PER_LONG	(BLK_SCSI_MAX_CMDS / (sizeof(long) * 8))
 
 struct queue_limits {
 	unsigned long		bounce_pfn;
@@ -403,6 +402,7 @@ struct request_queue {
 	 */
 	unsigned int		sg_timeout;
 	unsigned int		sg_reserved_size;
+	unsigned char		sgio_type;
 	int			node;
 #ifdef CONFIG_BLK_DEV_IO_TRACE
 	struct blk_trace	*blk_trace;
diff --git a/include/scsi/scsi.h b/include/scsi/scsi.h
index b67553f..06cc93e 100644
--- a/include/scsi/scsi.h
+++ b/include/scsi/scsi.h
@@ -327,6 +327,7 @@ static inline int scsi_status_is_good(int status)
 #define TYPE_OCRW           0x0f
 #define TYPE_ADC            0x10
 #define TYPE_OSD            0x11
+#define TYPE_MAX            0x1f
 #define TYPE_NO_LUN         0x7f
 
 /* SCSI protocols; these are taken from SPC-3 section 7.5 */
-- 
1.7.1



^ permalink raw reply related	[flat|nested] 73+ messages in thread

* [PATCH v2 04/14] sg_io: resolve conflicts between commands assigned to multiple classes (CVE-2012-4542)
  2013-02-06 15:15 [PATCH v2 00/14] Corrections and customization of the SG_IO command whitelist (CVE-2012-4542) Paolo Bonzini
                   ` (2 preceding siblings ...)
  2013-02-06 15:15 ` [PATCH v2 03/14] sg_io: use different default filters for each device class Paolo Bonzini
@ 2013-02-06 15:15 ` Paolo Bonzini
  2013-02-06 15:15 ` [PATCH v2 05/14] sg_io: whitelist a few more commands for rare & obsolete device types Paolo Bonzini
                   ` (12 subsequent siblings)
  16 siblings, 0 replies; 73+ messages in thread
From: Paolo Bonzini @ 2013-02-06 15:15 UTC (permalink / raw)
  To: linux-kernel; +Cc: Tejun Heo, James E.J. Bottomley, linux-scsi, Jens Axboe

Some SCSI commands can be sent to disks via SG_IO even by unprivileged
users.  Unfortunately, some opcodes overlap across SCSI device classes
and have different meanings for different classes.  Four of them can
be used for read-only file descriptors on MMC, but should be limited to
descriptors opened for read-write on SBC:

The current bitmap of allowed commands is designed for MMC devices
(roughly, "play/burn CDs without requiring root").

- READ SUBCHANNEL <-> UNMAP (destructive, but no control on written
  data)

- GET PERFORMANCE <-> ERASE (not really a problem, no one supports
  ERASE anyway)

- READ DISC INFORMATION <-> XPWRITE (not commonly implemented but
  most dangerous)

- PLAY AUDIO TI <-> SANITIZE (a very new command)

To fix this, the series splits the bitmap entries for these four
commands into two entries, one read-only for MMC and one read-write
for the other device classes.

Cc: "James E.J. Bottomley" <JBottomley@parallels.com>
Cc: linux-scsi@vger.kernel.org
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 block/scsi_ioctl.c |   12 ++++++++----
 1 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/block/scsi_ioctl.c b/block/scsi_ioctl.c
index c4c42dd..7ea3428 100644
--- a/block/scsi_ioctl.c
+++ b/block/scsi_ioctl.c
@@ -182,29 +182,33 @@ static void blk_set_cmd_filter_defaults(struct blk_cmd_filter *filter)
 	sgio_bitmap_set(0x2E, D|      W|R|O|      B|K        , write); /* WRITE AND VERIFY(10) */
 	sgio_bitmap_set(0x35, D|      W|R|O|      B|K        , write); /* SYNCHRONIZE CACHE(10) */
 	sgio_bitmap_set(0x3F, D|      W|  O                  , write); /* WRITE LONG(10) */
+	sgio_bitmap_set(0x42, D                              , write); /* UNMAP */
+	sgio_bitmap_set(0x48, D|                  B          , write); /* SANITIZE */
+	sgio_bitmap_set(0x51, D                              , write); /* XPWRITE(10) */
 	sgio_bitmap_set(0x8A, D|T|    W|  O|      B          , write); /* WRITE(16) */
 	sgio_bitmap_set(0xAA, D|      W|R|O|              C  , write); /* WRITE(12) */
+	sgio_bitmap_set(0xAC,             O                  , write); /* ERASE(12) */
 	sgio_bitmap_set(0xAE, D|      W|  O                  , write); /* WRITE AND VERIFY(12) */
 	sgio_bitmap_set(0xEA, D|      W|  O                  , write); /* WRITE_LONG_2 ?? */
 
 	/* (mostly) MMC */
 
 	sgio_bitmap_set(0x23,           R                    , read);  /* READ FORMAT CAPACITIES */
-	sgio_bitmap_set(0x42, D|        R                    , read);  /* READ SUB-CHANNEL / UNMAP !! */
+	sgio_bitmap_set(0x42,           R                    , read);  /* READ SUB-CHANNEL */
 	sgio_bitmap_set(0x43,           R                    , read);  /* READ TOC/PMA/ATIP */
 	sgio_bitmap_set(0x44,   T|      R|            V      , read);  /* READ HEADER */
 	sgio_bitmap_set(0x45,           R                    , read);  /* PLAY AUDIO(10) */
 	sgio_bitmap_set(0x46,           R                    , read);  /* GET CONFIGURATION */
 	sgio_bitmap_set(0x47,           R                    , read);  /* PLAY AUDIO MSF */
-	sgio_bitmap_set(0x48, D|        R|        B          , read);  /* PLAY AUDIO TI / SANITIZE !! */
+	sgio_bitmap_set(0x48,           R                    , read);  /* PLAY AUDIO TI */
 	sgio_bitmap_set(0x4A,           R                    , read);  /* GET EVENT STATUS NOTIFICATION */
 	sgio_bitmap_set(0x4B,           R                    , read);  /* PAUSE/RESUME */
 	sgio_bitmap_set(0x4E,           R                    , read);  /* STOP PLAY/SCAN */
-	sgio_bitmap_set(0x51, D|        R                    , read);  /* READ DISC INFORMATION / XPWRITE(10) !! */
+	sgio_bitmap_set(0x51,           R                    , read);  /* READ DISC INFORMATION */
 	sgio_bitmap_set(0x52,           R                    , read);  /* READ TRACK INFORMATION */
 	sgio_bitmap_set(0x5C,           R                    , read);  /* READ BUFFER CAPACITY */
 	sgio_bitmap_set(0xA4,           R                    , read);  /* REPORT KEY */
-	sgio_bitmap_set(0xAC,           R|O                  , read);  /* GET PERFORMANCE / ERASE !! */
+	sgio_bitmap_set(0xAC,           R                    , read);  /* GET PERFORMANCE */
 	sgio_bitmap_set(0xAD,           R                    , read);  /* READ DVD STRUCTURE */
 	sgio_bitmap_set(0xB9,           R                    , read);  /* READ CD MSF */
 	sgio_bitmap_set(0xBA,           R                    , read);  /* SCAN */
-- 
1.7.1



^ permalink raw reply related	[flat|nested] 73+ messages in thread

* [PATCH v2 05/14] sg_io: whitelist a few more commands for rare & obsolete device types
  2013-02-06 15:15 [PATCH v2 00/14] Corrections and customization of the SG_IO command whitelist (CVE-2012-4542) Paolo Bonzini
                   ` (3 preceding siblings ...)
  2013-02-06 15:15 ` [PATCH v2 04/14] sg_io: resolve conflicts between commands assigned to multiple classes (CVE-2012-4542) Paolo Bonzini
@ 2013-02-06 15:15 ` Paolo Bonzini
  2013-02-06 15:15 ` [PATCH v2 06/14] sg_io: whitelist another command for multimedia devices Paolo Bonzini
                   ` (11 subsequent siblings)
  16 siblings, 0 replies; 73+ messages in thread
From: Paolo Bonzini @ 2013-02-06 15:15 UTC (permalink / raw)
  To: linux-kernel; +Cc: Tejun Heo, James E.J. Bottomley, linux-scsi, Jens Axboe

Start cleaning up the table, moving out of the way four rare & obsolete
device types: printers, communication devices (network cards) and scanners
(both TYPE_PROCESSOR and TYPE_SCANNER).  Add missing commands for these
four device types.

Cc: "James E.J. Bottomley" <JBottomley@parallels.com>
Cc: linux-scsi@vger.kernel.org
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
	v1->v2: added more scanner commands, list taken from SANE

 block/scsi_ioctl.c |   83 +++++++++++++++++++++++++++++++++++++++++++++++-----
 1 files changed, 75 insertions(+), 8 deletions(-)

diff --git a/block/scsi_ioctl.c b/block/scsi_ioctl.c
index 7ea3428..e71cd42 100644
--- a/block/scsi_ioctl.c
+++ b/block/scsi_ioctl.c
@@ -142,7 +142,7 @@ static void blk_set_cmd_filter_defaults(struct blk_cmd_filter *filter)
 	sgio_bitmap_set(0x03, D|T|L|P|W|R|O|M|A|E|B|K|V|F|C|S, read);  /* REQUEST SENSE */
 	sgio_bitmap_set(0x12, D|T|L|P|W|R|O|M|A|E|B|K|V|F|C|S, read);  /* INQUIRY */
 	sgio_bitmap_set(0x1A, D|T|L|P|W|R|O|M|A|E|B|K|V|F|C|S, read);  /* MODE SENSE(6) */
-	sgio_bitmap_set(0x1B, D|T|L|  W|R|O|M|A|  B|K|V|F|  S, read);  /* START STOP UNIT */
+	sgio_bitmap_set(0x1B, D|T|    W|R|O|M|A|  B|K|V|F    , read);  /* START STOP UNIT */
 	sgio_bitmap_set(0x1C, D|T|B|O|W|R|O|M|A|E|  K|V|F|C|S, read);  /* RECEIVE DIAGNOSTIC RESULTS */
 	sgio_bitmap_set(0x2B, D|T|    W|R|O|M|      K        , read);  /* SEEK(10) */
 	sgio_bitmap_set(0x3C, D|T|B|O|W|R|O|M|A|E|  K|V|F|C|S, read);  /* READ BUFFER */
@@ -164,21 +164,21 @@ static void blk_set_cmd_filter_defaults(struct blk_cmd_filter *filter)
 
 	/* input */
 
-	sgio_bitmap_set(0x08, D|T|  P|W|  O|              C|S, read);  /* READ(6) */
-	sgio_bitmap_set(0x25, D|      W|R|O|      B|K|      S, read);  /* READ CAPACITY(10) */
-	sgio_bitmap_set(0x28, D|      W|R|O|      B|K|    C|S, read);  /* READ(10) */
+	sgio_bitmap_set(0x08, D|T|    W|  O                  , read);  /* READ(6) */
+	sgio_bitmap_set(0x25, D|      W|R|O|      B|K        , read);  /* READ CAPACITY(10) */
+	sgio_bitmap_set(0x28, D|      W|R|O|      B|K        , read);  /* READ(10) */
 	sgio_bitmap_set(0x2F, D|      W|R|O                  , read);  /* VERIFY(10) */
 	sgio_bitmap_set(0x37, D|          O|M                , read);  /* READ DEFECT DATA(10) */
 	sgio_bitmap_set(0x3E, D|      W|  O                  , read);  /* READ LONG(10) */
 	sgio_bitmap_set(0x88, D|T|    W|  O|      B          , read);  /* READ(16) */
 	sgio_bitmap_set(0x8F, D|T|    W|  O|      B          , read);  /* VERIFY(16) */
-	sgio_bitmap_set(0xA8, D|      W|R|O|              C  , read);  /* READ(12) */
+	sgio_bitmap_set(0xA8, D|      W|R|O                  , read);  /* READ(12) */
 
 	/* write */
 
 	sgio_bitmap_set(0x04, D|T|L|    R|O                  , write); /* FORMAT UNIT */
-	sgio_bitmap_set(0x0A, D|T|L|P|W|  O|              C|S, write); /* WRITE(6) */
-	sgio_bitmap_set(0x2A, D|      W|R|O|      B|K|    C|S, write); /* WRITE(10) */
+	sgio_bitmap_set(0x0A, D|T|L|  W|  O                  , write); /* WRITE(6) */
+	sgio_bitmap_set(0x2A, D|      W|R|O|      B|K        , write); /* WRITE(10) */
 	sgio_bitmap_set(0x2E, D|      W|R|O|      B|K        , write); /* WRITE AND VERIFY(10) */
 	sgio_bitmap_set(0x35, D|      W|R|O|      B|K        , write); /* SYNCHRONIZE CACHE(10) */
 	sgio_bitmap_set(0x3F, D|      W|  O                  , write); /* WRITE LONG(10) */
@@ -186,11 +186,19 @@ static void blk_set_cmd_filter_defaults(struct blk_cmd_filter *filter)
 	sgio_bitmap_set(0x48, D|                  B          , write); /* SANITIZE */
 	sgio_bitmap_set(0x51, D                              , write); /* XPWRITE(10) */
 	sgio_bitmap_set(0x8A, D|T|    W|  O|      B          , write); /* WRITE(16) */
-	sgio_bitmap_set(0xAA, D|      W|R|O|              C  , write); /* WRITE(12) */
+	sgio_bitmap_set(0xAA, D|      W|R|O                  , write); /* WRITE(12) */
 	sgio_bitmap_set(0xAC,             O                  , write); /* ERASE(12) */
 	sgio_bitmap_set(0xAE, D|      W|  O                  , write); /* WRITE AND VERIFY(12) */
 	sgio_bitmap_set(0xEA, D|      W|  O                  , write); /* WRITE_LONG_2 ?? */
 
+	/* printer */
+
+	sgio_bitmap_set(0x04,     L                          , write); /* FORMAT */
+	sgio_bitmap_set(0x0A,     L                          , write); /* PRINT */
+	sgio_bitmap_set(0x0B,     L                          , write); /* SLEW AND PRINT */
+	sgio_bitmap_set(0x10,     L                          , write); /* SYNCHRONIZE BUFFER */
+	sgio_bitmap_set(0x1B,     L                          , write); /* STOP PRINT */
+
 	/* (mostly) MMC */
 
 	sgio_bitmap_set(0x23,           R                    , read);  /* READ FORMAT CAPACITIES */
@@ -233,6 +241,65 @@ static void blk_set_cmd_filter_defaults(struct blk_cmd_filter *filter)
 
 	sgio_bitmap_set(0x19,   T                            , write); /* ERASE(6) */
 
+	/* communication devices (obsolete) */
+
+	sgio_bitmap_set(0x08,                             C  , write); /* GET MESSAGE(6) */
+	sgio_bitmap_set(0x0A,                             C  , write); /* SEND MESSAGE(6) */
+	sgio_bitmap_set(0x28,                             C  , write); /* GET MESSAGE(10) */
+	sgio_bitmap_set(0x2A,                             C  , write); /* SEND MESSAGE(10) */
+	sgio_bitmap_set(0xA8,                             C  , write); /* GET MESSAGE(12) */
+	sgio_bitmap_set(0xAA,                             C  , write); /* SEND MESSAGE(12) */
+
+	/*
+	 * scanners (obsolete).  Some scanners use TYPE_PROCESSOR,
+	 * allow both flags for all commands.
+	 */
+
+	sgio_bitmap_set(0x08,       P|                      S, read);  /* RECEIVE */
+	sgio_bitmap_set(0x0A,       P|                      S, write); /* SEND(6) */
+	sgio_bitmap_set(0x16,       P|                      S, write); /* RESERVE UNIT */
+	sgio_bitmap_set(0x17,       P|                      S, write); /* RELEASE UNIT */
+	sgio_bitmap_set(0x1B,       P|                      S, write); /* SCAN */
+	sgio_bitmap_set(0x1D,       P|                      S, write); /* SEND DIAGNOSTIC */
+	sgio_bitmap_set(0x24,       P|                      S, write); /* SET WINDOW */
+	sgio_bitmap_set(0x25,       P|                      S, write); /* GET WINDOW */
+	sgio_bitmap_set(0x08,       P|                      S, read);  /* RECEIVE */
+	sgio_bitmap_set(0x0A,       P|                      S, write); /* SEND(6) */
+	sgio_bitmap_set(0x31,       P|                      S, write); /* OBJECT POSITION */
+	sgio_bitmap_set(0x34,       P|                      S, write); /* GET DATA BUFFER STATUS */
+
+	/*
+	 * scanner vendor-specific commands, taken from SANE.
+	 * SG_IO for TYPE_SCANNER actually is not subject to
+	 * the whitelist, but let's be complete.  TYPE_PROCESSOR
+	 * has always been subject to the whitelist, so do not
+	 * include the P flag here.
+	 */
+	sgio_bitmap_set(0x04,                               S, write); /* Mustek AREA AND WINDOWS */
+	sgio_bitmap_set(0x06,                               S, write); /* Xerox ABORT */
+	sgio_bitmap_set(0x09,                               S, write); /* Vendor specific ??? */
+	sgio_bitmap_set(0x0C,                               S, write); /* Vendor specific ??? */
+	sgio_bitmap_set(0x0E,                               S, write); /* Vendor specific ??? */
+	sgio_bitmap_set(0x0F,                               S, write); /* Mustek GET IMAGE STATUS */
+	sgio_bitmap_set(0x10,                               S, write); /* Mustek ADF AND BACKTRACK */
+	sgio_bitmap_set(0x11,                               S, write); /* Mustek CCD DISTANCE */
+	sgio_bitmap_set(0x1C,                               S, write); /* Vendor specific ??? */
+	sgio_bitmap_set(0x29,                               S, write); /* Xerox READ IMAGE */
+	sgio_bitmap_set(0x55,                               S, write); /* Mustek LOOKUP TABLE */
+	sgio_bitmap_set(0x5E,                               S, write); /* Umax GET LAMP STATUS */
+	sgio_bitmap_set(0x5F,                               S, write); /* Umax SET LAMP STATUS */
+	sgio_bitmap_set(0xA2,                               S, write); /* Coolscan AUTO FOCUS */
+	sgio_bitmap_set(0xC0,                               S, write); /* Fujitsu SET SUBWINDOW */
+	sgio_bitmap_set(0xC1,                               S, write); /* Fujitsu ENDORSER */
+	sgio_bitmap_set(0xC2,                               S, write); /* Fujitsu GET HW STATUS */
+	sgio_bitmap_set(0xD5,                               S, write); /* Canon GET SCAN MODE */
+	sgio_bitmap_set(0xD6,                               S, write); /* Canon SET SCAN MODE */
+	sgio_bitmap_set(0xD8,                               S, write); /* Canon CANCEL */
+	sgio_bitmap_set(0xE0,                               S, write); /* Panasonic GET ADJUST DATA */
+	sgio_bitmap_set(0xE1,                               S, write); /* Panasonic SET TIMEOUT */
+	sgio_bitmap_set(0xE5,                               S, write); /* Canon SET SCAN MODE */
+	sgio_bitmap_set(0xF1,                               S, write); /* Fujitsu SCANNER CONTROL */
+
 #undef D
 #undef T
 #undef L
-- 
1.7.1



^ permalink raw reply related	[flat|nested] 73+ messages in thread

* [PATCH v2 06/14] sg_io: whitelist another command for multimedia devices
  2013-02-06 15:15 [PATCH v2 00/14] Corrections and customization of the SG_IO command whitelist (CVE-2012-4542) Paolo Bonzini
                   ` (4 preceding siblings ...)
  2013-02-06 15:15 ` [PATCH v2 05/14] sg_io: whitelist a few more commands for rare & obsolete device types Paolo Bonzini
@ 2013-02-06 15:15 ` Paolo Bonzini
  2013-02-06 15:15 ` [PATCH v2 07/14] sg_io: whitelist a few more commands for media changers Paolo Bonzini
                   ` (10 subsequent siblings)
  16 siblings, 0 replies; 73+ messages in thread
From: Paolo Bonzini @ 2013-02-06 15:15 UTC (permalink / raw)
  To: linux-kernel; +Cc: Tejun Heo, James E.J. Bottomley, linux-scsi, Jens Axboe

Three MMC commands were never included: PLAY AUDIO(12), SERVICE ACTION
IN(12), MECHANISM STATUS.  Add MECHANISM STATUS, the only one that has
not been obsoleted in recent versions of the standard.

Cc: "James E.J. Bottomley" <JBottomley@parallels.com>
Cc: linux-scsi@vger.kernel.org
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
	v1->v2: leave out PLAY AUDIO(12), SERVICE ACTION IN(12)

 block/scsi_ioctl.c |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/block/scsi_ioctl.c b/block/scsi_ioctl.c
index e71cd42..fa2a1fc 100644
--- a/block/scsi_ioctl.c
+++ b/block/scsi_ioctl.c
@@ -221,6 +221,7 @@ static void blk_set_cmd_filter_defaults(struct blk_cmd_filter *filter)
 	sgio_bitmap_set(0xB9,           R                    , read);  /* READ CD MSF */
 	sgio_bitmap_set(0xBA,           R                    , read);  /* SCAN */
 	sgio_bitmap_set(0xBC,           R                    , read);  /* PLAY CD */
+	sgio_bitmap_set(0xBD,           R                    , read);  /* MECHANISM STATUS */
 	sgio_bitmap_set(0xBE,           R                    , read);  /* READ CD */
 
 	sgio_bitmap_set(0x53, D|        R                    , write); /* RESERVE TRACK / XDWRITEREAD(10) */
-- 
1.7.1



^ permalink raw reply related	[flat|nested] 73+ messages in thread

* [PATCH v2 07/14] sg_io: whitelist a few more commands for media changers
  2013-02-06 15:15 [PATCH v2 00/14] Corrections and customization of the SG_IO command whitelist (CVE-2012-4542) Paolo Bonzini
                   ` (5 preceding siblings ...)
  2013-02-06 15:15 ` [PATCH v2 06/14] sg_io: whitelist another command for multimedia devices Paolo Bonzini
@ 2013-02-06 15:15 ` Paolo Bonzini
  2013-02-06 15:15 ` [PATCH v2 08/14] sg_io: whitelist a few more commands for tapes Paolo Bonzini
                   ` (9 subsequent siblings)
  16 siblings, 0 replies; 73+ messages in thread
From: Paolo Bonzini @ 2013-02-06 15:15 UTC (permalink / raw)
  To: linux-kernel; +Cc: Tejun Heo, James E.J. Bottomley, linux-scsi, Jens Axboe

Besides CD-ROMs, three more device types are interesting for SG_IO:
media changers, tapes and of course disks.

Starting with this patch, we will whitelist a few more commands for
these devices.  For media changers, enable "INITIALIZE ELEMENT STATUS"
and "REQUEST VOLUME ELEMENT ADDRESS".  A few changer-specific commands
were already enabled by chance because they overlapped commands that
are valid for other classes: "EXCHANGE MEDIUM" and "SEND VOLUME TAG",
"INITIALIZE ELEMENT STATUS WITH RANGE".

Cc: "James E.J. Bottomley" <JBottomley@parallels.com>
Cc: linux-scsi@vger.kernel.org
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 block/scsi_ioctl.c |   21 ++++++++++++++++-----
 1 files changed, 16 insertions(+), 5 deletions(-)

diff --git a/block/scsi_ioctl.c b/block/scsi_ioctl.c
index fa2a1fc..8cda426 100644
--- a/block/scsi_ioctl.c
+++ b/block/scsi_ioctl.c
@@ -142,9 +142,9 @@ static void blk_set_cmd_filter_defaults(struct blk_cmd_filter *filter)
 	sgio_bitmap_set(0x03, D|T|L|P|W|R|O|M|A|E|B|K|V|F|C|S, read);  /* REQUEST SENSE */
 	sgio_bitmap_set(0x12, D|T|L|P|W|R|O|M|A|E|B|K|V|F|C|S, read);  /* INQUIRY */
 	sgio_bitmap_set(0x1A, D|T|L|P|W|R|O|M|A|E|B|K|V|F|C|S, read);  /* MODE SENSE(6) */
-	sgio_bitmap_set(0x1B, D|T|    W|R|O|M|A|  B|K|V|F    , read);  /* START STOP UNIT */
+	sgio_bitmap_set(0x1B, D|T|    W|R|O|  A|  B|K|V|F    , read);  /* START STOP UNIT */
 	sgio_bitmap_set(0x1C, D|T|B|O|W|R|O|M|A|E|  K|V|F|C|S, read);  /* RECEIVE DIAGNOSTIC RESULTS */
-	sgio_bitmap_set(0x2B, D|T|    W|R|O|M|      K        , read);  /* SEEK(10) */
+	sgio_bitmap_set(0x2B, D|T|    W|R|O|        K        , read);  /* SEEK(10) */
 	sgio_bitmap_set(0x3C, D|T|B|O|W|R|O|M|A|E|  K|V|F|C|S, read);  /* READ BUFFER */
 	sgio_bitmap_set(0x4D, D|T|L|P|W|R|O|M|A|E|B|K|V|F|C|S, read);  /* LOG SENSE */
 	sgio_bitmap_set(0x5A, D|T|L|P|W|R|O|M|A|E|B|K|V|F|C|S, read);  /* MODE SENSE(10) */
@@ -168,7 +168,7 @@ static void blk_set_cmd_filter_defaults(struct blk_cmd_filter *filter)
 	sgio_bitmap_set(0x25, D|      W|R|O|      B|K        , read);  /* READ CAPACITY(10) */
 	sgio_bitmap_set(0x28, D|      W|R|O|      B|K        , read);  /* READ(10) */
 	sgio_bitmap_set(0x2F, D|      W|R|O                  , read);  /* VERIFY(10) */
-	sgio_bitmap_set(0x37, D|          O|M                , read);  /* READ DEFECT DATA(10) */
+	sgio_bitmap_set(0x37, D|          O                  , read);  /* READ DEFECT DATA(10) */
 	sgio_bitmap_set(0x3E, D|      W|  O                  , read);  /* READ LONG(10) */
 	sgio_bitmap_set(0x88, D|T|    W|  O|      B          , read);  /* READ(16) */
 	sgio_bitmap_set(0x8F, D|T|    W|  O|      B          , read);  /* VERIFY(16) */
@@ -199,6 +199,17 @@ static void blk_set_cmd_filter_defaults(struct blk_cmd_filter *filter)
 	sgio_bitmap_set(0x10,     L                          , write); /* SYNCHRONIZE BUFFER */
 	sgio_bitmap_set(0x1B,     L                          , write); /* STOP PRINT */
 
+	/* media changer */
+
+	sgio_bitmap_set(0x07,               M                , read);  /* INITIALIZE ELEMENT STATUS */
+	sgio_bitmap_set(0x1B,               M                , read);  /* OPEN/CLOSE IMPORT/EXPORT ELEMENT */
+	sgio_bitmap_set(0x2B,               M                , read);  /* POSITION TO ELEMENT */
+	sgio_bitmap_set(0x37,               M                , read);  /* INITIALIZE ELEMENT STATUS WITH RANGE */
+	sgio_bitmap_set(0xB5,               M                , read);  /* REQUEST VOLUME ELEMENT ADDRESS */
+
+	sgio_bitmap_set(0xA6,               M                , write); /* EXCHANGE MEDIUM */
+	sgio_bitmap_set(0xB6,               M                , write); /* SEND VOLUME TAG */
+
 	/* (mostly) MMC */
 
 	sgio_bitmap_set(0x23,           R                    , read);  /* READ FORMAT CAPACITIES */
@@ -232,9 +243,9 @@ static void blk_set_cmd_filter_defaults(struct blk_cmd_filter *filter)
 	sgio_bitmap_set(0xA1, D|        R|        B          , write); /* BLANK / ATA PASS-THROUGH(12) */
 	sgio_bitmap_set(0xA2,           R                    , write); /* SEND EVENT */
 	sgio_bitmap_set(0xA3,           R                    , write); /* SEND KEY */
-	sgio_bitmap_set(0xA6,           R|  M                , write); /* LOAD/UNLOAD C/DVD */
+	sgio_bitmap_set(0xA6,           R                    , write); /* LOAD/UNLOAD C/DVD */
 	sgio_bitmap_set(0xA7,           R                    , write); /* SET READ AHEAD */
-	sgio_bitmap_set(0xB6,           R|  M                , write); /* SET STREAMING */
+	sgio_bitmap_set(0xB6,           R                    , write); /* SET STREAMING */
 	sgio_bitmap_set(0xBB,           R                    , write); /* SET CD SPEED */
 	sgio_bitmap_set(0xBF,           R                    , write); /* SEND DVD STRUCTURE */
 
-- 
1.7.1



^ permalink raw reply related	[flat|nested] 73+ messages in thread

* [PATCH v2 08/14] sg_io: whitelist a few more commands for tapes
  2013-02-06 15:15 [PATCH v2 00/14] Corrections and customization of the SG_IO command whitelist (CVE-2012-4542) Paolo Bonzini
                   ` (6 preceding siblings ...)
  2013-02-06 15:15 ` [PATCH v2 07/14] sg_io: whitelist a few more commands for media changers Paolo Bonzini
@ 2013-02-06 15:15 ` Paolo Bonzini
  2013-02-06 15:15 ` [PATCH v2 09/14] sg_io: whitelist a few more commands for disks Paolo Bonzini
                   ` (8 subsequent siblings)
  16 siblings, 0 replies; 73+ messages in thread
From: Paolo Bonzini @ 2013-02-06 15:15 UTC (permalink / raw)
  To: linux-kernel; +Cc: Tejun Heo, James E.J. Bottomley, linux-scsi, Jens Axboe

Tapes have no problematic overlap, but quite a few commands
are missing that are useful when operating tapes with /dev/sg.
This patch adds them.

START STOP UNIT, FORMAT UNIT and SEEK(10) have similar meanings
but different names for tapes, so move them to the tape section
of the whitelist.

Cc: "James E.J. Bottomley" <JBottomley@parallels.com>
Cc: linux-scsi@vger.kernel.org
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 block/scsi_ioctl.c |   30 +++++++++++++++++++++++++-----
 1 files changed, 25 insertions(+), 5 deletions(-)

diff --git a/block/scsi_ioctl.c b/block/scsi_ioctl.c
index 8cda426..a9c2caf 100644
--- a/block/scsi_ioctl.c
+++ b/block/scsi_ioctl.c
@@ -142,9 +142,9 @@ static void blk_set_cmd_filter_defaults(struct blk_cmd_filter *filter)
 	sgio_bitmap_set(0x03, D|T|L|P|W|R|O|M|A|E|B|K|V|F|C|S, read);  /* REQUEST SENSE */
 	sgio_bitmap_set(0x12, D|T|L|P|W|R|O|M|A|E|B|K|V|F|C|S, read);  /* INQUIRY */
 	sgio_bitmap_set(0x1A, D|T|L|P|W|R|O|M|A|E|B|K|V|F|C|S, read);  /* MODE SENSE(6) */
-	sgio_bitmap_set(0x1B, D|T|    W|R|O|  A|  B|K|V|F    , read);  /* START STOP UNIT */
+	sgio_bitmap_set(0x1B, D|      W|R|O|  A|  B|K|  F    , read);  /* START STOP UNIT */
 	sgio_bitmap_set(0x1C, D|T|B|O|W|R|O|M|A|E|  K|V|F|C|S, read);  /* RECEIVE DIAGNOSTIC RESULTS */
-	sgio_bitmap_set(0x2B, D|T|    W|R|O|        K        , read);  /* SEEK(10) */
+	sgio_bitmap_set(0x2B, D|      W|R|O|        K        , read);  /* SEEK(10) */
 	sgio_bitmap_set(0x3C, D|T|B|O|W|R|O|M|A|E|  K|V|F|C|S, read);  /* READ BUFFER */
 	sgio_bitmap_set(0x4D, D|T|L|P|W|R|O|M|A|E|B|K|V|F|C|S, read);  /* LOG SENSE */
 	sgio_bitmap_set(0x5A, D|T|L|P|W|R|O|M|A|E|B|K|V|F|C|S, read);  /* MODE SENSE(10) */
@@ -176,8 +176,8 @@ static void blk_set_cmd_filter_defaults(struct blk_cmd_filter *filter)
 
 	/* write */
 
-	sgio_bitmap_set(0x04, D|T|L|    R|O                  , write); /* FORMAT UNIT */
-	sgio_bitmap_set(0x0A, D|T|L|  W|  O                  , write); /* WRITE(6) */
+	sgio_bitmap_set(0x04, D|        R|O                  , write); /* FORMAT UNIT */
+	sgio_bitmap_set(0x0A, D|T|    W|  O                  , write); /* WRITE(6) */
 	sgio_bitmap_set(0x2A, D|      W|R|O|      B|K        , write); /* WRITE(10) */
 	sgio_bitmap_set(0x2E, D|      W|R|O|      B|K        , write); /* WRITE AND VERIFY(10) */
 	sgio_bitmap_set(0x35, D|      W|R|O|      B|K        , write); /* SYNCHRONIZE CACHE(10) */
@@ -215,7 +215,7 @@ static void blk_set_cmd_filter_defaults(struct blk_cmd_filter *filter)
 	sgio_bitmap_set(0x23,           R                    , read);  /* READ FORMAT CAPACITIES */
 	sgio_bitmap_set(0x42,           R                    , read);  /* READ SUB-CHANNEL */
 	sgio_bitmap_set(0x43,           R                    , read);  /* READ TOC/PMA/ATIP */
-	sgio_bitmap_set(0x44,   T|      R|            V      , read);  /* READ HEADER */
+	sgio_bitmap_set(0x44,           R                    , read);  /* READ HEADER */
 	sgio_bitmap_set(0x45,           R                    , read);  /* PLAY AUDIO(10) */
 	sgio_bitmap_set(0x46,           R                    , read);  /* GET CONFIGURATION */
 	sgio_bitmap_set(0x47,           R                    , read);  /* PLAY AUDIO MSF */
@@ -251,7 +251,27 @@ static void blk_set_cmd_filter_defaults(struct blk_cmd_filter *filter)
 
 	/* (mostly) tape */
 
+	sgio_bitmap_set(0x01,   T                            , read);  /* REWIND */
+	sgio_bitmap_set(0x05,   T                            , read);  /* READ BLOCK LIMITS */
+	sgio_bitmap_set(0x0F,   T                            , read);  /* READ REVERSE(6) */
+	sgio_bitmap_set(0x13,   T                            , read);  /* VERIFY(6) */
+	sgio_bitmap_set(0x1B,   T|                    V      , read);  /* LOAD UNLOAD */
+	sgio_bitmap_set(0x2B,   T                            , read);  /* LOCATE(10) */
+	sgio_bitmap_set(0x34,   T                            , read);  /* READ POSITION */
+	sgio_bitmap_set(0x44,   T|                    V      , read);  /* REPORT DENSITY SUPPORT */
+	sgio_bitmap_set(0x81,   T                            , read);  /* READ REVERSE(16) */
+	sgio_bitmap_set(0x92,   T                            , read);  /* LOCATE(16) */
+
+	sgio_bitmap_set(0x04,   T                            , write); /* FORMAT MEDIUM */
+	sgio_bitmap_set(0x0B,   T                            , write); /* SET CAPACITY */
+	sgio_bitmap_set(0x10,   T                            , write); /* WRITE FILEMARKS(6) */
+	sgio_bitmap_set(0x11,   T                            , write); /* SPACE(6) */
+	sgio_bitmap_set(0x14,   T|L                          , write); /* RECOVER BUFFERED DATA */
 	sgio_bitmap_set(0x19,   T                            , write); /* ERASE(6) */
+	sgio_bitmap_set(0x80,   T                            , write); /* WRITE FILEMARKS(16) */
+	sgio_bitmap_set(0x82,   T                            , write); /* ALLOW OVERWRITE */
+	sgio_bitmap_set(0x91,   T                            , write); /* SPACE(16) */
+	sgio_bitmap_set(0x93,   T                            , write); /* ERASE(16) */
 
 	/* communication devices (obsolete) */
 
-- 
1.7.1



^ permalink raw reply related	[flat|nested] 73+ messages in thread

* [PATCH v2 09/14] sg_io: whitelist a few more commands for disks
  2013-02-06 15:15 [PATCH v2 00/14] Corrections and customization of the SG_IO command whitelist (CVE-2012-4542) Paolo Bonzini
                   ` (7 preceding siblings ...)
  2013-02-06 15:15 ` [PATCH v2 08/14] sg_io: whitelist a few more commands for tapes Paolo Bonzini
@ 2013-02-06 15:15 ` Paolo Bonzini
  2013-02-06 15:15 ` [PATCH v2 10/14] sg_io: whitelist a few obsolete commands Paolo Bonzini
                   ` (7 subsequent siblings)
  16 siblings, 0 replies; 73+ messages in thread
From: Paolo Bonzini @ 2013-02-06 15:15 UTC (permalink / raw)
  To: linux-kernel; +Cc: Tejun Heo, James E.J. Bottomley, linux-scsi, Jens Axboe

This adds missing commands to the table from SBC and related standards.
Only commands that affect the medium are added.  Commands that affect
other state of the LUN are all privileged, with the sole exception of START
STOP UNIT (which has always been allowed for all file descriptors.  I do not
really agree with that and it's probably an artifact of when /dev/cdrom had
r--r--r-- permissions, but I'm not trying to change that.

I left these out after discussion with Tejun who prefers not to whitelist
obsolete or really rarely used commands:

    sgio_bitmap_set(0x29, D|      W|R|O                  , read);  /* READ GENERATION */
    sgio_bitmap_set(0x34, D|      W|  O|        K        , read);  /* PRE-FETCH(10) */
    sgio_bitmap_set(0x90, D|      W|  O|      B          , read);  /* PRE-FETCH(16) */

    sgio_bitmap_set(0x07, D|      W|  O                  , write); /* REASSIGN BLOCKS */
    sgio_bitmap_set(0x2C, D|        R|O                  , write); /* ERASE(10) */
    sgio_bitmap_set(0x8B, D                              , write); /* ORWRITE */

Cc: "James E.J. Bottomley" <JBottomley@parallels.com>
Cc: linux-scsi@vger.kernel.org
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
	v1->v2: leave out the commands listed in the commit message

 block/scsi_ioctl.c |   17 +++++++++++++++--
 1 files changed, 15 insertions(+), 2 deletions(-)

diff --git a/block/scsi_ioctl.c b/block/scsi_ioctl.c
index a9c2caf..e100ee3 100644
--- a/block/scsi_ioctl.c
+++ b/block/scsi_ioctl.c
@@ -167,12 +167,15 @@ static void blk_set_cmd_filter_defaults(struct blk_cmd_filter *filter)
 	sgio_bitmap_set(0x08, D|T|    W|  O                  , read);  /* READ(6) */
 	sgio_bitmap_set(0x25, D|      W|R|O|      B|K        , read);  /* READ CAPACITY(10) */
 	sgio_bitmap_set(0x28, D|      W|R|O|      B|K        , read);  /* READ(10) */
+	sgio_bitmap_set(0x2D,             O                  , read);  /* READ UPDATED BLOCK */
 	sgio_bitmap_set(0x2F, D|      W|R|O                  , read);  /* VERIFY(10) */
 	sgio_bitmap_set(0x37, D|          O                  , read);  /* READ DEFECT DATA(10) */
 	sgio_bitmap_set(0x3E, D|      W|  O                  , read);  /* READ LONG(10) */
 	sgio_bitmap_set(0x88, D|T|    W|  O|      B          , read);  /* READ(16) */
 	sgio_bitmap_set(0x8F, D|T|    W|  O|      B          , read);  /* VERIFY(16) */
 	sgio_bitmap_set(0xA8, D|      W|R|O                  , read);  /* READ(12) */
+	sgio_bitmap_set(0xAF, D|      W|  O                  , read);  /* VERIFY(12) */
+	sgio_bitmap_set(0xB7, D|          O                  , read);  /* READ DEFECT DATA(12) */
 
 	/* write */
 
@@ -181,11 +184,21 @@ static void blk_set_cmd_filter_defaults(struct blk_cmd_filter *filter)
 	sgio_bitmap_set(0x2A, D|      W|R|O|      B|K        , write); /* WRITE(10) */
 	sgio_bitmap_set(0x2E, D|      W|R|O|      B|K        , write); /* WRITE AND VERIFY(10) */
 	sgio_bitmap_set(0x35, D|      W|R|O|      B|K        , write); /* SYNCHRONIZE CACHE(10) */
+	sgio_bitmap_set(0x38,         W|  O|        K        , write); /* MEDIUM SCAN */
+	sgio_bitmap_set(0x3D,             O                  , write); /* UPDATE BLOCK */
 	sgio_bitmap_set(0x3F, D|      W|  O                  , write); /* WRITE LONG(10) */
+	sgio_bitmap_set(0x41, D                              , write); /* WRITE SAME(10) */
 	sgio_bitmap_set(0x42, D                              , write); /* UNMAP */
 	sgio_bitmap_set(0x48, D|                  B          , write); /* SANITIZE */
 	sgio_bitmap_set(0x51, D                              , write); /* XPWRITE(10) */
+	sgio_bitmap_set(0x53, D                              , write); /* XDWRITEREAD(10) */
+	sgio_bitmap_set(0x85, D|                  B          , write); /* ATA PASS-THROUGH(16) */
+	sgio_bitmap_set(0x89, D                              , write); /* COMPARE AND WRITE */
 	sgio_bitmap_set(0x8A, D|T|    W|  O|      B          , write); /* WRITE(16) */
+	sgio_bitmap_set(0x8E, D|      W|  O|      B          , write); /* WRITE AND VERIFY(16) */
+	sgio_bitmap_set(0x91, D|      W|  O|      B          , write); /* SYNCHRONIZE CACHE(16) */
+	sgio_bitmap_set(0x93, D                              , write); /* WRITE SAME(16) */
+	sgio_bitmap_set(0xA1, D|                  B          , write); /* ATA PASS-THROUGH(12) */
 	sgio_bitmap_set(0xAA, D|      W|R|O                  , write); /* WRITE(12) */
 	sgio_bitmap_set(0xAC,             O                  , write); /* ERASE(12) */
 	sgio_bitmap_set(0xAE, D|      W|  O                  , write); /* WRITE AND VERIFY(12) */
@@ -235,12 +248,12 @@ static void blk_set_cmd_filter_defaults(struct blk_cmd_filter *filter)
 	sgio_bitmap_set(0xBD,           R                    , read);  /* MECHANISM STATUS */
 	sgio_bitmap_set(0xBE,           R                    , read);  /* READ CD */
 
-	sgio_bitmap_set(0x53, D|        R                    , write); /* RESERVE TRACK / XDWRITEREAD(10) */
+	sgio_bitmap_set(0x53,           R                    , write); /* RESERVE TRACK */
 	sgio_bitmap_set(0x54,           R                    , write); /* SEND OPC INFORMATION */
 	sgio_bitmap_set(0x58,           R                    , write); /* REPAIR TRACK */
 	sgio_bitmap_set(0x5B,           R                    , write); /* CLOSE TRACK/SESSION */
 	sgio_bitmap_set(0x5D,           R                    , write); /* SEND CUE SHEET */
-	sgio_bitmap_set(0xA1, D|        R|        B          , write); /* BLANK / ATA PASS-THROUGH(12) */
+	sgio_bitmap_set(0xA1,           R                    , write); /* BLANK */
 	sgio_bitmap_set(0xA2,           R                    , write); /* SEND EVENT */
 	sgio_bitmap_set(0xA3,           R                    , write); /* SEND KEY */
 	sgio_bitmap_set(0xA6,           R                    , write); /* LOAD/UNLOAD C/DVD */
-- 
1.7.1



^ permalink raw reply related	[flat|nested] 73+ messages in thread

* [PATCH v2 10/14] sg_io: whitelist a few obsolete commands
  2013-02-06 15:15 [PATCH v2 00/14] Corrections and customization of the SG_IO command whitelist (CVE-2012-4542) Paolo Bonzini
                   ` (8 preceding siblings ...)
  2013-02-06 15:15 ` [PATCH v2 09/14] sg_io: whitelist a few more commands for disks Paolo Bonzini
@ 2013-02-06 15:15 ` Paolo Bonzini
  2013-02-06 15:15 ` [PATCH v2 11/14] sg_io: mark blk_set_cmd_filter_defaults as __init Paolo Bonzini
                   ` (6 subsequent siblings)
  16 siblings, 0 replies; 73+ messages in thread
From: Paolo Bonzini @ 2013-02-06 15:15 UTC (permalink / raw)
  To: linux-kernel; +Cc: Tejun Heo, James E.J. Bottomley, linux-scsi, Jens Axboe

These are added to their own section of the table, together with SEEK(10)
which has always been permitted.

Cc: "James E.J. Bottomley" <JBottomley@parallels.com>
Cc: linux-scsi@vger.kernel.org
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 block/scsi_ioctl.c |   22 +++++++++++++++++++++-
 1 files changed, 21 insertions(+), 1 deletions(-)

diff --git a/block/scsi_ioctl.c b/block/scsi_ioctl.c
index e100ee3..4942314 100644
--- a/block/scsi_ioctl.c
+++ b/block/scsi_ioctl.c
@@ -144,7 +144,6 @@ static void blk_set_cmd_filter_defaults(struct blk_cmd_filter *filter)
 	sgio_bitmap_set(0x1A, D|T|L|P|W|R|O|M|A|E|B|K|V|F|C|S, read);  /* MODE SENSE(6) */
 	sgio_bitmap_set(0x1B, D|      W|R|O|  A|  B|K|  F    , read);  /* START STOP UNIT */
 	sgio_bitmap_set(0x1C, D|T|B|O|W|R|O|M|A|E|  K|V|F|C|S, read);  /* RECEIVE DIAGNOSTIC RESULTS */
-	sgio_bitmap_set(0x2B, D|      W|R|O|        K        , read);  /* SEEK(10) */
 	sgio_bitmap_set(0x3C, D|T|B|O|W|R|O|M|A|E|  K|V|F|C|S, read);  /* READ BUFFER */
 	sgio_bitmap_set(0x4D, D|T|L|P|W|R|O|M|A|E|B|K|V|F|C|S, read);  /* LOG SENSE */
 	sgio_bitmap_set(0x5A, D|T|L|P|W|R|O|M|A|E|B|K|V|F|C|S, read);  /* MODE SENSE(10) */
@@ -286,6 +285,27 @@ static void blk_set_cmd_filter_defaults(struct blk_cmd_filter *filter)
 	sgio_bitmap_set(0x91,   T                            , write); /* SPACE(16) */
 	sgio_bitmap_set(0x93,   T                            , write); /* ERASE(16) */
 
+	/* various obsolete */
+
+	sgio_bitmap_set(0x0B, D|      W|R|O                  , read);  /* SEEK(6) */
+	sgio_bitmap_set(0x2B, D|      W|R|O|        K        , read);  /* SEEK(10) */
+	sgio_bitmap_set(0x30, D|      W|R|O                  , read);  /* SEARCH DATA HIGH(10) */
+	sgio_bitmap_set(0x31, D|      W|R|O                  , read);  /* SEARCH DATA EQUAL(10) */
+	sgio_bitmap_set(0x32, D|      W|R|O                  , read);  /* SEARCH DATA LOW(10) */
+	sgio_bitmap_set(0x39, D|T|L|P|W|R|O|        K        , read);  /* COMPARE */
+	sgio_bitmap_set(0x52, D                              , read);  /* XDREAD(10) */
+	sgio_bitmap_set(0xB0,         W|R|O                  , read);  /* SEARCH DATA HIGH(12) */
+	sgio_bitmap_set(0xB1,         W|R|O                  , read);  /* SEARCH DATA EQUAL(12) */
+	sgio_bitmap_set(0xB2,         W|R|O                  , read);  /* SEARCH DATA LOW(12) */
+	sgio_bitmap_set(0xB4, D|T|    W|R|O                  , read);  /* READ ELEMENT STATUS ATTACHED */
+	sgio_bitmap_set(0xB8,   T|    W|R|O|M                , read);  /* READ ELEMENT STATUS */
+
+	sgio_bitmap_set(0x01, D|      W|R|O|M                , write); /* REZERO UNIT */
+	sgio_bitmap_set(0x18, D|T|L|P|W|R|O|        K        , write); /* COPY */
+	sgio_bitmap_set(0x3A, D|T|L|P|W|R|O|        K        , write); /* COPY AND VERIFY */
+	sgio_bitmap_set(0x50, D                              , write); /* XDWRITE(10) */
+	sgio_bitmap_set(0x80, D                              , write); /* XDWRITE EXTENDED(16) */
+
 	/* communication devices (obsolete) */
 
 	sgio_bitmap_set(0x08,                             C  , write); /* GET MESSAGE(6) */
-- 
1.7.1



^ permalink raw reply related	[flat|nested] 73+ messages in thread

* [PATCH v2 11/14] sg_io: mark blk_set_cmd_filter_defaults as __init
  2013-02-06 15:15 [PATCH v2 00/14] Corrections and customization of the SG_IO command whitelist (CVE-2012-4542) Paolo Bonzini
                   ` (9 preceding siblings ...)
  2013-02-06 15:15 ` [PATCH v2 10/14] sg_io: whitelist a few obsolete commands Paolo Bonzini
@ 2013-02-06 15:15 ` Paolo Bonzini
  2013-02-06 15:15 ` [PATCH v2 12/14] sg_io: remove remnants of sysfs SG_IO filters Paolo Bonzini
                   ` (5 subsequent siblings)
  16 siblings, 0 replies; 73+ messages in thread
From: Paolo Bonzini @ 2013-02-06 15:15 UTC (permalink / raw)
  To: linux-kernel; +Cc: Tejun Heo, James E.J. Bottomley, linux-scsi, Jens Axboe

Cc: "James E.J. Bottomley" <JBottomley@parallels.com>
Cc: linux-scsi@vger.kernel.org
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 block/scsi_ioctl.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/block/scsi_ioctl.c b/block/scsi_ioctl.c
index 4942314..3830faf 100644
--- a/block/scsi_ioctl.c
+++ b/block/scsi_ioctl.c
@@ -113,7 +113,7 @@ static int sg_emulated_host(struct request_queue *q, int __user *p)
 	return put_user(1, p);
 }
 
-static void blk_set_cmd_filter_defaults(struct blk_cmd_filter *filter)
+static void __init blk_set_cmd_filter_defaults(struct blk_cmd_filter *filter)
 {
 #define sgio_bitmap_set(cmd, mask, rw) \
 	filter->rw##_ok[(cmd)] |= (mask);
-- 
1.7.1



^ permalink raw reply related	[flat|nested] 73+ messages in thread

* [PATCH v2 12/14] sg_io: remove remnants of sysfs SG_IO filters
  2013-02-06 15:15 [PATCH v2 00/14] Corrections and customization of the SG_IO command whitelist (CVE-2012-4542) Paolo Bonzini
                   ` (10 preceding siblings ...)
  2013-02-06 15:15 ` [PATCH v2 11/14] sg_io: mark blk_set_cmd_filter_defaults as __init Paolo Bonzini
@ 2013-02-06 15:15 ` Paolo Bonzini
  2013-02-06 15:16 ` [PATCH v2 13/14] sg_io: introduce unpriv_sgio queue flag Paolo Bonzini
                   ` (4 subsequent siblings)
  16 siblings, 0 replies; 73+ messages in thread
From: Paolo Bonzini @ 2013-02-06 15:15 UTC (permalink / raw)
  To: linux-kernel; +Cc: Tejun Heo, James E.J. Bottomley, linux-scsi, Jens Axboe

Some defines and structs remained when support was removed for SG_IO
filters in sysfs.  Remove them.

Cc: "James E.J. Bottomley" <JBottomley@parallels.com>
Cc: linux-scsi@vger.kernel.org
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 include/linux/genhd.h |    9 ---------
 1 files changed, 0 insertions(+), 9 deletions(-)

diff --git a/include/linux/genhd.h b/include/linux/genhd.h
index 79b8bba..d3bc249 100644
--- a/include/linux/genhd.h
+++ b/include/linux/genhd.h
@@ -144,15 +144,6 @@ enum {
 	DISK_EVENT_EJECT_REQUEST		= 1 << 1, /* eject requested */
 };
 
-#define BLK_SCSI_MAX_CMDS	(256)
-#define BLK_SCSI_CMD_PER_LONG	(BLK_SCSI_MAX_CMDS / (sizeof(long) * 8))
-
-struct blk_scsi_cmd_filter {
-	unsigned long read_ok[BLK_SCSI_CMD_PER_LONG];
-	unsigned long write_ok[BLK_SCSI_CMD_PER_LONG];
-	struct kobject kobj;
-};
-
 struct disk_part_tbl {
 	struct rcu_head rcu_head;
 	int len;
-- 
1.7.1



^ permalink raw reply related	[flat|nested] 73+ messages in thread

* [PATCH v2 13/14] sg_io: introduce unpriv_sgio queue flag
  2013-02-06 15:15 [PATCH v2 00/14] Corrections and customization of the SG_IO command whitelist (CVE-2012-4542) Paolo Bonzini
                   ` (11 preceding siblings ...)
  2013-02-06 15:15 ` [PATCH v2 12/14] sg_io: remove remnants of sysfs SG_IO filters Paolo Bonzini
@ 2013-02-06 15:16 ` Paolo Bonzini
  2013-02-06 15:16 ` [PATCH v2 14/14] sg_io: use unpriv_sgio to disable whitelisting for scanners Paolo Bonzini
                   ` (3 subsequent siblings)
  16 siblings, 0 replies; 73+ messages in thread
From: Paolo Bonzini @ 2013-02-06 15:16 UTC (permalink / raw)
  To: linux-kernel; +Cc: Tejun Heo, James E.J. Bottomley, linux-scsi, Jens Axboe

This queue flag will let unprivileged users send any SG_IO command to
the device, without any filtering.

This is useful for virtualization, where some trusted guests would like
to send commands such as persistent reservations, but still the virtual
machine monitor should run with restricted permissions.

Cc: "James E.J. Bottomley" <JBottomley@parallels.com>
Cc: linux-scsi@vger.kernel.org
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 Documentation/block/queue-sysfs.txt |    8 ++++++++
 block/blk-sysfs.c                   |   33 +++++++++++++++++++++++++++++++++
 block/scsi_ioctl.c                  |    4 +++-
 include/linux/blkdev.h              |    3 +++
 4 files changed, 47 insertions(+), 1 deletions(-)

diff --git a/Documentation/block/queue-sysfs.txt b/Documentation/block/queue-sysfs.txt
index e54ac1d..341e781 100644
--- a/Documentation/block/queue-sysfs.txt
+++ b/Documentation/block/queue-sysfs.txt
@@ -133,6 +133,14 @@ control of this block device to that new IO scheduler. Note that writing
 an IO scheduler name to this file will attempt to load that IO scheduler
 module, if it isn't already present in the system.
 
+unpriv_sgio (RW)
+----------------
+When a process runs without CAP_SYS_RAWIO, access to some SCSI commands
+with the SG_IO ioctl is restricted.  If this option is '0', the whitelist
+is applied for all file descriptors belonging to unprivileged processes.
+If this option is '1', the whitelist is only applied for file descriptors
+that are opened read-only; other file descriptors can send all SCSI commands,
+and no restrictions are applied by the kernel.
 
 
 Jens Axboe <jens.axboe@oracle.com>, February 2009
diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c
index 7881477..ab2947d 100644
--- a/block/blk-sysfs.c
+++ b/block/blk-sysfs.c
@@ -215,6 +215,32 @@ static ssize_t queue_max_hw_sectors_show(struct request_queue *q, char *page)
 	return queue_var_show(max_hw_sectors_kb, (page));
 }
 
+static ssize_t
+queue_show_unpriv_sgio(struct request_queue *q, char *page)
+{
+	int bit;
+	bit = test_bit(QUEUE_FLAG_UNPRIV_SGIO, &q->queue_flags);
+	return queue_var_show(bit, page);
+}
+static ssize_t
+queue_store_unpriv_sgio(struct request_queue *q, const char *page, size_t count)
+{
+	unsigned long val;
+	ssize_t ret;
+
+	if (!capable(CAP_SYS_ADMIN))
+		return -EPERM;
+
+	ret = queue_var_store(&val, page, count);
+	spin_lock_irq(q->queue_lock);
+	if (val)
+		queue_flag_set(QUEUE_FLAG_UNPRIV_SGIO, q);
+	else
+		queue_flag_clear(QUEUE_FLAG_UNPRIV_SGIO, q);
+	spin_unlock_irq(q->queue_lock);
+	return ret;
+}
+
 #define QUEUE_SYSFS_BIT_FNS(name, flag, neg)				\
 static ssize_t								\
 queue_show_##name(struct request_queue *q, char *page)			\
@@ -403,6 +429,12 @@ static struct queue_sysfs_entry queue_nonrot_entry = {
 	.store = queue_store_nonrot,
 };
 
+static struct queue_sysfs_entry queue_unpriv_sgio_entry = {
+	.attr = {.name = "unpriv_sgio", .mode = S_IRUGO | S_IWUSR },
+	.show = queue_show_unpriv_sgio,
+	.store = queue_store_unpriv_sgio,
+};
+
 static struct queue_sysfs_entry queue_nomerges_entry = {
 	.attr = {.name = "nomerges", .mode = S_IRUGO | S_IWUSR },
 	.show = queue_nomerges_show,
@@ -445,6 +477,7 @@ static struct attribute *default_attrs[] = {
 	&queue_discard_max_entry.attr,
 	&queue_discard_zeroes_data_entry.attr,
 	&queue_write_same_max_entry.attr,
+	&queue_unpriv_sgio_entry.attr,
 	&queue_nonrot_entry.attr,
 	&queue_nomerges_entry.attr,
 	&queue_rq_affinity_entry.attr,
diff --git a/block/scsi_ioctl.c b/block/scsi_ioctl.c
index 3830faf..b46564b 100644
--- a/block/scsi_ioctl.c
+++ b/block/scsi_ioctl.c
@@ -398,7 +398,9 @@ int blk_verify_command(struct request_queue *q,
 		return 0;
 
 	/* Write-safe commands require a writable open */
-	if (has_write_perm && filter->write_ok[cmd[0]] & (1 << q->sgio_type))
+	if (has_write_perm &&
+	    (blk_queue_unpriv_sgio(q) ||
+	     (filter->write_ok[cmd[0]] & (1 << q->sgio_type))))
 		return 0;
 
 	return -EPERM;
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index b376d37..71312ec 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -459,6 +459,7 @@ struct request_queue {
 #define QUEUE_FLAG_SECDISCARD  17	/* supports SECDISCARD */
 #define QUEUE_FLAG_SAME_FORCE  18	/* force complete on same CPU */
 #define QUEUE_FLAG_DEAD        19	/* queue tear-down finished */
+#define QUEUE_FLAG_UNPRIV_SGIO 20	/* SG_IO free for unprivileged users */
 
 #define QUEUE_FLAG_DEFAULT	((1 << QUEUE_FLAG_IO_STAT) |		\
 				 (1 << QUEUE_FLAG_STACKABLE)	|	\
@@ -534,6 +535,8 @@ static inline void queue_flag_clear(unsigned int flag, struct request_queue *q)
 #define blk_queue_nomerges(q)	test_bit(QUEUE_FLAG_NOMERGES, &(q)->queue_flags)
 #define blk_queue_noxmerges(q)	\
 	test_bit(QUEUE_FLAG_NOXMERGES, &(q)->queue_flags)
+#define blk_queue_unpriv_sgio(q) \
+	test_bit(QUEUE_FLAG_UNPRIV_SGIO, &(q)->queue_flags)
 #define blk_queue_nonrot(q)	test_bit(QUEUE_FLAG_NONROT, &(q)->queue_flags)
 #define blk_queue_io_stat(q)	test_bit(QUEUE_FLAG_IO_STAT, &(q)->queue_flags)
 #define blk_queue_add_random(q)	test_bit(QUEUE_FLAG_ADD_RANDOM, &(q)->queue_flags)
-- 
1.7.1



^ permalink raw reply related	[flat|nested] 73+ messages in thread

* [PATCH v2 14/14] sg_io: use unpriv_sgio to disable whitelisting for scanners
  2013-02-06 15:15 [PATCH v2 00/14] Corrections and customization of the SG_IO command whitelist (CVE-2012-4542) Paolo Bonzini
                   ` (12 preceding siblings ...)
  2013-02-06 15:16 ` [PATCH v2 13/14] sg_io: introduce unpriv_sgio queue flag Paolo Bonzini
@ 2013-02-06 15:16 ` Paolo Bonzini
  2013-02-13  8:32 ` [PATCH v2 00/14] Corrections and customization of the SG_IO command whitelist (CVE-2012-4542) Paolo Bonzini
                   ` (2 subsequent siblings)
  16 siblings, 0 replies; 73+ messages in thread
From: Paolo Bonzini @ 2013-02-06 15:16 UTC (permalink / raw)
  To: linux-kernel; +Cc: Tejun Heo, James E.J. Bottomley, linux-scsi, Jens Axboe

Scanners allow all commands because vendor-specific commands are common.
The queue flag we just added lets us keep this behavior by default,
while making it possible to disable it.

Cc: "James E.J. Bottomley" <JBottomley@parallels.com>
Cc: linux-scsi@vger.kernel.org
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 drivers/scsi/scsi_scan.c |   12 +++++++++++-
 drivers/scsi/sg.c        |    3 ---
 2 files changed, 11 insertions(+), 4 deletions(-)

diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
index 86940f3..702b0ef 100644
--- a/drivers/scsi/scsi_scan.c
+++ b/drivers/scsi/scsi_scan.c
@@ -785,13 +785,23 @@ static int scsi_add_lun(struct scsi_device *sdev, unsigned char *inq_result,
 	sdev->request_queue->sgio_type = sdev->type;
 
 	switch (sdev->type) {
+	case TYPE_SCANNER:
+		/*
+		 * Scanners often use(d) vendor-specific commands.  A bunch
+		 * of them is whitelisted, but just allow everything by
+		 * default for maximum compatibility.
+		 */
+		__set_bit(QUEUE_FLAG_UNPRIV_SGIO,
+			  &sdev->request_queue->queue_flags);
+		sdev->writeable = 1;
+		break;
+
 	case TYPE_RBC:
 	case TYPE_TAPE:
 	case TYPE_DISK:
 	case TYPE_PRINTER:
 	case TYPE_MOD:
 	case TYPE_PROCESSOR:
-	case TYPE_SCANNER:
 	case TYPE_MEDIUM_CHANGER:
 	case TYPE_ENCLOSURE:
 	case TYPE_COMM:
diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c
index cab816f..1c35628 100644
--- a/drivers/scsi/sg.c
+++ b/drivers/scsi/sg.c
@@ -219,9 +219,6 @@ static int sg_allow_access(struct file *filp, unsigned char *cmd)
 	struct sg_fd *sfp = filp->private_data;
 	struct request_queue *q = sfp->parentdp->device->request_queue;
 
-	if (sfp->parentdp->device->type == TYPE_SCANNER)
-		return 0;
-
 	return blk_verify_command(q, cmd, filp->f_mode & FMODE_WRITE);
 }
 
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 73+ messages in thread

* Re: [PATCH v2 00/14] Corrections and customization of the SG_IO command whitelist (CVE-2012-4542)
  2013-02-06 15:15 [PATCH v2 00/14] Corrections and customization of the SG_IO command whitelist (CVE-2012-4542) Paolo Bonzini
                   ` (13 preceding siblings ...)
  2013-02-06 15:16 ` [PATCH v2 14/14] sg_io: use unpriv_sgio to disable whitelisting for scanners Paolo Bonzini
@ 2013-02-13  8:32 ` Paolo Bonzini
  2013-02-13 15:35   ` Douglas Gilbert
  2013-02-20 16:12 ` Paolo Bonzini
  2013-05-22  6:35 ` PING^7 (was Re: [PATCH v2 00/14] Corrections and customization of the SG_IO command whitelist (CVE-2012-4542)) Paolo Bonzini
  16 siblings, 1 reply; 73+ messages in thread
From: Paolo Bonzini @ 2013-02-13  8:32 UTC (permalink / raw)
  Cc: linux-kernel, Tejun Heo, James E.J. Bottomley, linux-scsi, Jens Axboe

Il 06/02/2013 16:15, Paolo Bonzini ha scritto:
> This series regards the whitelist that is used for the SG_IO ioctl.  This
> whitelist has three problems:
> 
> * the bitmap of allowed commands is designed for MMC devices (roughly,
>   "play/burn CDs without requiring root") but some opcodes overlap across SCSI
>   device classes and have different meanings for different classes.
> 
> * also because the bitmap of allowed commands is designed for MMC devices
>   only, some commands are missing even though they are generally useful and
>   not insecure.  At least not more insecure than anything else you can
>   do if you have access to /dev/sdX or /dev/stX nodes.
> 
> * the whitelist can be disabled per-process but not per-disk.  In addition,
>   the required capability (CAP_SYS_RAWIO) gives access to a range of other 
>   resources, enough to make it insecure.
> 
> The series corrects these problems.  Patches 1-4 solve the first problem,
> which also has an assigned CVE, by using different bitmaps for the various
> device classes.  Patches 5-11 solve the second by adding more commands
> to the bitmaps.  Patches 12 and 13 solve the third, and were already
> posted but ignored by the maintainers despite multiple pings.
> 
> Note: checkpatch hates the formatting of the command table.  I know about this,
> and ensured that there are no errors in the rest of the code.  The current
> formatting is IMHO quite handy, and roughly based on the files available
> from the SCSI standard body.
> 
> Ok for the next merge window?
> 
> Paolo
> 
> v1->v2: remove 2 MMC commands and 6 SBC commands (see patches 6 and 9
>         for details).  Added patch 14 and added a few more scanner
>         commands based on SANE (scanners are not whitelisted by default,
>         also were not in v1, but this makes it possible to opt into the
>         whitelist out of paranoia).  Removed C++ comments.  Removed the
>         large #if 0'd list of commands that the kernel does not pass
>         though.  Marked blk_set_cmd_filter_defaults as __init.
> 
> 
> Paolo Bonzini (14):
>   sg_io: pass request_queue to blk_verify_command
>   sg_io: reorganize list of allowed commands
>   sg_io: use different default filters for each device class
>   sg_io: resolve conflicts between commands assigned to multiple
>     classes (CVE-2012-4542)
>   sg_io: whitelist a few more commands for rare & obsolete device types
>   sg_io: whitelist another command for multimedia devices
>   sg_io: whitelist a few more commands for media changers
>   sg_io: whitelist a few more commands for tapes
>   sg_io: whitelist a few more commands for disks
>   sg_io: whitelist a few obsolete commands
>   sg_io: mark blk_set_cmd_filter_defaults as __init
>   sg_io: remove remnants of sysfs SG_IO filters
>   sg_io: introduce unpriv_sgio queue flag
>   sg_io: use unpriv_sgio to disable whitelisting for scanners
> 
>  Documentation/block/queue-sysfs.txt |    8 +
>  block/blk-sysfs.c                   |   33 +++
>  block/bsg.c                         |    2 +-
>  block/scsi_ioctl.c                  |  369 ++++++++++++++++++++++++++---------
>  drivers/scsi/scsi_scan.c            |   14 ++-
>  drivers/scsi/sg.c                   |    6 +-
>  include/linux/blkdev.h              |    8 +-
>  include/linux/genhd.h               |    9 -
>  include/scsi/scsi.h                 |    3 +
>  9 files changed, 344 insertions(+), 108 deletions(-)
> 

Ping? I'm not even sure what tree this should host these patches...

Paolo

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [PATCH v2 00/14] Corrections and customization of the SG_IO command whitelist (CVE-2012-4542)
  2013-02-13  8:32 ` [PATCH v2 00/14] Corrections and customization of the SG_IO command whitelist (CVE-2012-4542) Paolo Bonzini
@ 2013-02-13 15:35   ` Douglas Gilbert
  2013-02-13 15:48     ` Paolo Bonzini
  0 siblings, 1 reply; 73+ messages in thread
From: Douglas Gilbert @ 2013-02-13 15:35 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: linux-kernel, Tejun Heo, James E.J. Bottomley, linux-scsi, Jens Axboe

On 13-02-13 03:32 AM, Paolo Bonzini wrote:
> Il 06/02/2013 16:15, Paolo Bonzini ha scritto:
>> This series regards the whitelist that is used for the SG_IO ioctl.  This
>> whitelist has three problems:
>>
>> * the bitmap of allowed commands is designed for MMC devices (roughly,
>>    "play/burn CDs without requiring root") but some opcodes overlap across SCSI
>>    device classes and have different meanings for different classes.
>>
>> * also because the bitmap of allowed commands is designed for MMC devices
>>    only, some commands are missing even though they are generally useful and
>>    not insecure.  At least not more insecure than anything else you can
>>    do if you have access to /dev/sdX or /dev/stX nodes.
>>
>> * the whitelist can be disabled per-process but not per-disk.  In addition,
>>    the required capability (CAP_SYS_RAWIO) gives access to a range of other
>>    resources, enough to make it insecure.
>>
>> The series corrects these problems.  Patches 1-4 solve the first problem,
>> which also has an assigned CVE, by using different bitmaps for the various
>> device classes.  Patches 5-11 solve the second by adding more commands
>> to the bitmaps.  Patches 12 and 13 solve the third, and were already
>> posted but ignored by the maintainers despite multiple pings.
>>
>> Note: checkpatch hates the formatting of the command table.  I know about this,
>> and ensured that there are no errors in the rest of the code.  The current
>> formatting is IMHO quite handy, and roughly based on the files available
>> from the SCSI standard body.
>>
>> Ok for the next merge window?
>>
>> Paolo
>>
>> v1->v2: remove 2 MMC commands and 6 SBC commands (see patches 6 and 9
>>          for details).  Added patch 14 and added a few more scanner
>>          commands based on SANE (scanners are not whitelisted by default,
>>          also were not in v1, but this makes it possible to opt into the
>>          whitelist out of paranoia).  Removed C++ comments.  Removed the
>>          large #if 0'd list of commands that the kernel does not pass
>>          though.  Marked blk_set_cmd_filter_defaults as __init.
>>
>>
>> Paolo Bonzini (14):
>>    sg_io: pass request_queue to blk_verify_command
>>    sg_io: reorganize list of allowed commands
>>    sg_io: use different default filters for each device class
>>    sg_io: resolve conflicts between commands assigned to multiple
>>      classes (CVE-2012-4542)
>>    sg_io: whitelist a few more commands for rare & obsolete device types
>>    sg_io: whitelist another command for multimedia devices
>>    sg_io: whitelist a few more commands for media changers
>>    sg_io: whitelist a few more commands for tapes
>>    sg_io: whitelist a few more commands for disks
>>    sg_io: whitelist a few obsolete commands
>>    sg_io: mark blk_set_cmd_filter_defaults as __init
>>    sg_io: remove remnants of sysfs SG_IO filters
>>    sg_io: introduce unpriv_sgio queue flag
>>    sg_io: use unpriv_sgio to disable whitelisting for scanners
>>
>>   Documentation/block/queue-sysfs.txt |    8 +
>>   block/blk-sysfs.c                   |   33 +++
>>   block/bsg.c                         |    2 +-
>>   block/scsi_ioctl.c                  |  369 ++++++++++++++++++++++++++---------
>>   drivers/scsi/scsi_scan.c            |   14 ++-
>>   drivers/scsi/sg.c                   |    6 +-
>>   include/linux/blkdev.h              |    8 +-
>>   include/linux/genhd.h               |    9 -
>>   include/scsi/scsi.h                 |    3 +
>>   9 files changed, 344 insertions(+), 108 deletions(-)
>>
>
> Ping? I'm not even sure what tree this should host these patches...

You are whitelisting SCSI commands so obviously the SCSI tree
and the patch spills over into the block tree.
Can't see much point in ack-ing the sg changes since most
of the action is at higher levels.

The question I have is what existing code will this change
break (and will I being getting emails from peeved
developers)?

Is 8 lines of documentation changes enough? My guess is
that SG_IO ioctl pass-through users will be tripped up
and it won't be obvious to them to look at
     Documentation/block/queue-sysfs.txt
for enlightenment; especially if they are using a char
device node from the bsg, sg or st drivers to issue SG_IO.

Doug Gilbert


^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [PATCH v2 00/14] Corrections and customization of the SG_IO command whitelist (CVE-2012-4542)
  2013-02-13 15:35   ` Douglas Gilbert
@ 2013-02-13 15:48     ` Paolo Bonzini
  0 siblings, 0 replies; 73+ messages in thread
From: Paolo Bonzini @ 2013-02-13 15:48 UTC (permalink / raw)
  To: dgilbert
  Cc: linux-kernel, Tejun Heo, James E.J. Bottomley, linux-scsi, Jens Axboe

Il 13/02/2013 16:35, Douglas Gilbert ha scritto:
>>
>> Ping? I'm not even sure what tree this should host these patches...
> 
> You are whitelisting SCSI commands so obviously the SCSI tree
> and the patch spills over into the block tree.

Yeah, an Acked-by is in order but it's not clear from whom and for whom.

> Can't see much point in ack-ing the sg changes since most
> of the action is at higher levels.
> 
> The question I have is what existing code will this change
> break (and will I being getting emails from peeved
> developers)?

An unlikely situation is that a vendor-specific command in the "low"
range (i.e. not 0xc0..0xff) conflicted with an MMC command, so it
happened to be enabled.  That will now break, but only if executed
without CAP_SYS_RAWIO.

Nothing will change for programs executed with CAP_SYS_RAWIO.

I have not disabled any standards-defined command that used to be
enabled, and on the contrary I enabled a few of them, so this could
potentially lead to less emails from peeved developers, too.

> Is 8 lines of documentation changes enough? My guess is
> that SG_IO ioctl pass-through users will be tripped up
> and it won't be obvious to them to look at
>     Documentation/block/queue-sysfs.txt
> for enlightenment; especially if they are using a char
> device node from the bsg, sg or st drivers to issue SG_IO.

The command whitelist was not documented before.  It's quite likely that
any documentation except the code itself would not be updated the next
time the whitelist is touched.

Paolo

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [PATCH v2 00/14] Corrections and customization of the SG_IO command whitelist (CVE-2012-4542)
  2013-02-06 15:15 [PATCH v2 00/14] Corrections and customization of the SG_IO command whitelist (CVE-2012-4542) Paolo Bonzini
                   ` (14 preceding siblings ...)
  2013-02-13  8:32 ` [PATCH v2 00/14] Corrections and customization of the SG_IO command whitelist (CVE-2012-4542) Paolo Bonzini
@ 2013-02-20 16:12 ` Paolo Bonzini
  2013-03-22 22:30   ` PING^2 " Paolo Bonzini
  2013-05-06 20:43   ` PING^6 " Paolo Bonzini
  2013-05-22  6:35 ` PING^7 (was Re: [PATCH v2 00/14] Corrections and customization of the SG_IO command whitelist (CVE-2012-4542)) Paolo Bonzini
  16 siblings, 2 replies; 73+ messages in thread
From: Paolo Bonzini @ 2013-02-20 16:12 UTC (permalink / raw)
  Cc: linux-kernel, Tejun Heo, James E.J. Bottomley, linux-scsi, Jens Axboe

Il 06/02/2013 16:15, Paolo Bonzini ha scritto:
> This series regards the whitelist that is used for the SG_IO ioctl.  This
> whitelist has three problems:
> 
> * the bitmap of allowed commands is designed for MMC devices (roughly,
>   "play/burn CDs without requiring root") but some opcodes overlap across SCSI
>   device classes and have different meanings for different classes.
> 
> * also because the bitmap of allowed commands is designed for MMC devices
>   only, some commands are missing even though they are generally useful and
>   not insecure.  At least not more insecure than anything else you can
>   do if you have access to /dev/sdX or /dev/stX nodes.
> 
> * the whitelist can be disabled per-process but not per-disk.  In addition,
>   the required capability (CAP_SYS_RAWIO) gives access to a range of other 
>   resources, enough to make it insecure.
> 
> The series corrects these problems.  Patches 1-4 solve the first problem,
> which also has an assigned CVE, by using different bitmaps for the various
> device classes.  Patches 5-11 solve the second by adding more commands
> to the bitmaps.  Patches 12 and 13 solve the third, and were already
> posted but ignored by the maintainers despite multiple pings.
> 
> Note: checkpatch hates the formatting of the command table.  I know about this,
> and ensured that there are no errors in the rest of the code.  The current
> formatting is IMHO quite handy, and roughly based on the files available
> from the SCSI standard body.
> 
> Ok for the next merge window?
> 
> Paolo
> 
> v1->v2: remove 2 MMC commands and 6 SBC commands (see patches 6 and 9
>         for details).  Added patch 14 and added a few more scanner
>         commands based on SANE (scanners are not whitelisted by default,
>         also were not in v1, but this makes it possible to opt into the
>         whitelist out of paranoia).  Removed C++ comments.  Removed the
>         large #if 0'd list of commands that the kernel does not pass
>         though.  Marked blk_set_cmd_filter_defaults as __init.

Ping...

Jens/James, is anyone going to pick this up for 3.9?

Paolo

> 
> Paolo Bonzini (14):
>   sg_io: pass request_queue to blk_verify_command
>   sg_io: reorganize list of allowed commands
>   sg_io: use different default filters for each device class
>   sg_io: resolve conflicts between commands assigned to multiple
>     classes (CVE-2012-4542)
>   sg_io: whitelist a few more commands for rare & obsolete device types
>   sg_io: whitelist another command for multimedia devices
>   sg_io: whitelist a few more commands for media changers
>   sg_io: whitelist a few more commands for tapes
>   sg_io: whitelist a few more commands for disks
>   sg_io: whitelist a few obsolete commands
>   sg_io: mark blk_set_cmd_filter_defaults as __init
>   sg_io: remove remnants of sysfs SG_IO filters
>   sg_io: introduce unpriv_sgio queue flag
>   sg_io: use unpriv_sgio to disable whitelisting for scanners
> 
>  Documentation/block/queue-sysfs.txt |    8 +
>  block/blk-sysfs.c                   |   33 +++
>  block/bsg.c                         |    2 +-
>  block/scsi_ioctl.c                  |  369 ++++++++++++++++++++++++++---------
>  drivers/scsi/scsi_scan.c            |   14 ++-
>  drivers/scsi/sg.c                   |    6 +-
>  include/linux/blkdev.h              |    8 +-
>  include/linux/genhd.h               |    9 -
>  include/scsi/scsi.h                 |    3 +
>  9 files changed, 344 insertions(+), 108 deletions(-)
> 


^ permalink raw reply	[flat|nested] 73+ messages in thread

* PING^2 Re: [PATCH v2 00/14] Corrections and customization of the SG_IO command whitelist (CVE-2012-4542)
  2013-02-20 16:12 ` Paolo Bonzini
@ 2013-03-22 22:30   ` Paolo Bonzini
  2013-04-04 18:18     ` PING^3 " Paolo Bonzini
  2013-05-06 20:43   ` PING^6 " Paolo Bonzini
  1 sibling, 1 reply; 73+ messages in thread
From: Paolo Bonzini @ 2013-03-22 22:30 UTC (permalink / raw)
  To: linux-kernel; +Cc: Tejun Heo, James E.J. Bottomley, linux-scsi, Jens Axboe

Il 20/02/2013 17:12, Paolo Bonzini ha scritto:
> Il 06/02/2013 16:15, Paolo Bonzini ha scritto:
>> This series regards the whitelist that is used for the SG_IO ioctl.  This
>> whitelist has three problems:
>>
>> * the bitmap of allowed commands is designed for MMC devices (roughly,
>>   "play/burn CDs without requiring root") but some opcodes overlap across SCSI
>>   device classes and have different meanings for different classes.
>>
>> * also because the bitmap of allowed commands is designed for MMC devices
>>   only, some commands are missing even though they are generally useful and
>>   not insecure.  At least not more insecure than anything else you can
>>   do if you have access to /dev/sdX or /dev/stX nodes.
>>
>> * the whitelist can be disabled per-process but not per-disk.  In addition,
>>   the required capability (CAP_SYS_RAWIO) gives access to a range of other 
>>   resources, enough to make it insecure.
>>
>> The series corrects these problems.  Patches 1-4 solve the first problem,
>> which also has an assigned CVE, by using different bitmaps for the various
>> device classes.  Patches 5-11 solve the second by adding more commands
>> to the bitmaps.  Patches 12 and 13 solve the third, and were already
>> posted but ignored by the maintainers despite multiple pings.
>>
>> Note: checkpatch hates the formatting of the command table.  I know about this,
>> and ensured that there are no errors in the rest of the code.  The current
>> formatting is IMHO quite handy, and roughly based on the files available
>> from the SCSI standard body.
>>
>> Ok for the next merge window?
>>
>> Paolo
>>
>> v1->v2: remove 2 MMC commands and 6 SBC commands (see patches 6 and 9
>>         for details).  Added patch 14 and added a few more scanner
>>         commands based on SANE (scanners are not whitelisted by default,
>>         also were not in v1, but this makes it possible to opt into the
>>         whitelist out of paranoia).  Removed C++ comments.  Removed the
>>         large #if 0'd list of commands that the kernel does not pass
>>         though.  Marked blk_set_cmd_filter_defaults as __init.
> 
> Ping...
> 
> Jens/James, is anyone going to pick this up for 3.9?

Another month has passed, Ping^2...

Paolo

> Paolo
> 
>>
>> Paolo Bonzini (14):
>>   sg_io: pass request_queue to blk_verify_command
>>   sg_io: reorganize list of allowed commands
>>   sg_io: use different default filters for each device class
>>   sg_io: resolve conflicts between commands assigned to multiple
>>     classes (CVE-2012-4542)
>>   sg_io: whitelist a few more commands for rare & obsolete device types
>>   sg_io: whitelist another command for multimedia devices
>>   sg_io: whitelist a few more commands for media changers
>>   sg_io: whitelist a few more commands for tapes
>>   sg_io: whitelist a few more commands for disks
>>   sg_io: whitelist a few obsolete commands
>>   sg_io: mark blk_set_cmd_filter_defaults as __init
>>   sg_io: remove remnants of sysfs SG_IO filters
>>   sg_io: introduce unpriv_sgio queue flag
>>   sg_io: use unpriv_sgio to disable whitelisting for scanners
>>
>>  Documentation/block/queue-sysfs.txt |    8 +
>>  block/blk-sysfs.c                   |   33 +++
>>  block/bsg.c                         |    2 +-
>>  block/scsi_ioctl.c                  |  369 ++++++++++++++++++++++++++---------
>>  drivers/scsi/scsi_scan.c            |   14 ++-
>>  drivers/scsi/sg.c                   |    6 +-
>>  include/linux/blkdev.h              |    8 +-
>>  include/linux/genhd.h               |    9 -
>>  include/scsi/scsi.h                 |    3 +
>>  9 files changed, 344 insertions(+), 108 deletions(-)
>>
> 


^ permalink raw reply	[flat|nested] 73+ messages in thread

* PING^3 Re: [PATCH v2 00/14] Corrections and customization of the SG_IO command whitelist (CVE-2012-4542)
  2013-03-22 22:30   ` PING^2 " Paolo Bonzini
@ 2013-04-04 18:18     ` Paolo Bonzini
  2013-04-17 12:26       ` PING^4 aka The Jon Corbet Effect " Paolo Bonzini
  0 siblings, 1 reply; 73+ messages in thread
From: Paolo Bonzini @ 2013-04-04 18:18 UTC (permalink / raw)
  Cc: linux-kernel, Tejun Heo, James E.J. Bottomley, linux-scsi, Jens Axboe

Il 22/03/2013 23:30, Paolo Bonzini ha scritto:
> Il 20/02/2013 17:12, Paolo Bonzini ha scritto:
>> Il 06/02/2013 16:15, Paolo Bonzini ha scritto:
>>> This series regards the whitelist that is used for the SG_IO ioctl.  This
>>> whitelist has three problems:
>>>
>>> * the bitmap of allowed commands is designed for MMC devices (roughly,
>>>   "play/burn CDs without requiring root") but some opcodes overlap across SCSI
>>>   device classes and have different meanings for different classes.
>>>
>>> * also because the bitmap of allowed commands is designed for MMC devices
>>>   only, some commands are missing even though they are generally useful and
>>>   not insecure.  At least not more insecure than anything else you can
>>>   do if you have access to /dev/sdX or /dev/stX nodes.
>>>
>>> * the whitelist can be disabled per-process but not per-disk.  In addition,
>>>   the required capability (CAP_SYS_RAWIO) gives access to a range of other 
>>>   resources, enough to make it insecure.
>>>
>>> The series corrects these problems.  Patches 1-4 solve the first problem,
>>> which also has an assigned CVE, by using different bitmaps for the various
>>> device classes.  Patches 5-11 solve the second by adding more commands
>>> to the bitmaps.  Patches 12 and 13 solve the third, and were already
>>> posted but ignored by the maintainers despite multiple pings.
>>>
>>> Note: checkpatch hates the formatting of the command table.  I know about this,
>>> and ensured that there are no errors in the rest of the code.  The current
>>> formatting is IMHO quite handy, and roughly based on the files available
>>> from the SCSI standard body.
>>>
>>> Ok for the next merge window?
>>>
>>> Paolo
>>>
>>> v1->v2: remove 2 MMC commands and 6 SBC commands (see patches 6 and 9
>>>         for details).  Added patch 14 and added a few more scanner
>>>         commands based on SANE (scanners are not whitelisted by default,
>>>         also were not in v1, but this makes it possible to opt into the
>>>         whitelist out of paranoia).  Removed C++ comments.  Removed the
>>>         large #if 0'd list of commands that the kernel does not pass
>>>         though.  Marked blk_set_cmd_filter_defaults as __init.
>>
>> Ping...
>>
>> Jens/James, is anyone going to pick this up for 3.9?
> 
> Another month has passed, Ping^2...

Ping^3, any hope for 3.10?

Paolo

> 
> Paolo
> 
>> Paolo
>>
>>>
>>> Paolo Bonzini (14):
>>>   sg_io: pass request_queue to blk_verify_command
>>>   sg_io: reorganize list of allowed commands
>>>   sg_io: use different default filters for each device class
>>>   sg_io: resolve conflicts between commands assigned to multiple
>>>     classes (CVE-2012-4542)
>>>   sg_io: whitelist a few more commands for rare & obsolete device types
>>>   sg_io: whitelist another command for multimedia devices
>>>   sg_io: whitelist a few more commands for media changers
>>>   sg_io: whitelist a few more commands for tapes
>>>   sg_io: whitelist a few more commands for disks
>>>   sg_io: whitelist a few obsolete commands
>>>   sg_io: mark blk_set_cmd_filter_defaults as __init
>>>   sg_io: remove remnants of sysfs SG_IO filters
>>>   sg_io: introduce unpriv_sgio queue flag
>>>   sg_io: use unpriv_sgio to disable whitelisting for scanners
>>>
>>>  Documentation/block/queue-sysfs.txt |    8 +
>>>  block/blk-sysfs.c                   |   33 +++
>>>  block/bsg.c                         |    2 +-
>>>  block/scsi_ioctl.c                  |  369 ++++++++++++++++++++++++++---------
>>>  drivers/scsi/scsi_scan.c            |   14 ++-
>>>  drivers/scsi/sg.c                   |    6 +-
>>>  include/linux/blkdev.h              |    8 +-
>>>  include/linux/genhd.h               |    9 -
>>>  include/scsi/scsi.h                 |    3 +
>>>  9 files changed, 344 insertions(+), 108 deletions(-)
>>>
>>
> 


^ permalink raw reply	[flat|nested] 73+ messages in thread

* PING^4 aka The Jon Corbet Effect Re: [PATCH v2 00/14] Corrections and customization of the SG_IO command whitelist (CVE-2012-4542)
  2013-04-04 18:18     ` PING^3 " Paolo Bonzini
@ 2013-04-17 12:26       ` Paolo Bonzini
  2013-04-27 13:31         ` PING^5 aka New ways to attract attentions " Paolo Bonzini
  0 siblings, 1 reply; 73+ messages in thread
From: Paolo Bonzini @ 2013-04-17 12:26 UTC (permalink / raw)
  To: linux-kernel, linux-scsi, corbet
  Cc: Tejun Heo, James E.J. Bottomley, Jens Axboe

And a fourth ping comes...

Jon, the next time I read "it seems likely to be picked up fairly soon"
(http://lwn.net/Articles/535075/), I'll picture the author of the patch
attempting open-heart surgery on a long-red-haired voodoo doll!

Paolo

Il 04/04/2013 20:18, Paolo Bonzini ha scritto:
> Il 22/03/2013 23:30, Paolo Bonzini ha scritto:
>> Il 20/02/2013 17:12, Paolo Bonzini ha scritto:
>>> Il 06/02/2013 16:15, Paolo Bonzini ha scritto:
>>>> This series regards the whitelist that is used for the SG_IO ioctl.  This
>>>> whitelist has three problems:
>>>>
>>>> * the bitmap of allowed commands is designed for MMC devices (roughly,
>>>>   "play/burn CDs without requiring root") but some opcodes overlap across SCSI
>>>>   device classes and have different meanings for different classes.
>>>>
>>>> * also because the bitmap of allowed commands is designed for MMC devices
>>>>   only, some commands are missing even though they are generally useful and
>>>>   not insecure.  At least not more insecure than anything else you can
>>>>   do if you have access to /dev/sdX or /dev/stX nodes.
>>>>
>>>> * the whitelist can be disabled per-process but not per-disk.  In addition,
>>>>   the required capability (CAP_SYS_RAWIO) gives access to a range of other 
>>>>   resources, enough to make it insecure.
>>>>
>>>> The series corrects these problems.  Patches 1-4 solve the first problem,
>>>> which also has an assigned CVE, by using different bitmaps for the various
>>>> device classes.  Patches 5-11 solve the second by adding more commands
>>>> to the bitmaps.  Patches 12 and 13 solve the third, and were already
>>>> posted but ignored by the maintainers despite multiple pings.
>>>>
>>>> Note: checkpatch hates the formatting of the command table.  I know about this,
>>>> and ensured that there are no errors in the rest of the code.  The current
>>>> formatting is IMHO quite handy, and roughly based on the files available
>>>> from the SCSI standard body.
>>>>
>>>> Ok for the next merge window?
>>>>
>>>> Paolo
>>>>
>>>> v1->v2: remove 2 MMC commands and 6 SBC commands (see patches 6 and 9
>>>>         for details).  Added patch 14 and added a few more scanner
>>>>         commands based on SANE (scanners are not whitelisted by default,
>>>>         also were not in v1, but this makes it possible to opt into the
>>>>         whitelist out of paranoia).  Removed C++ comments.  Removed the
>>>>         large #if 0'd list of commands that the kernel does not pass
>>>>         though.  Marked blk_set_cmd_filter_defaults as __init.
>>>
>>> Ping...
>>>
>>> Jens/James, is anyone going to pick this up for 3.9?
>>
>> Another month has passed, Ping^2...
> 
> Ping^3, any hope for 3.10?
> 
> Paolo
> 
>>
>> Paolo
>>
>>> Paolo
>>>
>>>>
>>>> Paolo Bonzini (14):
>>>>   sg_io: pass request_queue to blk_verify_command
>>>>   sg_io: reorganize list of allowed commands
>>>>   sg_io: use different default filters for each device class
>>>>   sg_io: resolve conflicts between commands assigned to multiple
>>>>     classes (CVE-2012-4542)
>>>>   sg_io: whitelist a few more commands for rare & obsolete device types
>>>>   sg_io: whitelist another command for multimedia devices
>>>>   sg_io: whitelist a few more commands for media changers
>>>>   sg_io: whitelist a few more commands for tapes
>>>>   sg_io: whitelist a few more commands for disks
>>>>   sg_io: whitelist a few obsolete commands
>>>>   sg_io: mark blk_set_cmd_filter_defaults as __init
>>>>   sg_io: remove remnants of sysfs SG_IO filters
>>>>   sg_io: introduce unpriv_sgio queue flag
>>>>   sg_io: use unpriv_sgio to disable whitelisting for scanners
>>>>
>>>>  Documentation/block/queue-sysfs.txt |    8 +
>>>>  block/blk-sysfs.c                   |   33 +++
>>>>  block/bsg.c                         |    2 +-
>>>>  block/scsi_ioctl.c                  |  369 ++++++++++++++++++++++++++---------
>>>>  drivers/scsi/scsi_scan.c            |   14 ++-
>>>>  drivers/scsi/sg.c                   |    6 +-
>>>>  include/linux/blkdev.h              |    8 +-
>>>>  include/linux/genhd.h               |    9 -
>>>>  include/scsi/scsi.h                 |    3 +
>>>>  9 files changed, 344 insertions(+), 108 deletions(-)
>>>>
>>>
>>
> 


^ permalink raw reply	[flat|nested] 73+ messages in thread

* PING^5 aka New ways to attract attentions Re: [PATCH v2 00/14] Corrections and customization of the SG_IO command whitelist (CVE-2012-4542)
  2013-04-17 12:26       ` PING^4 aka The Jon Corbet Effect " Paolo Bonzini
@ 2013-04-27 13:31         ` Paolo Bonzini
  0 siblings, 0 replies; 73+ messages in thread
From: Paolo Bonzini @ 2013-04-27 13:31 UTC (permalink / raw)
  Cc: linux-kernel, linux-scsi, corbet, Tejun Heo,
	James E.J. Bottomley, Jens Axboe

PING^5

So a blatant attempt at getting attention from LWN didn't work.  I'm
desperate, so...

... look!  I'm top posting!

Paolo

Il 17/04/2013 14:26, Paolo Bonzini ha scritto:
> And a fourth ping comes...
> 
> Jon, the next time I read "it seems likely to be picked up fairly soon"
> (http://lwn.net/Articles/535075/), I'll picture the author of the patch
> attempting open-heart surgery on a long-red-haired voodoo doll!
> 
> Paolo
> 
> Il 04/04/2013 20:18, Paolo Bonzini ha scritto:
>> Il 22/03/2013 23:30, Paolo Bonzini ha scritto:
>>> Il 20/02/2013 17:12, Paolo Bonzini ha scritto:
>>>> Il 06/02/2013 16:15, Paolo Bonzini ha scritto:
>>>>> This series regards the whitelist that is used for the SG_IO ioctl.  This
>>>>> whitelist has three problems:
>>>>>
>>>>> * the bitmap of allowed commands is designed for MMC devices (roughly,
>>>>>   "play/burn CDs without requiring root") but some opcodes overlap across SCSI
>>>>>   device classes and have different meanings for different classes.
>>>>>
>>>>> * also because the bitmap of allowed commands is designed for MMC devices
>>>>>   only, some commands are missing even though they are generally useful and
>>>>>   not insecure.  At least not more insecure than anything else you can
>>>>>   do if you have access to /dev/sdX or /dev/stX nodes.
>>>>>
>>>>> * the whitelist can be disabled per-process but not per-disk.  In addition,
>>>>>   the required capability (CAP_SYS_RAWIO) gives access to a range of other 
>>>>>   resources, enough to make it insecure.
>>>>>
>>>>> The series corrects these problems.  Patches 1-4 solve the first problem,
>>>>> which also has an assigned CVE, by using different bitmaps for the various
>>>>> device classes.  Patches 5-11 solve the second by adding more commands
>>>>> to the bitmaps.  Patches 12 and 13 solve the third, and were already
>>>>> posted but ignored by the maintainers despite multiple pings.
>>>>>
>>>>> Note: checkpatch hates the formatting of the command table.  I know about this,
>>>>> and ensured that there are no errors in the rest of the code.  The current
>>>>> formatting is IMHO quite handy, and roughly based on the files available
>>>>> from the SCSI standard body.
>>>>>
>>>>> Ok for the next merge window?
>>>>>
>>>>> Paolo
>>>>>
>>>>> v1->v2: remove 2 MMC commands and 6 SBC commands (see patches 6 and 9
>>>>>         for details).  Added patch 14 and added a few more scanner
>>>>>         commands based on SANE (scanners are not whitelisted by default,
>>>>>         also were not in v1, but this makes it possible to opt into the
>>>>>         whitelist out of paranoia).  Removed C++ comments.  Removed the
>>>>>         large #if 0'd list of commands that the kernel does not pass
>>>>>         though.  Marked blk_set_cmd_filter_defaults as __init.
>>>>
>>>> Ping...
>>>>
>>>> Jens/James, is anyone going to pick this up for 3.9?
>>>
>>> Another month has passed, Ping^2...
>>
>> Ping^3, any hope for 3.10?
>>
>> Paolo
>>
>>>
>>> Paolo
>>>
>>>> Paolo
>>>>
>>>>>
>>>>> Paolo Bonzini (14):
>>>>>   sg_io: pass request_queue to blk_verify_command
>>>>>   sg_io: reorganize list of allowed commands
>>>>>   sg_io: use different default filters for each device class
>>>>>   sg_io: resolve conflicts between commands assigned to multiple
>>>>>     classes (CVE-2012-4542)
>>>>>   sg_io: whitelist a few more commands for rare & obsolete device types
>>>>>   sg_io: whitelist another command for multimedia devices
>>>>>   sg_io: whitelist a few more commands for media changers
>>>>>   sg_io: whitelist a few more commands for tapes
>>>>>   sg_io: whitelist a few more commands for disks
>>>>>   sg_io: whitelist a few obsolete commands
>>>>>   sg_io: mark blk_set_cmd_filter_defaults as __init
>>>>>   sg_io: remove remnants of sysfs SG_IO filters
>>>>>   sg_io: introduce unpriv_sgio queue flag
>>>>>   sg_io: use unpriv_sgio to disable whitelisting for scanners
>>>>>
>>>>>  Documentation/block/queue-sysfs.txt |    8 +
>>>>>  block/blk-sysfs.c                   |   33 +++
>>>>>  block/bsg.c                         |    2 +-
>>>>>  block/scsi_ioctl.c                  |  369 ++++++++++++++++++++++++++---------
>>>>>  drivers/scsi/scsi_scan.c            |   14 ++-
>>>>>  drivers/scsi/sg.c                   |    6 +-
>>>>>  include/linux/blkdev.h              |    8 +-
>>>>>  include/linux/genhd.h               |    9 -
>>>>>  include/scsi/scsi.h                 |    3 +
>>>>>  9 files changed, 344 insertions(+), 108 deletions(-)
>>>>>
>>>>
>>>
>>
> 


^ permalink raw reply	[flat|nested] 73+ messages in thread

* PING^6 Re: [PATCH v2 00/14] Corrections and customization of the SG_IO command whitelist (CVE-2012-4542)
  2013-02-20 16:12 ` Paolo Bonzini
  2013-03-22 22:30   ` PING^2 " Paolo Bonzini
@ 2013-05-06 20:43   ` Paolo Bonzini
  1 sibling, 0 replies; 73+ messages in thread
From: Paolo Bonzini @ 2013-05-06 20:43 UTC (permalink / raw)
  To: linux-kernel, Tejun Heo, linux-scsi; +Cc: James E.J. Bottomley, Jens Axboe

Il 20/02/2013 17:12, Paolo Bonzini ha scritto:
> Il 06/02/2013 16:15, Paolo Bonzini ha scritto:
>> This series regards the whitelist that is used for the SG_IO ioctl.  This
>> whitelist has three problems:
>>
>> * the bitmap of allowed commands is designed for MMC devices (roughly,
>>   "play/burn CDs without requiring root") but some opcodes overlap across SCSI
>>   device classes and have different meanings for different classes.
>>
>> * also because the bitmap of allowed commands is designed for MMC devices
>>   only, some commands are missing even though they are generally useful and
>>   not insecure.  At least not more insecure than anything else you can
>>   do if you have access to /dev/sdX or /dev/stX nodes.
>>
>> * the whitelist can be disabled per-process but not per-disk.  In addition,
>>   the required capability (CAP_SYS_RAWIO) gives access to a range of other 
>>   resources, enough to make it insecure.
>>
>> The series corrects these problems.  Patches 1-4 solve the first problem,
>> which also has an assigned CVE, by using different bitmaps for the various
>> device classes.  Patches 5-11 solve the second by adding more commands
>> to the bitmaps.  Patches 12 and 13 solve the third, and were already
>> posted but ignored by the maintainers despite multiple pings.
>>
>> Note: checkpatch hates the formatting of the command table.  I know about this,
>> and ensured that there are no errors in the rest of the code.  The current
>> formatting is IMHO quite handy, and roughly based on the files available
>> from the SCSI standard body.
>>
>> Ok for the next merge window?
>>
>> Paolo
>>
>> v1->v2: remove 2 MMC commands and 6 SBC commands (see patches 6 and 9
>>         for details).  Added patch 14 and added a few more scanner
>>         commands based on SANE (scanners are not whitelisted by default,
>>         also were not in v1, but this makes it possible to opt into the
>>         whitelist out of paranoia).  Removed C++ comments.  Removed the
>>         large #if 0'd list of commands that the kernel does not pass
>>         though.  Marked blk_set_cmd_filter_defaults as __init.
> 
> Ping...
> 
> Jens/James, is anyone going to pick this up for 3.9?
> 
> Paolo
> 
>>
>> Paolo Bonzini (14):
>>   sg_io: pass request_queue to blk_verify_command
>>   sg_io: reorganize list of allowed commands
>>   sg_io: use different default filters for each device class
>>   sg_io: resolve conflicts between commands assigned to multiple
>>     classes (CVE-2012-4542)
>>   sg_io: whitelist a few more commands for rare & obsolete device types
>>   sg_io: whitelist another command for multimedia devices
>>   sg_io: whitelist a few more commands for media changers
>>   sg_io: whitelist a few more commands for tapes
>>   sg_io: whitelist a few more commands for disks
>>   sg_io: whitelist a few obsolete commands
>>   sg_io: mark blk_set_cmd_filter_defaults as __init
>>   sg_io: remove remnants of sysfs SG_IO filters
>>   sg_io: introduce unpriv_sgio queue flag
>>   sg_io: use unpriv_sgio to disable whitelisting for scanners
>>
>>  Documentation/block/queue-sysfs.txt |    8 +
>>  block/blk-sysfs.c                   |   33 +++
>>  block/bsg.c                         |    2 +-
>>  block/scsi_ioctl.c                  |  369 ++++++++++++++++++++++++++---------
>>  drivers/scsi/scsi_scan.c            |   14 ++-
>>  drivers/scsi/sg.c                   |    6 +-
>>  include/linux/blkdev.h              |    8 +-
>>  include/linux/genhd.h               |    9 -
>>  include/scsi/scsi.h                 |    3 +
>>  9 files changed, 344 insertions(+), 108 deletions(-)
>>
> 


^ permalink raw reply	[flat|nested] 73+ messages in thread

* PING^7 (was Re: [PATCH v2 00/14] Corrections and customization of the SG_IO command whitelist (CVE-2012-4542))
  2013-02-06 15:15 [PATCH v2 00/14] Corrections and customization of the SG_IO command whitelist (CVE-2012-4542) Paolo Bonzini
                   ` (15 preceding siblings ...)
  2013-02-20 16:12 ` Paolo Bonzini
@ 2013-05-22  6:35 ` Paolo Bonzini
  2013-05-22  9:32   ` Tejun Heo
  16 siblings, 1 reply; 73+ messages in thread
From: Paolo Bonzini @ 2013-05-22  6:35 UTC (permalink / raw)
  To: Tejun Heo, James E.J. Bottomley, Jens Axboe; +Cc: linux-kernel, linux-scsi

I'm not sure what is more ridiculous, whether the seven pings or the
lack of review...

Paolo

Il 06/02/2013 16:15, Paolo Bonzini ha scritto:
> This series regards the whitelist that is used for the SG_IO ioctl.  This
> whitelist has three problems:
> 
> * the bitmap of allowed commands is designed for MMC devices (roughly,
>   "play/burn CDs without requiring root") but some opcodes overlap across SCSI
>   device classes and have different meanings for different classes.
> 
> * also because the bitmap of allowed commands is designed for MMC devices
>   only, some commands are missing even though they are generally useful and
>   not insecure.  At least not more insecure than anything else you can
>   do if you have access to /dev/sdX or /dev/stX nodes.
> 
> * the whitelist can be disabled per-process but not per-disk.  In addition,
>   the required capability (CAP_SYS_RAWIO) gives access to a range of other 
>   resources, enough to make it insecure.
> 
> The series corrects these problems.  Patches 1-4 solve the first problem,
> which also has an assigned CVE, by using different bitmaps for the various
> device classes.  Patches 5-11 solve the second by adding more commands
> to the bitmaps.  Patches 12 and 13 solve the third, and were already
> posted but ignored by the maintainers despite multiple pings.
> 
> Note: checkpatch hates the formatting of the command table.  I know about this,
> and ensured that there are no errors in the rest of the code.  The current
> formatting is IMHO quite handy, and roughly based on the files available
> from the SCSI standard body.
> 
> Ok for the next merge window?
> 
> Paolo
> 
> v1->v2: remove 2 MMC commands and 6 SBC commands (see patches 6 and 9
>         for details).  Added patch 14 and added a few more scanner
>         commands based on SANE (scanners are not whitelisted by default,
>         also were not in v1, but this makes it possible to opt into the
>         whitelist out of paranoia).  Removed C++ comments.  Removed the
>         large #if 0'd list of commands that the kernel does not pass
>         though.  Marked blk_set_cmd_filter_defaults as __init.
> 
> 
> Paolo Bonzini (14):
>   sg_io: pass request_queue to blk_verify_command
>   sg_io: reorganize list of allowed commands
>   sg_io: use different default filters for each device class
>   sg_io: resolve conflicts between commands assigned to multiple
>     classes (CVE-2012-4542)
>   sg_io: whitelist a few more commands for rare & obsolete device types
>   sg_io: whitelist another command for multimedia devices
>   sg_io: whitelist a few more commands for media changers
>   sg_io: whitelist a few more commands for tapes
>   sg_io: whitelist a few more commands for disks
>   sg_io: whitelist a few obsolete commands
>   sg_io: mark blk_set_cmd_filter_defaults as __init
>   sg_io: remove remnants of sysfs SG_IO filters
>   sg_io: introduce unpriv_sgio queue flag
>   sg_io: use unpriv_sgio to disable whitelisting for scanners
> 
>  Documentation/block/queue-sysfs.txt |    8 +
>  block/blk-sysfs.c                   |   33 +++
>  block/bsg.c                         |    2 +-
>  block/scsi_ioctl.c                  |  369 ++++++++++++++++++++++++++---------
>  drivers/scsi/scsi_scan.c            |   14 ++-
>  drivers/scsi/sg.c                   |    6 +-
>  include/linux/blkdev.h              |    8 +-
>  include/linux/genhd.h               |    9 -
>  include/scsi/scsi.h                 |    3 +
>  9 files changed, 344 insertions(+), 108 deletions(-)
> 


^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: PING^7 (was Re: [PATCH v2 00/14] Corrections and customization of the SG_IO command whitelist (CVE-2012-4542))
  2013-05-22  6:35 ` PING^7 (was Re: [PATCH v2 00/14] Corrections and customization of the SG_IO command whitelist (CVE-2012-4542)) Paolo Bonzini
@ 2013-05-22  9:32   ` Tejun Heo
  2013-05-22  9:53     ` Paolo Bonzini
  0 siblings, 1 reply; 73+ messages in thread
From: Tejun Heo @ 2013-05-22  9:32 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: James E.J. Bottomley, Jens Axboe, linux-kernel, linux-scsi

On Wed, May 22, 2013 at 08:35:54AM +0200, Paolo Bonzini wrote:
> I'm not sure what is more ridiculous, whether the seven pings or the
> lack of review...

So, ummm, I don't know what Jens is thinking but at this point I'm
basically waiting for someone else to pick it up as review to return
ratio is too low to continue.  It doesn't seem like I can get the
series into a shape I can ack with reasonable amount of effort.

My memory is kinda hazy now but here are two review points that came
to my mind before giving up.

* The response that I got after asking for justification basically
  boiled down to "it has to".  Whatever that means.

* In the patch series, fixes and feature changes are still mixed in
  order.  I gave up after this.

Thanks and good luck.

-- 
tejun

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: PING^7 (was Re: [PATCH v2 00/14] Corrections and customization of the SG_IO command whitelist (CVE-2012-4542))
  2013-05-22  9:32   ` Tejun Heo
@ 2013-05-22  9:53     ` Paolo Bonzini
  2013-05-22 10:02       ` Tejun Heo
  0 siblings, 1 reply; 73+ messages in thread
From: Paolo Bonzini @ 2013-05-22  9:53 UTC (permalink / raw)
  To: Tejun Heo; +Cc: James E.J. Bottomley, Jens Axboe, linux-kernel, linux-scsi

Il 22/05/2013 11:32, Tejun Heo ha scritto:
> On Wed, May 22, 2013 at 08:35:54AM +0200, Paolo Bonzini wrote:
>> I'm not sure what is more ridiculous, whether the seven pings or the
>> lack of review...
> 
> So, ummm, I don't know what Jens is thinking but at this point I'm
> basically waiting for someone else to pick it up as review to return
> ratio is too low to continue.  It doesn't seem like I can get the
> series into a shape I can ack with reasonable amount of effort.

Then please say so.  I didn't find any comment in your review that I missed.

> My memory is kinda hazy now but here are two review points that came
> to my mind before giving up.
> 
> * The response that I got after asking for justification basically
>   boiled down to "it has to".  Whatever that means.

For patches 1-4, it means that you're allowed to write to media when a
file is opened for reading.  The patches fix this.

For patches 5-12, it means that you currently need root-equivalent
privileges (CAP_SYS_RAWIO) to do "regular business" on any SCSI device
that is not a CD-ROM or a tape or a disk.

For patches 13-14, it means that you currently need root-equivalent
privileges (CAP_SYS_RAWIO) to do operations on SCSI devices that require
some level of trust, hence there is no way to confine this to a single
device.

But all this is in the cover letter, I'm just paraphrasing.

> * In the patch series, fixes and feature changes are still mixed in
>   order.  I gave up after this.

Bugfixes are in patch 1-4.  The patches first introduce the new table
format without any semantic change, then they introduce per-class
filters while still leaving the conflicting commands accessible with
O_RDONLY, and finally fix the bug.  If you have any better ideas, please
tell me.  I did try to optimize for reviewability and bisectability, if
I screwed up I'd like to hear why.

Whitelisting of extra commands is in patch 5-10.  Additional related
changes are in patches 11-14.

Again, all this is in the cover letter.

Paolo

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: PING^7 (was Re: [PATCH v2 00/14] Corrections and customization of the SG_IO command whitelist (CVE-2012-4542))
  2013-05-22  9:53     ` Paolo Bonzini
@ 2013-05-22 10:02       ` Tejun Heo
  2013-05-22 10:23         ` Paolo Bonzini
  0 siblings, 1 reply; 73+ messages in thread
From: Tejun Heo @ 2013-05-22 10:02 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: James E.J. Bottomley, Jens Axboe, linux-kernel, linux-scsi

On Wed, May 22, 2013 at 11:53:30AM +0200, Paolo Bonzini wrote:
> Il 22/05/2013 11:32, Tejun Heo ha scritto:
> > On Wed, May 22, 2013 at 08:35:54AM +0200, Paolo Bonzini wrote:
> >> I'm not sure what is more ridiculous, whether the seven pings or the
> >> lack of review...
> > 
> > So, ummm, I don't know what Jens is thinking but at this point I'm
> > basically waiting for someone else to pick it up as review to return
> > ratio is too low to continue.  It doesn't seem like I can get the
> > series into a shape I can ack with reasonable amount of effort.
> 
> Then please say so.  I didn't find any comment in your review that I missed.

Well, I've tried that multiple times and didn't get the results that I
was expecting each time, so doing it all over again felt pointless.
Even now, you just repeat what you've been saying and I'd have to
fight through each and every point.  It just doesn't feel worth the
effort.  It'd be far less effort to just slurp the patches and
regurgitate them myself.  I don't care that much about the changes
right now, so I'm just waiting for either someone else picking it up
or my yield with you somehow magically improving and the next refresh
addressing most of the issues.

So, I suppose this is bye bye at least on this thread.

-- 
tejun

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: PING^7 (was Re: [PATCH v2 00/14] Corrections and customization of the SG_IO command whitelist (CVE-2012-4542))
  2013-05-22 10:02       ` Tejun Heo
@ 2013-05-22 10:23         ` Paolo Bonzini
  2013-05-22 12:07           ` James Bottomley
  2013-05-22 13:41           ` Tejun Heo
  0 siblings, 2 replies; 73+ messages in thread
From: Paolo Bonzini @ 2013-05-22 10:23 UTC (permalink / raw)
  To: Tejun Heo; +Cc: James E.J. Bottomley, Jens Axboe, linux-kernel, linux-scsi

Il 22/05/2013 12:02, Tejun Heo ha scritto:
> On Wed, May 22, 2013 at 11:53:30AM +0200, Paolo Bonzini wrote:
>> Il 22/05/2013 11:32, Tejun Heo ha scritto:
>>> On Wed, May 22, 2013 at 08:35:54AM +0200, Paolo Bonzini wrote:
>>>> I'm not sure what is more ridiculous, whether the seven pings or the
>>>> lack of review...
>>>
>>> So, ummm, I don't know what Jens is thinking but at this point I'm
>>> basically waiting for someone else to pick it up as review to return
>>> ratio is too low to continue.  It doesn't seem like I can get the
>>> series into a shape I can ack with reasonable amount of effort.
>>
>> Then please say so.  I didn't find any comment in your review that I missed.
> 
> Well, I've tried that multiple times and didn't get the results that I
> was expecting each time, so doing it all over again felt pointless.
> Even now, you just repeat what you've been saying and I'd have to
> fight through each and every point.

Yes, because I have no idea what _your_ point is.

Let's look at the first submission.

Patch 1 - acked by you.

Patch 2 - discussions on the formatting.  Every comment of yours has
been accounted for, except for one.  I wrote:

     > If you want opcodes visible, you can make them the comments, right?

     Yes, like "/* 0x00 */ CONSTANT, MASK".  I still have a slight
preference
     for the opcodes because if the constant ends up wrong, the
     head-scratching would be higher than if the opcode is wrong (the opcode
     is what you see in the dumps).

You didn't answer; v2 was posted 15 days after the end of the v1 thread,
so you had enough time to post more comments and have them addressed in
v2.  Yo me that means "fair enough".

Patch 3-5 - no comment.

Patch 6 - long discussion, ending with "The vast majority of the
          commands are added because Linux itself is using them", and
          with me removing some commands from the list according to
          your request.

Patch 7-13 - no comment.

Cover letter has no comment either.


So you haven't commented on most patches or on the cover letter, and now
you ask about clarification generically rather than about specific
points of the commit messages or the cover letter.

There is only sensible conclusion I can make.  Namely, that you haven't
even read those commit messages.  So, I'm sorry if you did, but I don't
have a crystal ball to understand what you found wrong, and there's
nothing I can do about it.

> It just doesn't feel worth the
> effort.  It'd be far less effort to just slurp the patches and
> regurgitate them myself.

If there is a fundamental misunderstanding, that wouldn't help anyway.
We would have the same discussion in reverse when I review your patches.

> I don't care that much about the changes
> right now, so I'm just waiting for either someone else picking it up
> or my yield with you somehow magically improving and the next refresh
> addressing most of the issues.

Well, so far there was just one pass, and not even a full one.

Paolo

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: PING^7 (was Re: [PATCH v2 00/14] Corrections and customization of the SG_IO command whitelist (CVE-2012-4542))
  2013-05-22 10:23         ` Paolo Bonzini
@ 2013-05-22 12:07           ` James Bottomley
  2013-05-22 14:07             ` Paolo Bonzini
  2013-05-22 13:41           ` Tejun Heo
  1 sibling, 1 reply; 73+ messages in thread
From: James Bottomley @ 2013-05-22 12:07 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Tejun Heo, Jens Axboe, linux-kernel, linux-scsi

On Wed, 2013-05-22 at 12:23 +0200, Paolo Bonzini wrote:
> Il 22/05/2013 12:02, Tejun Heo ha scritto:
> > On Wed, May 22, 2013 at 11:53:30AM +0200, Paolo Bonzini wrote:
> >> Il 22/05/2013 11:32, Tejun Heo ha scritto:
> >>> On Wed, May 22, 2013 at 08:35:54AM +0200, Paolo Bonzini wrote:
> >>>> I'm not sure what is more ridiculous, whether the seven pings or the
> >>>> lack of review...
> >>>
> >>> So, ummm, I don't know what Jens is thinking but at this point I'm
> >>> basically waiting for someone else to pick it up as review to return
> >>> ratio is too low to continue.  It doesn't seem like I can get the
> >>> series into a shape I can ack with reasonable amount of effort.
> >>
> >> Then please say so.  I didn't find any comment in your review that I missed.
> > 
> > Well, I've tried that multiple times and didn't get the results that I
> > was expecting each time, so doing it all over again felt pointless.
> > Even now, you just repeat what you've been saying and I'd have to
> > fight through each and every point.
> 
> Yes, because I have no idea what _your_ point is.

OK, let me try.  I did draw straws with Jens at LSF to see who would
look at this and he lost, but the complexity of the patch set probably
makes it hard for him to find the time.

The first problem, which Tejun already pointed out is that you've
combined a "bug fix" with a large feature set in such a way that they
can't be separated, so saying the first four patches fix a bug isn't
helpful.

The second problem is that it's not at all clear what the bug actually
is.  You have to wade through tons of red hat bugzillas before you come
up with the fact that there's one command which we allow users to send
which is ambiguous: GPCMD_READ_SUBCHANNEL which has the same opcode as
UNMAP on a disk.  Once you finally work this out, you wonder what all
the fuss is about because UNMAP is advisory ... even if an unprivileged
user can now send the command, it can't be used to damage any data or
even get access to any data, so there doesn't seem to be an actual bug
to fix at all.

The various committees do try hard to ensure there's no opcode
overlap ... although they don't always succeed as you see with the UNMAP
above, so I'm not at all sure we need the huge complexity of per scsi
device type command filter lists, which is what the rest of the feature
additions is about.

The third problem is that in order to verify that all the code motion
doesn't actually introduce a bug, you have to wade through about seven
patches ... the patch split really isn't at all conducive to reviewing
this critical piece.

Finally, the patch for the feature I think you actually want, which is
13/14, could have been implemented fairly simply as a single patch and
doesn't have to be part of this series.

James



^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: PING^7 (was Re: [PATCH v2 00/14] Corrections and customization of the SG_IO command whitelist (CVE-2012-4542))
  2013-05-22 10:23         ` Paolo Bonzini
  2013-05-22 12:07           ` James Bottomley
@ 2013-05-22 13:41           ` Tejun Heo
  2013-05-22 14:12             ` Paolo Bonzini
  1 sibling, 1 reply; 73+ messages in thread
From: Tejun Heo @ 2013-05-22 13:41 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: James E.J. Bottomley, Jens Axboe, linux-kernel, linux-scsi

On Wed, May 22, 2013 at 12:23:56PM +0200, Paolo Bonzini wrote:
> Yes, because I have no idea what _your_ point is.

Isolate the actual fixes and just submit them as it seems impossible
for you to provide proper justifications for the things you want to
add.

-- 
tejun

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: PING^7 (was Re: [PATCH v2 00/14] Corrections and customization of the SG_IO command whitelist (CVE-2012-4542))
  2013-05-22 12:07           ` James Bottomley
@ 2013-05-22 14:07             ` Paolo Bonzini
  2013-05-22 16:31               ` Paolo Bonzini
  0 siblings, 1 reply; 73+ messages in thread
From: Paolo Bonzini @ 2013-05-22 14:07 UTC (permalink / raw)
  To: James Bottomley; +Cc: Tejun Heo, Jens Axboe, linux-kernel, linux-scsi


> OK, let me try.  I did draw straws with Jens at LSF to see who would
> look at this and he lost, but the complexity of the patch set probably
> makes it hard for him to find the time.

Thanks.

> The first problem, which Tejun already pointed out is that you've
> combined a "bug fix" with a large feature set in such a way that they
> can't be separated, so saying the first four patches fix a bug isn't
> helpful.

Fair enough.  Though to be precise, this is a combined patchset also
because previous attempts to get similar patches reviewed proceeded with
glacial pace.

For example, the unpriv_sgio patch had been sent in November 2012
(https://patchwork.kernel.org/patch/1735871/).  It got two acks from
Tejun, and was never applied despite two pings.

My attempt to include UNMAP and WRITE SAME into the whitelist (even before
I found out by chance the overlap) dates back to July 2012, and was only
reviewed two months after.

> The second problem is that it's not at all clear what the bug actually
> is.  You have to wade through tons of red hat bugzillas before you come
> up with the fact that there's one command which we allow users to send
> which is ambiguous: GPCMD_READ_SUBCHANNEL which has the same opcode as
> UNMAP on a disk.

No, you don't need to.  Just read the commit message of patch 4.

> Once you finally work this out, you wonder what all
> the fuss is about because UNMAP is advisory ... even if an unprivileged
> user can now send the command, it can't be used to damage any data or
> even get access to any data, so there doesn't seem to be an actual bug
> to fix at all.

This doesn't look advisory to me:

$ sudo chmod 444 /dev/sdb
$ sha1sum /dev/sdb
e15720aa0d26856f0486867634b6737c30ea7346  /dev/sdb
$ echo -n $'%\x16%\x10%%%%%%%%%%%%%%\x40%%%%%' | tr % \\0 | \
     sg_raw  --readonly /dev/sdb -s24 42 00 00 00 00 00 00 00 18 00
$ sha1sum /dev/sdb
aab335ccc669bbbc85d727870b5941dc3581f025  /dev/sdb

It doesn't look readonly, either.

> The various committees do try hard to ensure there's no opcode
> overlap ... although they don't always succeed as you see with the UNMAP
> above, so I'm not at all sure we need the huge complexity of per scsi
> device type command filter lists, which is what the rest of the feature
> additions is about.

That was introduced because Tejun didn't want to enable more than the bare
minimum for MMC devices.  He was worried about hypothetical scenarios with
vendors recycling opcodes in a way that is not suitable for exposing to
unprivileged userland (he reinforced this, for example, here:
http://article.gmane.org/gmane.linux.kernel/1429863).

> The third problem is that in order to verify that all the code motion
> doesn't actually introduce a bug, you have to wade through about seven
> patches ... the patch split really isn't at all conducive to reviewing
> this critical piece.

Not true.  Each patch can be reviewed independently.  The big and hard-to-
review movement is all in patch 2.

The next patches can be tedious to review, but that does not meen hard.  Just
do "grep ^[-+].*sgio_bitmap_set patchNN | sort -k2 -k1" for example

> Finally, the patch for the feature I think you actually want, which is
> 13/14, could have been implemented fairly simply as a single patch and
> doesn't have to be part of this series.

It was, and it was ignored.  I sent it together because of the common
dependency on the first patch.

However, it is not the only feature I need; that patch should be just for things
like reservations or vendor-specific commands.  I also need that SG_IO works
well without any privileges, neither CAP_SYS_RAWIO (needed for a process to
bypass the whitelist) neither CAP_SYS_ADMIN (needed for a process to disable
the whitelist for others as in patch 13/14).  I need that at least for disks,
tapes and media changers.

Paolo

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: PING^7 (was Re: [PATCH v2 00/14] Corrections and customization of the SG_IO command whitelist (CVE-2012-4542))
  2013-05-22 13:41           ` Tejun Heo
@ 2013-05-22 14:12             ` Paolo Bonzini
  2013-05-22 14:30               ` Tejun Heo
  2013-05-22 15:03               ` Theodore Ts'o
  0 siblings, 2 replies; 73+ messages in thread
From: Paolo Bonzini @ 2013-05-22 14:12 UTC (permalink / raw)
  To: Tejun Heo; +Cc: James E.J. Bottomley, Jens Axboe, linux-kernel, linux-scsi

Il 22/05/2013 15:41, Tejun Heo ha scritto:
> On Wed, May 22, 2013 at 12:23:56PM +0200, Paolo Bonzini wrote:
>> Yes, because I have no idea what _your_ point is.
> 
> Isolate the actual fixes and just submit them as it seems impossible
> for you to provide proper justifications for the things you want to
> add.

Quoting myself on January 26, 2013: "The vast majority of the commands
are added because Linux itself is using them".

Paolo

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: PING^7 (was Re: [PATCH v2 00/14] Corrections and customization of the SG_IO command whitelist (CVE-2012-4542))
  2013-05-22 14:12             ` Paolo Bonzini
@ 2013-05-22 14:30               ` Tejun Heo
  2013-05-22 15:00                 ` Paolo Bonzini
  2013-05-22 15:03               ` Theodore Ts'o
  1 sibling, 1 reply; 73+ messages in thread
From: Tejun Heo @ 2013-05-22 14:30 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: James E.J. Bottomley, Jens Axboe, linux-kernel, linux-scsi

On Wed, May 22, 2013 at 04:12:04PM +0200, Paolo Bonzini wrote:
> Il 22/05/2013 15:41, Tejun Heo ha scritto:
> > On Wed, May 22, 2013 at 12:23:56PM +0200, Paolo Bonzini wrote:
> >> Yes, because I have no idea what _your_ point is.
> > 
> > Isolate the actual fixes and just submit them as it seems impossible
> > for you to provide proper justifications for the things you want to
> > add.
> 
> Quoting myself on January 26, 2013: "The vast majority of the commands
> are added because Linux itself is using them".

See, this is exactly what I've been talking about.  Reviewing or
raising points is almost useless.  Gees, why did I start this again?
Why the hell aren't my points clear yet after so many exchanges on the
exact same frigging subject?  Stop repeting yourself and try to
understand the review points for once.

* Separate fixes from additions.  Transform existing code so that the
  visible behavior doesn't change but the required fix can be
  implemented on top.  Explicitly note what's going on in the commit
  messages.

* Fix the frigging CVE bug that you've been waving around and do
  *just* that.

* Add the frigging "count me out" feature that you want for your use
  case.  It isn't controversial and is what you need and the
  maintainer can apply to the point where [s]he thinks acceptable.

* If for whatever reason you have to add more command codes to the
  exception table, do them with explicit justifications.  How the hell
  "the vast majority of the commands are added because Linux itself is
  using them" a proper justification?  How are they used for what
  reason and why is adding them beneficial?  How many times have I
  asked you to give at least some useful use cases?  And WTF is "vast
  majority", what about others then?  Why do you need this at all if
  you have the "count me out" knob in the first place?  You first
  built that command list by scanning the spec and just adding the
  commands that seemed "right" to you.  I have near-zero confidence in
  your perception of the relationship between the specs and actual
  world.

So, stop quoting and repeating yourself.  You're overdoing yourself on
that department already.  Try to listen and understand for a change.

-- 
tejun

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: PING^7 (was Re: [PATCH v2 00/14] Corrections and customization of the SG_IO command whitelist (CVE-2012-4542))
  2013-05-22 14:30               ` Tejun Heo
@ 2013-05-22 15:00                 ` Paolo Bonzini
  2013-05-22 19:30                   ` Tejun Heo
  0 siblings, 1 reply; 73+ messages in thread
From: Paolo Bonzini @ 2013-05-22 15:00 UTC (permalink / raw)
  To: Tejun Heo; +Cc: James E.J. Bottomley, Jens Axboe, linux-kernel, linux-scsi

Il 22/05/2013 16:30, Tejun Heo ha scritto:
> * Separate fixes from additions.  Transform existing code so that the
>   visible behavior doesn't change but the required fix can be
>   implemented on top.  Explicitly note what's going on in the commit
>   messages.

Been there, done that.  Have you read the commit messages *at all*?

Patch 2: "Right now, there is still just one bitmap and the mask is
ignored, so there is no semantic change yet".

Patch 3: "This patch already introduces some semantic change, albeit
very limited; [...] a few commands are now forbidden for devices of type
other than TYPE_ROM, where they are reserved or vendor-specific".

> * Fix the frigging CVE bug that you've been waving around and do
>   *just* that.

Perhaps you've missed that this is v2, not v27.  Tell me a place in the
original review where you've told me to split the series.

> * Add the frigging "count me out" feature that you want for your use
>   case.  It isn't controversial and is what you need and the
>   maintainer can apply to the point where [s]he thinks acceptable.

Sure.  Except I had sent it six months ago, and it lied unreviewed
despite your acks for two months.  It is in the archives waiting to be
picked up.

> * If for whatever reason you have to add more command codes to the
>   exception table, do them with explicit justifications.  How the hell
>   "the vast majority of the commands are added because Linux itself is
>   using them" a proper justification?   How are they used for what
>   reason and why is adding them beneficial?

For example, WRITE SAME is used to discard blocks.  If a Linux guest
wants to discard blocks, it may send WRITE SAME.  If a disk advertises
support for WRITE SAME, it is not nice if WRITE SAME then fails because
of a stupid whitelist that was designed for CD-ROMs.

And another disk instead works because it uses UNMAP instead of WRITE
SAME.  It's a support nightmare, perhaps the cause is obvious to me by
the time it reaches me, but that takes time---which is wasted time for
everyone involved.

>   How many times have I
>   asked you to give at least some useful use cases?  And WTF is "vast
>   majority", what about others then?

The others are not in v2, or are sent by udev, or were added to the
standard for a reason and applications are using them (e.g. COMPARE AND
WRITE).

>   Why do you need this at all if
>   you have the "count me out" knob in the first place?

Because the "count me out" knob needs privileges.  It opens up way, way
more than what these patches do.

And the frigging patch to make the whole whitelisting
userspace-configurable with finer-grain (but still with appropriate
capabilities) was nacked.  By you, for God's sake.

http://permalink.gmane.org/gmane.linux.kernel/1378071

>   You first
>   built that command list by scanning the spec and just adding the
>   commands that seemed "right" to you.

And then in v2 I stated that I removed some disk commands because of
discussion with you.  But of course you don't know, because you've not
read the damn commit messages.

>   I have near-zero confidence in
>   your perception of the relationship between the specs and actual
>   world.

Thankyouverymuch.  Perhaps you should have read the commit messages (oh
sorry, have I said it already?) and seen that it's not about commands
that seemed "right":

   Only commands that affect the medium are added.
   Commands that affect other state of the LUN are all privileged, with
   the sole exception of START STOP UNIT (which has always been
   allowed for all file descriptors.  I do not really agree with that
   and it's probably an artifact of when /dev/cdrom had
   r--r--r-- permissions, but I'm not trying to change that.

> So, stop quoting and repeating yourself.  You're overdoing yourself on
> that department already.  Try to listen and understand for a change.

Guy, calm down.  We're two.

Paolo


^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: PING^7 (was Re: [PATCH v2 00/14] Corrections and customization of the SG_IO command whitelist (CVE-2012-4542))
  2013-05-22 14:12             ` Paolo Bonzini
  2013-05-22 14:30               ` Tejun Heo
@ 2013-05-22 15:03               ` Theodore Ts'o
  2013-05-22 15:53                 ` Paolo Bonzini
  1 sibling, 1 reply; 73+ messages in thread
From: Theodore Ts'o @ 2013-05-22 15:03 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Tejun Heo, James E.J. Bottomley, Jens Axboe, linux-kernel, linux-scsi

Paolo,

I'll probably regret butting my head into this, but it might be
helpful if you talk about your particular use case which is driving
your desire to make these changes.  For example, what do you think the
SG_IO whitelist _should_ be used, and why should it be made more
general?  What's the use case that is being impaired by the current
state of how sg_io whitelists are being handled?

Secondly, when you are trying to get a security vulnerability fixed,
it's helpful if you give the precise nature of the problem, and what
the an attacker can do with it.  I think you are worried that if an
attacker has read-only access, they can still send the UNMAP command
which may (since it is advisory) result in a block no longer
containing valid data, such that a read will return zero's or some
other undefined garbage.   Yes?

Now consider that if this is a high-priority fix, it's important to
make the patch as small as possible, since distributions (like your
employer) may want to backport the patch to older kernels.  And
distribution release engineers will appreciate things if the patch is
as small as possible, making the _minimum_ necessary changes to fix
said security exposure.  Generally, a series of 14 patches is __not__
the minimum necessary patch.

Finally, please consider that your attitude is not going to win
friends and influence people.  I don't know if the capability to work
well with upstream developers (people which ***other*** Red Hat
engineers have had no problems work with, and which I can attest,
through personal experience, are very reasonable engineers who are
easy to work with), is something which is a part of your performance
review process.  But if it isn't, it probably should be, since the
ability to listen to review feedback is going to be important for your
long term career prospects, no matter whether it is with the Linux
kernel, some other open source project, or even a proprietary softare
project.

Regards,

					- Ted

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: PING^7 (was Re: [PATCH v2 00/14] Corrections and customization of the SG_IO command whitelist (CVE-2012-4542))
  2013-05-22 15:03               ` Theodore Ts'o
@ 2013-05-22 15:53                 ` Paolo Bonzini
  2013-05-22 16:32                   ` Martin K. Petersen
  2013-05-22 20:39                   ` Tejun Heo
  0 siblings, 2 replies; 73+ messages in thread
From: Paolo Bonzini @ 2013-05-22 15:53 UTC (permalink / raw)
  To: Theodore Ts'o, Tejun Heo, James E.J. Bottomley, Jens Axboe,
	linux-kernel, linux-scsi

Il 22/05/2013 17:03, Theodore Ts'o ha scritto:
> Paolo,
> 
> I'll probably regret butting my head into this, but it might be
> helpful if you talk about your particular use case which is driving
> your desire to make these changes.

Ted,

thank you very much.  I understand that my discussion with Tejun is
leading nowhere, and any outside help can only improve the situation.  I
hope you won't regret it. :)

> For example, what do you think the
> SG_IO whitelist _should_ be used, and why should it be made more
> general?  What's the use case that is being impaired by the current
> state of how sg_io whitelists are being handled?

Thanks for writing the questions down.  I hope you don't mind the
verbosity; perhaps we can use the opportunity to rewind the discussion.

First of all, I'll note that SG_IO and block-device-specific ioctls both
have their place.  My usecase for SG_IO is virtualization, where I need
to pass information from the LUN to the virtual machine with as much
fidelity as possible if I choose to virtualize at the SCSI level.  In
that case, I'll use SG_IO.  If people are okay with virtualizing at a
higher-level, I'll use ioctls(BLK*) or fallocate (it depends on whether
my target is a block device or a file).  It depends on the application,
the user, the context,...  But in general, a program that cares about
things like sense data must use SG_IO, a program that only cares about
high-level concepts will use the ioctl.

Now, SG_IO whitelist started so that you could "burn CDs without being
root".  But that can be extended to other device types; the SG_IO
whitelist provides a way to allow low-level operations on devices while
ensuring: a) respect of file permissions and no need for special
privileges; b) no disruption of the devices or the storage fabric (for
disks, no disruption beyond what would be possible anyway with
read/write or other ioctls).

There are some non-disk devices that (like CD-ROMs) have their own set
of commands and can only be accessed via /dev/sg, for example media
changers.  Tapes also have some functionality that is not accessible via
/dev/st.  It would be useful (for virt, but I'm sure there are other
usecases) to make the common uses of these devices possible without
privileges, just by granting the appropriate permissions to the uids who
will operate on them.  This is why the SG_IO whitelist should be widened
for these devices.

But even for block devices, it should be made more general because the
limitations in the current list are not really justified, except as an
artifact of how the whitelist developed (again, "burn CDs without being
root").  I think that there's no reason to forbid unprivileged programs
from doing with SG_IO what they could do with other ioctls.  By
extension, if it makes sense to add an unprivileged ioctl in the future
(e.g. for atomic compare and write) it should also be available from now
via SG_IO.

> Secondly, when you are trying to get a security vulnerability fixed,
> it's helpful if you give the precise nature of the problem, and what
> the an attacker can do with it.  I think you are worried that if an
> attacker has read-only access, they can still send the UNMAP command
> which may (since it is advisory) result in a block no longer
> containing valid data, such that a read will return zero's or some
> other undefined garbage.   Yes?

Yes.

> Now consider that if this is a high-priority fix, it's important to
> make the patch as small as possible, since distributions (like your
> employer) may want to backport the patch to older kernels.  And
> distribution release engineers will appreciate things if the patch is
> as small as possible, making the _minimum_ necessary changes to fix
> said security exposure.  Generally, a series of 14 patches is __not__
> the minimum necessary patch.

It is not a high-priority fix, or Red Hat would have agreed on an
embargo date with other distros, and done all the security stuff that
they routinely do.  Still, it is a fix that I would like to get in, if
only because Red Hat's policy is to get things upstream as much as possible.

In fact, I would have very much preferred to get things upstream first.
 Unfortunately, this was not really possible in this case; FWIW, the
RHEL kernel already has something very similar to the first 4 patches in
v1 of this series.

The reason for posting it all together, it's that frankly getting
patches into block/ is an absolute pain.  I know it's not a good reason,
and I have certainly compounded my own mistakes.  But guys, you have a
review problem if things are allowed to sit in the mailing list for two
months without a single reply.

It's really the first time (and I've been working on free software for a
_long_ time, as both volunteer and employee, maintainer and contributor)
that I felt the helpless because I was not in the frequent contributors
"clique".  I know that's just an impression and doesn't match reality,
but emotions do count and they make it really hard for one to keep his
temper.  I deleted many f-words from my replies (I should have deleted
more).

In fact, the (low) level of this discussion is also a first for me.
Maybe I didn't know myself well enough, because even I wouldn't have
expected it.  I'm likely more surprised of this than anyone else.

> Finally, please consider that your attitude is not going to win
> friends and influence people.

I do listen to review feedback, but I also expect the other side to
listen to me, ask me what is not clear, and possess some knowledge of
the domain that he's reviewing patches for.  All of which, quite
frankly, I have not seen in this case.

> I don't know if the capability to work
> well with upstream developers (people which ***other*** Red Hat
> engineers have had no problems work with, and which I can attest,
> through personal experience, are very reasonable engineers who are
> easy to work with), is something which is a part of your performance
> review process.  But if it isn't, it probably should be,

It is, as it should be.

Paolo

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: PING^7 (was Re: [PATCH v2 00/14] Corrections and customization of the SG_IO command whitelist (CVE-2012-4542))
  2013-05-22 14:07             ` Paolo Bonzini
@ 2013-05-22 16:31               ` Paolo Bonzini
  0 siblings, 0 replies; 73+ messages in thread
From: Paolo Bonzini @ 2013-05-22 16:31 UTC (permalink / raw)
  Cc: James Bottomley, Tejun Heo, Jens Axboe, linux-kernel, linux-scsi

Il 22/05/2013 16:07, Paolo Bonzini ha scritto:
>> Finally, the patch for the feature I think you actually want, which is
>> 13/14, could have been implemented fairly simply as a single patch and
>> doesn't have to be part of this series.
> 
> It was, and it was ignored.  I sent it together because of the common
> dependency on the first patch.
> 
> However, it is not the only feature I need; that patch should be just for things
> like reservations or vendor-specific commands.  I also need that SG_IO works
> well without any privileges, neither CAP_SYS_RAWIO (needed for a process to
> bypass the whitelist) neither CAP_SYS_ADMIN (needed for a process to disable
> the whitelist for others as in patch 13/14).  I need that at least for disks,
> tapes and media changers.

In fact,  I'd much rather go back to userspace-configurable filters
(http://thread.gmane.org/gmane.linux.scsi/77783/focus=1378071); then
policy can be implemented entirely in userspace based on INQUIRY data.
There was a patch here for the bitmaps:
http://permalink.gmane.org/gmane.linux.kernel/1378071

IIRC turning it into a full implementation requires exposing the queue
parameters for all SCSI devices in sysfs, even for those devices that
are not visible as block devices.  (Again IIRC) non-block devices such
as tapes do not have a /sys/bus/scsi/devices/h:c:i:l/block directory,
hence they also have no block/queue to export the whitelist knobs.

You can then add a kernel config knob that would enable only the bare
minimum set of commands (basically INQUIRY, REPORT LUNS, TEST UNIT
READY; maybe also some of these: REQUEST SENSE, START STOP UNIT, MODE
SENSE, LOG SENSE).  If people know that udev (or something else) is new
enough to know how to initialize the required bitmaps, they can enable
the knob.

Paolo

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: PING^7 (was Re: [PATCH v2 00/14] Corrections and customization of the SG_IO command whitelist (CVE-2012-4542))
  2013-05-22 15:53                 ` Paolo Bonzini
@ 2013-05-22 16:32                   ` Martin K. Petersen
  2013-05-22 17:00                     ` Paolo Bonzini
  2013-05-25  3:54                     ` Vladislav Bolkhovitin
  2013-05-22 20:39                   ` Tejun Heo
  1 sibling, 2 replies; 73+ messages in thread
From: Martin K. Petersen @ 2013-05-22 16:32 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Theodore Ts'o, Tejun Heo, James E.J. Bottomley, Jens Axboe,
	linux-kernel, linux-scsi

>>>>> "Paolo" == Paolo Bonzini <pbonzini@redhat.com> writes:

Paolo> First of all, I'll note that SG_IO and block-device-specific
Paolo> ioctls both have their place.  My usecase for SG_IO is
Paolo> virtualization, where I need to pass information from the LUN to
Paolo> the virtual machine with as much fidelity as possible if I choose
Paolo> to virtualize at the SCSI level.  

Now there's your problem! Several people told you way back that the SCSI
virt approach was a really poor choice. The SG_IO permissions problem is
a classic "Doctor, it hurts when I do this".

The kernel's fundamental task is to provide abstraction between
applications and intricacies of hardware. The right way to solve the
problem would have been to provide a better device abstraction built on
top of the block/SCSI infrastructure we already have in place. If you
need more fidelity, add fidelity to the block layer instead of punching
a giant hole through it.

I seem to recall that reservations were part of your motivation for
going the SCSI route in the first place. A better approach would have
been to create a generic reservations mechanism that could be exposed to
the guest. And then let the baremetal kernel worry about the appropriate
way to communicate with the physical hardware. Just like we've done with
reads and writes, discard, write same, etc.

The fact that burning CDs requires SG_IO in the first place is just a
symptom that we got that interface totally wrong. cat iso.img > /dev/sr0
would have been much more in line with how Unix works...

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: PING^7 (was Re: [PATCH v2 00/14] Corrections and customization of the SG_IO command whitelist (CVE-2012-4542))
  2013-05-22 16:32                   ` Martin K. Petersen
@ 2013-05-22 17:00                     ` Paolo Bonzini
  2013-05-22 18:11                       ` Theodore Ts'o
  2013-05-25  3:54                     ` Vladislav Bolkhovitin
  1 sibling, 1 reply; 73+ messages in thread
From: Paolo Bonzini @ 2013-05-22 17:00 UTC (permalink / raw)
  To: Martin K. Petersen
  Cc: Theodore Ts'o, Tejun Heo, James E.J. Bottomley, Jens Axboe,
	linux-kernel, linux-scsi

Il 22/05/2013 18:32, Martin K. Petersen ha scritto:
>>>>>> "Paolo" == Paolo Bonzini <pbonzini@redhat.com> writes:
> 
> Paolo> First of all, I'll note that SG_IO and block-device-specific
> Paolo> ioctls both have their place.  My usecase for SG_IO is
> Paolo> virtualization, where I need to pass information from the LUN to
> Paolo> the virtual machine with as much fidelity as possible if I choose
> Paolo> to virtualize at the SCSI level.  
> 
> Now there's your problem! Several people told you way back that the SCSI
> virt approach was a really poor choice. The SG_IO permissions problem is
> a classic "Doctor, it hurts when I do this".

Unfortunately, it's not me who does this; I'm the doctor who was told it
hurts.

You have hardware providers selling cloud services that want to run
their own custom backup services from within a VM, which entails having
vendor-specific commands run from within a VM.  Or you have people that
run clusters that are half-physical and half-virtual and want to use the
same /dev/disk/by-id paths in both cases; perhaps, with NPIV, they want
to use one zoning approach for both physical and virtual machines.
Someone else they want to backup to tapes from a VM (for example s390
people who just put everything in a VM, so the distinction of physical
and virtual makes no sense for them).  Some people use virtual machines
as sandboxes, and want to burn the ISOs from the same VMs where they
download the ISOs.  Some people have vendor utilities that only run
under Windows, and want to run them in a VM.

Yes, it hurts when they do this.  But I'm not really in the position to
say "don't do that", especially if the reaction would be to pick another
hypervisor than KVM.

> The kernel's fundamental task is to provide abstraction between
> applications and intricacies of hardware. The right way to solve the
> problem would have been to provide a better device abstraction built on
> top of the block/SCSI infrastructure we already have in place. If you
> need more fidelity, add fidelity to the block layer instead of punching
> a giant hole through it.

That would require implementing:

- a interface to get rich error information

- a bunch of ioctls or syscalls to expose every single command, for
example extended copy or reservations

- a ioctl interface to media changers, similar to /dev/st

- ??? and what to do about vendor specific commands, etc.?

With all this to be done in the kernel, having an implementation of tape
and media changer SCSI targets in the virtual machine monitor ends up
being the easiest part.

Some of these two subtopics have been the subject of proverbially many
LWN.net articles.  The implications on userspace ABI are immense, and so
is the complexity of the task.  There is more than a temptation to take
a shortcut, and it's not by chance IMO that all of VMware, Hyper-V and
Xen did the same (though for Xen it's not upstream) before me.

> I seem to recall that reservations were part of your motivation for
> going the SCSI route in the first place.

Reservations are the main motivation for the possibility to bypass the
whitelist, the other being vendor-specific commands.  UNMAP/WRITE
SAME/COMPARE AND WRITE are the main motivation for per-class whitelists.

> A better approach would have
> been to create a generic reservations mechanism that could be exposed to
> the guest. And then let the baremetal kernel worry about the appropriate
> way to communicate with the physical hardware. Just like we've done with
> reads and writes, discard, write same, etc.

I agree for some cases.  For example, I did mean to send a patch to add
ioctls for BLKPING (test unit ready) and BLKCMPXCHG.  I haven't done yet
also because the huge latency in the review of this series wasn't
exactly encouraging me.

But the kernel is not the right place to provide a C API wrapper for the
whole SCSI standard.

> The fact that burning CDs requires SG_IO in the first place is just a
> symptom that we got that interface totally wrong. cat iso.img > /dev/sr0
> would have been much more in line with how Unix works...

In theory, but how do you do things like formatting, picking one of the
gazillion burning methods or media?  Reality is, you'd have a bunch of
ioctls and a program to use them.  "cat /dev/sg2 > page.jpg" would also
be nice for scanners, but are you going to put SANE into the kernel?

Again, the SG_IO shortcut is a necessity more than a temptation.

Paolo

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: PING^7 (was Re: [PATCH v2 00/14] Corrections and customization of the SG_IO command whitelist (CVE-2012-4542))
  2013-05-22 17:00                     ` Paolo Bonzini
@ 2013-05-22 18:11                       ` Theodore Ts'o
  2013-05-22 19:37                         ` Paolo Bonzini
  0 siblings, 1 reply; 73+ messages in thread
From: Theodore Ts'o @ 2013-05-22 18:11 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Martin K. Petersen, Tejun Heo, James E.J. Bottomley, Jens Axboe,
	linux-kernel, linux-scsi

On Wed, May 22, 2013 at 07:00:14PM +0200, Paolo Bonzini wrote:
> You have hardware providers selling cloud services that want to run
> their own custom backup services from within a VM, which entails having
> vendor-specific commands run from within a VM.  Or you have people that
> run clusters that are half-physical and half-virtual and want to use the
> same /dev/disk/by-id paths in both cases; perhaps, with NPIV, they want
> to use one zoning approach for both physical and virtual machines.
> Someone else they want to backup to tapes from a VM (for example s390
> people who just put everything in a VM, so the distinction of physical
> and virtual makes no sense for them).  Some people use virtual machines
> as sandboxes, and want to burn the ISOs from the same VMs where they
> download the ISOs.  Some people have vendor utilities that only run
> under Windows, and want to run them in a VM.

So is this hypothetical or do you have a real customer in mind?  If
it's not theoretical, how does the cloud service control who has
access to the CD burner, and how are the disks loaded into the CD
burner?  Trying to do it where you run cdrecord from the guest OS
sounds **insane**.  The far better way of doing things is to export a
web service where you ship a iso image plus a shipping address to a
burn service, and you don't try to figure out how to make dozens of VM
on a cloud server share a physical CD burner attached to a physical
host OS.  That way leads to insanity.

						- Ted

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: PING^7 (was Re: [PATCH v2 00/14] Corrections and customization of the SG_IO command whitelist (CVE-2012-4542))
  2013-05-22 15:00                 ` Paolo Bonzini
@ 2013-05-22 19:30                   ` Tejun Heo
  2013-05-22 21:18                     ` Paolo Bonzini
  0 siblings, 1 reply; 73+ messages in thread
From: Tejun Heo @ 2013-05-22 19:30 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: James E.J. Bottomley, Jens Axboe, linux-kernel, linux-scsi

On Wed, May 22, 2013 at 05:00:52PM +0200, Paolo Bonzini wrote:
> Il 22/05/2013 16:30, Tejun Heo ha scritto:
> > * Separate fixes from additions.  Transform existing code so that the
> >   visible behavior doesn't change but the required fix can be
> >   implemented on top.  Explicitly note what's going on in the commit
> >   messages.
> 
> Been there, done that.  Have you read the commit messages *at all*?

Yeah, I was summarizing all the points that I raised during the
review.  This part is almost there but still not quite there.

> Patch 2: "Right now, there is still just one bitmap and the mask is
> ignored, so there is no semantic change yet".
> 
> Patch 3: "This patch already introduces some semantic change, albeit
> very limited; [...] a few commands are now forbidden for devices of type
> other than TYPE_ROM, where they are reserved or vendor-specific".

The thing is that the behavior change is now implemented in an
inactive form by #2 and then flipped on by #3.  #2 both change the
format and the content of the table.  This should have been like the
following.

#2: Convert to the new table for mat but set all bits for all commands
    as that's what the old table looks like in the new format.

#3: Implement per-type filtering.  Now this wouldn't change any
    behavior as the new table is the same as the old one.  This can be
    rolled into #2.

#4: Update the filtering table to fix the security issue and *just*
    the identified security issue with UNMAP.  This way, the behavior
    change is limited to the identified security issue and distros can
    backport upto this to address the security issue.

#5: Update the filtering table further so that commands which don't
    make sense for the class are not allowed.  Note that this has some
    possibility of breaking existing users and we do want this in a
    separate patch from #4.

One thing I don't get is why you're saying the original patch 3
"introduces some semantic changes, albeit very limited" because the
update to the command table in the previous patch is rather extensive.
Why do you say it's very limited?

> > * Fix the frigging CVE bug that you've been waving around and do
> >   *just* that.
> 
> Perhaps you've missed that this is v2, not v27.  Tell me a place in the
> original review where you've told me to split the series.

Hmmmm.... so the thing is that it took us a lot of fighting and
bickering for the patchset to look like what it looks like right now
and I clearly recall talking about minimal fixes for the identified
security issue.  It's just exhausting that it still isn't split
properly after so much effort and to have to continue the same type of
point-by-point fight to get the rest of them across, especially when
none of these points are controversial at all.

> > * Add the frigging "count me out" feature that you want for your use
> >   case.  It isn't controversial and is what you need and the
> >   maintainer can apply to the point where [s]he thinks acceptable.
> 
> Sure.  Except I had sent it six months ago, and it lied unreviewed
> despite your acks for two months.  It is in the archives waiting to be
> picked up.

Yeah, well, probably Jens was busy at the moment or something and then
there were activities related to the changes, so it got delayed until
the whole series is resolved.  Anyways, my point was that you wanna
put that right after the security fix, not at the end of the series
with contentious changes in the middle.  That way, the security fix
and the part that your use case requires can be picked up separately.
In fact, at this point, it chould be helpful to make them separate
patchset.

> > * If for whatever reason you have to add more command codes to the
> >   exception table, do them with explicit justifications.  How the hell
> >   "the vast majority of the commands are added because Linux itself is
> >   using them" a proper justification?   How are they used for what
> >   reason and why is adding them beneficial?
> 
> For example, WRITE SAME is used to discard blocks.  If a Linux guest
> wants to discard blocks, it may send WRITE SAME.  If a disk advertises
> support for WRITE SAME, it is not nice if WRITE SAME then fails because
> of a stupid whitelist that was designed for CD-ROMs.

So, as I've said multiple times, let's *please* do this case-by-base -
build the command table gradually with explicit use cases at each
stage because SG_IO is very easy to get wrong for both the users and
the hardware devices and the kernel isn't policing the content of
what's being issued, because it is very nice to be able to know
exactly the intents behind such table contents later, and because we
wanna debate whether we even want to support specific use cases.  It's
one thing to allow WRITE SAME passthrough and it's a different thing
to try to allow all commands which may be issued by udev.

> >   Why do you need this at all if
> >   you have the "count me out" knob in the first place?
> 
> Because the "count me out" knob needs privileges.  It opens up way, way

So does giving out access to the device node itself.  While it could
be cumbersome, requiring privileges there for arbitration isn't a new
thing.

> more than what these patches do.

So, your use case wouldn't work with this?

> And the frigging patch to make the whole whitelisting
> userspace-configurable with finer-grain (but still with appropriate
> capabilities) was nacked.  By you, for God's sake.
> 
> http://permalink.gmane.org/gmane.linux.kernel/1378071

Yeah, that's still nack.

> >   You first
> >   built that command list by scanning the spec and just adding the
> >   commands that seemed "right" to you.
> 
> And then in v2 I stated that I removed some disk commands because of
> discussion with you.  But of course you don't know, because you've not
> read the damn commit messages.

I did read.  Here's the patch description from patch 5.

 Start cleaning up the table, moving out of the way four rare &
 obsolete device types: printers, communication devices (network
 cards) and scanners (both TYPE_PROCESSOR and TYPE_SCANNER).  Add
 missing commands for these four device types.

No real discussion or justifications for those "missing commands".
Similar deal for other patches.  If I argue about command X to make a
general point, what I get is "all - X".  Build from ground up, not the
other way around.

> >   I have near-zero confidence in
> >   your perception of the relationship between the specs and actual
> >   world.
> 
> Thankyouverymuch.  Perhaps you should have read the commit messages (oh
> sorry, have I said it already?) and seen that it's not about commands
> that seemed "right":
> 
>    Only commands that affect the medium are added.
>    Commands that affect other state of the LUN are all privileged, with
>    the sole exception of START STOP UNIT (which has always been
>    allowed for all file descriptors.  I do not really agree with that
>    and it's probably an artifact of when /dev/cdrom had
>    r--r--r-- permissions, but I'm not trying to change that.

Yeah, that one was better than others, but it still doesn't explain
and justify why those specific commands are being added.

-- 
tejun

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: PING^7 (was Re: [PATCH v2 00/14] Corrections and customization of the SG_IO command whitelist (CVE-2012-4542))
  2013-05-22 18:11                       ` Theodore Ts'o
@ 2013-05-22 19:37                         ` Paolo Bonzini
  2013-05-22 20:19                           ` Theodore Ts'o
  0 siblings, 1 reply; 73+ messages in thread
From: Paolo Bonzini @ 2013-05-22 19:37 UTC (permalink / raw)
  To: Theodore Ts'o, Martin K. Petersen, Tejun Heo,
	James E.J. Bottomley, Jens Axboe, linux-kernel, linux-scsi

Il 22/05/2013 20:11, Theodore Ts'o ha scritto:
> On Wed, May 22, 2013 at 07:00:14PM +0200, Paolo Bonzini wrote:
>> You have hardware providers selling cloud services that want to run
>> their own custom backup services from within a VM, which entails having
>> vendor-specific commands run from within a VM.  Or you have people that
>> run clusters that are half-physical and half-virtual and want to use the
>> same /dev/disk/by-id paths in both cases; perhaps, with NPIV, they want
>> to use one zoning approach for both physical and virtual machines.
>> Someone else they want to backup to tapes from a VM (for example s390
>> people who just put everything in a VM, so the distinction of physical
>> and virtual makes no sense for them).  Some people use virtual machines
>> as sandboxes, and want to burn the ISOs from the same VMs where they
>> download the ISOs.  Some people have vendor utilities that only run
>> under Windows, and want to run them in a VM.
> 
> So is this hypothetical or do you have a real customer in mind?

All of these come from real customers.

> If it's not theoretical, how does the cloud service control who has
> access to the CD burner, and how are the disks loaded into the CD
> burner?

CD burning would be used in a VM that runs on your local workstation, so
the VM gets access to the CD burner under your desk.  There was also a
developer of a CD burning tool that wanted to test it inside BSD,
Solaris and Windows VMs; the idea is the same.

Paolo

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: PING^7 (was Re: [PATCH v2 00/14] Corrections and customization of the SG_IO command whitelist (CVE-2012-4542))
  2013-05-22 19:37                         ` Paolo Bonzini
@ 2013-05-22 20:19                           ` Theodore Ts'o
  2013-05-22 20:36                             ` Paolo Bonzini
  0 siblings, 1 reply; 73+ messages in thread
From: Theodore Ts'o @ 2013-05-22 20:19 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Martin K. Petersen, Tejun Heo, James E.J. Bottomley, Jens Axboe,
	linux-kernel, linux-scsi

On Wed, May 22, 2013 at 09:37:54PM +0200, Paolo Bonzini wrote:
> > If it's not theoretical, how does the cloud service control who has
> > access to the CD burner, and how are the disks loaded into the CD
> > burner?
> 
> CD burning would be used in a VM that runs on your local workstation, so
> the VM gets access to the CD burner under your desk.  There was also a
> developer of a CD burning tool that wanted to test it inside BSD,
> Solaris and Windows VMs; the idea is the same.

So in both cases all of the VM's and the host OS are within the same
trust boundary.  This simplifies the security requirements than in the
more generic cloud server caser where the VM's are mutually
suspicious.  This simplifies the requirements of what we need to push
into the kernel, yes?

						- Ted

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: PING^7 (was Re: [PATCH v2 00/14] Corrections and customization of the SG_IO command whitelist (CVE-2012-4542))
  2013-05-22 20:19                           ` Theodore Ts'o
@ 2013-05-22 20:36                             ` Paolo Bonzini
  0 siblings, 0 replies; 73+ messages in thread
From: Paolo Bonzini @ 2013-05-22 20:36 UTC (permalink / raw)
  To: Theodore Ts'o, Martin K. Petersen, Tejun Heo,
	James E.J. Bottomley, Jens Axboe, linux-kernel, linux-scsi

Il 22/05/2013 22:19, Theodore Ts'o ha scritto:
> On Wed, May 22, 2013 at 09:37:54PM +0200, Paolo Bonzini wrote:
>>> If it's not theoretical, how does the cloud service control who has
>>> access to the CD burner, and how are the disks loaded into the CD
>>> burner?
>>
>> CD burning would be used in a VM that runs on your local workstation, so
>> the VM gets access to the CD burner under your desk.  There was also a
>> developer of a CD burning tool that wanted to test it inside BSD,
>> Solaris and Windows VMs; the idea is the same.
> 
> So in both cases all of the VM's and the host OS are within the same
> trust boundary.  This simplifies the security requirements than in the
> more generic cloud server caser where the VM's are mutually
> suspicious.  This simplifies the requirements of what we need to push
> into the kernel, yes?

What do you mean by "push into the kernel"?

(Anyway the CD burner case is really the only one that the current
whitelist covers completely.  I was just listing it as a use case for
SG_IO in the context as virtualization).

Paolo

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: PING^7 (was Re: [PATCH v2 00/14] Corrections and customization of the SG_IO command whitelist (CVE-2012-4542))
  2013-05-22 15:53                 ` Paolo Bonzini
  2013-05-22 16:32                   ` Martin K. Petersen
@ 2013-05-22 20:39                   ` Tejun Heo
  2013-05-22 21:12                     ` Paolo Bonzini
  1 sibling, 1 reply; 73+ messages in thread
From: Tejun Heo @ 2013-05-22 20:39 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Theodore Ts'o, James E.J. Bottomley, Jens Axboe,
	linux-kernel, linux-scsi

Hey,

On Wed, May 22, 2013 at 05:53:34PM +0200, Paolo Bonzini wrote:
> I do listen to review feedback, but I also expect the other side to
> listen to me, ask me what is not clear, and possess some knowledge of
> the domain that he's reviewing patches for.  All of which, quite
> frankly, I have not seen in this case.

Heh, nice one.  As we've talked on RH mailing list, I don't doubt this
has been a pretty bad experience for you but it also has been one of
the worst review experiences for me too.  It's on both of us that we
do get frustrated easily and the discussions escalate very quickly;
however, the biggest issue from my side is that it's very difficult to
get a point across and even when the point seems to have been made
what comes out of it is the mimum possible change around that point
rather than wider interpretation and application of the point made.
I'm not saying you don't listen to reviews at all but the reception
definitely feels very low-gain.

Anyways, at this point, the easiest way to make forward progress is
completely separating out security fix from the rest along with the
"count me out" knob, which should be able to cover most of the
described use cases anyway.  Let's please do further modifications to
the filtering table as a separate step.

-- 
tejun

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: PING^7 (was Re: [PATCH v2 00/14] Corrections and customization of the SG_IO command whitelist (CVE-2012-4542))
  2013-05-22 20:39                   ` Tejun Heo
@ 2013-05-22 21:12                     ` Paolo Bonzini
  0 siblings, 0 replies; 73+ messages in thread
From: Paolo Bonzini @ 2013-05-22 21:12 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Theodore Ts'o, James E.J. Bottomley, Jens Axboe,
	linux-kernel, linux-scsi

Il 22/05/2013 22:39, Tejun Heo ha scritto:
> Hey,
> 
> On Wed, May 22, 2013 at 05:53:34PM +0200, Paolo Bonzini wrote:
>> I do listen to review feedback, but I also expect the other side to
>> listen to me, ask me what is not clear, and possess some knowledge of
>> the domain that he's reviewing patches for.  All of which, quite
>> frankly, I have not seen in this case.
> 
> Heh, nice one.  As we've talked on RH mailing list, I don't doubt this
> has been a pretty bad experience for you but it also has been one of
> the worst review experiences for me too.

I can imagine.  Sorry about that.

> I'm not saying you don't listen to reviews at all but the reception
> definitely feels very low-gain.

Frankly, I can say the same with s/reviews/explanations/...

> Anyways, at this point, the easiest way to make forward progress is
> completely separating out security fix from the rest along with the
> "count me out" knob, which should be able to cover most of the
> described use cases anyway.  Let's please do further modifications to
> the filtering table as a separate step.

Okay, we seem to have reverted (both of us) to a more civil tone, and I
appreciate setting a way forward.

I'll send three separate series in the next few days.  I guess it's okay
to send the common patch twice.

Paolo

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: PING^7 (was Re: [PATCH v2 00/14] Corrections and customization of the SG_IO command whitelist (CVE-2012-4542))
  2013-05-22 19:30                   ` Tejun Heo
@ 2013-05-22 21:18                     ` Paolo Bonzini
  2013-05-22 22:17                       ` Tejun Heo
  0 siblings, 1 reply; 73+ messages in thread
From: Paolo Bonzini @ 2013-05-22 21:18 UTC (permalink / raw)
  To: Tejun Heo; +Cc: James E.J. Bottomley, Jens Axboe, linux-kernel, linux-scsi

Il 22/05/2013 21:30, Tejun Heo ha scritto:
> The thing is that the behavior change is now implemented in an
> inactive form by #2 and then flipped on by #3.  #2 both change the
> format and the content of the table.  This should have been like the
> following.
> 
> #2: Convert to the new table for mat but set all bits for all commands
>     as that's what the old table looks like in the new format.

Ok, that's easy.  I will still need a bulk change like
s/__set_bit/sgio_bitmap_set/g, but it should still be relatively trivial.

> #3: Implement per-type filtering.  Now this wouldn't change any
>     behavior as the new table is the same as the old one.  This can be
>     rolled into #2.

Ok, I prefer to keep it separate though.

> #4: Update the filtering table to fix the security issue and *just*
>     the identified security issue with UNMAP.  This way, the behavior
>     change is limited to the identified security issue and distros can
>     backport upto this to address the security issue.
> 
> #5: Update the filtering table further so that commands which don't
>     make sense for the class are not allowed.  Note that this has some
>     possibility of breaking existing users and we do want this in a
>     separate patch from #4.

Sure, I can do that.  Compared to the way I split the patches, it
would have "patch 3" and "patch 4" reversed.

> Anyways, my point was that you wanna
> put that right after the security fix, not at the end of the series
> with contentious changes in the middle.  That way, the security fix
> and the part that your use case requires can be picked up separately.
> In fact, at this point, it chould be helpful to make them separate
> patchset.

I would just make it a separate series, since it isn't contentious.

> So, as I've said multiple times, let's *please* do this case-by-base -
> build the command table gradually with explicit use cases at each
> stage because SG_IO is very easy to get wrong for both the users and
> the hardware devices and the kernel isn't policing the content of
> what's being issued, because it is very nice to be able to know
> exactly the intents behind such table contents later, and because we
> wanna debate whether we even want to support specific use cases.  It's
> one thing to allow WRITE SAME passthrough and it's a different thing
> to try to allow all commands which may be issued by udev.

Ok, so I can split it in 10 patches one per command, but at some point I
wonder if it is overkill.  For example, for disks:

- WRITE AND VERIFY(16) is needed to support >2TB disks, and the
corresponding 12-byte CDB is whitelisted already.  I didn't get reports
about _these_ command but I do get bug reports about >2TB disks.
SYNCHRONIZE CACHE(16) is similarly the 16-byte extension of another
10-byte command.

- I added ATA PASS-THROUGH(16) because ATA PASS-THROUGH(12) is present;
using the (16) version is preferrable because (12) conflicts with the
destructive MMC command BLANK, see the sg_sat_identify man page.

- WRITE SAME(16), WRITE SAME(10), UNMAP are needed for discard.

- COMPARE AND WRITE is used by cluster software.

Should this be four separate patches?  Or is it enough to write this in
the commit message?  I honestly find this level of detail to be way way
higher than the normal requirement of kernel patches, but I'm happy to
comply.

>>>   Why do you need this at all if
>>>   you have the "count me out" knob in the first place?
>>
>> Because the "count me out" knob needs privileges.  It opens up way, way
> 
> So does giving out access to the device node itself.

No, it doesn't.  You can use SCM_RIGHTS, and pass a file descriptor for
the device node to an unprivileged program.  You can choose the
users/groups that are allowed to access the device.  In either case, the
privileged action is limited in time or in scope.

The count-me-out knob affects all processes that use the device node,
and won't be cleaned up properly if you SIGKILL the (privileged)
process that sets it.  So if you can avoid it, you should.

> While it could be cumbersome, requiring privileges there for arbitration
> isn't a new thing.
> 
>> more than what these patches do.
> 
> So, your use case wouldn't work with this?

There are many use cases, I listed some in my reply to Martin.
Sometimes you have trust over the guest and can use count-me-out.  But
in some cases you don't, and yet the current whitelist is not enough
(e.g. tapes).

> I did read.  Here's the patch description from patch 5.
> 
>  Start cleaning up the table, moving out of the way four rare &
>  obsolete device types: printers, communication devices (network
>  cards) and scanners (both TYPE_PROCESSOR and TYPE_SCANNER).  Add
>  missing commands for these four device types.
> 
> No real discussion or justifications for those "missing commands".

For scanners, the justification is in the patch: I actually looked at
SANE and listed each single command that is used.

For the printer commands, there is exactly one additional command, and
since the devices are obsolete anyway (but you can't block the whole
class, it would be a regression) there is no justification except
sanity.  I want to keep the per-class lists separate, and I would have
no justification for leaving out a single printer command which was
already present in the 1994 SCSI-2 standard ("SLEW AND PRINT").

Please accept that leaving something "out" of the list is as arbitrary
as putting it "in".  The list anyway has plenty, plenty of commands
that will not be implemented in the large majority of SCSI hardware in
existence such as USB enclosures or pendrives.  If they could brick
some hardware, we'd have known by now.  Erroneous input to valid
commands is much, much more likely to crash the firmware.

At some point you have to admit that if these commands are in the
standard it is because "someone" pushed for them.  Since arbitrary
applications are going to run in the VM, singling which commands go in
makes little sense, no matter how many time you ask for it.  Especially
since, anyway, it's not going to be a particularly fine net.

In some cases, leaving commands out would amount to this:

   /*
    * SYNCHRONIZE CACHE(16) and WRITE AND VERIFY(16) are not whitelisted
    * because of some not-better-specified worry that badness will
    * happen.
    */

and I would have a harder time justifying that comment, than anything
that is in these patches.

> Similar deal for other patches.  If I argue about command X to make a
> general point, what I get is "all - X".  Build from ground up, not the
> other way around.

Ok, I'll drop patch 10 which covers obsolete commands.  I know that
some of these are used because users submitted patches to QEMU to
recognize them, but I can leave this patch out and treat them the same
way as vendor-specific commands (i.e., just ask users to opt out of the
whitelist).

The rest is tape stuff, for which I really added everything that is in
the standard.  The reason is that the current whitelist is totally
inapplicable to those devices, because the command set is completely
different from that of disks.  The standards are relatively small and
do not really have the kitchen sink in them.  Media changers only have
half a dozen commands, for example.  You can google the command names
and you'll get many references from hardware manuals.

Paolo

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: PING^7 (was Re: [PATCH v2 00/14] Corrections and customization of the SG_IO command whitelist (CVE-2012-4542))
  2013-05-22 21:18                     ` Paolo Bonzini
@ 2013-05-22 22:17                       ` Tejun Heo
  2013-05-23  0:54                         ` Tejun Heo
  2013-05-23  7:45                         ` Paolo Bonzini
  0 siblings, 2 replies; 73+ messages in thread
From: Tejun Heo @ 2013-05-22 22:17 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: James E.J. Bottomley, Jens Axboe, linux-kernel, linux-scsi

On Wed, May 22, 2013 at 11:18:05PM +0200, Paolo Bonzini wrote:
> Ok, so I can split it in 10 patches one per command, but at some point I
> wonder if it is overkill.  For example, for disks:
> 
> - WRITE AND VERIFY(16) is needed to support >2TB disks, and the
> corresponding 12-byte CDB is whitelisted already.  I didn't get reports
> about _these_ command but I do get bug reports about >2TB disks.
> SYNCHRONIZE CACHE(16) is similarly the 16-byte extension of another
> 10-byte command.
> 
> - I added ATA PASS-THROUGH(16) because ATA PASS-THROUGH(12) is present;
> using the (16) version is preferrable because (12) conflicts with the
> destructive MMC command BLANK, see the sg_sat_identify man page.
> 
> - WRITE SAME(16), WRITE SAME(10), UNMAP are needed for discard.
> 
> - COMPARE AND WRITE is used by cluster software.
> 
> Should this be four separate patches?  Or is it enough to write this in
> the commit message?  I honestly find this level of detail to be way way
> higher than the normal requirement of kernel patches, but I'm happy to
> comply.

I think the larger point is that it really is an extension of the
existing kludge.  As mentioned before, cdb filtering itself isn't
something desirable.  It was added as a kludge to work around CD
burning problem.  We should have limited to MMC devices from the
beginning but forgot that and it of course developed uses which were
never intended which in turn pushes the expansion of the kludge.

The primary point where we don't see eye-to-eye is how manageable we
find SG_IO to be.  I have deep seated distrust on exposing SG_IOs in
controlled way.  The drivers try really hard to avoid certain command
sequences or sub-operations.  We end up with all sorts of black and
white lists and careful code sequences grown over time to avoid
obscure issues.

If the userland is trusted enough to be responsible for all that, it's
fine and that's what "count me out" knob is about which makes sense to
me.  Per-cdb filtering to a lesser security domain, not so much.  An
argument could be made that, as a side effect of the misused cdb
filter, we already have it and might as well expand on it; however, as
Martin reiterated in this thread, this is a fundamentally broken
approach.  Well, it at least seems that way to me.

So, what I'm asking is not about splitting the patches in extreme
granularity or adding absurd amount of details, but more the general
discussion and justification of the approach taken for prospect use
cases and how the commands being added would be part of such approach.

> >>>   Why do you need this at all if
> >>>   you have the "count me out" knob in the first place?
> >>
> >> Because the "count me out" knob needs privileges.  It opens up way, way
> > 
> > So does giving out access to the device node itself.
> 
> No, it doesn't.  You can use SCM_RIGHTS, and pass a file descriptor for
> the device node to an unprivileged program.  You can choose the
> users/groups that are allowed to access the device.  In either case, the
> privileged action is limited in time or in scope.
> 
> The count-me-out knob affects all processes that use the device node,
> and won't be cleaned up properly if you SIGKILL the (privileged)
> process that sets it.  So if you can avoid it, you should.

Then let's make it fit the use case better.  I really can't see much
point in crafting the cdb filter when you basically have to entrust
the device to the user anyway.  Let's either trust the user with the
device or not.  I'm very doubtful that the filtered access via SG_IO
can be reliable or secure enough.  Let's please avoid extending a
broken thing.

> There are many use cases, I listed some in my reply to Martin.
> Sometimes you have trust over the guest and can use count-me-out.  But
> in some cases you don't, and yet the current whitelist is not enough
> (e.g. tapes).

Can you elaborate?  Why can't a tape device be entrusted to the user?

-- 
tejun

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: PING^7 (was Re: [PATCH v2 00/14] Corrections and customization of the SG_IO command whitelist (CVE-2012-4542))
  2013-05-22 22:17                       ` Tejun Heo
@ 2013-05-23  0:54                         ` Tejun Heo
  2013-05-23  7:45                         ` Paolo Bonzini
  1 sibling, 0 replies; 73+ messages in thread
From: Tejun Heo @ 2013-05-23  0:54 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: James E.J. Bottomley, Jens Axboe, linux-kernel, linux-scsi

On Thu, May 23, 2013 at 07:17:37AM +0900, Tejun Heo wrote:
> > No, it doesn't.  You can use SCM_RIGHTS, and pass a file descriptor for
> > the device node to an unprivileged program.  You can choose the
> > users/groups that are allowed to access the device.  In either case, the
> > privileged action is limited in time or in scope.
> > 
> > The count-me-out knob affects all processes that use the device node,
> > and won't be cleaned up properly if you SIGKILL the (privileged)
> > process that sets it.  So if you can avoid it, you should.
> 
> Then let's make it fit the use case better.  I really can't see much
> point in crafting the cdb filter when you basically have to entrust
> the device to the user anyway.  Let's either trust the user with the

One more thing, is it really necessary to have finer granularity than
provided by file permissions?  What would be the use case?  Do you
expect to have multiple - two - differing levels of access with and
without SG_IO?  Note that, for the same user, it's pointless to give
out SG_IO access to processes while denying for other processes.  As
long as ptrace can be attached, hijacking such fd is easy.  Making it
per-device should be suitable enough, no?

-- 
tejun

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: PING^7 (was Re: [PATCH v2 00/14] Corrections and customization of the SG_IO command whitelist (CVE-2012-4542))
  2013-05-22 22:17                       ` Tejun Heo
  2013-05-23  0:54                         ` Tejun Heo
@ 2013-05-23  7:45                         ` Paolo Bonzini
  2013-05-23  9:02                           ` Tejun Heo
  1 sibling, 1 reply; 73+ messages in thread
From: Paolo Bonzini @ 2013-05-23  7:45 UTC (permalink / raw)
  To: Tejun Heo; +Cc: James E.J. Bottomley, Jens Axboe, linux-kernel, linux-scsi

Il 23/05/2013 00:17, Tejun Heo ha scritto:
> Then let's make it fit the use case better.  I really can't see much
> point in crafting the cdb filter when you basically have to entrust
> the device to the user anyway.  Let's either trust the user with the
> device or not.  I'm very doubtful that the filtered access via SG_IO
> can be reliable or secure enough.  Let's please avoid extending a
> broken thing.

Sorry to say that, but "I'm very doubtful that..." is just conspiracy
theory.

It is not broken.  I'm not _that_ clueless, if it were broken I wouldn't
have had users use it in production.

> One more thing, is it really necessary to have finer granularity than
> provided by file permissions?  What would be the use case?  Do you
> expect to have multiple - two - differing levels of access with and
> without SG_IO?

No, I don't.  I want four levels:

1) no access;

2) read-only access;

3) read-write whitelisted access;

4) generic access;

but it's indeed fine to assume that 3 and 4 will never be given together
to the same disk.  The important point is that 2 and 3 should not
require any privileges except for opening the file.

With the opt-out knob, you still need a long-lived privileged process in
order to set the knob back to "no access", and that's undesirable.
Long-lived privileged processes can be SIGKILLed and leave things open
for misuse; instead, if I need something privileged I want to confine it
to a helper that opens the file and passes back the file descriptor.

> for the same user, it's pointless to give out SG_IO access to
> processes while denying for other processes.  As long as ptrace can
> be attached, hijacking such fd is easy. Making it per-device should
> be suitable enough, no?

Yes, and that's what I did.  Such hijacking is why a kernel whitelist is
necessary in untrusted cases (i.e. you cannot just implement it in
userspace).

>> There are many use cases, I listed some in my reply to Martin.
>> Sometimes you have trust over the guest and can use count-me-out.  But
>> in some cases you don't, and yet the current whitelist is not enough
>> (e.g. tapes).
> 
> Can you elaborate?  Why can't a tape device be entrusted to the user?

In general, any device may or may not be entrusted to the user.  In this
respect, tapes or disks have no difference.

But while the current whitelist is almost okay for disks, it is not
usable for tapes.  Too many essential commands are missing; this is why
extending the whitelist to cover other device types is important for me.
 And since you don't want to open new commands to all classes with no
distinction (which I understand), the only choice is per-class whitelists.

Paolo

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: PING^7 (was Re: [PATCH v2 00/14] Corrections and customization of the SG_IO command whitelist (CVE-2012-4542))
  2013-05-23  7:45                         ` Paolo Bonzini
@ 2013-05-23  9:02                           ` Tejun Heo
  2013-05-23  9:47                             ` Paolo Bonzini
  0 siblings, 1 reply; 73+ messages in thread
From: Tejun Heo @ 2013-05-23  9:02 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: James E.J. Bottomley, Jens Axboe, linux-kernel, linux-scsi

On Thu, May 23, 2013 at 09:45:42AM +0200, Paolo Bonzini wrote:
> Il 23/05/2013 00:17, Tejun Heo ha scritto:
> > Then let's make it fit the use case better.  I really can't see much
> > point in crafting the cdb filter when you basically have to entrust
> > the device to the user anyway.  Let's either trust the user with the
> > device or not.  I'm very doubtful that the filtered access via SG_IO
> > can be reliable or secure enough.  Let's please avoid extending a
> > broken thing.
> 
> Sorry to say that, but "I'm very doubtful that..." is just conspiracy
> theory.
> 
> It is not broken.  I'm not _that_ clueless, if it were broken I wouldn't
> have had users use it in production.

No no, I'm not talking about it not working for the users - it's just
passing the commands, it of course works.  I'm doubting about it being
a worthy security isolation layer.  cdb filtering (of any form really)
has always been something on the border.  It always was something we
got stuck with due to lack of other immediate options.

> > One more thing, is it really necessary to have finer granularity than
> > provided by file permissions?  What would be the use case?  Do you
> > expect to have multiple - two - differing levels of access with and
> > without SG_IO?
> 
> No, I don't.  I want four levels:
> 
> 1) no access;
> 
> 2) read-only access;
> 
> 3) read-write whitelisted access;
> 
> 4) generic access;
> 
> but it's indeed fine to assume that 3 and 4 will never be given together
> to the same disk.  The important point is that 2 and 3 should not
> require any privileges except for opening the file.

I'm wondering whether combining 3 into 4 would be good enough.

> With the opt-out knob, you still need a long-lived privileged process in
> order to set the knob back to "no access", and that's undesirable.
> Long-lived privileged processes can be SIGKILLed and leave things open
> for misuse; instead, if I need something privileged I want to confine it
> to a helper that opens the file and passes back the file descriptor.

Hmmm?  Somebody has to give out the access rights anyway, be it a udev
rule or polkit.  While not having to depend on them could be nice, I
don't think involving userland is a big deal.  It's already involved
in closely related capacities anyway.

> > for the same user, it's pointless to give out SG_IO access to
> > processes while denying for other processes.  As long as ptrace can
> > be attached, hijacking such fd is easy. Making it per-device should
> > be suitable enough, no?
> 
> Yes, and that's what I did.  Such hijacking is why a kernel whitelist is
> necessary in untrusted cases (i.e. you cannot just implement it in
> userspace).

I was thinking about the mentions of SCM_RIGHTS.  Anyways, yeah, it
isn't possible to build any finer granularity than the existing
security domains.

> >> There are many use cases, I listed some in my reply to Martin.
> >> Sometimes you have trust over the guest and can use count-me-out.  But
> >> in some cases you don't, and yet the current whitelist is not enough
> >> (e.g. tapes).
> > 
> > Can you elaborate?  Why can't a tape device be entrusted to the user?
> 
> In general, any device may or may not be entrusted to the user.  In this
> respect, tapes or disks have no difference.
> 
> But while the current whitelist is almost okay for disks, it is not
> usable for tapes.  Too many essential commands are missing; this is why
> extending the whitelist to cover other device types is important for me.
>  And since you don't want to open new commands to all classes with no
> distinction (which I understand), the only choice is per-class whitelists.

I was curious why they wouldn't be able to use count-me-out knob.  The
white list was nasty but we did it anyway because there were some
horrible commands which, ISTR, could even affect other devices sharing
the bus and at least at the beginning the list of commands which
should be allowed were pretty compact.

While the list has grown over time, at times probably too carelessly,
I'm not sure it'd be a good idea to make it fully generic to the
extent where giving out write access to a user means SG_IO access to
most common commands by default.  Count-me-out is attractive mainly
because it involves an explicit step where the additional access is
granted.  It'd be nice if that's good enough for the common use cases.
If not, I don't know.

Whitelisting those commands is likely to be mostly harmless most of
the time but I would be surprised if there aren't devices out in the
wild which can be bricked / affected adversely with carefully crafted
commands within the allowed cdbs.  The level of security isolation
which can be provided by command code filtering is somewhat flimsy,
which is why cdb filtering was always seen as a kludge, which in turn
is why there's reluctance against extending it to more use cases.

-- 
tejun

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: PING^7 (was Re: [PATCH v2 00/14] Corrections and customization of the SG_IO command whitelist (CVE-2012-4542))
  2013-05-23  9:02                           ` Tejun Heo
@ 2013-05-23  9:47                             ` Paolo Bonzini
  2013-05-24  1:44                               ` Tejun Heo
  0 siblings, 1 reply; 73+ messages in thread
From: Paolo Bonzini @ 2013-05-23  9:47 UTC (permalink / raw)
  To: Tejun Heo; +Cc: James E.J. Bottomley, Jens Axboe, linux-kernel, linux-scsi

Il 23/05/2013 11:02, Tejun Heo ha scritto:
> On Thu, May 23, 2013 at 09:45:42AM +0200, Paolo Bonzini wrote:
>> Il 23/05/2013 00:17, Tejun Heo ha scritto:
>>> Then let's make it fit the use case better.  I really can't see much
>>> point in crafting the cdb filter when you basically have to entrust
>>> the device to the user anyway.  Let's either trust the user with the
>>> device or not.  I'm very doubtful that the filtered access via SG_IO
>>> can be reliable or secure enough.  Let's please avoid extending a
>>> broken thing.
>>
>> Sorry to say that, but "I'm very doubtful that..." is just conspiracy
>> theory.
>>
>> It is not broken.  I'm not _that_ clueless, if it were broken I wouldn't
>> have had users use it in production.
> 
> No no, I'm not talking about it not working for the users - it's just
> passing the commands, it of course works.  I'm doubting about it being
> a worthy security isolation layer.  cdb filtering (of any form really)
> has always been something on the border.  It always was something we
> got stuck with due to lack of other immediate options.

I understood correctly then. :)  I agree it's not the best, but it's not
completely broken either.  It has bugs, and that's what these patches
try to fix.

>>> One more thing, is it really necessary to have finer granularity than
>>> provided by file permissions?  What would be the use case?  Do you
>>> expect to have multiple - two - differing levels of access with and
>>> without SG_IO?
>>
>> No, I don't.  I want four levels:
>>
>> 1) no access;
>> 2) read-only access;
>> 3) read-write whitelisted access;
>> 4) generic access;
>>
>> but it's indeed fine to assume that 3 and 4 will never be given together
>> to the same disk.  The important point is that 2 and 3 should not
>> require any privileges except for opening the file.
> 
> I'm wondering whether combining 3 into 4 would be good enough.

No, it wouldn't.  I learnt it the hard way (by having a patch nacked :))
at http://thread.gmane.org/gmane.linux.kernel/1311887.

>> With the opt-out knob, you still need a long-lived privileged process in
>> order to set the knob back to "no access", and that's undesirable.
>> Long-lived privileged processes can be SIGKILLed and leave things open
>> for misuse; instead, if I need something privileged I want to confine it
>> to a helper that opens the file and passes back the file descriptor.
> 
> Hmmm?  Somebody has to give out the access rights anyway, be it a udev
> rule or polkit.  While not having to depend on them could be nice, I
> don't think involving userland is a big deal.  It's already involved
> in closely related capacities anyway.

Polkit need not do anything to give out the access rights, it only has
to just confirm the credentials.  A helper setgid-disk can consult
polkit, open the file, pass the file descriptor back, and exit.  We
already do something similar for tap devices.

>>>> There are many use cases, I listed some in my reply to Martin.
>>>> Sometimes you have trust over the guest and can use count-me-out.  But
>>>> in some cases you don't, and yet the current whitelist is not enough
>>>> (e.g. tapes).
>>>
>>> Can you elaborate?  Why can't a tape device be entrusted to the user?
>
> I was curious why they wouldn't be able to use count-me-out knob.  The
> white list was nasty but we did it anyway because there were some
> horrible commands which, ISTR, could even affect other devices sharing
> the bus and at least at the beginning the list of commands which
> should be allowed were pretty compact.

Exactly.  Though these commands do not really affect other devices
sharing the bus, they affect other initiators trying to use the device.
 And these commands are generic, so they apply to disks as well as tapes.

Unfortunately, a purely userspace whitelist would be too easy to circumvent.

> Whitelisting those commands is likely to be mostly harmless most of
> the time but I would be surprised if there aren't devices out in the
> wild which can be bricked / affected adversely with carefully crafted
> commands within the allowed cdbs.  The level of security isolation
> which can be provided by command code filtering is somewhat flimsy,
> which is why cdb filtering was always seen as a kludge, which in turn
> is why there's reluctance against extending it to more use cases.

Yes, I understand that.  But I don't believe it is a serious concern.
In the case of tapes, for example, you're not talking about 10 dollar
USB pendrives whose firmware was written in fire-and-forget mode.
Firmware bugs would be updated by the manufacturers.

I cannot exclude there will _never_ be a bug like this, of course.  But
even if there will be one, IMHO it would be exceptional enough that we
can afford fixing it with a per-device quirk.

Scanners have never had any CDB filtering done on them; I searched for
this kind of bug, and I couldn't find any report.

What I am trying to do is to reach a compromise.  The one I'm proposing
is to forbid reserved or vendor-specific commands, while at the same
time opening the doors on more standardized commands.

Paolo


^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: PING^7 (was Re: [PATCH v2 00/14] Corrections and customization of the SG_IO command whitelist (CVE-2012-4542))
  2013-05-23  9:47                             ` Paolo Bonzini
@ 2013-05-24  1:44                               ` Tejun Heo
  2013-05-24  7:13                                 ` Paolo Bonzini
  0 siblings, 1 reply; 73+ messages in thread
From: Tejun Heo @ 2013-05-24  1:44 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: James E.J. Bottomley, Jens Axboe, linux-kernel, linux-scsi

On Thu, May 23, 2013 at 11:47:25AM +0200, Paolo Bonzini wrote:
> > No no, I'm not talking about it not working for the users - it's just
> > passing the commands, it of course works.  I'm doubting about it being
> > a worthy security isolation layer.  cdb filtering (of any form really)
> > has always been something on the border.  It always was something we
> > got stuck with due to lack of other immediate options.
> 
> I understood correctly then. :)  I agree it's not the best, but it's not
> completely broken either.  It has bugs, and that's what these patches
> try to fix.

The same filtering table being applied to different classes of
hardware is a software bug, but my point is that the practive
essentially entrusts non-insignificant part of security enforcement to
the hardware itself.  The variety of hardware in question is very wide
and significant portion has historically been known to be flaky.

> > I'm wondering whether combining 3 into 4 would be good enough.
> 
> No, it wouldn't.  I learnt it the hard way (by having a patch nacked :))
> at http://thread.gmane.org/gmane.linux.kernel/1311887.

Of course you can't do that by adding dangerous commands to the
existing filtering table which is allowed by default.  I'm saying
"count me out" knob could be good enough.  Neither is perfect but at
least the latter doesn't affect the default cases.

> > Hmmm?  Somebody has to give out the access rights anyway, be it a udev
> > rule or polkit.  While not having to depend on them could be nice, I
> > don't think involving userland is a big deal.  It's already involved
> > in closely related capacities anyway.
> 
> Polkit need not do anything to give out the access rights, it only has
> to just confirm the credentials.  A helper setgid-disk can consult
> polkit, open the file, pass the file descriptor back, and exit.  We
> already do something similar for tap devices.

I see.  It really just comes down to plumbing and doesn't seem like a
particularly challenging one at that.

> Yes, I understand that.  But I don't believe it is a serious concern.
> In the case of tapes, for example, you're not talking about 10 dollar
> USB pendrives whose firmware was written in fire-and-forget mode.
> Firmware bugs would be updated by the manufacturers.

This is being applied to all block devices by default and it's a part
of a vicious circle which is being reinforced by this change.  We
added flawed security mechanism to work around deficiency in software
stack.  The mechanism could be mis-used for other purposed which in
turn developed into other use cases, which then pushes the expansion
of the existing flawed mechanism.  This is a process with postive feed
back built into it.

> I cannot exclude there will _never_ be a bug like this, of course.  But
> even if there will be one, IMHO it would be exceptional enough that we
> can afford fixing it with a per-device quirk.
>
> Scanners have never had any CDB filtering done on them; I searched for
> this kind of bug, and I couldn't find any report.
> 
> What I am trying to do is to reach a compromise.  The one I'm proposing
> is to forbid reserved or vendor-specific commands, while at the same
> time opening the doors on more standardized commands.

I feel pretty uncomfortable about the direction and think we should
reach compromise from the other direction.  If VMs need raw device
access, just entrust the device to those VMs and enforce whatever
extra restriction from hypervisor side.  It sure isn't perfect but
neither is the other compromise and the risk is taken by the ones
wanting to do (relatively) exotic stuff rather than forced on all
others.

-- 
tejun

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: PING^7 (was Re: [PATCH v2 00/14] Corrections and customization of the SG_IO command whitelist (CVE-2012-4542))
  2013-05-24  1:44                               ` Tejun Heo
@ 2013-05-24  7:13                                 ` Paolo Bonzini
  2013-05-24  8:02                                   ` Tejun Heo
  0 siblings, 1 reply; 73+ messages in thread
From: Paolo Bonzini @ 2013-05-24  7:13 UTC (permalink / raw)
  To: Tejun Heo; +Cc: James E.J. Bottomley, Jens Axboe, linux-kernel, linux-scsi

Il 24/05/2013 03:44, Tejun Heo ha scritto:
> On Thu, May 23, 2013 at 11:47:25AM +0200, Paolo Bonzini wrote:
>>> No no, I'm not talking about it not working for the users - it's just
>>> passing the commands, it of course works.  I'm doubting about it being
>>> a worthy security isolation layer.  cdb filtering (of any form really)
>>> has always been something on the border.  It always was something we
>>> got stuck with due to lack of other immediate options.
>>
>> I understood correctly then. :)  I agree it's not the best, but it's not
>> completely broken either.  It has bugs, and that's what these patches
>> try to fix.
> 
> The same filtering table being applied to different classes of
> hardware is a software bug, but my point is that the practive
> essentially entrusts non-insignificant part of security enforcement to
> the hardware itself.  The variety of hardware in question is very wide
> and significant portion has historically been known to be flaky.

Unproven theory, and contradicted by actual practice.  Bugs are more
common in the handling of borderline conditions, not in the handling of
unimplemented commands.

If you want to be secure aginst buggy firmware, the commands you have to
block are READ and WRITE.  Check out the list of existing USB quirks.

>>> I'm wondering whether combining 3 into 4 would be good enough.
>>
>> No, it wouldn't.  I learnt it the hard way (by having a patch nacked :))
>> at http://thread.gmane.org/gmane.linux.kernel/1311887.
> 
> Of course you can't do that by adding dangerous commands to the
> existing filtering table which is allowed by default.  I'm saying
> "count me out" knob could be good enough.  Neither is perfect but at
> least the latter doesn't affect the default cases.

You need to allow more commands.

The count-me-out knob allows all commands.

You cannot always allow all commands.

Ergo, you cannot always use the count-me-out knob.

Paolo

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: PING^7 (was Re: [PATCH v2 00/14] Corrections and customization of the SG_IO command whitelist (CVE-2012-4542))
  2013-05-24  7:13                                 ` Paolo Bonzini
@ 2013-05-24  8:02                                   ` Tejun Heo
  2013-05-24  8:31                                     ` Paolo Bonzini
  2013-05-25  4:35                                     ` James Bottomley
  0 siblings, 2 replies; 73+ messages in thread
From: Tejun Heo @ 2013-05-24  8:02 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: James E.J. Bottomley, Jens Axboe, lkml, linux-scsi

On Fri, May 24, 2013 at 4:13 PM, Paolo Bonzini <pbonzini@redhat.com> wrote:
>> The same filtering table being applied to different classes of
>> hardware is a software bug, but my point is that the practive
>> essentially entrusts non-insignificant part of security enforcement to
>> the hardware itself.  The variety of hardware in question is very wide
>> and significant portion has historically been known to be flaky.
>
> Unproven theory, and contradicted by actual practice.  Bugs are more
> common in the handling of borderline conditions, not in the handling of
> unimplemented commands.
>
> If you want to be secure aginst buggy firmware, the commands you have to
> block are READ and WRITE.  Check out the list of existing USB quirks.

Well, I'd actually much prefer disabling CDB whitelisting for all !MMC
devices if at all possible. You're basically arguing that because what
we have is already broken, it should be okay to break it further.
Also, RW commands having more quirks doesn't necessarily indicate that
they tend to be more broken. They just get hammered on a lot in
various ways so problems on those commands tend to be more noticeable.

> You need to allow more commands.
> The count-me-out knob allows all commands.
> You cannot always allow all commands.
> Ergo, you cannot always use the count-me-out knob.

The thing is that both approaches aren't perfect here so you can make
similar type of argument from the other side. If the system wants to
give out raw hardware access to VMs, requiring it to delegate the
device fully could be reasonable. Not ideal but widening direct
command access by default is pretty nasty too.

--
tejun

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: PING^7 (was Re: [PATCH v2 00/14] Corrections and customization of the SG_IO command whitelist (CVE-2012-4542))
  2013-05-24  8:02                                   ` Tejun Heo
@ 2013-05-24  8:31                                     ` Paolo Bonzini
  2013-05-24  9:07                                       ` Tejun Heo
  2013-05-25  4:35                                     ` James Bottomley
  1 sibling, 1 reply; 73+ messages in thread
From: Paolo Bonzini @ 2013-05-24  8:31 UTC (permalink / raw)
  To: Tejun Heo; +Cc: James E.J. Bottomley, Jens Axboe, lkml, linux-scsi

Il 24/05/2013 10:02, Tejun Heo ha scritto:
> On Fri, May 24, 2013 at 4:13 PM, Paolo Bonzini <pbonzini@redhat.com> wrote:
>>> The same filtering table being applied to different classes of
>>> hardware is a software bug, but my point is that the practive
>>> essentially entrusts non-insignificant part of security enforcement to
>>> the hardware itself.  The variety of hardware in question is very wide
>>> and significant portion has historically been known to be flaky.
>>
>> Unproven theory, and contradicted by actual practice.  Bugs are more
>> common in the handling of borderline conditions, not in the handling of
>> unimplemented commands.
>>
>> If you want to be secure aginst buggy firmware, the commands you have to
>> block are READ and WRITE.  Check out the list of existing USB quirks.
> 
> Well, I'd actually much prefer disabling CDB whitelisting for all !MMC
> devices if at all possible. You're basically arguing that because what
> we have is already broken, it should be okay to break it further.
> Also, RW commands having more quirks doesn't necessarily indicate that
> they tend to be more broken. They just get hammered on a lot in
> various ways so problems on those commands tend to be more noticeable.

I agree intuition may not count, and it's perfectly possible that
firmware writers forgot a "break;" or put the wrong location in a jump
table, so that unimplemented commands give interesting results.

However, the _fact_ is that this might happen anyway with the buttload
of commands that are already enabled by the whitelist and that most
disks will never implement.

>> You need to allow more commands.
>> The count-me-out knob allows all commands.
>> You cannot always allow all commands.
>> Ergo, you cannot always use the count-me-out knob.
> 
> The thing is that both approaches aren't perfect here so you can make
> similar type of argument from the other side. If the system wants to
> give out raw hardware access to VMs, requiring it to delegate the
> device fully could be reasonable.

No, it is not unfortunately.  Allowing to do discards is one thing,
allowing to disrupt the settings of a SAN is another.  You can only
delegate the device fully in these cases:

(a) of course, if the guest is trusted;

(b) if QEMU is running as a confined user, then you can get by with a
userspace whitelist.  (Otherwise you're just a ptrace away from
arbitrary access, as you pointed out).

Unfortunately, there are _real_ problems that this patches fix, such as
log spews due to blocked commands, and these happen even if neither of
the above conditions is true.

Is there anything else I can do?  Sure, I can check for the presence of
the whitelist and hack the VPD pages to hide features that I know the
whitelist will block.  Is it the right thing to do?  In my opinion no.
It makes no sense to have raw device access _disable_ features compared
to emulation.

> Not ideal but widening direct
> command access by default is pretty nasty too.

I actually agree, and that's why I'm trying to balance the widening and
restricting.

Paolo

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: PING^7 (was Re: [PATCH v2 00/14] Corrections and customization of the SG_IO command whitelist (CVE-2012-4542))
  2013-05-24  8:31                                     ` Paolo Bonzini
@ 2013-05-24  9:07                                       ` Tejun Heo
  2013-05-24  9:45                                         ` Paolo Bonzini
  0 siblings, 1 reply; 73+ messages in thread
From: Tejun Heo @ 2013-05-24  9:07 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: James E.J. Bottomley, Jens Axboe, lkml, linux-scsi

On Fri, May 24, 2013 at 5:31 PM, Paolo Bonzini <pbonzini@redhat.com> wrote:
> I agree intuition may not count, and it's perfectly possible that
> firmware writers forgot a "break;" or put the wrong location in a jump
> table, so that unimplemented commands give interesting results.

It's not just unimplemented commands. Exposing any new command exposes
its borderline problems together with it.

> However, the _fact_ is that this might happen anyway with the buttload
> of commands that are already enabled by the whitelist and that most
> disks will never implement.

Yes and that's why the whitelist is generally frowned upon. It's
inherently fragile. These devices simply aren't designed and
implemented to be exposed to lesser security domains directly. It's
true that it's already kinda broken that way but as I wrote before
it's a vicious cycle and we don't wanna keep building on top of it.
This expansion is gonna increase the usage of whitelisting which will
in turn attract further use cases which are likely to call for even
more expansion.

>> The thing is that both approaches aren't perfect here so you can make
>> similar type of argument from the other side. If the system wants to
>> give out raw hardware access to VMs, requiring it to delegate the
>> device fully could be reasonable.
>
> No, it is not unfortunately.  Allowing to do discards is one thing,
> allowing to disrupt the settings of a SAN is another.  You can only
> delegate the device fully in these cases:
>
> (a) of course, if the guest is trusted;
>
> (b) if QEMU is running as a confined user, then you can get by with a
> userspace whitelist.  (Otherwise you're just a ptrace away from
> arbitrary access, as you pointed out).
>
> Unfortunately, there are _real_ problems that this patches fix, such as
> log spews due to blocked commands, and these happen even if neither of
> the above conditions is true.
>
> Is there anything else I can do?  Sure, I can check for the presence of
> the whitelist and hack the VPD pages to hide features that I know the
> whitelist will block.  Is it the right thing to do?  In my opinion no.
> It makes no sense to have raw device access _disable_ features compared
> to emulation.

If the bulk of filtering can be solved with userland whitelisting as a
confined user, it should be possible to resolve peripheral problems
like log messages in simpler way, no? Can you please elaborate on the
log message problem? Who's spewing those messages?

--
tejun

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: PING^7 (was Re: [PATCH v2 00/14] Corrections and customization of the SG_IO command whitelist (CVE-2012-4542))
  2013-05-24  9:07                                       ` Tejun Heo
@ 2013-05-24  9:45                                         ` Paolo Bonzini
  2013-05-24 22:20                                           ` Tejun Heo
  0 siblings, 1 reply; 73+ messages in thread
From: Paolo Bonzini @ 2013-05-24  9:45 UTC (permalink / raw)
  To: Tejun Heo; +Cc: James E.J. Bottomley, Jens Axboe, lkml, linux-scsi

Il 24/05/2013 11:07, Tejun Heo ha scritto:
> On Fri, May 24, 2013 at 5:31 PM, Paolo Bonzini <pbonzini@redhat.com> wrote:
>> I agree intuition may not count, and it's perfectly possible that
>> firmware writers forgot a "break;" or put the wrong location in a jump
>> table, so that unimplemented commands give interesting results.
> 
> It's not just unimplemented commands. Exposing any new command exposes
> its borderline problems together with it.

For commands that are used by Linux already, the right way to fix the
problems is not obscuring the commands from userspace's view.  You can
hit the same problems with ioctls or even with normal operation of the
device.

>> However, the _fact_ is that this might happen anyway with the buttload
>> of commands that are already enabled by the whitelist and that most
>> disks will never implement.
> 
> Yes and that's why the whitelist is generally frowned upon. It's
> inherently fragile. These devices simply aren't designed and
> implemented to be exposed to lesser security domains directly. It's
> true that it's already kinda broken that way but as I wrote before
> it's a vicious cycle and we don't wanna keep building on top of it.
> This expansion is gonna increase the usage of whitelisting which will
> in turn attract further use cases which are likely to call for even
> more expansion.

And prohibiting the extension of whitelists is gonna increase the
usage of unpriv_sgio and less-secure userspace whitelists.

Anvil, meet hammer.

>>> The thing is that both approaches aren't perfect here so you can make
>>> similar type of argument from the other side. If the system wants to
>>> give out raw hardware access to VMs, requiring it to delegate the
>>> device fully could be reasonable.
>>
>> No, it is not unfortunately.  Allowing to do discards is one thing,
>> allowing to disrupt the settings of a SAN is another.  You can only
>> delegate the device fully in these cases:
>>
>> (a) of course, if the guest is trusted;
>> (b) if QEMU is running as a confined user;
> 
> If the bulk of filtering can be solved with userland whitelisting as a
> confined user, it should be possible to resolve peripheral problems
> like log messages in simpler way, no? Can you please elaborate on the
> log message problem? Who's spewing those messages?

For example:

        if (bdev_write_same(bdev)) {
                unsigned char bdn[BDEVNAME_SIZE];

                if (!blkdev_issue_write_same(bdev, sector, nr_sects, gfp_mask,
                                             ZERO_PAGE(0)))
                        return 0;

                bdevname(bdev, bdn);
                pr_err("%s: WRITE SAME failed. Manually zeroing.\n", bdn);
        }

        return __blkdev_issue_zeroout(bdev, sector, nr_sects, gfp_mask);

The device exposes the ability to zero out LUN blocks, but the command is
not whitelisted and it fails.

Paolo

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: PING^7 (was Re: [PATCH v2 00/14] Corrections and customization of the SG_IO command whitelist (CVE-2012-4542))
  2013-05-24  9:45                                         ` Paolo Bonzini
@ 2013-05-24 22:20                                           ` Tejun Heo
  0 siblings, 0 replies; 73+ messages in thread
From: Tejun Heo @ 2013-05-24 22:20 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: James E.J. Bottomley, Jens Axboe, lkml, linux-scsi

On Fri, May 24, 2013 at 11:45:33AM +0200, Paolo Bonzini wrote:
> > It's not just unimplemented commands. Exposing any new command exposes
> > its borderline problems together with it.
> 
> For commands that are used by Linux already, the right way to fix the
> problems is not obscuring the commands from userspace's view.  You can
> hit the same problems with ioctls or even with normal operation of the
> device.

The kernel is providing an isolation layer between the userland and
the devices.  It isn't obscuring.  We can go through many adjectives
but it's still increasing the amount exposed surface and accelerating
expansion of cdb filter.

> And prohibiting the extension of whitelists is gonna increase the
> usage of unpriv_sgio and less-secure userspace whitelists.
> 
> Anvil, meet hammer.

Delegating full device access is still a fringe use case compared to
generic block RW access.  Given thta we're expecting to have an extra
separation layer albeit in userland, the overall picture doesn't seem
to favor extension of whitelists.

> > If the bulk of filtering can be solved with userland whitelisting as a
> > confined user, it should be possible to resolve peripheral problems
> > like log messages in simpler way, no? Can you please elaborate on the
> > log message problem? Who's spewing those messages?
> 
> For example:
> 
>         if (bdev_write_same(bdev)) {
>                 unsigned char bdn[BDEVNAME_SIZE];
> 
>                 if (!blkdev_issue_write_same(bdev, sector, nr_sects, gfp_mask,
>                                              ZERO_PAGE(0)))
>                         return 0;
> 
>                 bdevname(bdev, bdn);
>                 pr_err("%s: WRITE SAME failed. Manually zeroing.\n", bdn);
>         }
> 
>         return __blkdev_issue_zeroout(bdev, sector, nr_sects, gfp_mask);
> 
> The device exposes the ability to zero out LUN blocks, but the command is
> not whitelisted and it fails.

Which can be solved by userland filtering, right?

-- 
tejun

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: PING^7 (was Re: [PATCH v2 00/14] Corrections and customization of the SG_IO command whitelist (CVE-2012-4542))
  2013-05-22 16:32                   ` Martin K. Petersen
  2013-05-22 17:00                     ` Paolo Bonzini
@ 2013-05-25  3:54                     ` Vladislav Bolkhovitin
  2013-05-28 20:25                       ` Martin K. Petersen
  1 sibling, 1 reply; 73+ messages in thread
From: Vladislav Bolkhovitin @ 2013-05-25  3:54 UTC (permalink / raw)
  To: Martin K. Petersen
  Cc: Paolo Bonzini, Theodore Ts'o, Tejun Heo,
	James E.J. Bottomley, Jens Axboe, linux-kernel, linux-scsi

Martin K. Petersen, on 05/22/2013 09:32 AM wrote:
> Paolo> First of all, I'll note that SG_IO and block-device-specific
> Paolo> ioctls both have their place.  My usecase for SG_IO is
> Paolo> virtualization, where I need to pass information from the LUN to
> Paolo> the virtual machine with as much fidelity as possible if I choose
> Paolo> to virtualize at the SCSI level.  
> 
> Now there's your problem! Several people told you way back that the SCSI
> virt approach was a really poor choice. The SG_IO permissions problem is
> a classic "Doctor, it hurts when I do this".
> 
> The kernel's fundamental task is to provide abstraction between
> applications and intricacies of hardware. The right way to solve the
> problem would have been to provide a better device abstraction built on
> top of the block/SCSI infrastructure we already have in place. If you
> need more fidelity, add fidelity to the block layer instead of punching
> a giant hole through it.
> 
> I seem to recall that reservations were part of your motivation for
> going the SCSI route in the first place. A better approach would have
> been to create a generic reservations mechanism that could be exposed to
> the guest. And then let the baremetal kernel worry about the appropriate
> way to communicate with the physical hardware. Just like we've done with
> reads and writes, discard, write same, etc.

Well, any abstraction is good only if it isn't artificial, so solving more problems
than creating.

Reality is that de facto in the industry _SCSI_ is the abstraction for block/direct
access to data. Look around. How many of systems around you after all layers end up to
SCSI commands in their storage devices?

Linux block layer is purely artificial creature slowly reinventing wheel creating more
problems, than solving. It enforces approach, where often "impossible" means
"impossible in this interface". For instance, how about copy offload? How about
reservations? How about atomic writes? Look at history of barriers and compare then
with what can be done in SCSI. It's still worse, because doesn't allow usage of all
devices capabilities. Why was it needed to create special blk integrity interface with
the only end user - SCSI? Artificial task created - then well solved. Etc, etc.

The block layer keeps repeating SCSI. So, maybe, after all, it's better to acknowledge
that direct usage of SCSI without any intermediate layers and translations is more
productive? And for those minors not using SCSI internally, translate from SCSI to
their internal commands? Creating and filling CDB fields for most cases isn't anyhow
harder, than creating and feeling bio fields.

So, I appreciate work Paolo is doing in this direction. At least, the right thing will
be on the virtualization level.

I do understand that with all existing baggage replacing block layer by SCSI isn't
practical and not proposing it, but let's at least acknowledge limitations of the
academic block abstraction. Let's don't make those limitations global walls. Many
things better to do using direct SCSI, hence let's do the better way.

Vlad

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: PING^7 (was Re: [PATCH v2 00/14] Corrections and customization of the SG_IO command whitelist (CVE-2012-4542))
  2013-05-24  8:02                                   ` Tejun Heo
  2013-05-24  8:31                                     ` Paolo Bonzini
@ 2013-05-25  4:35                                     ` James Bottomley
  2013-05-25  5:27                                       ` Christoph Hellwig
  2013-05-25  8:37                                       ` Tejun Heo
  1 sibling, 2 replies; 73+ messages in thread
From: James Bottomley @ 2013-05-25  4:35 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Paolo Bonzini, Jens Axboe, lkml, linux-scsi

On Fri, 2013-05-24 at 17:02 +0900, Tejun Heo wrote:
> On Fri, May 24, 2013 at 4:13 PM, Paolo Bonzini <pbonzini@redhat.com> wrote:
> >> The same filtering table being applied to different classes of
> >> hardware is a software bug, but my point is that the practive
> >> essentially entrusts non-insignificant part of security enforcement to
> >> the hardware itself.  The variety of hardware in question is very wide
> >> and significant portion has historically been known to be flaky.
> >
> > Unproven theory, and contradicted by actual practice.  Bugs are more
> > common in the handling of borderline conditions, not in the handling of
> > unimplemented commands.
> >
> > If you want to be secure aginst buggy firmware, the commands you have to
> > block are READ and WRITE.  Check out the list of existing USB quirks.
> 
> Well, I'd actually much prefer disabling CDB whitelisting for all !MMC
> devices if at all possible.

I'll go along with this.  I'm also wondering what the problem would be
if we just allowed all commands on either CAP_SYS_RAWIO or opening the
device for write, so we just defer to the filesystem permissions and
restricted read only opens to the basic all device opcodes.

>  You're basically arguing that because what
> we have is already broken, it should be okay to break it further.
> Also, RW commands having more quirks doesn't necessarily indicate that
> they tend to be more broken. They just get hammered on a lot in
> various ways so problems on those commands tend to be more noticeable.

I agree with this, so finding a way to get rid of the opcode table seems
to be what we need.

> > You need to allow more commands.
> > The count-me-out knob allows all commands.
> > You cannot always allow all commands.
> > Ergo, you cannot always use the count-me-out knob.

Do we have a real world example of this?  Getting the kernel out of the
command filtering business does seem to be a good idea to me.

> The thing is that both approaches aren't perfect here so you can make
> similar type of argument from the other side. If the system wants to
> give out raw hardware access to VMs, requiring it to delegate the
> device fully could be reasonable. Not ideal but widening direct
> command access by default is pretty nasty too.

James


^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: PING^7 (was Re: [PATCH v2 00/14] Corrections and customization of the SG_IO command whitelist (CVE-2012-4542))
  2013-05-25  4:35                                     ` James Bottomley
@ 2013-05-25  5:27                                       ` Christoph Hellwig
  2013-05-25  7:05                                         ` Paolo Bonzini
  2013-05-25  8:37                                       ` Tejun Heo
  1 sibling, 1 reply; 73+ messages in thread
From: Christoph Hellwig @ 2013-05-25  5:27 UTC (permalink / raw)
  To: James Bottomley; +Cc: Tejun Heo, Paolo Bonzini, Jens Axboe, lkml, linux-scsi

On Fri, May 24, 2013 at 09:35:02PM -0700, James Bottomley wrote:
> I'll go along with this.  I'm also wondering what the problem would be
> if we just allowed all commands on either CAP_SYS_RAWIO or opening the
> device for write, so we just defer to the filesystem permissions and
> restricted read only opens to the basic all device opcodes.

I've been out of this area for a bit, but the problem used to be that
you could send destructive commands to a partition.  The right fix
for that would be to not allow SG_IO on partitions at all, just
wondering if anything would be broken by this.


^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: PING^7 (was Re: [PATCH v2 00/14] Corrections and customization of the SG_IO command whitelist (CVE-2012-4542))
  2013-05-25  5:27                                       ` Christoph Hellwig
@ 2013-05-25  7:05                                         ` Paolo Bonzini
  2013-05-25  7:11                                           ` Christoph Hellwig
  2013-06-21 11:57                                           ` Christoph Hellwig
  0 siblings, 2 replies; 73+ messages in thread
From: Paolo Bonzini @ 2013-05-25  7:05 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: James Bottomley, Tejun Heo, Jens Axboe, lkml, linux-scsi

Il 25/05/2013 07:27, Christoph Hellwig ha scritto:
> On Fri, May 24, 2013 at 09:35:02PM -0700, James Bottomley wrote:
>> I'll go along with this.  I'm also wondering what the problem would be
>> if we just allowed all commands on either CAP_SYS_RAWIO or opening the
>> device for write, so we just defer to the filesystem permissions and
>> restricted read only opens to the basic all device opcodes.
> 
> I've been out of this area for a bit, but the problem used to be that
> you could send destructive commands to a partition.  The right fix
> for that would be to not allow SG_IO on partitions at all, just
> wondering if anything would be broken by this.

Linus wanted to keep that for CAP_SYS_RAWIO.  We found two uses of SG_IO
on partitions: zfs-fuse used SYNCHRONIZE CACHE; some proprietary driver
used TEST UNIT READY.

Really, the solution is to make the bitmaps configurable in userspace.
It is no less secure than unpriv_sgio.  Then the kernel can be
configured at build-time to have either an MMC bitmap and a basic
whitelist of a dozen commands.  We can even avoid working around those
few conflicting opcodes; if you're paranoid you can just configure your
kernel right.

Paolo

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: PING^7 (was Re: [PATCH v2 00/14] Corrections and customization of the SG_IO command whitelist (CVE-2012-4542))
  2013-05-25  7:05                                         ` Paolo Bonzini
@ 2013-05-25  7:11                                           ` Christoph Hellwig
  2013-05-25  7:21                                             ` Paolo Bonzini
  2013-06-21 11:57                                           ` Christoph Hellwig
  1 sibling, 1 reply; 73+ messages in thread
From: Christoph Hellwig @ 2013-05-25  7:11 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Christoph Hellwig, James Bottomley, Tejun Heo, Jens Axboe, lkml,
	linux-scsi

On Sat, May 25, 2013 at 09:05:25AM +0200, Paolo Bonzini wrote:
> > you could send destructive commands to a partition.  The right fix
> > for that would be to not allow SG_IO on partitions at all, just
> > wondering if anything would be broken by this.
> 
> Linus wanted to keep that for CAP_SYS_RAWIO.  We found two uses of SG_IO
> on partitions: zfs-fuse used SYNCHRONIZE CACHE; some proprietary driver
> used TEST UNIT READY.
> 
> Really, the solution is to make the bitmaps configurable in userspace.
> It is no less secure than unpriv_sgio.  Then the kernel can be
> configured at build-time to have either an MMC bitmap and a basic
> whitelist of a dozen commands.  We can even avoid working around those
> few conflicting opcodes; if you're paranoid you can just configure your
> kernel right.

Keep it simple.  Allowing SG_IO for CAP_SYS_RAWIO probably is fine,
allowing it for permissions only clearly isn't. All the per-command
filetering is just complete bullshit and the kind of bloat that
eventually will make Linux unmaintainable.


^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: PING^7 (was Re: [PATCH v2 00/14] Corrections and customization of the SG_IO command whitelist (CVE-2012-4542))
  2013-05-25  7:11                                           ` Christoph Hellwig
@ 2013-05-25  7:21                                             ` Paolo Bonzini
  0 siblings, 0 replies; 73+ messages in thread
From: Paolo Bonzini @ 2013-05-25  7:21 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: James Bottomley, Tejun Heo, Jens Axboe, lkml, linux-scsi

Il 25/05/2013 09:11, Christoph Hellwig ha scritto:
> > Linus wanted to keep that for CAP_SYS_RAWIO.  We found two uses of SG_IO
> > on partitions: zfs-fuse used SYNCHRONIZE CACHE; some proprietary driver
> > used TEST UNIT READY.
> > 
> > Really, the solution is to make the bitmaps configurable in userspace.
> > It is no less secure than unpriv_sgio.  Then the kernel can be
> > configured at build-time to have either an MMC bitmap and a basic
> > whitelist of a dozen commands.  We can even avoid working around those
> > few conflicting opcodes; if you're paranoid you can just configure your
> > kernel right.
> 
> Keep it simple.  Allowing SG_IO for CAP_SYS_RAWIO probably is fine,
> allowing it for permissions only clearly isn't. All the per-command
> filetering is just complete bullshit and the kind of bloat that
> eventually will make Linux unmaintainable.

What I'm proposing is:

- by default, only allow INQUIRY / REPORT LUNS / TEST UNIT READY.  Keep
the old whitelist available for backwards-compatibility, configurable at
build-time.

- let CAP_SYS_ADMIN users set a different per-device whitelist.

Paolo

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: PING^7 (was Re: [PATCH v2 00/14] Corrections and customization of the SG_IO command whitelist (CVE-2012-4542))
  2013-05-25  4:35                                     ` James Bottomley
  2013-05-25  5:27                                       ` Christoph Hellwig
@ 2013-05-25  8:37                                       ` Tejun Heo
  2013-05-25 11:14                                         ` Paolo Bonzini
  1 sibling, 1 reply; 73+ messages in thread
From: Tejun Heo @ 2013-05-25  8:37 UTC (permalink / raw)
  To: James Bottomley; +Cc: Paolo Bonzini, Jens Axboe, lkml, linux-scsi

Hey, James.

On Fri, May 24, 2013 at 09:35:02PM -0700, James Bottomley wrote:
> > Well, I'd actually much prefer disabling CDB whitelisting for all !MMC
> > devices if at all possible.
> 
> I'll go along with this.  I'm also wondering what the problem would be

Don't think we can.  It'd be a behavior change clearly visible to
userland at this point.

> if we just allowed all commands on either CAP_SYS_RAWIO or opening the
> device for write, so we just defer to the filesystem permissions and
> restricted read only opens to the basic all device opcodes.

Given that there are quite a few cases where we give out block device
permission accesses, changing the behavior by default is likely too
dangerous.

> Do we have a real world example of this?  Getting the kernel out of the
> command filtering business does seem to be a good idea to me.

Something like the following seems workable.

* Fix the security bug.  I don't really care how it's fixed as long as
  the amount of whitelisted commands goes down not up.

* It's not like we can remove the filter for !MMC devices at this
  point, so I think it makes sense to make it per-class so that we can
  *remove* commands which aren't relevant for the device type.  Also,
  we probably wanna add read blinking comment yelling that no further
  commands should be added.

* Merge the patch to give out SG_IO access along with write access, so
  the use cases which want to give out SG_IO access can do so
  explicitly and be fully responsible for the device.  This makes
  sense to me.  If one wants to be allowed to issue raw commands to
  the hardware, one takes the full responsibility.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: PING^7 (was Re: [PATCH v2 00/14] Corrections and customization of the SG_IO command whitelist (CVE-2012-4542))
  2013-05-25  8:37                                       ` Tejun Heo
@ 2013-05-25 11:14                                         ` Paolo Bonzini
  2013-05-25 12:48                                           ` Tejun Heo
  0 siblings, 1 reply; 73+ messages in thread
From: Paolo Bonzini @ 2013-05-25 11:14 UTC (permalink / raw)
  To: Tejun Heo; +Cc: James Bottomley, Jens Axboe, lkml, linux-scsi

Il 25/05/2013 10:37, Tejun Heo ha scritto:
> Hey, James.
> 
> On Fri, May 24, 2013 at 09:35:02PM -0700, James Bottomley wrote:
>>> Well, I'd actually much prefer disabling CDB whitelisting for all !MMC
>>> devices if at all possible.
>>
>> I'll go along with this.  I'm also wondering what the problem would be
> 
> Don't think we can.  It'd be a behavior change clearly visible to
> userland at this point.

We can (and even for MMC) if it is a build-time configuration knob.  It
would satisfy those people who want the CVE fixed, as long as userspace
gets some configurability.

> * Fix the security bug.  I don't really care how it's fixed as long as
>   the amount of whitelisted commands goes down not up.
> 
> * It's not like we can remove the filter for !MMC devices at this
>   point, so I think it makes sense to make it per-class so that we can
>   *remove* commands which aren't relevant for the device type.  Also,
>   we probably wanna add read blinking comment yelling that no further
>   commands should be added.
> 
> * Merge the patch to give out SG_IO access along with write access, so
>   the use cases which want to give out SG_IO access can do so
>   explicitly and be fully responsible for the device.  This makes
>   sense to me.  If one wants to be allowed to issue raw commands to
>   the hardware, one takes the full responsibility.

That's not possible; it would make it impossible to do things like using
a privileged helper to open the file and passing it back via SCM_RIGHTS
to an unprivileged program (running as the user).  This is the ptrace
attack that you mentioned.

Paolo

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: PING^7 (was Re: [PATCH v2 00/14] Corrections and customization of the SG_IO command whitelist (CVE-2012-4542))
  2013-05-25 11:14                                         ` Paolo Bonzini
@ 2013-05-25 12:48                                           ` Tejun Heo
  2013-05-25 12:56                                             ` Paolo Bonzini
  0 siblings, 1 reply; 73+ messages in thread
From: Tejun Heo @ 2013-05-25 12:48 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: James Bottomley, Jens Axboe, lkml, linux-scsi

On Sat, May 25, 2013 at 01:14:37PM +0200, Paolo Bonzini wrote:
> > Don't think we can.  It'd be a behavior change clearly visible to
> > userland at this point.
> 
> We can (and even for MMC) if it is a build-time configuration knob.  It
> would satisfy those people who want the CVE fixed, as long as userspace
> gets some configurability.

I don't think that's a good idea.  We can gradually try to phase it
out by triggering warning message if SG_IO commands are issued to !MMC
devices but I'm not sure that'd be worth the effort.

> > * Merge the patch to give out SG_IO access along with write access, so
> >   the use cases which want to give out SG_IO access can do so
> >   explicitly and be fully responsible for the device.  This makes
> >   sense to me.  If one wants to be allowed to issue raw commands to
> >   the hardware, one takes the full responsibility.
> 
> That's not possible; it would make it impossible to do things like using
> a privileged helper to open the file and passing it back via SCM_RIGHTS
> to an unprivileged program (running as the user).  This is the ptrace
> attack that you mentioned.

I have no idea what you're talking about.  I'm describing the same
thing you implemented and posted.

-- 
tejun

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: PING^7 (was Re: [PATCH v2 00/14] Corrections and customization of the SG_IO command whitelist (CVE-2012-4542))
  2013-05-25 12:48                                           ` Tejun Heo
@ 2013-05-25 12:56                                             ` Paolo Bonzini
  0 siblings, 0 replies; 73+ messages in thread
From: Paolo Bonzini @ 2013-05-25 12:56 UTC (permalink / raw)
  To: Tejun Heo; +Cc: James Bottomley, Jens Axboe, lkml, linux-scsi

Il 25/05/2013 14:48, Tejun Heo ha scritto:
>>> * Merge the patch to give out SG_IO access along with write access, so
>>> > >   the use cases which want to give out SG_IO access can do so
>>> > >   explicitly and be fully responsible for the device.  This makes
>>> > >   sense to me.  If one wants to be allowed to issue raw commands to
>>> > >   the hardware, one takes the full responsibility.
>> > 
>> > That's not possible; it would make it impossible to do things like using
>> > a privileged helper to open the file and passing it back via SCM_RIGHTS
>> > to an unprivileged program (running as the user).  This is the ptrace
>> > attack that you mentioned.
> I have no idea what you're talking about.  I'm describing the same
> thing you implemented and posted.

Ok, I think we need a rewind.  I'll try to post what I mean next week.

Paolo

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: PING^7 (was Re: [PATCH v2 00/14] Corrections and customization of the SG_IO command whitelist (CVE-2012-4542))
  2013-05-25  3:54                     ` Vladislav Bolkhovitin
@ 2013-05-28 20:25                       ` Martin K. Petersen
  2013-05-29  6:12                         ` Vladislav Bolkhovitin
  0 siblings, 1 reply; 73+ messages in thread
From: Martin K. Petersen @ 2013-05-28 20:25 UTC (permalink / raw)
  To: Vladislav Bolkhovitin
  Cc: Martin K. Petersen, Paolo Bonzini, Theodore Ts'o, Tejun Heo,
	James E.J. Bottomley, Jens Axboe, linux-kernel, linux-scsi

>>>>> "Vladislav" == Vladislav Bolkhovitin <vst@vlnb.net> writes:

Vladislav> Linux block layer is purely artificial creature slowly
Vladislav> reinventing wheel creating more problems, than solving.

On the contrary. I do think we solve a whole bunch of problems.


Vladislav> It enforces approach, where often "impossible" means
Vladislav> "impossible in this interface".

I agree we have limitations. I do not agree that all limitations are
bad. Sometimes it's OK to say no.


Vladislav> For instance, how about copy offload?  How about atomic
Vladislav> writes?

I'm actively working on copy offload. Nobody appears to be interested in
atomic writes. Otherwise I'd work on those as well.


Vladislav> Why was it needed to create special blk integrity interface
Vladislav> with the only end user - SCSI?

Simple. Because we did not want to interleave data and PI 512+8+512+8
neither in memory, nor at DMA time. Furthermore, the ATA EPP proposal
was still on the table so I also needed to support ATA.

And finally, NVM Express uses the blk_integrity interface as well.


Vladislav> The block layer keeps repeating SCSI. So, maybe, after all,
Vladislav> it's better to acknowledge that direct usage of SCSI without
Vladislav> any intermediate layers and translations is more productive?
Vladislav> And for those minors not using SCSI internally, translate
Vladislav> from SCSI to their internal commands? Creating and filling
Vladislav> CDB fields for most cases isn't anyhow harder, than creating
Vladislav> and feeling bio fields.

This is quite possibly the worst idea I have heard all week.

As it stands it's a headache for the disk ULD driver to figure out which
of the bazillion READ/WRITE variants to send to a SCSI/ATA device. What
makes you think that an application or filesystem would be better
equipped to make that call?

See also: WRITE SAME w/ zeroes vs. WRITE SAME w/ UNMAP vs. UNMAP 

See also: EXTENDED COPY vs. the PROXY command set

See also: USB-ATA bridge chips

You make it sound like all the block layer does is filling out
CDBs. Which it doesn't in fact have anything to do with at all.

When you are talking about CDBs we're down in the SBC/SSC territory.
Which is such a tiny bit of what's going on. We have transports, we have
SAM, we have HBA controller DMA constraints, system DMA constraints,
buffer bouncing, etc. There's a ton of stuff that needs to happen before
the CDB and the data physically reach the storage.

You seem to be advocating that everything up to the point where the
device receives the command is in the way. Well, by all means. Why limit
ourselves to the confines of SCSI? Why not get rid of POSIX
read()/write(), page cache, filesystems and let applications speak
ST-506 directly?

I know we're doing different things. My job is to make a general purpose
operating system with interfaces that make sense to normal applications.
That does not preclude special cases where it may make sense to poke at
the device directly. For testing purposes, for instance. But I consider
it a failure when we start having applications that know about hardware
intricacies, cylinders/heads/sectors, etc. That road leads straight to
the 1980s...

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: PING^7 (was Re: [PATCH v2 00/14] Corrections and customization of the SG_IO command whitelist (CVE-2012-4542))
  2013-05-28 20:25                       ` Martin K. Petersen
@ 2013-05-29  6:12                         ` Vladislav Bolkhovitin
  0 siblings, 0 replies; 73+ messages in thread
From: Vladislav Bolkhovitin @ 2013-05-29  6:12 UTC (permalink / raw)
  To: Martin K. Petersen
  Cc: Paolo Bonzini, Theodore Ts'o, Tejun Heo,
	James E.J. Bottomley, Jens Axboe, linux-kernel, linux-scsi

Martin K. Petersen, on 05/28/2013 01:25 PM wrote:
> Vladislav> Linux block layer is purely artificial creature slowly
> Vladislav> reinventing wheel creating more problems, than solving.
> 
> On the contrary. I do think we solve a whole bunch of problems.
> 
> 
> Vladislav> It enforces approach, where often "impossible" means
> Vladislav> "impossible in this interface".
> 
> I agree we have limitations. I do not agree that all limitations are
> bad. Sometimes it's OK to say no.
> 
> 
> Vladislav> For instance, how about copy offload?  How about atomic
> Vladislav> writes?
> 
> I'm actively working on copy offload. Nobody appears to be interested in
> atomic writes. Otherwise I'd work on those as well.
> 
> 
> Vladislav> Why was it needed to create special blk integrity interface
> Vladislav> with the only end user - SCSI?
> 
> Simple. Because we did not want to interleave data and PI 512+8+512+8
> neither in memory, nor at DMA time.

It can similarly be done in SCSI-like interface without need for any middleman.

> Furthermore, the ATA EPP proposal
> was still on the table so I also needed to support ATA.
> 
> And finally, NVM Express uses the blk_integrity interface as well.
> 
> 
> Vladislav> The block layer keeps repeating SCSI. So, maybe, after all,
> Vladislav> it's better to acknowledge that direct usage of SCSI without
> Vladislav> any intermediate layers and translations is more productive?
> Vladislav> And for those minors not using SCSI internally, translate
> Vladislav> from SCSI to their internal commands? Creating and filling
> Vladislav> CDB fields for most cases isn't anyhow harder, than creating
> Vladislav> and feeling bio fields.
> 
> This is quite possibly the worst idea I have heard all week.
> 
> As it stands it's a headache for the disk ULD driver to figure out which
> of the bazillion READ/WRITE variants to send to a SCSI/ATA device. What
> makes you think that an application or filesystem would be better
> equipped to make that call?
> 
> See also: WRITE SAME w/ zeroes vs. WRITE SAME w/ UNMAP vs. UNMAP 
> 
> See also: EXTENDED COPY vs. the PROXY command set
> 
> See also: USB-ATA bridge chips
> 
> You make it sound like all the block layer does is filling out
> CDBs. Which it doesn't in fact have anything to do with at all.
> 
> When you are talking about CDBs we're down in the SBC/SSC territory.
> Which is such a tiny bit of what's going on. We have transports, we have
> SAM, we have HBA controller DMA constraints, system DMA constraints,
> buffer bouncing, etc. There's a ton of stuff that needs to happen before
> the CDB and the data physically reach the storage.
> 
> You seem to be advocating that everything up to the point where the
> device receives the command is in the way. Well, by all means. Why limit
> ourselves to the confines of SCSI? Why not get rid of POSIX
> read()/write(), page cache, filesystems and let applications speak
> ST-506 directly?
> 
> I know we're doing different things. My job is to make a general purpose
> operating system with interfaces that make sense to normal applications.
> That does not preclude special cases where it may make sense to poke at
> the device directly. For testing purposes, for instance. But I consider
> it a failure when we start having applications that know about hardware
> intricacies, cylinders/heads/sectors, etc. That road leads straight to
> the 1980s...

What you mean is true, but my point is that this abstraction is better to be done in
SCSI, i.e. SAM, manner. Now need to write fields inside of CDBs, it would be pretty
inconvenient ;). But CDBs fields can be fields in some scsi_io structure. Exact opcodes
can be easily abstracted to be filled on the last stage, where end CDB is constructed
from those fields.

Problem with block abstraction is that it is the least common denominator of all block
devices capabilities, hence advanced capabilities, available only some class of
devices, are automatically become "impossible". Hence, it would be more productive
instead to use the most capable abstraction, which is SAM. In this abstraction there's
no need to reinvent complex interfaces and write complex middleman code for every
advanced capability. All advanced capabilities there are available by definition, if
supported by underlying hardware. That's my point.

POSIX is for simple applications, for which read()/write() calls are sufficient. They
are outside of our discussions. But advanced applications need more. I know plenty of
applications issuing direct SCSI commands, but how many can you name applications using
block interface (bsg)? I can recall only one quite relatively used Linux specific
library. That's all. This interface is not demanded by applications.

Vlad

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: PING^7 (was Re: [PATCH v2 00/14] Corrections and customization of the SG_IO command whitelist (CVE-2012-4542))
  2013-05-25  7:05                                         ` Paolo Bonzini
  2013-05-25  7:11                                           ` Christoph Hellwig
@ 2013-06-21 11:57                                           ` Christoph Hellwig
  1 sibling, 0 replies; 73+ messages in thread
From: Christoph Hellwig @ 2013-06-21 11:57 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Christoph Hellwig, James Bottomley, Tejun Heo, Jens Axboe, lkml,
	linux-scsi, asias.hejun

On Sat, May 25, 2013 at 09:05:25AM +0200, Paolo Bonzini wrote:
> Linus wanted to keep that for CAP_SYS_RAWIO.  We found two uses of SG_IO
> on partitions: zfs-fuse used SYNCHRONIZE CACHE; some proprietary driver
> used TEST UNIT READY.

FYI I looked at the zfs code and the way it uses it is part of a huge
braindamage.

Given that the upstream appears dead I've Cc'd Asias He as he appears
to somehow maintain it as part of the Debian package.

Asias,

the flushwc() routine in zfs-fuse is overly complicated and will not do
the right thing for many cases like using device mapper, many modern SSD
device, or much simple file backed filesystems.

Fortunately fixing this is trivial - the call to flushwc can be replaced
with a call to fdatasync() and the the kernel will do the right thing
for any backing device or file.


^ permalink raw reply	[flat|nested] 73+ messages in thread

end of thread, other threads:[~2013-06-21 11:57 UTC | newest]

Thread overview: 73+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-02-06 15:15 [PATCH v2 00/14] Corrections and customization of the SG_IO command whitelist (CVE-2012-4542) Paolo Bonzini
2013-02-06 15:15 ` [PATCH v2 01/14] sg_io: pass request_queue to blk_verify_command Paolo Bonzini
2013-02-06 15:15 ` [PATCH v2 02/14] sg_io: reorganize list of allowed commands Paolo Bonzini
2013-02-06 15:15 ` [PATCH v2 03/14] sg_io: use different default filters for each device class Paolo Bonzini
2013-02-06 15:15 ` [PATCH v2 04/14] sg_io: resolve conflicts between commands assigned to multiple classes (CVE-2012-4542) Paolo Bonzini
2013-02-06 15:15 ` [PATCH v2 05/14] sg_io: whitelist a few more commands for rare & obsolete device types Paolo Bonzini
2013-02-06 15:15 ` [PATCH v2 06/14] sg_io: whitelist another command for multimedia devices Paolo Bonzini
2013-02-06 15:15 ` [PATCH v2 07/14] sg_io: whitelist a few more commands for media changers Paolo Bonzini
2013-02-06 15:15 ` [PATCH v2 08/14] sg_io: whitelist a few more commands for tapes Paolo Bonzini
2013-02-06 15:15 ` [PATCH v2 09/14] sg_io: whitelist a few more commands for disks Paolo Bonzini
2013-02-06 15:15 ` [PATCH v2 10/14] sg_io: whitelist a few obsolete commands Paolo Bonzini
2013-02-06 15:15 ` [PATCH v2 11/14] sg_io: mark blk_set_cmd_filter_defaults as __init Paolo Bonzini
2013-02-06 15:15 ` [PATCH v2 12/14] sg_io: remove remnants of sysfs SG_IO filters Paolo Bonzini
2013-02-06 15:16 ` [PATCH v2 13/14] sg_io: introduce unpriv_sgio queue flag Paolo Bonzini
2013-02-06 15:16 ` [PATCH v2 14/14] sg_io: use unpriv_sgio to disable whitelisting for scanners Paolo Bonzini
2013-02-13  8:32 ` [PATCH v2 00/14] Corrections and customization of the SG_IO command whitelist (CVE-2012-4542) Paolo Bonzini
2013-02-13 15:35   ` Douglas Gilbert
2013-02-13 15:48     ` Paolo Bonzini
2013-02-20 16:12 ` Paolo Bonzini
2013-03-22 22:30   ` PING^2 " Paolo Bonzini
2013-04-04 18:18     ` PING^3 " Paolo Bonzini
2013-04-17 12:26       ` PING^4 aka The Jon Corbet Effect " Paolo Bonzini
2013-04-27 13:31         ` PING^5 aka New ways to attract attentions " Paolo Bonzini
2013-05-06 20:43   ` PING^6 " Paolo Bonzini
2013-05-22  6:35 ` PING^7 (was Re: [PATCH v2 00/14] Corrections and customization of the SG_IO command whitelist (CVE-2012-4542)) Paolo Bonzini
2013-05-22  9:32   ` Tejun Heo
2013-05-22  9:53     ` Paolo Bonzini
2013-05-22 10:02       ` Tejun Heo
2013-05-22 10:23         ` Paolo Bonzini
2013-05-22 12:07           ` James Bottomley
2013-05-22 14:07             ` Paolo Bonzini
2013-05-22 16:31               ` Paolo Bonzini
2013-05-22 13:41           ` Tejun Heo
2013-05-22 14:12             ` Paolo Bonzini
2013-05-22 14:30               ` Tejun Heo
2013-05-22 15:00                 ` Paolo Bonzini
2013-05-22 19:30                   ` Tejun Heo
2013-05-22 21:18                     ` Paolo Bonzini
2013-05-22 22:17                       ` Tejun Heo
2013-05-23  0:54                         ` Tejun Heo
2013-05-23  7:45                         ` Paolo Bonzini
2013-05-23  9:02                           ` Tejun Heo
2013-05-23  9:47                             ` Paolo Bonzini
2013-05-24  1:44                               ` Tejun Heo
2013-05-24  7:13                                 ` Paolo Bonzini
2013-05-24  8:02                                   ` Tejun Heo
2013-05-24  8:31                                     ` Paolo Bonzini
2013-05-24  9:07                                       ` Tejun Heo
2013-05-24  9:45                                         ` Paolo Bonzini
2013-05-24 22:20                                           ` Tejun Heo
2013-05-25  4:35                                     ` James Bottomley
2013-05-25  5:27                                       ` Christoph Hellwig
2013-05-25  7:05                                         ` Paolo Bonzini
2013-05-25  7:11                                           ` Christoph Hellwig
2013-05-25  7:21                                             ` Paolo Bonzini
2013-06-21 11:57                                           ` Christoph Hellwig
2013-05-25  8:37                                       ` Tejun Heo
2013-05-25 11:14                                         ` Paolo Bonzini
2013-05-25 12:48                                           ` Tejun Heo
2013-05-25 12:56                                             ` Paolo Bonzini
2013-05-22 15:03               ` Theodore Ts'o
2013-05-22 15:53                 ` Paolo Bonzini
2013-05-22 16:32                   ` Martin K. Petersen
2013-05-22 17:00                     ` Paolo Bonzini
2013-05-22 18:11                       ` Theodore Ts'o
2013-05-22 19:37                         ` Paolo Bonzini
2013-05-22 20:19                           ` Theodore Ts'o
2013-05-22 20:36                             ` Paolo Bonzini
2013-05-25  3:54                     ` Vladislav Bolkhovitin
2013-05-28 20:25                       ` Martin K. Petersen
2013-05-29  6:12                         ` Vladislav Bolkhovitin
2013-05-22 20:39                   ` Tejun Heo
2013-05-22 21:12                     ` Paolo Bonzini

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).