linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH 0/6] bsg: fix regression resulting in panics when sending commands via BSG and some sanity cleanups
@ 2017-08-09 14:11 Benjamin Block
  2017-08-09 14:11 ` [RFC PATCH 1/6] bsg: fix kernel panic resulting from missing allocation of a reply-buffer Benjamin Block
                   ` (5 more replies)
  0 siblings, 6 replies; 31+ messages in thread
From: Benjamin Block @ 2017-08-09 14:11 UTC (permalink / raw)
  To: James E . J . Bottomley, Martin K . Petersen, Jens Axboe
  Cc: Benjamin Block, linux-block, linux-kernel, linux-scsi,
	Johannes Thumshirn, Christoph Hellwig, Steffen Maier, open-iscsi

Hello all,

Steffen noticed recently that we have a regression in the BSG code that
prevents us from sending any traffic over this interface. After I
researched this a bit, it turned out that this affects not only zFCP, but
likely all LLDs that implements the BSG API. This was introduced in 4.11
(details in Patch 1 of this series).

I imagine the regression happened because of some very "unfortunate"
variable- namings. I can not fix them, as they involve the base struct
request, but I tried to add some cleanups that should make the
relationships between stuff more visible in the future I hope.

Patch 1   - Regression Fix; Also tagged for stable
Patch 2-6 - Cleanups

I tagged this as RFC. Patches 2-6 are a 'nice to have' IMO, Patch 1 is
obviously necessary, and if it is OK, I can re-send it separately if
necessary. If you don't like the changes in the other patches, I don't mind
dropping them.

I am not sure about Patch 4. It certainly works, but it changes
user-visible behavior, in what I believe is within the behavior described
by the SG interface. It makes the different methods of how BSG passes
commands down to the LLDs more conform with each other - even though I
can't make them the exact same. More details in the patch description.

I rebased the series on Jens' for-next and I have function-tested the
series on s390x's zFCP with the tools provided in the zfcp HBA API library
(https://www.ibm.com/developerworks/linux/linux390/zfcp-hbaapi.html) and
some custom code to test the read/write interface of BSG.

Reviews are more than welcome :)


                                                Beste Grüße / Best regards,
                                                  - Benjamin Block

Benjamin Block (6):
  bsg: fix kernel panic resulting from missing allocation of a
    reply-buffer
  bsg: assign sense_len instead of fixed SCSI_SENSE_BUFFERSIZE
  bsg: scsi-transport: add compile-tests to prevent reply-buffer
    overflows
  bsg: refactor ioctl to use regular BSG-command infrastructure for
    SG_IO
  bsg: reduce unecessary arguments for bsg_map_hdr()
  bsg: reduce unecessary arguments for blk_complete_sgv4_hdr_rq()

 block/bsg-lib.c                     |  4 +-
 block/bsg.c                         | 90 ++++++++++++++++++++++---------------
 drivers/scsi/scsi_transport_fc.c    |  3 ++
 drivers/scsi/scsi_transport_iscsi.c |  3 ++
 include/linux/bsg-lib.h             |  2 +
 5 files changed, 65 insertions(+), 37 deletions(-)

--
2.12.2

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [RFC PATCH 1/6] bsg: fix kernel panic resulting from missing allocation of a reply-buffer
  2017-08-09 14:11 [RFC PATCH 0/6] bsg: fix regression resulting in panics when sending commands via BSG and some sanity cleanups Benjamin Block
@ 2017-08-09 14:11 ` Benjamin Block
  2017-08-10  9:32   ` Christoph Hellwig
  2017-08-09 14:11 ` [RFC PATCH 2/6] bsg: assign sense_len instead of fixed SCSI_SENSE_BUFFERSIZE Benjamin Block
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 31+ messages in thread
From: Benjamin Block @ 2017-08-09 14:11 UTC (permalink / raw)
  To: James E . J . Bottomley, Martin K . Petersen, Jens Axboe
  Cc: Benjamin Block, linux-block, linux-kernel, linux-scsi,
	Johannes Thumshirn, Christoph Hellwig, Steffen Maier, open-iscsi

In contrast to the normal SCSI-lib, the BSG block-queue doesn't make use of
any extra init_rq_fn() to make additional allocations during
request-creation, and the request sense-pointer is not used to transport
SCSI sense data, but is used as backing for the bsg_job->reply pointer;
that in turn is used in the LLDs to store protocol IUs or similar stuff.
This 're-purposing' of the sense-pointer is done in the BSG blk-lib
(bsg_create_job()), during the queue-processing.

Failing to allocate/assign it results in illegal dereferences because LLDs
use this pointer unquestioned, as can be seen in the various
BSG-implementations:

drivers/scsi/libfc/fc_lport.c:  fc_lport_bsg_request()
drivers/scsi/qla2xxx/qla_bsg.c: qla24xx_bsg_request()
drivers/scsi/qla4xxx/ql4_bsg.c: qla4xxx_process_vendor_specific()
drivers/s390/scsi/zfcp_fc.c:    zfcp_fc_ct_els_job_handler()
...

An example panic on s390x, using the zFCP driver, looks like this (I had
debugging on, otherwise NULL-pointer dereferences wouldn't even panic on
s390x):

Unable to handle kernel pointer dereference in virtual kernel address space
Failing address: 6b6b6b6b6b6b6000 TEID: 6b6b6b6b6b6b6403
Fault in home space mode while using kernel ASCE.
AS:0000000001590007 R3:0000000000000024
Oops: 0038 ilc:2 [#1] PREEMPT SMP DEBUG_PAGEALLOC
Modules linked in: <Long List>
CPU: 2 PID: 0 Comm: swapper/2 Not tainted 4.12.0-bsg-regression+ #3
Hardware name: IBM 2964 N96 702 (z/VM 6.4.0)
task: 0000000065cb0100 task.stack: 0000000065cb4000
Krnl PSW : 0704e00180000000 000003ff801e4156 (zfcp_fc_ct_els_job_handler+0x16/0x58 [zfcp])
           R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:2 PM:0 RI:0 EA:3
Krnl GPRS: 0000000000000001 000000005fa9d0d0 000000005fa9d078 0000000000e16866
           000003ff00000290 6b6b6b6b6b6b6b6b 0000000059f78f00 000000000000000f
           00000000593a0958 00000000593a0958 0000000060d88800 000000005ddd4c38
           0000000058b50100 07000000659cba08 000003ff801e8556 00000000659cb9a8
Krnl Code: 000003ff801e4146: e31020500004        lg      %r1,80(%r2)
           000003ff801e414c: 58402040           l       %r4,64(%r2)
          #000003ff801e4150: e35020200004       lg      %r5,32(%r2)
          >000003ff801e4156: 50405004           st      %r4,4(%r5)
           000003ff801e415a: e54c50080000       mvhi    8(%r5),0
           000003ff801e4160: e33010280012       lt      %r3,40(%r1)
           000003ff801e4166: a718fffb           lhi     %r1,-5
           000003ff801e416a: 1803               lr      %r0,%r3
Call Trace:
([<000003ff801e8556>] zfcp_fsf_req_complete+0x726/0x768 [zfcp])
 [<000003ff801ea82a>] zfcp_fsf_reqid_check+0x102/0x180 [zfcp]
 [<000003ff801eb980>] zfcp_qdio_int_resp+0x230/0x278 [zfcp]
 [<00000000009b91b6>] qdio_kick_handler+0x2ae/0x2c8
 [<00000000009b9e3e>] __tiqdio_inbound_processing+0x406/0xc10
 [<00000000001684c2>] tasklet_action+0x15a/0x1d8
 [<0000000000bd28ec>] __do_softirq+0x3ec/0x848
 [<00000000001675a4>] irq_exit+0x74/0xf8
 [<000000000010dd6a>] do_IRQ+0xba/0xf0
 [<0000000000bd19e8>] io_int_handler+0x104/0x2d4
 [<00000000001033b6>] enabled_wait+0xb6/0x188
([<000000000010339e>] enabled_wait+0x9e/0x188)
 [<000000000010396a>] arch_cpu_idle+0x32/0x50
 [<0000000000bd0112>] default_idle_call+0x52/0x68
 [<00000000001cd0fa>] do_idle+0x102/0x188
 [<00000000001cd41e>] cpu_startup_entry+0x3e/0x48
 [<0000000000118c64>] smp_start_secondary+0x11c/0x130
 [<0000000000bd2016>] restart_int_handler+0x62/0x78
 [<0000000000000000>]           (null)
INFO: lockdep is turned off.
Last Breaking-Event-Address:
 [<000003ff801e41d6>] zfcp_fc_ct_job_handler+0x3e/0x48 [zfcp]

Kernel panic - not syncing: Fatal exception in interrupt

To prevent this, allocate a buffer when the BSG blk-request is setup, and
before it is queued for LLD processing.

Reported-by: Steffen Maier <maier@linux.vnet.ibm.com>
Signed-off-by: Benjamin Block <bblock@linux.vnet.ibm.com>
Fixes: 82ed4db499b8 ("block: split scsi_request out of struct request")
Cc: <stable@vger.kernel.org> #4.11+
---
 block/bsg.c | 22 +++++++++++++++++-----
 1 file changed, 17 insertions(+), 5 deletions(-)

diff --git a/block/bsg.c b/block/bsg.c
index 37663b664666..285b1b8126c3 100644
--- a/block/bsg.c
+++ b/block/bsg.c
@@ -74,6 +74,8 @@ static int bsg_major;
 
 static struct kmem_cache *bsg_cmd_cachep;
 
+#define BSG_COMMAND_REPLY_BUFFERSIZE	SCSI_SENSE_BUFFERSIZE
+
 /*
  * our internal command type
  */
@@ -85,6 +87,7 @@ struct bsg_command {
 	struct bio *bidi_bio;
 	int err;
 	struct sg_io_v4 hdr;
+	u8 reply_buffer[BSG_COMMAND_REPLY_BUFFERSIZE];
 };
 
 static void bsg_free_command(struct bsg_command *bc)
@@ -137,7 +140,7 @@ static inline struct hlist_head *bsg_dev_idx_hash(int index)
 
 static int blk_fill_sgv4_hdr_rq(struct request_queue *q, struct request *rq,
 				struct sg_io_v4 *hdr, struct bsg_device *bd,
-				fmode_t has_write_perm)
+				u8 *reply_buffer, fmode_t has_write_perm)
 {
 	struct scsi_request *req = scsi_req(rq);
 
@@ -162,6 +165,10 @@ static int blk_fill_sgv4_hdr_rq(struct request_queue *q, struct request *rq,
 	 */
 	req->cmd_len = hdr->request_len;
 
+	/* this is later asigned to bsg_job as reply */
+	req->sense = reply_buffer;
+	req->sense_len = BSG_COMMAND_REPLY_BUFFERSIZE;
+
 	rq->timeout = msecs_to_jiffies(hdr->timeout);
 	if (!rq->timeout)
 		rq->timeout = q->sg_timeout;
@@ -206,7 +213,8 @@ bsg_validate_sgv4_hdr(struct sg_io_v4 *hdr, int *op)
  * map sg_io_v4 to a request.
  */
 static struct request *
-bsg_map_hdr(struct bsg_device *bd, struct sg_io_v4 *hdr, fmode_t has_write_perm)
+bsg_map_hdr(struct bsg_device *bd, struct sg_io_v4 *hdr, fmode_t has_write_perm,
+	    u8 *reply_buffer)
 {
 	struct request_queue *q = bd->queue;
 	struct request *rq, *next_rq = NULL;
@@ -237,7 +245,8 @@ bsg_map_hdr(struct bsg_device *bd, struct sg_io_v4 *hdr, fmode_t has_write_perm)
 	if (IS_ERR(rq))
 		return rq;
 
-	ret = blk_fill_sgv4_hdr_rq(q, rq, hdr, bd, has_write_perm);
+	ret = blk_fill_sgv4_hdr_rq(q, rq, hdr, bd, reply_buffer,
+				   has_write_perm);
 	if (ret)
 		goto out;
 
@@ -619,7 +628,8 @@ static int __bsg_write(struct bsg_device *bd, const char __user *buf,
 		/*
 		 * get a request, fill in the blanks, and add to request queue
 		 */
-		rq = bsg_map_hdr(bd, &bc->hdr, has_write_perm);
+		rq = bsg_map_hdr(bd, &bc->hdr, has_write_perm,
+				 bc->reply_buffer);
 		if (IS_ERR(rq)) {
 			ret = PTR_ERR(rq);
 			rq = NULL;
@@ -908,6 +918,7 @@ static long bsg_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
 	}
 	case SG_IO: {
 		struct request *rq;
+		u8 reply_buffer[BSG_COMMAND_REPLY_BUFFERSIZE] = { 0, };
 		struct bio *bio, *bidi_bio = NULL;
 		struct sg_io_v4 hdr;
 		int at_head;
@@ -915,7 +926,8 @@ static long bsg_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
 		if (copy_from_user(&hdr, uarg, sizeof(hdr)))
 			return -EFAULT;
 
-		rq = bsg_map_hdr(bd, &hdr, file->f_mode & FMODE_WRITE);
+		rq = bsg_map_hdr(bd, &hdr, file->f_mode & FMODE_WRITE,
+				 reply_buffer);
 		if (IS_ERR(rq))
 			return PTR_ERR(rq);
 
-- 
2.12.2

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [RFC PATCH 2/6] bsg: assign sense_len instead of fixed SCSI_SENSE_BUFFERSIZE
  2017-08-09 14:11 [RFC PATCH 0/6] bsg: fix regression resulting in panics when sending commands via BSG and some sanity cleanups Benjamin Block
  2017-08-09 14:11 ` [RFC PATCH 1/6] bsg: fix kernel panic resulting from missing allocation of a reply-buffer Benjamin Block
@ 2017-08-09 14:11 ` Benjamin Block
  2017-08-10  9:32   ` Christoph Hellwig
  2017-08-09 14:11 ` [RFC PATCH 3/6] bsg: scsi-transport: add compile-tests to prevent reply-buffer overflows Benjamin Block
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 31+ messages in thread
From: Benjamin Block @ 2017-08-09 14:11 UTC (permalink / raw)
  To: James E . J . Bottomley, Martin K . Petersen, Jens Axboe
  Cc: Benjamin Block, linux-block, linux-kernel, linux-scsi,
	Johannes Thumshirn, Christoph Hellwig, Steffen Maier, open-iscsi

We do set rq->sense_len when we assigne the reply-buffer in
blk_fill_sgv4_hdr_rq(). No point in possibly deviating from this value
later on.

bsg-lib.h specifies:
    unsigned int reply_len;
    /*
     * On entry : reply_len indicates the buffer size allocated for
     * the reply.
     *
     * ...
     */

Signed-off-by: Benjamin Block <bblock@linux.vnet.ibm.com>
---
 block/bsg-lib.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/block/bsg-lib.c b/block/bsg-lib.c
index c4513b23f57a..c7c2c6bbb5ae 100644
--- a/block/bsg-lib.c
+++ b/block/bsg-lib.c
@@ -147,8 +147,8 @@ static int bsg_create_job(struct device *dev, struct request *req)
 	job->request = rq->cmd;
 	job->request_len = rq->cmd_len;
 	job->reply = rq->sense;
-	job->reply_len = SCSI_SENSE_BUFFERSIZE;	/* Size of sense buffer
-						 * allocated */
+	job->reply_len = rq->sense_len;
+
 	if (req->bio) {
 		ret = bsg_map_buffer(&job->request_payload, req);
 		if (ret)
-- 
2.12.2

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [RFC PATCH 3/6] bsg: scsi-transport: add compile-tests to prevent reply-buffer overflows
  2017-08-09 14:11 [RFC PATCH 0/6] bsg: fix regression resulting in panics when sending commands via BSG and some sanity cleanups Benjamin Block
  2017-08-09 14:11 ` [RFC PATCH 1/6] bsg: fix kernel panic resulting from missing allocation of a reply-buffer Benjamin Block
  2017-08-09 14:11 ` [RFC PATCH 2/6] bsg: assign sense_len instead of fixed SCSI_SENSE_BUFFERSIZE Benjamin Block
@ 2017-08-09 14:11 ` Benjamin Block
  2017-08-10  9:32   ` Christoph Hellwig
  2017-08-09 14:11 ` [RFC PATCH 4/6] bsg: refactor ioctl to use regular BSG-command infrastructure for SG_IO Benjamin Block
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 31+ messages in thread
From: Benjamin Block @ 2017-08-09 14:11 UTC (permalink / raw)
  To: James E . J . Bottomley, Martin K . Petersen, Jens Axboe
  Cc: Benjamin Block, linux-block, linux-kernel, linux-scsi,
	Johannes Thumshirn, Christoph Hellwig, Steffen Maier, open-iscsi

The BSG implementations use the bsg_job's reply buffer as storage for their
own custom reply structures (e.g.: struct fc_bsg_reply or
struct iscsi_bsg_reply). The size of bsg_job's reply buffer and those of
the implementations is not dependent in any way the compiler can currently
check.

To make it easier to notice accidental violations add an explicit compile-
time check that tests whether the implementations' reply buffer is at most
as large as bsg_job's.

To do so, we have to move the size-define from bsg.c to a common header.

Signed-off-by: Benjamin Block <bblock@linux.vnet.ibm.com>
---
 block/bsg.c                         | 3 +--
 drivers/scsi/scsi_transport_fc.c    | 3 +++
 drivers/scsi/scsi_transport_iscsi.c | 3 +++
 include/linux/bsg-lib.h             | 2 ++
 4 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/block/bsg.c b/block/bsg.c
index 285b1b8126c3..b924f1c23c58 100644
--- a/block/bsg.c
+++ b/block/bsg.c
@@ -20,6 +20,7 @@
 #include <linux/uio.h>
 #include <linux/idr.h>
 #include <linux/bsg.h>
+#include <linux/bsg-lib.h>
 #include <linux/slab.h>
 
 #include <scsi/scsi.h>
@@ -74,8 +75,6 @@ static int bsg_major;
 
 static struct kmem_cache *bsg_cmd_cachep;
 
-#define BSG_COMMAND_REPLY_BUFFERSIZE	SCSI_SENSE_BUFFERSIZE
-
 /*
  * our internal command type
  */
diff --git a/drivers/scsi/scsi_transport_fc.c b/drivers/scsi/scsi_transport_fc.c
index 892fbd9800d9..ce6654b5d329 100644
--- a/drivers/scsi/scsi_transport_fc.c
+++ b/drivers/scsi/scsi_transport_fc.c
@@ -3736,6 +3736,9 @@ static int fc_bsg_dispatch(struct bsg_job *job)
 {
 	struct Scsi_Host *shost = fc_bsg_to_shost(job);
 
+	BUILD_BUG_ON(sizeof(struct fc_bsg_reply) >
+		     BSG_COMMAND_REPLY_BUFFERSIZE);
+
 	if (scsi_is_fc_rport(job->dev))
 		return fc_bsg_rport_dispatch(shost, job);
 	else
diff --git a/drivers/scsi/scsi_transport_iscsi.c b/drivers/scsi/scsi_transport_iscsi.c
index a424eaeafeb0..4e021c949ad7 100644
--- a/drivers/scsi/scsi_transport_iscsi.c
+++ b/drivers/scsi/scsi_transport_iscsi.c
@@ -1483,6 +1483,9 @@ static int iscsi_bsg_host_dispatch(struct bsg_job *job)
 	int cmdlen = sizeof(uint32_t);	/* start with length of msgcode */
 	int ret;
 
+	BUILD_BUG_ON(sizeof(struct iscsi_bsg_reply) >
+		     BSG_COMMAND_REPLY_BUFFERSIZE);
+
 	/* check if we have the msgcode value at least */
 	if (job->request_len < sizeof(uint32_t)) {
 		ret = -ENOMSG;
diff --git a/include/linux/bsg-lib.h b/include/linux/bsg-lib.h
index e34dde2da0ef..85d7c7678cc6 100644
--- a/include/linux/bsg-lib.h
+++ b/include/linux/bsg-lib.h
@@ -25,6 +25,8 @@
 
 #include <linux/blkdev.h>
 
+#define BSG_COMMAND_REPLY_BUFFERSIZE	SCSI_SENSE_BUFFERSIZE
+
 struct request;
 struct device;
 struct scatterlist;
-- 
2.12.2

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [RFC PATCH 4/6] bsg: refactor ioctl to use regular BSG-command infrastructure for SG_IO
  2017-08-09 14:11 [RFC PATCH 0/6] bsg: fix regression resulting in panics when sending commands via BSG and some sanity cleanups Benjamin Block
                   ` (2 preceding siblings ...)
  2017-08-09 14:11 ` [RFC PATCH 3/6] bsg: scsi-transport: add compile-tests to prevent reply-buffer overflows Benjamin Block
@ 2017-08-09 14:11 ` Benjamin Block
  2017-08-10  8:24   ` Johannes Thumshirn
  2017-08-09 14:11 ` [RFC PATCH 5/6] bsg: reduce unnecessary arguments for bsg_map_hdr() Benjamin Block
  2017-08-09 14:11 ` [RFC PATCH 6/6] bsg: reduce unnecessary arguments for blk_complete_sgv4_hdr_rq() Benjamin Block
  5 siblings, 1 reply; 31+ messages in thread
From: Benjamin Block @ 2017-08-09 14:11 UTC (permalink / raw)
  To: James E . J . Bottomley, Martin K . Petersen, Jens Axboe
  Cc: Benjamin Block, linux-block, linux-kernel, linux-scsi,
	Johannes Thumshirn, Christoph Hellwig, Steffen Maier, open-iscsi

Before, the SG_IO ioctl for BSG devices used to use its own on-stack data
to assemble and send the specified command. The read and write calls use
their own infrastructure build around the struct bsg_command and a custom
slab-pool for that.

Rafactor this, so that SG_IO ioctl also uses struct bsg_command and the
surrounding infrastructure. This way we use global defines like
BSG_COMMAND_REPLY_BUFFERSIZE only in one place, rather than two, the
handling of BSG commands gets more consistent, and it reduces some code-
duplications (the bio-pointer handling). It also reduces the stack
footprint by 320 to 384 bytes (depending on how large pointers are), and
uses the active slab-implementation for efficient alloc/free.

There are two other side-effects:
 - the 'duration' field in the sg header is now also filled for SG_IO
   calls, unlike before were it was always zero.
 - the BSG device queue-limit is also applied to SG_IO, unlike before were
   you could flood one BSG device with as many commands as you'd like. If
   one can trust older SG documentation this limit is applicable to either
   normal writes, or SG_IO calls; but this wasn't enforced before for
   SG_IO.

A complete unification is not possible, as it then would also enqueue SG_IO
commands in the BGS devices's command list, but this is only for the read-
and write-calls.

Signed-off-by: Benjamin Block <bblock@linux.vnet.ibm.com>
---
 block/bsg.c | 60 ++++++++++++++++++++++++++++++++++++------------------------
 1 file changed, 36 insertions(+), 24 deletions(-)

diff --git a/block/bsg.c b/block/bsg.c
index b924f1c23c58..8517361a9b3f 100644
--- a/block/bsg.c
+++ b/block/bsg.c
@@ -320,6 +320,17 @@ static void bsg_rq_end_io(struct request *rq, blk_status_t status)
 	wake_up(&bd->wq_done);
 }
 
+static int bsg_prep_add_command(struct bsg_command *bc, struct request *rq)
+{
+	bc->rq = rq;
+	bc->bio = rq->bio;
+	if (rq->next_rq)
+		bc->bidi_bio = rq->next_rq->bio;
+	bc->hdr.duration = jiffies;
+
+	return 0 == (bc->hdr.flags & BSG_FLAG_Q_AT_TAIL);
+}
+
 /*
  * do final setup of a 'bc' and submit the matching 'rq' to the block
  * layer for io
@@ -327,16 +338,11 @@ static void bsg_rq_end_io(struct request *rq, blk_status_t status)
 static void bsg_add_command(struct bsg_device *bd, struct request_queue *q,
 			    struct bsg_command *bc, struct request *rq)
 {
-	int at_head = (0 == (bc->hdr.flags & BSG_FLAG_Q_AT_TAIL));
+	int at_head = bsg_prep_add_command(bc, rq);
 
 	/*
 	 * add bc command to busy queue and submit rq for io
 	 */
-	bc->rq = rq;
-	bc->bio = rq->bio;
-	if (rq->next_rq)
-		bc->bidi_bio = rq->next_rq->bio;
-	bc->hdr.duration = jiffies;
 	spin_lock_irq(&bd->lock);
 	list_add_tail(&bc->list, &bd->busy_list);
 	spin_unlock_irq(&bd->lock);
@@ -916,31 +922,37 @@ static long bsg_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
 		return scsi_cmd_ioctl(bd->queue, NULL, file->f_mode, cmd, uarg);
 	}
 	case SG_IO: {
-		struct request *rq;
-		u8 reply_buffer[BSG_COMMAND_REPLY_BUFFERSIZE] = { 0, };
-		struct bio *bio, *bidi_bio = NULL;
-		struct sg_io_v4 hdr;
+		struct bsg_command *bc;
 		int at_head;
 
-		if (copy_from_user(&hdr, uarg, sizeof(hdr)))
-			return -EFAULT;
+		bc = bsg_alloc_command(bd);
+		if (IS_ERR(bc))
+			return PTR_ERR(bc);
 
-		rq = bsg_map_hdr(bd, &hdr, file->f_mode & FMODE_WRITE,
-				 reply_buffer);
-		if (IS_ERR(rq))
-			return PTR_ERR(rq);
+		if (copy_from_user(&bc->hdr, uarg, sizeof(bc->hdr))) {
+			ret = -EFAULT;
+			goto sg_io_out;
+		}
 
-		bio = rq->bio;
-		if (rq->next_rq)
-			bidi_bio = rq->next_rq->bio;
+		bc->rq = bsg_map_hdr(bd, &bc->hdr, file->f_mode & FMODE_WRITE,
+				     bc->reply_buffer);
+		if (IS_ERR(bc->rq)) {
+			ret = PTR_ERR(bc->rq);
+			goto sg_io_out;
+		}
 
-		at_head = (0 == (hdr.flags & BSG_FLAG_Q_AT_TAIL));
-		blk_execute_rq(bd->queue, NULL, rq, at_head);
-		ret = blk_complete_sgv4_hdr_rq(rq, &hdr, bio, bidi_bio);
+		at_head = bsg_prep_add_command(bc, bc->rq);
+		blk_execute_rq(bd->queue, NULL, bc->rq, at_head);
+		bc->hdr.duration = jiffies_to_msecs(jiffies - bc->hdr.duration);
 
-		if (copy_to_user(uarg, &hdr, sizeof(hdr)))
-			return -EFAULT;
+		ret = blk_complete_sgv4_hdr_rq(bc->rq, &bc->hdr, bc->bio,
+					       bc->bidi_bio);
 
+		if (copy_to_user(uarg, &bc->hdr, sizeof(bc->hdr)))
+			ret = -EFAULT;
+
+	sg_io_out:
+		bsg_free_command(bc);
 		return ret;
 	}
 	/*
-- 
2.12.2

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [RFC PATCH 5/6] bsg: reduce unnecessary arguments for bsg_map_hdr()
  2017-08-09 14:11 [RFC PATCH 0/6] bsg: fix regression resulting in panics when sending commands via BSG and some sanity cleanups Benjamin Block
                   ` (3 preceding siblings ...)
  2017-08-09 14:11 ` [RFC PATCH 4/6] bsg: refactor ioctl to use regular BSG-command infrastructure for SG_IO Benjamin Block
@ 2017-08-09 14:11 ` Benjamin Block
  2017-08-10  8:26   ` Johannes Thumshirn
  2017-08-10  9:35   ` Christoph Hellwig
  2017-08-09 14:11 ` [RFC PATCH 6/6] bsg: reduce unnecessary arguments for blk_complete_sgv4_hdr_rq() Benjamin Block
  5 siblings, 2 replies; 31+ messages in thread
From: Benjamin Block @ 2017-08-09 14:11 UTC (permalink / raw)
  To: James E . J . Bottomley, Martin K . Petersen, Jens Axboe
  Cc: Benjamin Block, linux-block, linux-kernel, linux-scsi,
	Johannes Thumshirn, Christoph Hellwig, Steffen Maier, open-iscsi

Since struct bsg_command is now used in every calling case, we don't
need separation of arguments anymore that are contained in the same
bsg_command.

Signed-off-by: Benjamin Block <bblock@linux.vnet.ibm.com>
---
 block/bsg.c | 13 ++++++-------
 1 file changed, 6 insertions(+), 7 deletions(-)

diff --git a/block/bsg.c b/block/bsg.c
index 8517361a9b3f..6ee2ffca808a 100644
--- a/block/bsg.c
+++ b/block/bsg.c
@@ -212,11 +212,12 @@ bsg_validate_sgv4_hdr(struct sg_io_v4 *hdr, int *op)
  * map sg_io_v4 to a request.
  */
 static struct request *
-bsg_map_hdr(struct bsg_device *bd, struct sg_io_v4 *hdr, fmode_t has_write_perm,
-	    u8 *reply_buffer)
+bsg_map_hdr(struct bsg_device *bd, struct bsg_command *bc,
+	    fmode_t has_write_perm)
 {
 	struct request_queue *q = bd->queue;
 	struct request *rq, *next_rq = NULL;
+	struct sg_io_v4 *hdr = &bc->hdr;
 	int ret;
 	unsigned int op, dxfer_len;
 	void __user *dxferp = NULL;
@@ -244,7 +245,7 @@ bsg_map_hdr(struct bsg_device *bd, struct sg_io_v4 *hdr, fmode_t has_write_perm,
 	if (IS_ERR(rq))
 		return rq;
 
-	ret = blk_fill_sgv4_hdr_rq(q, rq, hdr, bd, reply_buffer,
+	ret = blk_fill_sgv4_hdr_rq(q, rq, hdr, bd, bc->reply_buffer,
 				   has_write_perm);
 	if (ret)
 		goto out;
@@ -633,8 +634,7 @@ static int __bsg_write(struct bsg_device *bd, const char __user *buf,
 		/*
 		 * get a request, fill in the blanks, and add to request queue
 		 */
-		rq = bsg_map_hdr(bd, &bc->hdr, has_write_perm,
-				 bc->reply_buffer);
+		rq = bsg_map_hdr(bd, bc, has_write_perm);
 		if (IS_ERR(rq)) {
 			ret = PTR_ERR(rq);
 			rq = NULL;
@@ -934,8 +934,7 @@ static long bsg_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
 			goto sg_io_out;
 		}
 
-		bc->rq = bsg_map_hdr(bd, &bc->hdr, file->f_mode & FMODE_WRITE,
-				     bc->reply_buffer);
+		bc->rq = bsg_map_hdr(bd, bc, file->f_mode & FMODE_WRITE);
 		if (IS_ERR(bc->rq)) {
 			ret = PTR_ERR(bc->rq);
 			goto sg_io_out;
-- 
2.12.2

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [RFC PATCH 6/6] bsg: reduce unnecessary arguments for blk_complete_sgv4_hdr_rq()
  2017-08-09 14:11 [RFC PATCH 0/6] bsg: fix regression resulting in panics when sending commands via BSG and some sanity cleanups Benjamin Block
                   ` (4 preceding siblings ...)
  2017-08-09 14:11 ` [RFC PATCH 5/6] bsg: reduce unnecessary arguments for bsg_map_hdr() Benjamin Block
@ 2017-08-09 14:11 ` Benjamin Block
  2017-08-10  8:27   ` Johannes Thumshirn
  2017-08-10  9:35   ` Christoph Hellwig
  5 siblings, 2 replies; 31+ messages in thread
From: Benjamin Block @ 2017-08-09 14:11 UTC (permalink / raw)
  To: James E . J . Bottomley, Martin K . Petersen, Jens Axboe
  Cc: Benjamin Block, linux-block, linux-kernel, linux-scsi,
	Johannes Thumshirn, Christoph Hellwig, Steffen Maier, open-iscsi

Since struct bsg_command is now used in every calling case, we don't
need separation of arguments anymore that are contained in the same
bsg_command.

Signed-off-by: Benjamin Block <bblock@linux.vnet.ibm.com>
---
 block/bsg.c | 20 +++++++++-----------
 1 file changed, 9 insertions(+), 11 deletions(-)

diff --git a/block/bsg.c b/block/bsg.c
index 6ee2ffca808a..09f767cdf816 100644
--- a/block/bsg.c
+++ b/block/bsg.c
@@ -399,13 +399,14 @@ static struct bsg_command *bsg_get_done_cmd(struct bsg_device *bd)
 	return bc;
 }
 
-static int blk_complete_sgv4_hdr_rq(struct request *rq, struct sg_io_v4 *hdr,
-				    struct bio *bio, struct bio *bidi_bio)
+static int blk_complete_sgv4_hdr_rq(struct bsg_command *bc)
 {
+	struct sg_io_v4 *hdr = &bc->hdr;
+	struct request *rq = bc->rq;
 	struct scsi_request *req = scsi_req(rq);
 	int ret = 0;
 
-	dprintk("rq %p bio %p 0x%x\n", rq, bio, req->result);
+	dprintk("rq %p bio %p 0x%x\n", rq, bc->bio, req->result);
 	/*
 	 * fill in all the output members
 	 */
@@ -432,7 +433,7 @@ static int blk_complete_sgv4_hdr_rq(struct request *rq, struct sg_io_v4 *hdr,
 	if (rq->next_rq) {
 		hdr->dout_resid = req->resid_len;
 		hdr->din_resid = scsi_req(rq->next_rq)->resid_len;
-		blk_rq_unmap_user(bidi_bio);
+		blk_rq_unmap_user(bc->bidi_bio);
 		blk_put_request(rq->next_rq);
 	} else if (rq_data_dir(rq) == READ)
 		hdr->din_resid = req->resid_len;
@@ -448,7 +449,7 @@ static int blk_complete_sgv4_hdr_rq(struct request *rq, struct sg_io_v4 *hdr,
 	if (!ret && req->result < 0)
 		ret = req->result;
 
-	blk_rq_unmap_user(bio);
+	blk_rq_unmap_user(bc->bio);
 	scsi_req_free_cmd(req);
 	blk_put_request(rq);
 
@@ -507,8 +508,7 @@ static int bsg_complete_all_commands(struct bsg_device *bd)
 		if (IS_ERR(bc))
 			break;
 
-		tret = blk_complete_sgv4_hdr_rq(bc->rq, &bc->hdr, bc->bio,
-						bc->bidi_bio);
+		tret = blk_complete_sgv4_hdr_rq(bc);
 		if (!ret)
 			ret = tret;
 
@@ -542,8 +542,7 @@ __bsg_read(char __user *buf, size_t count, struct bsg_device *bd,
 		 * after completing the request. so do that here,
 		 * bsg_complete_work() cannot do that for us
 		 */
-		ret = blk_complete_sgv4_hdr_rq(bc->rq, &bc->hdr, bc->bio,
-					       bc->bidi_bio);
+		ret = blk_complete_sgv4_hdr_rq(bc);
 
 		if (copy_to_user(buf, &bc->hdr, sizeof(bc->hdr)))
 			ret = -EFAULT;
@@ -944,8 +943,7 @@ static long bsg_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
 		blk_execute_rq(bd->queue, NULL, bc->rq, at_head);
 		bc->hdr.duration = jiffies_to_msecs(jiffies - bc->hdr.duration);
 
-		ret = blk_complete_sgv4_hdr_rq(bc->rq, &bc->hdr, bc->bio,
-					       bc->bidi_bio);
+		ret = blk_complete_sgv4_hdr_rq(bc);
 
 		if (copy_to_user(uarg, &bc->hdr, sizeof(bc->hdr)))
 			ret = -EFAULT;
-- 
2.12.2

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [RFC PATCH 4/6] bsg: refactor ioctl to use regular BSG-command infrastructure for SG_IO
  2017-08-09 14:11 ` [RFC PATCH 4/6] bsg: refactor ioctl to use regular BSG-command infrastructure for SG_IO Benjamin Block
@ 2017-08-10  8:24   ` Johannes Thumshirn
  2017-08-10  9:34     ` Christoph Hellwig
  2017-08-10 22:12     ` Benjamin Block
  0 siblings, 2 replies; 31+ messages in thread
From: Johannes Thumshirn @ 2017-08-10  8:24 UTC (permalink / raw)
  To: Benjamin Block
  Cc: James E . J . Bottomley, Martin K . Petersen, Jens Axboe,
	linux-block, linux-kernel, linux-scsi, Christoph Hellwig,
	Steffen Maier, open-iscsi

On Wed, Aug 09, 2017 at 04:11:18PM +0200, Benjamin Block wrote:
> +	return 0 == (bc->hdr.flags & BSG_FLAG_Q_AT_TAIL);

return !(bc->hdr.flags & BSG_FLAG_Q_AT_TAIL); and make the function return
bool? I have to admit, this is the 1st time I have seen the above construct.

Thanks,
	Johannes
-- 
Johannes Thumshirn                                          Storage
jthumshirn@suse.de                                +49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC PATCH 5/6] bsg: reduce unnecessary arguments for bsg_map_hdr()
  2017-08-09 14:11 ` [RFC PATCH 5/6] bsg: reduce unnecessary arguments for bsg_map_hdr() Benjamin Block
@ 2017-08-10  8:26   ` Johannes Thumshirn
  2017-08-10  9:35   ` Christoph Hellwig
  1 sibling, 0 replies; 31+ messages in thread
From: Johannes Thumshirn @ 2017-08-10  8:26 UTC (permalink / raw)
  To: Benjamin Block
  Cc: James E . J . Bottomley, Martin K . Petersen, Jens Axboe,
	linux-block, linux-kernel, linux-scsi, Christoph Hellwig,
	Steffen Maier, open-iscsi

Looks good,
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
-- 
Johannes Thumshirn                                          Storage
jthumshirn@suse.de                                +49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC PATCH 6/6] bsg: reduce unnecessary arguments for blk_complete_sgv4_hdr_rq()
  2017-08-09 14:11 ` [RFC PATCH 6/6] bsg: reduce unnecessary arguments for blk_complete_sgv4_hdr_rq() Benjamin Block
@ 2017-08-10  8:27   ` Johannes Thumshirn
  2017-08-10  9:35   ` Christoph Hellwig
  1 sibling, 0 replies; 31+ messages in thread
From: Johannes Thumshirn @ 2017-08-10  8:27 UTC (permalink / raw)
  To: Benjamin Block
  Cc: James E . J . Bottomley, Martin K . Petersen, Jens Axboe,
	linux-block, linux-kernel, linux-scsi, Christoph Hellwig,
	Steffen Maier, open-iscsi

Looks good,
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
-- 
Johannes Thumshirn                                          Storage
jthumshirn@suse.de                                +49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC PATCH 1/6] bsg: fix kernel panic resulting from missing allocation of a reply-buffer
  2017-08-09 14:11 ` [RFC PATCH 1/6] bsg: fix kernel panic resulting from missing allocation of a reply-buffer Benjamin Block
@ 2017-08-10  9:32   ` Christoph Hellwig
  2017-08-10 22:10     ` Benjamin Block
  0 siblings, 1 reply; 31+ messages in thread
From: Christoph Hellwig @ 2017-08-10  9:32 UTC (permalink / raw)
  To: Benjamin Block
  Cc: James E . J . Bottomley, Martin K . Petersen, Jens Axboe,
	linux-block, linux-kernel, linux-scsi, Johannes Thumshirn,
	Christoph Hellwig, Steffen Maier, open-iscsi

We can't use an on-stack buffer for the sense data, as drivers will
dma to it.  So we should reuse the SCSI init_rq_fn() for the BSG
queues and/or implement the same scheme.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC PATCH 2/6] bsg: assign sense_len instead of fixed SCSI_SENSE_BUFFERSIZE
  2017-08-09 14:11 ` [RFC PATCH 2/6] bsg: assign sense_len instead of fixed SCSI_SENSE_BUFFERSIZE Benjamin Block
@ 2017-08-10  9:32   ` Christoph Hellwig
  0 siblings, 0 replies; 31+ messages in thread
From: Christoph Hellwig @ 2017-08-10  9:32 UTC (permalink / raw)
  To: Benjamin Block
  Cc: James E . J . Bottomley, Martin K . Petersen, Jens Axboe,
	linux-block, linux-kernel, linux-scsi, Johannes Thumshirn,
	Christoph Hellwig, Steffen Maier, open-iscsi

Looks fine,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC PATCH 3/6] bsg: scsi-transport: add compile-tests to prevent reply-buffer overflows
  2017-08-09 14:11 ` [RFC PATCH 3/6] bsg: scsi-transport: add compile-tests to prevent reply-buffer overflows Benjamin Block
@ 2017-08-10  9:32   ` Christoph Hellwig
  0 siblings, 0 replies; 31+ messages in thread
From: Christoph Hellwig @ 2017-08-10  9:32 UTC (permalink / raw)
  To: Benjamin Block
  Cc: James E . J . Bottomley, Martin K . Petersen, Jens Axboe,
	linux-block, linux-kernel, linux-scsi, Johannes Thumshirn,
	Christoph Hellwig, Steffen Maier, open-iscsi

Looks fine,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC PATCH 4/6] bsg: refactor ioctl to use regular BSG-command infrastructure for SG_IO
  2017-08-10  8:24   ` Johannes Thumshirn
@ 2017-08-10  9:34     ` Christoph Hellwig
  2017-08-10 22:12     ` Benjamin Block
  1 sibling, 0 replies; 31+ messages in thread
From: Christoph Hellwig @ 2017-08-10  9:34 UTC (permalink / raw)
  To: Johannes Thumshirn
  Cc: Benjamin Block, James E . J . Bottomley, Martin K . Petersen,
	Jens Axboe, linux-block, linux-kernel, linux-scsi,
	Christoph Hellwig, Steffen Maier, open-iscsi

On Thu, Aug 10, 2017 at 10:24:56AM +0200, Johannes Thumshirn wrote:
> On Wed, Aug 09, 2017 at 04:11:18PM +0200, Benjamin Block wrote:
> > +	return 0 == (bc->hdr.flags & BSG_FLAG_Q_AT_TAIL);
> 
> return !(bc->hdr.flags & BSG_FLAG_Q_AT_TAIL); and make the function return
> bool? I have to admit, this is the 1st time I have seen the above construct.

It's a somewhat odd style.  I agree with your comment, but otherwise
the patch looks ok to me.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC PATCH 5/6] bsg: reduce unnecessary arguments for bsg_map_hdr()
  2017-08-09 14:11 ` [RFC PATCH 5/6] bsg: reduce unnecessary arguments for bsg_map_hdr() Benjamin Block
  2017-08-10  8:26   ` Johannes Thumshirn
@ 2017-08-10  9:35   ` Christoph Hellwig
  2017-08-10 22:19     ` Benjamin Block
  1 sibling, 1 reply; 31+ messages in thread
From: Christoph Hellwig @ 2017-08-10  9:35 UTC (permalink / raw)
  To: Benjamin Block
  Cc: James E . J . Bottomley, Martin K . Petersen, Jens Axboe,
	linux-block, linux-kernel, linux-scsi, Johannes Thumshirn,
	Christoph Hellwig, Steffen Maier, open-iscsi

On Wed, Aug 09, 2017 at 04:11:19PM +0200, Benjamin Block wrote:
> Since struct bsg_command is now used in every calling case, we don't
> need separation of arguments anymore that are contained in the same
> bsg_command.
> 
> Signed-off-by: Benjamin Block <bblock@linux.vnet.ibm.com>
> ---
>  block/bsg.c | 13 ++++++-------
>  1 file changed, 6 insertions(+), 7 deletions(-)
> 
> diff --git a/block/bsg.c b/block/bsg.c
> index 8517361a9b3f..6ee2ffca808a 100644
> --- a/block/bsg.c
> +++ b/block/bsg.c
> @@ -212,11 +212,12 @@ bsg_validate_sgv4_hdr(struct sg_io_v4 *hdr, int *op)
>   * map sg_io_v4 to a request.
>   */
>  static struct request *
> -bsg_map_hdr(struct bsg_device *bd, struct sg_io_v4 *hdr, fmode_t has_write_perm,
> -	    u8 *reply_buffer)
> +bsg_map_hdr(struct bsg_device *bd, struct bsg_command *bc,
> +	    fmode_t has_write_perm)

I wish we could just rename the argument to mode and pass on the
whole file->f_mode while you are cleaning up this code.  That should
be a separate patch, though.

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC PATCH 6/6] bsg: reduce unnecessary arguments for blk_complete_sgv4_hdr_rq()
  2017-08-09 14:11 ` [RFC PATCH 6/6] bsg: reduce unnecessary arguments for blk_complete_sgv4_hdr_rq() Benjamin Block
  2017-08-10  8:27   ` Johannes Thumshirn
@ 2017-08-10  9:35   ` Christoph Hellwig
  1 sibling, 0 replies; 31+ messages in thread
From: Christoph Hellwig @ 2017-08-10  9:35 UTC (permalink / raw)
  To: Benjamin Block
  Cc: James E . J . Bottomley, Martin K . Petersen, Jens Axboe,
	linux-block, linux-kernel, linux-scsi, Johannes Thumshirn,
	Christoph Hellwig, Steffen Maier, open-iscsi

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC PATCH 1/6] bsg: fix kernel panic resulting from missing allocation of a reply-buffer
  2017-08-10  9:32   ` Christoph Hellwig
@ 2017-08-10 22:10     ` Benjamin Block
  2017-08-10 22:45       ` Benjamin Block
  2017-08-11  8:38       ` Christoph Hellwig
  0 siblings, 2 replies; 31+ messages in thread
From: Benjamin Block @ 2017-08-10 22:10 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: James E . J . Bottomley, Martin K . Petersen, Jens Axboe,
	linux-block, linux-kernel, linux-scsi, Johannes Thumshirn,
	Steffen Maier, open-iscsi

On Thu, Aug 10, 2017 at 11:32:17AM +0200, Christoph Hellwig wrote:
> We can't use an on-stack buffer for the sense data, as drivers will
> dma to it.  So we should reuse the SCSI init_rq_fn() for the BSG
> queues and/or implement the same scheme.
> 

BSG is odd in this regard. Here is the data model as far as I understood
it from reading the source.

The user of the interface provides his input via a sg_io_v4 object.

 struct sg_io_v4
 +--------------+
 |              |  
 | request-------->+----------------------------+
 |   + _len     |  |                            |
 |   (A)        |  | BSG Request                |
 |              |  | e.g. struct fc_bsg_request | Depends on BSG implementation
 |              |  |                            | FC vs. iSCSI vs. ...
 |              |  +----------------------------+
 |              |
 | response------->+----------------------------+ Used as _Output_
 |   + max_len  |  |                            | User doesn't initialize
 |   (B)        |  | BSG Reply                  | User provides (optional)
 |              |  | e.g. struct fc_bsg_reply   |   memory; May be NULL.
 |              |  |                            |
 |              |  +----------------------------+
 |              |
 | dout_xferp----->+-----------------------+ Stuff send on the wire by
 |   + _len     |  |                       |   the LLD
 |   (C)        |  | Transport Protocol IU | Aligned on PAGE_SIZE
 |              |  | e.g. FC-GS-7 CT_IU    |
 |              |  |                       |
 |              |  +-----------------------+
 |              |
 | din_xferp------>+-----------------------+ Buffer for response data by
 |   + _len     |  |                       |   the LLD
 |   (D)        |  | Transport Protocol IU | Aligned on PAGE_SIZE
 |              |  | e.g. FC-GS-7 CT_IU    |
 |              |  |                       |
 |              |  +-----------------------+
 +--------------+

This is just a part of it, but the most important for this issue. The
BSG driver then encapsulates this into two linked block-requests he
passes down to the LLDs. The first block-request (E) is for the Request
Data, and the second request (F) for the Response Data. (F) is optional,
depending on the whether the user passes both dout_xferp and din_xferp.

 struct request (E)
 +--------------+
 |              |  struct scsi_request
 | scsi_request--->+-----------------+
 |              |  |                 |
 |              |  | cmd---------------------> Copy of (A)
 |              |  |  + _len         |         Space in struct or kzalloc
 |              |  |  (G)            |
 |              |  |                 |
 |              |  | sense-------------------> Space for BSG Reply
 |              |  |  + _len         |         Same Data-Structure as (B)
 |              |  |  (H)            |         NOT actually pointer (B)
 |              |  |                 |         'reply_buffer' in my patch 
 |              |  +-----------------+
 |              |
 | bio------------> Mapped via blk_rq_map_user() to (C) dout_xferp
 |              |
 | next_rq---------+
 |              |  |
 +--------------+  |
                   |
 struct request (F)|(if used)
 +--------------+<-+
 |              |
 | scsi_request---> Unused here
 |              |
 | bio------------> Mapped via blk_rq_map_user() to (D) din_xferp
 |              |
 +--------------+

This is enqueued in the (legacy) blk-queue. In bsg-lib.c this is
processed and encapsulated into an other data-structure struct bsg_job

 struct bsg_job
 +-----------------+
 |                 |
 | request-----------> (G) scsi_request->cmd -> Copy of (A)
 |   + _len        |       e.g. struct fc_bsg_request
 |                 |
 | reply-------------> (H) scsi_request->sense -> 'reply_buffer'
 |   + _len        |       e.g. struct fc_bsg_reply
 |                 |
 | request_payload---> struct scatterlist ... map (E)->bio
 |   + _len        |
 |   (I)           |
 |                 |
 | reply_payload-----> struct scatterlist ... map (F)->bio
 |   + _len        |
 |   (J)           |
 |                 |
 +-----------------+

This then is finally what the LLD gets to work with, with the callback
from the request-queue. Depending on contents of (G) the LLD gets to
decide whatever the user-space wants him to do. This depends highly on
the transport.

In case of zFCP we map (I) and (J) directly into the ring that passes
the data to our hardware and the one that the hardware uses to pass back
the responses.

(This is actually pretty cool if you think about it. Apart from the copy
we make of (A) into (G), this whole send was completely 'zero-copy'.
Depending on the hardware it can directly DMA onto the wire... I guess
most modern cards can do such a thing.)

When the answer is coming back, the payload data is expected to be put
into (J). If your HW can DMA into this, no more effort.

Again, depending on (H), the LLD fills in some information for
accounting and such. In case of FC, there is also some
protocol-information, but this is by no means a standard IU.

This is then passed up the stack pretty quick and if the user passed
something with (B), data from (H) is copied into (B). But this is
optional. The main payload is transferred to the user via (J) which is a
remap of (D).

So this is my understanding here. (I don't wanna say that this is
necessarily completely correct ;-), pleas correct me, if I am wrong. The
lack of proper documentation is also not helping at all.)

This worked till it broke. Right now every driver that tries to access
(H) will panic the system, or cause very undefined behavior. I suspect
no driver right now tries to do any DMA into (H); before the regression,
this has been also an on-stack variable (I suspect since BSG was
introduced, haven't checked though).

The asymmetries between the first struct request (E) and the following
(F) also makes it hard to use the same scheme as in other drivers, where
init_rq_fn() gets to initialize each request in the same'ish way. Or?
Just looking at it right now, this would require some bigger rework that
is not appropriate for a stable bug-fix.

I think it would be best if we fix the possible panics every user of
this is gonna experience and after that we can still think about
improving it further beyond what the rest of the patch set already does,
if that is necessary.


                                                    Beste Grüße / Best regards,
                                                      - Benjamin Block
-- 
Linux on z Systems Development         /         IBM Systems & Technology Group
                  IBM Deutschland Research & Development GmbH 
Vorsitz. AufsR.: Martina Koederitz     /        Geschäftsführung: Dirk Wittkopp
Sitz der Gesellschaft: Böblingen / Registergericht: AmtsG Stuttgart, HRB 243294

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC PATCH 4/6] bsg: refactor ioctl to use regular BSG-command infrastructure for SG_IO
  2017-08-10  8:24   ` Johannes Thumshirn
  2017-08-10  9:34     ` Christoph Hellwig
@ 2017-08-10 22:12     ` Benjamin Block
  1 sibling, 0 replies; 31+ messages in thread
From: Benjamin Block @ 2017-08-10 22:12 UTC (permalink / raw)
  To: Johannes Thumshirn
  Cc: James E . J . Bottomley, Martin K . Petersen, Jens Axboe,
	linux-block, linux-kernel, linux-scsi, Christoph Hellwig,
	Steffen Maier, open-iscsi

On Thu, Aug 10, 2017 at 10:24:56AM +0200, Johannes Thumshirn wrote:
> On Wed, Aug 09, 2017 at 04:11:18PM +0200, Benjamin Block wrote:
> > +	return 0 == (bc->hdr.flags & BSG_FLAG_Q_AT_TAIL);
> 
> return !(bc->hdr.flags & BSG_FLAG_Q_AT_TAIL); and make the function return
> bool? I have to admit, this is the 1st time I have seen the above construct.
> 

Hmm yeah, odd. To be honest, I just copied the expression so that it is
obvious that no behavior changed, but just the place. I'll replace it
with what you suggest.



                                                    Beste Grüße / Best regards,
                                                      - Benjamin Block
-- 
Linux on z Systems Development         /         IBM Systems & Technology Group
		  IBM Deutschland Research & Development GmbH 
Vorsitz. AufsR.: Martina Koederitz     /        Geschäftsführung: Dirk Wittkopp
Sitz der Gesellschaft: Böblingen / Registergericht: AmtsG Stuttgart, HRB 243294

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC PATCH 5/6] bsg: reduce unnecessary arguments for bsg_map_hdr()
  2017-08-10  9:35   ` Christoph Hellwig
@ 2017-08-10 22:19     ` Benjamin Block
  0 siblings, 0 replies; 31+ messages in thread
From: Benjamin Block @ 2017-08-10 22:19 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: James E . J . Bottomley, Martin K . Petersen, Jens Axboe,
	linux-block, linux-kernel, linux-scsi, Johannes Thumshirn,
	Steffen Maier, open-iscsi

On Thu, Aug 10, 2017 at 11:35:31AM +0200, Christoph Hellwig wrote:
> On Wed, Aug 09, 2017 at 04:11:19PM +0200, Benjamin Block wrote:
> > Since struct bsg_command is now used in every calling case, we don't
> > need separation of arguments anymore that are contained in the same
> > bsg_command.
> > 
> > Signed-off-by: Benjamin Block <bblock@linux.vnet.ibm.com>
> > ---
> >  block/bsg.c | 13 ++++++-------
> >  1 file changed, 6 insertions(+), 7 deletions(-)
> > 
> > diff --git a/block/bsg.c b/block/bsg.c
> > index 8517361a9b3f..6ee2ffca808a 100644
> > --- a/block/bsg.c
> > +++ b/block/bsg.c
> > @@ -212,11 +212,12 @@ bsg_validate_sgv4_hdr(struct sg_io_v4 *hdr, int *op)
> >   * map sg_io_v4 to a request.
> >   */
> >  static struct request *
> > -bsg_map_hdr(struct bsg_device *bd, struct sg_io_v4 *hdr, fmode_t has_write_perm,
> > -	    u8 *reply_buffer)
> > +bsg_map_hdr(struct bsg_device *bd, struct bsg_command *bc,
> > +	    fmode_t has_write_perm)
> 
> I wish we could just rename the argument to mode and pass on the
> whole file->f_mode while you are cleaning up this code.  That should
> be a separate patch, though.
> 

Hmm, I did a quick pass through the code and the only place this seems
to be used, is to pass it to blk_verify_command() if the subcommand used
in the BSG request is a SCSI Command. And this has the same semantics.
So I guess this would require adjustments to the whole stack, as this is
also used from the 'normal' SG side of the world.



                                                    Beste Grüße / Best regards,
                                                      - Benjamin Block
-- 
Linux on z Systems Development         /         IBM Systems & Technology Group
		  IBM Deutschland Research & Development GmbH 
Vorsitz. AufsR.: Martina Koederitz     /        Geschäftsführung: Dirk Wittkopp
Sitz der Gesellschaft: Böblingen / Registergericht: AmtsG Stuttgart, HRB 243294

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC PATCH 1/6] bsg: fix kernel panic resulting from missing allocation of a reply-buffer
  2017-08-10 22:10     ` Benjamin Block
@ 2017-08-10 22:45       ` Benjamin Block
  2017-08-11  8:38       ` Christoph Hellwig
  1 sibling, 0 replies; 31+ messages in thread
From: Benjamin Block @ 2017-08-10 22:45 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: James E . J . Bottomley, Martin K . Petersen, Jens Axboe,
	linux-block, linux-kernel, linux-scsi, Johannes Thumshirn,
	Steffen Maier, open-iscsi

On Fri, Aug 11, 2017 at 12:10:38AM +0200, Benjamin Block wrote:
> On Thu, Aug 10, 2017 at 11:32:17AM +0200, Christoph Hellwig wrote:
> > We can't use an on-stack buffer for the sense data, as drivers will
> > dma to it.  So we should reuse the SCSI init_rq_fn() for the BSG
> > queues and/or implement the same scheme.
> > 
> 
...
> 
>  struct sg_io_v4
>  +--------------+
>  |              |  
>  | request-------->+----------------------------+
>  |   + _len     |  |                            |
>  |   (A)        |  | BSG Request                |
>  |              |  | e.g. struct fc_bsg_request | Depends on BSG implementation
>  |              |  |                            | FC vs. iSCSI vs. ...
>  |              |  +----------------------------+
>  |              |
>  | response------->+----------------------------+ Used as _Output_
>  |   + max_len  |  |                            | User doesn't initialize
>  |   (B)        |  | BSG Reply                  | User provides (optional)
>  |              |  | e.g. struct fc_bsg_reply   |   memory; May be NULL.
>  |              |  |                            |
>  |              |  +----------------------------+
>  |              |
>  | dout_xferp----->+-----------------------+ Stuff send on the wire by
>  |   + _len     |  |                       |   the LLD
>  |   (C)        |  | Transport Protocol IU | Aligned on PAGE_SIZE
>  |              |  | e.g. FC-GS-7 CT_IU    |
>  |              |  |                       |
>  |              |  +-----------------------+
>  |              |
>  | din_xferp------>+-----------------------+ Buffer for response data by
>  |   + _len     |  |                       |   the LLD
>  |   (D)        |  | Transport Protocol IU | Aligned on PAGE_SIZE
>  |              |  | e.g. FC-GS-7 CT_IU    |
>  |              |  |                       |
>  |              |  +-----------------------+
>  +--------------+
> 
...
> 
>  struct request (E)
>  +--------------+
>  |              |  struct scsi_request
>  | scsi_request--->+-----------------+
>  |              |  |                 |
>  |              |  | cmd---------------------> Copy of (A)
>  |              |  |  + _len         |         Space in struct or kzalloc
>  |              |  |  (G)            |
>  |              |  |                 |
>  |              |  | sense-------------------> Space for BSG Reply
>  |              |  |  + _len         |         Same Data-Structure as (B)
>  |              |  |  (H)            |         NOT actually pointer (B)
>  |              |  |                 |         'reply_buffer' in my patch 
>  |              |  +-----------------+
>  |              |
>  | bio------------> Mapped via blk_rq_map_user() to (C) dout_xferp
>  |              |
>  | next_rq---------+
>  |              |  |
>  +--------------+  |
>                    |
>  struct request (F)|(if used)
>  +--------------+<-+
>  |              |
>  | scsi_request---> Unused here
>  |              |
>  | bio------------> Mapped via blk_rq_map_user() to (D) din_xferp
>  |              |
>  +--------------+
> 
...
> 
>  struct bsg_job
>  +-----------------+
>  |                 |
>  | request-----------> (G) scsi_request->cmd -> Copy of (A)
>  |   + _len        |       e.g. struct fc_bsg_request
>  |                 |
>  | reply-------------> (H) scsi_request->sense -> 'reply_buffer'
>  |   + _len        |       e.g. struct fc_bsg_reply
>  |                 |
>  | request_payload---> struct scatterlist ... map (E)->bio
>  |   + _len        |
>  |   (I)           |
>  |                 |
>  | reply_payload-----> struct scatterlist ... map (F)->bio
>  |   + _len        |
>  |   (J)           |
>  |                 |
>  +-----------------+
> 
....
> 
> This worked till it broke. Right now every driver that tries to access
> (H) will panic the system, or cause very undefined behavior. I suspect
> no driver right now tries to do any DMA into (H); before the regression,
> this has been also an on-stack variable (I suspect since BSG was
> introduced, haven't checked though).
> 
> The asymmetries between the first struct request (E) and the following
> (F) also makes it hard to use the same scheme as in other drivers, where
> init_rq_fn() gets to initialize each request in the same'ish way. Or?
> Just looking at it right now, this would require some bigger rework that
> is not appropriate for a stable bug-fix.
> 

Just some more brain-dump here.

One more problem for direct DMA into (H) in the current BSG setup is
probably, that the transport classes have each their own private format
for the BSG reply (struct fc_bsg_reply and struct iscsi_bsg_reply right
now I think). The current stack doesn't take any precaution to properly
align this in accords to what the LLDs specifies for the blk-layer... so
lets assume struct fc_bsg_reply. This has fields for actual protocol IUs
(in contrast to iSCSI, where it only has some vendor-reply buffer [an
array with 0 length...]), but they start after some BSG meta-data that
are non-standard. We would have to rewrite that to allow mapping the
start of the protocol IUs in accords to the expectations the LLDs have..
page-aligned and such things. Otherwise we would break whatever handles
the meta-data the LLD pass up the stack - added that this is actually
user visible data, passed back via struct sg_io_v4.

This could be something of a new feature I guess, it would be an
improvement in terms that it could reduce copies even more, but w/o
further research I guess it is a bit more work.



                                                    Beste Grüße / Best regards,
                                                      - Benjamin Block
-- 
Linux on z Systems Development         /         IBM Systems & Technology Group
		  IBM Deutschland Research & Development GmbH 
Vorsitz. AufsR.: Martina Koederitz     /        Geschäftsführung: Dirk Wittkopp
Sitz der Gesellschaft: Böblingen / Registergericht: AmtsG Stuttgart, HRB 243294

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC PATCH 1/6] bsg: fix kernel panic resulting from missing allocation of a reply-buffer
  2017-08-10 22:10     ` Benjamin Block
  2017-08-10 22:45       ` Benjamin Block
@ 2017-08-11  8:38       ` Christoph Hellwig
  2017-08-11  9:14         ` Christoph Hellwig
  1 sibling, 1 reply; 31+ messages in thread
From: Christoph Hellwig @ 2017-08-11  8:38 UTC (permalink / raw)
  To: Benjamin Block
  Cc: Christoph Hellwig, James E . J . Bottomley, Martin K . Petersen,
	Jens Axboe, linux-block, linux-kernel, linux-scsi,
	Johannes Thumshirn, Steffen Maier, open-iscsi

My point was that we now gurantee that that the sense data is not
a stack pointer an a driver can DMA to it.  Now for BSG the sense
data is "just" abused as reply, but the point still stands - we
don't want to pass a possible stack pointer to drivers in a data
buffer because we want to allow DMA to it.

That being said with your patch 4 that becomes a moot point as we'll
now always dynamically allocate it.  So maybe just reorder that to go
first and we should be fine.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC PATCH 1/6] bsg: fix kernel panic resulting from missing allocation of a reply-buffer
  2017-08-11  8:38       ` Christoph Hellwig
@ 2017-08-11  9:14         ` Christoph Hellwig
  2017-08-11 13:49           ` Benjamin Block
  0 siblings, 1 reply; 31+ messages in thread
From: Christoph Hellwig @ 2017-08-11  9:14 UTC (permalink / raw)
  To: Benjamin Block
  Cc: Christoph Hellwig, James E . J . Bottomley, Martin K . Petersen,
	Jens Axboe, linux-block, linux-kernel, linux-scsi,
	Johannes Thumshirn, Steffen Maier, open-iscsi

But patch 1 still creates an additional copy of the sense data for
all bsg users.

Can you test the patch below which implements my suggestion?  Your
other patches should still apply fine on top modulo minor context
changes.

---
>From 4cd32ee48e334b62b55bff0d380833b978454040 Mon Sep 17 00:00:00 2001
From: Christoph Hellwig <hch@lst.de>
Date: Fri, 11 Aug 2017 11:03:29 +0200
Subject: bsg-lib: allocate sense data for each request

Since we split the scsi_request out of the request the driver is supposed
to provide storage for the sense buffer.  The bsg-lib code failed to do so,
though and will crash anytime it is used.

This patch moves bsg-lib to allocate and setup the bsg_job ahead of time,
and allocate the sense data, which is used as reply buffer in bsg.

Reported-by: Steffen Maier <maier@linux.vnet.ibm.com>
Signed-off-by: Benjamin Block <bblock@linux.vnet.ibm.com>
Fixes: 82ed4db499b8 ("block: split scsi_request out of struct request")
Cc: <stable@vger.kernel.org> #4.11+
---
 block/bsg-lib.c         | 53 +++++++++++++++++++++++++++----------------------
 include/linux/blkdev.h  |  1 -
 include/linux/bsg-lib.h |  2 ++
 3 files changed, 31 insertions(+), 25 deletions(-)

diff --git a/block/bsg-lib.c b/block/bsg-lib.c
index c4513b23f57a..215893dbd038 100644
--- a/block/bsg-lib.c
+++ b/block/bsg-lib.c
@@ -100,7 +100,7 @@ EXPORT_SYMBOL_GPL(bsg_job_done);
  */
 static void bsg_softirq_done(struct request *rq)
 {
-	struct bsg_job *job = rq->special;
+	struct bsg_job *job = blk_mq_rq_to_pdu(rq);
 
 	bsg_job_put(job);
 }
@@ -129,26 +129,9 @@ static int bsg_map_buffer(struct bsg_buffer *buf, struct request *req)
 static int bsg_create_job(struct device *dev, struct request *req)
 {
 	struct request *rsp = req->next_rq;
-	struct request_queue *q = req->q;
-	struct scsi_request *rq = scsi_req(req);
-	struct bsg_job *job;
+	struct bsg_job *job = blk_mq_rq_to_pdu(req);
 	int ret;
 
-	BUG_ON(req->special);
-
-	job = kzalloc(sizeof(struct bsg_job) + q->bsg_job_size, GFP_KERNEL);
-	if (!job)
-		return -ENOMEM;
-
-	req->special = job;
-	job->req = req;
-	if (q->bsg_job_size)
-		job->dd_data = (void *)&job[1];
-	job->request = rq->cmd;
-	job->request_len = rq->cmd_len;
-	job->reply = rq->sense;
-	job->reply_len = SCSI_SENSE_BUFFERSIZE;	/* Size of sense buffer
-						 * allocated */
 	if (req->bio) {
 		ret = bsg_map_buffer(&job->request_payload, req);
 		if (ret)
@@ -187,7 +170,6 @@ static void bsg_request_fn(struct request_queue *q)
 {
 	struct device *dev = q->queuedata;
 	struct request *req;
-	struct bsg_job *job;
 	int ret;
 
 	if (!get_device(dev))
@@ -207,8 +189,7 @@ static void bsg_request_fn(struct request_queue *q)
 			continue;
 		}
 
-		job = req->special;
-		ret = q->bsg_job_fn(job);
+		ret = q->bsg_job_fn(blk_mq_rq_to_pdu(req));
 		spin_lock_irq(q->queue_lock);
 		if (ret)
 			break;
@@ -219,6 +200,29 @@ static void bsg_request_fn(struct request_queue *q)
 	spin_lock_irq(q->queue_lock);
 }
 
+static int bsg_init_rq(struct request_queue *q, struct request *req, gfp_t gfp)
+{
+	struct bsg_job *job = blk_mq_rq_to_pdu(req);
+
+	memset(job, 0, sizeof(*job));
+	job->req = req;
+	job->dd_data = job + 1;
+	job->request = job->sreq.cmd;
+	job->request_len = job->sreq.cmd_len;
+	job->reply_len = SCSI_SENSE_BUFFERSIZE;
+	job->reply = job->sreq.sense = kzalloc(job->reply_len, gfp);
+	if (!job->reply)
+		return -ENOMEM;
+	return 0;
+}
+
+static void bsg_exit_rq(struct request_queue *q, struct request *req)
+{
+	struct bsg_job *job = blk_mq_rq_to_pdu(req);
+
+	kfree(job->reply);
+}
+
 /**
  * bsg_setup_queue - Create and add the bsg hooks so we can receive requests
  * @dev: device to attach bsg device to
@@ -235,7 +239,9 @@ struct request_queue *bsg_setup_queue(struct device *dev, char *name,
 	q = blk_alloc_queue(GFP_KERNEL);
 	if (!q)
 		return ERR_PTR(-ENOMEM);
-	q->cmd_size = sizeof(struct scsi_request);
+	q->cmd_size = sizeof(struct bsg_job) + dd_job_size;
+	q->init_rq_fn = bsg_init_rq;
+	q->exit_rq_fn = bsg_exit_rq;
 	q->request_fn = bsg_request_fn;
 
 	ret = blk_init_allocated_queue(q);
@@ -243,7 +249,6 @@ struct request_queue *bsg_setup_queue(struct device *dev, char *name,
 		goto out_cleanup_queue;
 
 	q->queuedata = dev;
-	q->bsg_job_size = dd_job_size;
 	q->bsg_job_fn = job_fn;
 	queue_flag_set_unlocked(QUEUE_FLAG_BIDI, q);
 	queue_flag_set_unlocked(QUEUE_FLAG_SCSI_PASSTHROUGH, q);
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index f45f157b2910..6ae9aa6f93f0 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -568,7 +568,6 @@ struct request_queue {
 
 #if defined(CONFIG_BLK_DEV_BSG)
 	bsg_job_fn		*bsg_job_fn;
-	int			bsg_job_size;
 	struct bsg_class_device bsg_dev;
 #endif
 
diff --git a/include/linux/bsg-lib.h b/include/linux/bsg-lib.h
index e34dde2da0ef..637a20cfb237 100644
--- a/include/linux/bsg-lib.h
+++ b/include/linux/bsg-lib.h
@@ -24,6 +24,7 @@
 #define _BLK_BSG_
 
 #include <linux/blkdev.h>
+#include <scsi/scsi_request.h>
 
 struct request;
 struct device;
@@ -37,6 +38,7 @@ struct bsg_buffer {
 };
 
 struct bsg_job {
+	struct scsi_request sreq;
 	struct device *dev;
 	struct request *req;
 
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [RFC PATCH 1/6] bsg: fix kernel panic resulting from missing allocation of a reply-buffer
  2017-08-11  9:14         ` Christoph Hellwig
@ 2017-08-11 13:49           ` Benjamin Block
  2017-08-11 14:36             ` Christoph Hellwig
  0 siblings, 1 reply; 31+ messages in thread
From: Benjamin Block @ 2017-08-11 13:49 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: James E . J . Bottomley, Martin K . Petersen, Jens Axboe,
	linux-block, linux-kernel, linux-scsi, Johannes Thumshirn,
	Steffen Maier, open-iscsi

On Fri, Aug 11, 2017 at 11:14:15AM +0200, Christoph Hellwig wrote:
> But patch 1 still creates an additional copy of the sense data for
> all bsg users.
>

Huh? What additional copy? There is one reply-buffer and that is copied
into the user-buffer should it contain valid data. Just like in your
patch, neither you, nor me touches any of the copy-code. There is also
no changes to how the driver get their data into that buffer, it will
still be copied in both cases.

> 
> Can you test the patch below which implements my suggestion?  Your
> other patches should still apply fine on top modulo minor context
> changes.

Only your patch on top of 4.13-rc4. din_xferp (D) is also empty, which is
not taken from the sense-buffer.

=============================================================================
BUG kmalloc-1024 (Not tainted): Invalid object pointer 0x000000004ad9e0f0
-----------------------------------------------------------------------------

Disabling lock debugging due to kernel taint
INFO: Slab 0x000003d1012b6600 objects=24 used=11 fp=0x000000004ad9da58 flags=0x3fffc0000008101
CPU: 2 PID: 20 Comm: ksoftirqd/2 Tainted: G    B           4.13.0-rc4-bsg-regression+ #2
Hardware name: IBM 2964 N96 702 (z/VM 6.4.0)
Call Trace:
([<0000000000117532>] show_stack+0x8a/0xe0)
 [<0000000000bcbaee>] dump_stack+0x96/0xd8 
 [<00000000003cd5fc>] slab_err+0xac/0xc0 
 [<00000000003d68e4>] free_debug_processing+0x554/0x570 
 [<00000000003d69ae>] __slab_free+0xae/0x618 
 [<00000000003d7dce>] kfree+0x44e/0x4a0 
 [<0000000000859b4e>] blk_done_softirq+0x146/0x160 
 [<0000000000bf4ec0>] __do_softirq+0x3d0/0x840 
 [<00000000001662a6>] run_ksoftirqd+0x3e/0xb8 
 [<00000000001957fc>] smpboot_thread_fn+0x2f4/0x318 
 [<000000000018f6f6>] kthread+0x166/0x178 
 [<0000000000bf3cf2>] kernel_thread_starter+0x6/0xc 
 [<0000000000bf3cec>] kernel_thread_starter+0x0/0xc 
INFO: lockdep is turned off.
FIX kmalloc-1024: Object at 0x000000004ad9e0f0 not freed
=============================================================================
BUG kmalloc-1024 (Tainted: G    B          ): Invalid object pointer 0x000000004ad9f630
-----------------------------------------------------------------------------

INFO: Slab 0x000003d1012b6600 objects=24 used=11 fp=0x000000004ad98558 flags=0x3fffc0000008101
CPU: 2 PID: 20 Comm: ksoftirqd/2 Tainted: G    B           4.13.0-rc4-bsg-regression+ #2
Hardware name: IBM 2964 N96 702 (z/VM 6.4.0)
Call Trace:
([<0000000000117532>] show_stack+0x8a/0xe0)
 [<0000000000bcbaee>] dump_stack+0x96/0xd8 
 [<00000000003cd5fc>] slab_err+0xac/0xc0 
 [<00000000003d68e4>] free_debug_processing+0x554/0x570 
 [<00000000003d69ae>] __slab_free+0xae/0x618 
 [<00000000003d7dce>] kfree+0x44e/0x4a0 
 [<0000000000859b4e>] blk_done_softirq+0x146/0x160 
 [<0000000000bf4ec0>] __do_softirq+0x3d0/0x840 
 [<00000000001662a6>] run_ksoftirqd+0x3e/0xb8 
 [<00000000001957fc>] smpboot_thread_fn+0x2f4/0x318 
 [<000000000018f6f6>] kthread+0x166/0x178 
 [<0000000000bf3cf2>] kernel_thread_starter+0x6/0xc 
 [<0000000000bf3cec>] kernel_thread_starter+0x0/0xc 
INFO: lockdep is turned off.
FIX kmalloc-1024: Object at 0x000000004ad9f630 not freed
=============================================================================
BUG kmalloc-1024 (Tainted: G    B          ): Invalid object pointer 0x000000004ad986a0
-----------------------------------------------------------------------------

INFO: Slab 0x000003d1012b6600 objects=24 used=13 fp=0x000000004ad9d508 flags=0x3fffc0000008101
CPU: 2 PID: 20 Comm: ksoftirqd/2 Tainted: G    B           4.13.0-rc4-bsg-regression+ #2
Hardware name: IBM 2964 N96 702 (z/VM 6.4.0)
Call Trace:
([<0000000000117532>] show_stack+0x8a/0xe0)
 [<0000000000bcbaee>] dump_stack+0x96/0xd8 
 [<00000000003cd5fc>] slab_err+0xac/0xc0 
 [<00000000003d68e4>] free_debug_processing+0x554/0x570 
 [<00000000003d69ae>] __slab_free+0xae/0x618 
 [<00000000003d7dce>] kfree+0x44e/0x4a0
 [<0000000000859b4e>] blk_done_softirq+0x146/0x160
 [<0000000000bf4ec0>] __do_softirq+0x3d0/0x840
 [<00000000001662a6>] run_ksoftirqd+0x3e/0xb8
 [<00000000001957fc>] smpboot_thread_fn+0x2f4/0x318
 [<000000000018f6f6>] kthread+0x166/0x178
 [<0000000000bf3cf2>] kernel_thread_starter+0x6/0xc
 [<0000000000bf3cec>] kernel_thread_starter+0x0/0xc
INFO: lockdep is turned off.
FIX kmalloc-1024: Object at 0x000000004ad986a0 not freed
=============================================================================
BUG kmalloc-1024 (Tainted: G    B          ): Invalid object pointer 0x000000004ad9d650
-----------------------------------------------------------------------------

INFO: Slab 0x000003d1012b6600 objects=24 used=15 fp=0x000000004ad9ea48 flags=0x3fffc0000008101
CPU: 2 PID: 20 Comm: ksoftirqd/2 Tainted: G    B           4.13.0-rc4-bsg-regression+ #2
Hardware name: IBM 2964 N96 702 (z/VM 6.4.0)
Call Trace:
([<0000000000117532>] show_stack+0x8a/0xe0)
 [<0000000000bcbaee>] dump_stack+0x96/0xd8
 [<00000000003cd5fc>] slab_err+0xac/0xc0
 [<00000000003d68e4>] free_debug_processing+0x554/0x570
 [<00000000003d69ae>] __slab_free+0xae/0x618
 [<00000000003d7dce>] kfree+0x44e/0x4a0
 [<0000000000859b4e>] blk_done_softirq+0x146/0x160
 [<0000000000bf4ec0>] __do_softirq+0x3d0/0x840
 [<00000000001662a6>] run_ksoftirqd+0x3e/0xb8
 [<00000000001957fc>] smpboot_thread_fn+0x2f4/0x318
 [<000000000018f6f6>] kthread+0x166/0x178
 [<0000000000bf3cf2>] kernel_thread_starter+0x6/0xc
 [<0000000000bf3cec>] kernel_thread_starter+0x0/0xc
INFO: lockdep is turned off.
FIX kmalloc-1024: Object at 0x000000004ad9d650 not freed
=============================================================================
BUG kmalloc-1024 (Tainted: G    B          ): Invalid object pointer 0x000000004ad9eb90
-----------------------------------------------------------------------------

INFO: Slab 0x000003d1012b6600 objects=24 used=17 fp=0x000000004ad99548 flags=0x3fffc0000008101
CPU: 2 PID: 20 Comm: ksoftirqd/2 Tainted: G    B           4.13.0-rc4-bsg-regression+ #2
Hardware name: IBM 2964 N96 702 (z/VM 6.4.0)
Call Trace:
([<0000000000117532>] show_stack+0x8a/0xe0)
 [<0000000000bcbaee>] dump_stack+0x96/0xd8
 [<00000000003cd5fc>] slab_err+0xac/0xc0
 [<00000000003d68e4>] free_debug_processing+0x554/0x570
 [<00000000003d69ae>] __slab_free+0xae/0x618
 [<00000000003d7dce>] kfree+0x44e/0x4a0
 [<0000000000859b4e>] blk_done_softirq+0x146/0x160
 [<0000000000bf4ec0>] __do_softirq+0x3d0/0x840
 [<00000000001662a6>] run_ksoftirqd+0x3e/0xb8
 [<00000000001957fc>] smpboot_thread_fn+0x2f4/0x318
 [<000000000018f6f6>] kthread+0x166/0x178
 [<0000000000bf3cf2>] kernel_thread_starter+0x6/0xc
 [<0000000000bf3cec>] kernel_thread_starter+0x0/0xc
INFO: lockdep is turned off.
FIX kmalloc-1024: Object at 0x000000004ad9eb90 not freed

> 
> ---
> From 4cd32ee48e334b62b55bff0d380833b978454040 Mon Sep 17 00:00:00 2001
> From: Christoph Hellwig <hch@lst.de>
> Date: Fri, 11 Aug 2017 11:03:29 +0200
> Subject: bsg-lib: allocate sense data for each request
> 
> Since we split the scsi_request out of the request the driver is supposed
> to provide storage for the sense buffer.  The bsg-lib code failed to do so,
> though and will crash anytime it is used.
> 
> This patch moves bsg-lib to allocate and setup the bsg_job ahead of time,
> and allocate the sense data, which is used as reply buffer in bsg.
> 
> Reported-by: Steffen Maier <maier@linux.vnet.ibm.com>
> Signed-off-by: Benjamin Block <bblock@linux.vnet.ibm.com>
> Fixes: 82ed4db499b8 ("block: split scsi_request out of struct request")
> Cc: <stable@vger.kernel.org> #4.11+
> ---
>  block/bsg-lib.c         | 53 +++++++++++++++++++++++++++----------------------
>  include/linux/blkdev.h  |  1 -
>  include/linux/bsg-lib.h |  2 ++
>  3 files changed, 31 insertions(+), 25 deletions(-)
> 
> diff --git a/block/bsg-lib.c b/block/bsg-lib.c
> index c4513b23f57a..215893dbd038 100644
> --- a/block/bsg-lib.c
> +++ b/block/bsg-lib.c
> @@ -100,7 +100,7 @@ EXPORT_SYMBOL_GPL(bsg_job_done);
>   */
>  static void bsg_softirq_done(struct request *rq)
>  {
> -	struct bsg_job *job = rq->special;
> +	struct bsg_job *job = blk_mq_rq_to_pdu(rq);
> 
>  	bsg_job_put(job);
>  }
> @@ -129,26 +129,9 @@ static int bsg_map_buffer(struct bsg_buffer *buf, struct request *req)
>  static int bsg_create_job(struct device *dev, struct request *req)
>  {
>  	struct request *rsp = req->next_rq;
> -	struct request_queue *q = req->q;
> -	struct scsi_request *rq = scsi_req(req);
> -	struct bsg_job *job;
> +	struct bsg_job *job = blk_mq_rq_to_pdu(req);
>  	int ret;
> 
> -	BUG_ON(req->special);
> -
> -	job = kzalloc(sizeof(struct bsg_job) + q->bsg_job_size, GFP_KERNEL);
> -	if (!job)
> -		return -ENOMEM;
> -
> -	req->special = job;
> -	job->req = req;
> -	if (q->bsg_job_size)
> -		job->dd_data = (void *)&job[1];
> -	job->request = rq->cmd;
> -	job->request_len = rq->cmd_len;
> -	job->reply = rq->sense;
> -	job->reply_len = SCSI_SENSE_BUFFERSIZE;	/* Size of sense buffer
> -						 * allocated */
>  	if (req->bio) {
>  		ret = bsg_map_buffer(&job->request_payload, req);
>  		if (ret)
> @@ -187,7 +170,6 @@ static void bsg_request_fn(struct request_queue *q)
>  {
>  	struct device *dev = q->queuedata;
>  	struct request *req;
> -	struct bsg_job *job;
>  	int ret;
> 
>  	if (!get_device(dev))
> @@ -207,8 +189,7 @@ static void bsg_request_fn(struct request_queue *q)
>  			continue;
>  		}
> 
> -		job = req->special;
> -		ret = q->bsg_job_fn(job);
> +		ret = q->bsg_job_fn(blk_mq_rq_to_pdu(req));
>  		spin_lock_irq(q->queue_lock);
>  		if (ret)
>  			break;
> @@ -219,6 +200,29 @@ static void bsg_request_fn(struct request_queue *q)
>  	spin_lock_irq(q->queue_lock);
>  }
> 
> +static int bsg_init_rq(struct request_queue *q, struct request *req, gfp_t gfp)
> +{
> +	struct bsg_job *job = blk_mq_rq_to_pdu(req);
> +
> +	memset(job, 0, sizeof(*job));
> +	job->req = req;
> +	job->dd_data = job + 1;
> +	job->request = job->sreq.cmd;
> +	job->request_len = job->sreq.cmd_len;
> +	job->reply_len = SCSI_SENSE_BUFFERSIZE;
> +	job->reply = job->sreq.sense = kzalloc(job->reply_len, gfp);
> +	if (!job->reply)
> +		return -ENOMEM;
> +	return 0;
> +}

This will now create an additional and never used (sizeof(struct
bsg_job) + dd_job_size + SCSI_SENSE_BUFFERSIZE - sizeof(struct
scsi_request)) size buffer for each bidirectional BSG request (each FC
request for zFCP for example). I mentioned that in my reply.

I really don't see the point in this exercise. We know the old way
worked well for BSG, no driver has been changed in expectations to now
do DMA to the reply-buffer.. its a custom non-standardised kernel
struct.. there is no HW that would do this (and like I said, the reply
protocol Data is already DMA'ed directly into the user-provided
buffers, if the HW can). And more-over, like you said, later patches
make all allocations on-heap anyway.

If you really want, I can change the first patch to do more of what
Patch 4 does right now, so that there is no on-stack usage, even just
intermediate in a patch-series.


                                                    Beste Grüße / Best regards,
                                                      - Benjamin Block

> +
> +static void bsg_exit_rq(struct request_queue *q, struct request *req)
> +{
> +	struct bsg_job *job = blk_mq_rq_to_pdu(req);
> +
> +	kfree(job->reply);
> +}
> +
>  /**
>   * bsg_setup_queue - Create and add the bsg hooks so we can receive requests
>   * @dev: device to attach bsg device to
> @@ -235,7 +239,9 @@ struct request_queue *bsg_setup_queue(struct device *dev, char *name,
>  	q = blk_alloc_queue(GFP_KERNEL);
>  	if (!q)
>  		return ERR_PTR(-ENOMEM);
> -	q->cmd_size = sizeof(struct scsi_request);
> +	q->cmd_size = sizeof(struct bsg_job) + dd_job_size;
> +	q->init_rq_fn = bsg_init_rq;
> +	q->exit_rq_fn = bsg_exit_rq;
>  	q->request_fn = bsg_request_fn;
> 
>  	ret = blk_init_allocated_queue(q);
> @@ -243,7 +249,6 @@ struct request_queue *bsg_setup_queue(struct device *dev, char *name,
>  		goto out_cleanup_queue;
> 
>  	q->queuedata = dev;
> -	q->bsg_job_size = dd_job_size;
>  	q->bsg_job_fn = job_fn;
>  	queue_flag_set_unlocked(QUEUE_FLAG_BIDI, q);
>  	queue_flag_set_unlocked(QUEUE_FLAG_SCSI_PASSTHROUGH, q);
> diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
> index f45f157b2910..6ae9aa6f93f0 100644
> --- a/include/linux/blkdev.h
> +++ b/include/linux/blkdev.h
> @@ -568,7 +568,6 @@ struct request_queue {
> 
>  #if defined(CONFIG_BLK_DEV_BSG)
>  	bsg_job_fn		*bsg_job_fn;
> -	int			bsg_job_size;
>  	struct bsg_class_device bsg_dev;
>  #endif
> 
> diff --git a/include/linux/bsg-lib.h b/include/linux/bsg-lib.h
> index e34dde2da0ef..637a20cfb237 100644
> --- a/include/linux/bsg-lib.h
> +++ b/include/linux/bsg-lib.h
> @@ -24,6 +24,7 @@
>  #define _BLK_BSG_
> 
>  #include <linux/blkdev.h>
> +#include <scsi/scsi_request.h>
> 
>  struct request;
>  struct device;
> @@ -37,6 +38,7 @@ struct bsg_buffer {
>  };
> 
>  struct bsg_job {
> +	struct scsi_request sreq;
>  	struct device *dev;
>  	struct request *req;
> 
> -- 
> 2.11.0
> 

-- 
Linux on z Systems Development         /         IBM Systems & Technology Group
		  IBM Deutschland Research & Development GmbH 
Vorsitz. AufsR.: Martina Koederitz     /        Geschäftsführung: Dirk Wittkopp
Sitz der Gesellschaft: Böblingen / Registergericht: AmtsG Stuttgart, HRB 243294

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC PATCH 1/6] bsg: fix kernel panic resulting from missing allocation of a reply-buffer
  2017-08-11 13:49           ` Benjamin Block
@ 2017-08-11 14:36             ` Christoph Hellwig
  2017-08-11 15:32               ` Benjamin Block
  0 siblings, 1 reply; 31+ messages in thread
From: Christoph Hellwig @ 2017-08-11 14:36 UTC (permalink / raw)
  To: Benjamin Block
  Cc: Christoph Hellwig, James E . J . Bottomley, Martin K . Petersen,
	Jens Axboe, linux-block, linux-kernel, linux-scsi,
	Johannes Thumshirn, Steffen Maier, open-iscsi

On Fri, Aug 11, 2017 at 03:49:29PM +0200, Benjamin Block wrote:
> On Fri, Aug 11, 2017 at 11:14:15AM +0200, Christoph Hellwig wrote:
> > But patch 1 still creates an additional copy of the sense data for
> > all bsg users.
> >
> 
> Huh? What additional copy? There is one reply-buffer and that is copied
> into the user-buffer should it contain valid data. Just like in your
> patch, neither you, nor me touches any of the copy-code. There is also
> no changes to how the driver get their data into that buffer, it will
> still be copied in both cases.

You're right - I misread your patch.  But that does make it worse as
this means that with your patch we re-assign the scsi_request.sense
pointer when using bsg.  That will lead to crashes if using the bsg
code against e.g. a normal scsi device using bsg when that request
later gets reused for something that is not bsg.

> 
> > 
> > Can you test the patch below which implements my suggestion?  Your
> > other patches should still apply fine on top modulo minor context
> > changes.
> 
> Only your patch on top of 4.13-rc4. din_xferp (D) is also empty, which is
> not taken from the sense-buffer.

Can't parse this.

> =============================================================================
> BUG kmalloc-1024 (Not tainted): Invalid object pointer 0x000000004ad9e0f0
> -----------------------------------------------------------------------------

Oops - if we don't allocate the job separately we should not free it either.
Updated patch for that below:

---
>From 4cd32ee48e334b62b55bff0d380833b978454040 Mon Sep 17 00:00:00 2001
From: Christoph Hellwig <hch@lst.de>
Date: Fri, 11 Aug 2017 11:03:29 +0200
Subject: bsg-lib: allocate sense data for each request

Since we split the scsi_request out of the request the driver is supposed
to provide storage for the sense buffer.  The bsg-lib code failed to do so,
though and will crash anytime it is used.

This patch moves bsg-lib to allocate and setup the bsg_job ahead of time,
and allocate the sense data, which is used as reply buffer in bsg.

Reported-by: Steffen Maier <maier@linux.vnet.ibm.com>
Signed-off-by: Benjamin Block <bblock@linux.vnet.ibm.com>
Fixes: 82ed4db499b8 ("block: split scsi_request out of struct request")
Cc: <stable@vger.kernel.org> #4.11+
---
 block/bsg-lib.c         | 53 +++++++++++++++++++++++++++----------------------
 include/linux/blkdev.h  |  1 -
 include/linux/bsg-lib.h |  2 ++
 3 files changed, 31 insertions(+), 25 deletions(-)

diff --git a/block/bsg-lib.c b/block/bsg-lib.c
index c4513b23f57a..215893dbd038 100644
--- a/block/bsg-lib.c
+++ b/block/bsg-lib.c
@@ -100,7 +100,7 @@ EXPORT_SYMBOL_GPL(bsg_job_done);
  */
 static void bsg_softirq_done(struct request *rq)
 {
-	struct bsg_job *job = rq->special;
+	struct bsg_job *job = blk_mq_rq_to_pdu(rq);
 
 	bsg_job_put(job);
 }
@@ -129,26 +129,9 @@ static int bsg_map_buffer(struct bsg_buffer *buf, struct request *req)
 static int bsg_create_job(struct device *dev, struct request *req)
 {
 	struct request *rsp = req->next_rq;
-	struct request_queue *q = req->q;
-	struct scsi_request *rq = scsi_req(req);
-	struct bsg_job *job;
+	struct bsg_job *job = blk_mq_rq_to_pdu(req);
 	int ret;
 
-	BUG_ON(req->special);
-
-	job = kzalloc(sizeof(struct bsg_job) + q->bsg_job_size, GFP_KERNEL);
-	if (!job)
-		return -ENOMEM;
-
-	req->special = job;
-	job->req = req;
-	if (q->bsg_job_size)
-		job->dd_data = (void *)&job[1];
-	job->request = rq->cmd;
-	job->request_len = rq->cmd_len;
-	job->reply = rq->sense;
-	job->reply_len = SCSI_SENSE_BUFFERSIZE;	/* Size of sense buffer
-						 * allocated */
 	if (req->bio) {
 		ret = bsg_map_buffer(&job->request_payload, req);
 		if (ret)
@@ -187,7 +170,6 @@ static void bsg_request_fn(struct request_queue *q)
 {
 	struct device *dev = q->queuedata;
 	struct request *req;
-	struct bsg_job *job;
 	int ret;
 
 	if (!get_device(dev))
@@ -207,8 +189,7 @@ static void bsg_request_fn(struct request_queue *q)
 			continue;
 		}
 
-		job = req->special;
-		ret = q->bsg_job_fn(job);
+		ret = q->bsg_job_fn(blk_mq_rq_to_pdu(req));
 		spin_lock_irq(q->queue_lock);
 		if (ret)
 			break;
@@ -219,6 +200,29 @@ static void bsg_request_fn(struct request_queue *q)
 	spin_lock_irq(q->queue_lock);
 }
 
+static int bsg_init_rq(struct request_queue *q, struct request *req, gfp_t gfp)
+{
+	struct bsg_job *job = blk_mq_rq_to_pdu(req);
+
+	memset(job, 0, sizeof(*job));
+	job->req = req;
+	job->dd_data = job + 1;
+	job->request = job->sreq.cmd;
+	job->request_len = job->sreq.cmd_len;
+	job->reply_len = SCSI_SENSE_BUFFERSIZE;
+	job->reply = job->sreq.sense = kzalloc(job->reply_len, gfp);
+	if (!job->reply)
+		return -ENOMEM;
+	return 0;
+}
+
+static void bsg_exit_rq(struct request_queue *q, struct request *req)
+{
+	struct bsg_job *job = blk_mq_rq_to_pdu(req);
+
+	kfree(job->reply);
+}
+
 /**
  * bsg_setup_queue - Create and add the bsg hooks so we can receive requests
  * @dev: device to attach bsg device to
@@ -235,7 +239,9 @@ struct request_queue *bsg_setup_queue(struct device *dev, char *name,
 	q = blk_alloc_queue(GFP_KERNEL);
 	if (!q)
 		return ERR_PTR(-ENOMEM);
-	q->cmd_size = sizeof(struct scsi_request);
+	q->cmd_size = sizeof(struct bsg_job) + dd_job_size;
+	q->init_rq_fn = bsg_init_rq;
+	q->exit_rq_fn = bsg_exit_rq;
 	q->request_fn = bsg_request_fn;
 
 	ret = blk_init_allocated_queue(q);
@@ -243,7 +249,6 @@ struct request_queue *bsg_setup_queue(struct device *dev, char *name,
 		goto out_cleanup_queue;
 
 	q->queuedata = dev;
-	q->bsg_job_size = dd_job_size;
 	q->bsg_job_fn = job_fn;
 	queue_flag_set_unlocked(QUEUE_FLAG_BIDI, q);
 	queue_flag_set_unlocked(QUEUE_FLAG_SCSI_PASSTHROUGH, q);
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index f45f157b2910..6ae9aa6f93f0 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -568,7 +568,6 @@ struct request_queue {
 
 #if defined(CONFIG_BLK_DEV_BSG)
 	bsg_job_fn		*bsg_job_fn;
-	int			bsg_job_size;
 	struct bsg_class_device bsg_dev;
 #endif
 
diff --git a/include/linux/bsg-lib.h b/include/linux/bsg-lib.h
index e34dde2da0ef..637a20cfb237 100644
--- a/include/linux/bsg-lib.h
+++ b/include/linux/bsg-lib.h
@@ -24,6 +24,7 @@
 #define _BLK_BSG_
 
 #include <linux/blkdev.h>
+#include <scsi/scsi_request.h>
 
 struct request;
 struct device;
@@ -37,6 +38,7 @@ struct bsg_buffer {
 };
 
 struct bsg_job {
+	struct scsi_request sreq;
 	struct device *dev;
 	struct request *req;
 
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [RFC PATCH 1/6] bsg: fix kernel panic resulting from missing allocation of a reply-buffer
  2017-08-11 14:36             ` Christoph Hellwig
@ 2017-08-11 15:32               ` Benjamin Block
  2017-08-11 15:35                 ` Christoph Hellwig
  0 siblings, 1 reply; 31+ messages in thread
From: Benjamin Block @ 2017-08-11 15:32 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: James E . J . Bottomley, Martin K . Petersen, Jens Axboe,
	linux-block, linux-kernel, linux-scsi, Johannes Thumshirn,
	Steffen Maier, open-iscsi

On Fri, Aug 11, 2017 at 04:36:49PM +0200, Christoph Hellwig wrote:
> On Fri, Aug 11, 2017 at 03:49:29PM +0200, Benjamin Block wrote:
> > On Fri, Aug 11, 2017 at 11:14:15AM +0200, Christoph Hellwig wrote:
> > > But patch 1 still creates an additional copy of the sense data for
> > > all bsg users.
> > >
> > 
> > Huh? What additional copy? There is one reply-buffer and that is copied
> > into the user-buffer should it contain valid data. Just like in your
> > patch, neither you, nor me touches any of the copy-code. There is also
> > no changes to how the driver get their data into that buffer, it will
> > still be copied in both cases.
> 
> You're right - I misread your patch.  But that does make it worse as
> this means that with your patch we re-assign the scsi_request.sense
> pointer when using bsg.  That will lead to crashes if using the bsg
> code against e.g. a normal scsi device using bsg when that request
> later gets reused for something that is not bsg.
>

So when the bsg interface is used with something different than the
bsg-lib request queue? I haven't actually thought about that (presuming
the bsg-lib queue was the only one being used). Fair enough, I haven't
completely read that code now, but that seems bad then, to reassign a
space allocated in someone else's request queue. 

That still leaves open that we now over-allocate space in bsg-lib, or?

> 
> > 
> > > 
> > > Can you test the patch below which implements my suggestion?  Your
> > > other patches should still apply fine on top modulo minor context
> > > changes.
> > 
> > Only your patch on top of 4.13-rc4. din_xferp (D) is also empty, which is
> > not taken from the sense-buffer.
> 
> Can't parse this.
> 
> > =============================================================================
> > BUG kmalloc-1024 (Not tainted): Invalid object pointer 0x000000004ad9e0f0
> > -----------------------------------------------------------------------------
> 
> Oops - if we don't allocate the job separately we should not free it either.
> Updated patch for that below:
>

My diff tells that this was the same patch as before.


                                                    Beste Grüße / Best regards,
                                                      - Benjamin Block
-- 
Linux on z Systems Development         /         IBM Systems & Technology Group
		  IBM Deutschland Research & Development GmbH 
Vorsitz. AufsR.: Martina Koederitz     /        Geschäftsführung: Dirk Wittkopp
Sitz der Gesellschaft: Böblingen / Registergericht: AmtsG Stuttgart, HRB 243294

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC PATCH 1/6] bsg: fix kernel panic resulting from missing allocation of a reply-buffer
  2017-08-11 15:32               ` Benjamin Block
@ 2017-08-11 15:35                 ` Christoph Hellwig
  2017-08-11 16:01                   ` Benjamin Block
  0 siblings, 1 reply; 31+ messages in thread
From: Christoph Hellwig @ 2017-08-11 15:35 UTC (permalink / raw)
  To: Benjamin Block
  Cc: Christoph Hellwig, James E . J . Bottomley, Martin K . Petersen,
	Jens Axboe, linux-block, linux-kernel, linux-scsi,
	Johannes Thumshirn, Steffen Maier, open-iscsi

On Fri, Aug 11, 2017 at 05:32:03PM +0200, Benjamin Block wrote:
> So when the bsg interface is used with something different than the
> bsg-lib request queue?

Yes.

> I haven't actually thought about that (presuming
> the bsg-lib queue was the only one being used). Fair enough, I haven't
> completely read that code now, but that seems bad then, to reassign a
> space allocated in someone else's request queue. 
> 
> That still leaves open that we now over-allocate space in bsg-lib, or?

Which space do we over-allocate?

> My diff tells that this was the same patch as before.

Next try:

---
>From f5b03b82df0569c035022c1c2535696186907f1a Mon Sep 17 00:00:00 2001
From: Christoph Hellwig <hch@lst.de>
Date: Fri, 11 Aug 2017 11:03:29 +0200
Subject: bsg-lib: allocate sense data for each request

Since we split the scsi_request out of the request the driver is supposed
to provide storage for the sense buffer.  The bsg-lib code failed to do so,
though and will crash anytime it is used.

This patch moves bsg-lib to allocate and setup the bsg_job ahead of time,
and allocate the sense data, which is used as reply buffer in bsg.

Reported-by: Steffen Maier <maier@linux.vnet.ibm.com>
Signed-off-by: Benjamin Block <bblock@linux.vnet.ibm.com>
Fixes: 82ed4db499b8 ("block: split scsi_request out of struct request")
Cc: <stable@vger.kernel.org> #4.11+
---
 block/bsg-lib.c         | 51 +++++++++++++++++++++++++++----------------------
 include/linux/blkdev.h  |  1 -
 include/linux/bsg-lib.h |  2 ++
 3 files changed, 30 insertions(+), 24 deletions(-)

diff --git a/block/bsg-lib.c b/block/bsg-lib.c
index c4513b23f57a..c07333c3b785 100644
--- a/block/bsg-lib.c
+++ b/block/bsg-lib.c
@@ -37,13 +37,11 @@ static void bsg_destroy_job(struct kref *kref)
 	struct bsg_job *job = container_of(kref, struct bsg_job, kref);
 	struct request *rq = job->req;
 
-	blk_end_request_all(rq, BLK_STS_OK);
-
 	put_device(job->dev);	/* release reference for the request */
 
 	kfree(job->request_payload.sg_list);
 	kfree(job->reply_payload.sg_list);
-	kfree(job);
+	blk_end_request_all(rq, BLK_STS_OK);
 }
 
 void bsg_job_put(struct bsg_job *job)
@@ -100,7 +98,7 @@ EXPORT_SYMBOL_GPL(bsg_job_done);
  */
 static void bsg_softirq_done(struct request *rq)
 {
-	struct bsg_job *job = rq->special;
+	struct bsg_job *job = blk_mq_rq_to_pdu(rq);
 
 	bsg_job_put(job);
 }
@@ -129,24 +127,11 @@ static int bsg_map_buffer(struct bsg_buffer *buf, struct request *req)
 static int bsg_create_job(struct device *dev, struct request *req)
 {
 	struct request *rsp = req->next_rq;
-	struct request_queue *q = req->q;
+	struct bsg_job *job = blk_mq_rq_to_pdu(req);
 	struct scsi_request *rq = scsi_req(req);
-	struct bsg_job *job;
 	int ret;
 
-	BUG_ON(req->special);
-
-	job = kzalloc(sizeof(struct bsg_job) + q->bsg_job_size, GFP_KERNEL);
-	if (!job)
-		return -ENOMEM;
-
-	req->special = job;
-	job->req = req;
-	if (q->bsg_job_size)
-		job->dd_data = (void *)&job[1];
-	job->request = rq->cmd;
 	job->request_len = rq->cmd_len;
-	job->reply = rq->sense;
 	job->reply_len = SCSI_SENSE_BUFFERSIZE;	/* Size of sense buffer
 						 * allocated */
 	if (req->bio) {
@@ -187,7 +172,6 @@ static void bsg_request_fn(struct request_queue *q)
 {
 	struct device *dev = q->queuedata;
 	struct request *req;
-	struct bsg_job *job;
 	int ret;
 
 	if (!get_device(dev))
@@ -207,8 +191,7 @@ static void bsg_request_fn(struct request_queue *q)
 			continue;
 		}
 
-		job = req->special;
-		ret = q->bsg_job_fn(job);
+		ret = q->bsg_job_fn(blk_mq_rq_to_pdu(req));
 		spin_lock_irq(q->queue_lock);
 		if (ret)
 			break;
@@ -219,6 +202,27 @@ static void bsg_request_fn(struct request_queue *q)
 	spin_lock_irq(q->queue_lock);
 }
 
+static int bsg_init_rq(struct request_queue *q, struct request *req, gfp_t gfp)
+{
+	struct bsg_job *job = blk_mq_rq_to_pdu(req);
+
+	memset(job, 0, sizeof(*job));
+	job->req = req;
+	job->request = job->sreq.cmd;
+	job->dd_data = job + 1;
+	job->reply = job->sreq.sense = kzalloc(job->reply_len, gfp);
+	if (!job->reply)
+		return -ENOMEM;
+	return 0;
+}
+
+static void bsg_exit_rq(struct request_queue *q, struct request *req)
+{
+	struct bsg_job *job = blk_mq_rq_to_pdu(req);
+
+	kfree(job->reply);
+}
+
 /**
  * bsg_setup_queue - Create and add the bsg hooks so we can receive requests
  * @dev: device to attach bsg device to
@@ -235,7 +239,9 @@ struct request_queue *bsg_setup_queue(struct device *dev, char *name,
 	q = blk_alloc_queue(GFP_KERNEL);
 	if (!q)
 		return ERR_PTR(-ENOMEM);
-	q->cmd_size = sizeof(struct scsi_request);
+	q->cmd_size = sizeof(struct bsg_job) + dd_job_size;
+	q->init_rq_fn = bsg_init_rq;
+	q->exit_rq_fn = bsg_exit_rq;
 	q->request_fn = bsg_request_fn;
 
 	ret = blk_init_allocated_queue(q);
@@ -243,7 +249,6 @@ struct request_queue *bsg_setup_queue(struct device *dev, char *name,
 		goto out_cleanup_queue;
 
 	q->queuedata = dev;
-	q->bsg_job_size = dd_job_size;
 	q->bsg_job_fn = job_fn;
 	queue_flag_set_unlocked(QUEUE_FLAG_BIDI, q);
 	queue_flag_set_unlocked(QUEUE_FLAG_SCSI_PASSTHROUGH, q);
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index f45f157b2910..6ae9aa6f93f0 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -568,7 +568,6 @@ struct request_queue {
 
 #if defined(CONFIG_BLK_DEV_BSG)
 	bsg_job_fn		*bsg_job_fn;
-	int			bsg_job_size;
 	struct bsg_class_device bsg_dev;
 #endif
 
diff --git a/include/linux/bsg-lib.h b/include/linux/bsg-lib.h
index e34dde2da0ef..637a20cfb237 100644
--- a/include/linux/bsg-lib.h
+++ b/include/linux/bsg-lib.h
@@ -24,6 +24,7 @@
 #define _BLK_BSG_
 
 #include <linux/blkdev.h>
+#include <scsi/scsi_request.h>
 
 struct request;
 struct device;
@@ -37,6 +38,7 @@ struct bsg_buffer {
 };
 
 struct bsg_job {
+	struct scsi_request sreq;
 	struct device *dev;
 	struct request *req;
 
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [RFC PATCH 1/6] bsg: fix kernel panic resulting from missing allocation of a reply-buffer
  2017-08-11 15:35                 ` Christoph Hellwig
@ 2017-08-11 16:01                   ` Benjamin Block
  2017-08-13 14:39                     ` Christoph Hellwig
  2017-08-14 16:32                     ` Benjamin Block
  0 siblings, 2 replies; 31+ messages in thread
From: Benjamin Block @ 2017-08-11 16:01 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: James E . J . Bottomley, Martin K . Petersen, Jens Axboe,
	linux-block, linux-kernel, linux-scsi, Johannes Thumshirn,
	Steffen Maier, open-iscsi

On Fri, Aug 11, 2017 at 05:35:53PM +0200, Christoph Hellwig wrote:
> On Fri, Aug 11, 2017 at 05:32:03PM +0200, Benjamin Block wrote:
> > So when the bsg interface is used with something different than the
> > bsg-lib request queue?
> 
> Yes.
> 
> > I haven't actually thought about that (presuming
> > the bsg-lib queue was the only one being used). Fair enough, I haven't
> > completely read that code now, but that seems bad then, to reassign a
> > space allocated in someone else's request queue. 
> > 
> > That still leaves open that we now over-allocate space in bsg-lib, or?
> 
> Which space do we over-allocate?

When the BSG interface is used with bsg-lib, and the user sends a
Bidirectional command - so when he gives an input- and output-buffer
(most users of our interface will likely do that, if they wanna get the
transport-level response data) - bsg will allocate two requests from the
queue. The first request's bio is used to map the input and the second
request's bio for the output (see bsg_map_hdr() in the if-statement with
(op == REQ_OP_SCSI_OUT && hdr->din_xfer_len)).

When we now allocate the full space of bsg_job, sense, dd_data for each
request, these will be wasted on the (linked) second request. They will
go unused all the time, as only the first request's bsg_job, sense and
dd_data is used by the LLDs and BSG itself.

Right now, because we don't allocate this on each request, those spaces
are only allocated for the first request in bsg-lib.

Maybe we can ignore this, if it gets to complicated, I don't wanne
prolong this unnecessary.

> 
> > My diff tells that this was the same patch as before.
> 
> Next try:

I will have a look when I am back in the office next week.


                                                    Beste Grüße / Best regards,
                                                      - Benjamin Block
> 
> ---
> From f5b03b82df0569c035022c1c2535696186907f1a Mon Sep 17 00:00:00 2001
> From: Christoph Hellwig <hch@lst.de>
> Date: Fri, 11 Aug 2017 11:03:29 +0200
> Subject: bsg-lib: allocate sense data for each request
> 
> Since we split the scsi_request out of the request the driver is supposed
> to provide storage for the sense buffer.  The bsg-lib code failed to do so,
> though and will crash anytime it is used.
> 
> This patch moves bsg-lib to allocate and setup the bsg_job ahead of time,
> and allocate the sense data, which is used as reply buffer in bsg.
> 
> Reported-by: Steffen Maier <maier@linux.vnet.ibm.com>
> Signed-off-by: Benjamin Block <bblock@linux.vnet.ibm.com>
> Fixes: 82ed4db499b8 ("block: split scsi_request out of struct request")
> Cc: <stable@vger.kernel.org> #4.11+
> ---
>  block/bsg-lib.c         | 51 +++++++++++++++++++++++++++----------------------
>  include/linux/blkdev.h  |  1 -
>  include/linux/bsg-lib.h |  2 ++
>  3 files changed, 30 insertions(+), 24 deletions(-)
> 
> diff --git a/block/bsg-lib.c b/block/bsg-lib.c
> index c4513b23f57a..c07333c3b785 100644
> --- a/block/bsg-lib.c
> +++ b/block/bsg-lib.c
> @@ -37,13 +37,11 @@ static void bsg_destroy_job(struct kref *kref)
>  	struct bsg_job *job = container_of(kref, struct bsg_job, kref);
>  	struct request *rq = job->req;
> 
> -	blk_end_request_all(rq, BLK_STS_OK);
> -
>  	put_device(job->dev);	/* release reference for the request */
> 
>  	kfree(job->request_payload.sg_list);
>  	kfree(job->reply_payload.sg_list);
> -	kfree(job);
> +	blk_end_request_all(rq, BLK_STS_OK);
>  }
> 
>  void bsg_job_put(struct bsg_job *job)
> @@ -100,7 +98,7 @@ EXPORT_SYMBOL_GPL(bsg_job_done);
>   */
>  static void bsg_softirq_done(struct request *rq)
>  {
> -	struct bsg_job *job = rq->special;
> +	struct bsg_job *job = blk_mq_rq_to_pdu(rq);
> 
>  	bsg_job_put(job);
>  }
> @@ -129,24 +127,11 @@ static int bsg_map_buffer(struct bsg_buffer *buf, struct request *req)
>  static int bsg_create_job(struct device *dev, struct request *req)
>  {
>  	struct request *rsp = req->next_rq;
> -	struct request_queue *q = req->q;
> +	struct bsg_job *job = blk_mq_rq_to_pdu(req);
>  	struct scsi_request *rq = scsi_req(req);
> -	struct bsg_job *job;
>  	int ret;
> 
> -	BUG_ON(req->special);
> -
> -	job = kzalloc(sizeof(struct bsg_job) + q->bsg_job_size, GFP_KERNEL);
> -	if (!job)
> -		return -ENOMEM;
> -
> -	req->special = job;
> -	job->req = req;
> -	if (q->bsg_job_size)
> -		job->dd_data = (void *)&job[1];
> -	job->request = rq->cmd;
>  	job->request_len = rq->cmd_len;
> -	job->reply = rq->sense;
>  	job->reply_len = SCSI_SENSE_BUFFERSIZE;	/* Size of sense buffer
>  						 * allocated */
>  	if (req->bio) {
> @@ -187,7 +172,6 @@ static void bsg_request_fn(struct request_queue *q)
>  {
>  	struct device *dev = q->queuedata;
>  	struct request *req;
> -	struct bsg_job *job;
>  	int ret;
> 
>  	if (!get_device(dev))
> @@ -207,8 +191,7 @@ static void bsg_request_fn(struct request_queue *q)
>  			continue;
>  		}
> 
> -		job = req->special;
> -		ret = q->bsg_job_fn(job);
> +		ret = q->bsg_job_fn(blk_mq_rq_to_pdu(req));
>  		spin_lock_irq(q->queue_lock);
>  		if (ret)
>  			break;
> @@ -219,6 +202,27 @@ static void bsg_request_fn(struct request_queue *q)
>  	spin_lock_irq(q->queue_lock);
>  }
> 
> +static int bsg_init_rq(struct request_queue *q, struct request *req, gfp_t gfp)
> +{
> +	struct bsg_job *job = blk_mq_rq_to_pdu(req);
> +
> +	memset(job, 0, sizeof(*job));
> +	job->req = req;
> +	job->request = job->sreq.cmd;
> +	job->dd_data = job + 1;
> +	job->reply = job->sreq.sense = kzalloc(job->reply_len, gfp);
> +	if (!job->reply)
> +		return -ENOMEM;
> +	return 0;
> +}
> +
> +static void bsg_exit_rq(struct request_queue *q, struct request *req)
> +{
> +	struct bsg_job *job = blk_mq_rq_to_pdu(req);
> +
> +	kfree(job->reply);
> +}
> +
>  /**
>   * bsg_setup_queue - Create and add the bsg hooks so we can receive requests
>   * @dev: device to attach bsg device to
> @@ -235,7 +239,9 @@ struct request_queue *bsg_setup_queue(struct device *dev, char *name,
>  	q = blk_alloc_queue(GFP_KERNEL);
>  	if (!q)
>  		return ERR_PTR(-ENOMEM);
> -	q->cmd_size = sizeof(struct scsi_request);
> +	q->cmd_size = sizeof(struct bsg_job) + dd_job_size;
> +	q->init_rq_fn = bsg_init_rq;
> +	q->exit_rq_fn = bsg_exit_rq;
>  	q->request_fn = bsg_request_fn;
> 
>  	ret = blk_init_allocated_queue(q);
> @@ -243,7 +249,6 @@ struct request_queue *bsg_setup_queue(struct device *dev, char *name,
>  		goto out_cleanup_queue;
> 
>  	q->queuedata = dev;
> -	q->bsg_job_size = dd_job_size;
>  	q->bsg_job_fn = job_fn;
>  	queue_flag_set_unlocked(QUEUE_FLAG_BIDI, q);
>  	queue_flag_set_unlocked(QUEUE_FLAG_SCSI_PASSTHROUGH, q);
> diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
> index f45f157b2910..6ae9aa6f93f0 100644
> --- a/include/linux/blkdev.h
> +++ b/include/linux/blkdev.h
> @@ -568,7 +568,6 @@ struct request_queue {
> 
>  #if defined(CONFIG_BLK_DEV_BSG)
>  	bsg_job_fn		*bsg_job_fn;
> -	int			bsg_job_size;
>  	struct bsg_class_device bsg_dev;
>  #endif
> 
> diff --git a/include/linux/bsg-lib.h b/include/linux/bsg-lib.h
> index e34dde2da0ef..637a20cfb237 100644
> --- a/include/linux/bsg-lib.h
> +++ b/include/linux/bsg-lib.h
> @@ -24,6 +24,7 @@
>  #define _BLK_BSG_
> 
>  #include <linux/blkdev.h>
> +#include <scsi/scsi_request.h>
> 
>  struct request;
>  struct device;
> @@ -37,6 +38,7 @@ struct bsg_buffer {
>  };
> 
>  struct bsg_job {
> +	struct scsi_request sreq;
>  	struct device *dev;
>  	struct request *req;
> 
> -- 
> 2.11.0
> 

-- 
Linux on z Systems Development         /         IBM Systems & Technology Group
		  IBM Deutschland Research & Development GmbH 
Vorsitz. AufsR.: Martina Koederitz     /        Geschäftsführung: Dirk Wittkopp
Sitz der Gesellschaft: Böblingen / Registergericht: AmtsG Stuttgart, HRB 243294

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC PATCH 1/6] bsg: fix kernel panic resulting from missing allocation of a reply-buffer
  2017-08-11 16:01                   ` Benjamin Block
@ 2017-08-13 14:39                     ` Christoph Hellwig
  2017-08-14 16:33                       ` Benjamin Block
  2017-08-14 16:32                     ` Benjamin Block
  1 sibling, 1 reply; 31+ messages in thread
From: Christoph Hellwig @ 2017-08-13 14:39 UTC (permalink / raw)
  To: Benjamin Block
  Cc: Christoph Hellwig, James E . J . Bottomley, Martin K . Petersen,
	Jens Axboe, linux-block, linux-kernel, linux-scsi,
	Johannes Thumshirn, Steffen Maier, open-iscsi

On Fri, Aug 11, 2017 at 06:01:42PM +0200, Benjamin Block wrote:
> When the BSG interface is used with bsg-lib, and the user sends a
> Bidirectional command - so when he gives an input- and output-buffer
> (most users of our interface will likely do that, if they wanna get the
> transport-level response data) - bsg will allocate two requests from the
> queue. The first request's bio is used to map the input and the second
> request's bio for the output (see bsg_map_hdr() in the if-statement with
> (op == REQ_OP_SCSI_OUT && hdr->din_xfer_len)).
> 
> When we now allocate the full space of bsg_job, sense, dd_data for each
> request, these will be wasted on the (linked) second request. They will
> go unused all the time, as only the first request's bsg_job, sense and
> dd_data is used by the LLDs and BSG itself.
> 
> Right now, because we don't allocate this on each request, those spaces
> are only allocated for the first request in bsg-lib.
> 
> Maybe we can ignore this, if it gets to complicated, I don't wanne
> prolong this unnecessary.

We have the same 'issue' with bidirection scsi commands - it's a side
effect of having two request structures for these commands, and the
only real fix would be to have a single request structure, which would
be nice especially if we can't do it without growing struct request.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC PATCH 1/6] bsg: fix kernel panic resulting from missing allocation of a reply-buffer
  2017-08-11 16:01                   ` Benjamin Block
  2017-08-13 14:39                     ` Christoph Hellwig
@ 2017-08-14 16:32                     ` Benjamin Block
  2017-08-16 10:53                       ` Christoph Hellwig
  1 sibling, 1 reply; 31+ messages in thread
From: Benjamin Block @ 2017-08-14 16:32 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: James E . J . Bottomley, Martin K . Petersen, Jens Axboe,
	linux-block, linux-kernel, linux-scsi, Johannes Thumshirn,
	Steffen Maier, open-iscsi

Hey Christoph,

I looked over the patch in detail, see below.

> From f5b03b82df0569c035022c1c2535696186907f1a Mon Sep 17 00:00:00 2001
> From: Christoph Hellwig <hch@lst.de>
> Date: Fri, 11 Aug 2017 11:03:29 +0200
> Subject: bsg-lib: allocate sense data for each request
>
> Since we split the scsi_request out of the request the driver is supposed
> to provide storage for the sense buffer.  The bsg-lib code failed to do so,
> though and will crash anytime it is used.
>
> This patch moves bsg-lib to allocate and setup the bsg_job ahead of time,
> and allocate the sense data, which is used as reply buffer in bsg.
>
> Reported-by: Steffen Maier <maier@linux.vnet.ibm.com>
> Signed-off-by: Benjamin Block <bblock@linux.vnet.ibm.com>
> Fixes: 82ed4db499b8 ("block: split scsi_request out of struct request")
> Cc: <stable@vger.kernel.org> #4.11+
> ---
>  block/bsg-lib.c         | 51 +++++++++++++++++++++++++++----------------------
>  include/linux/blkdev.h  |  1 -
>  include/linux/bsg-lib.h |  2 ++
>  3 files changed, 30 insertions(+), 24 deletions(-)
>
> diff --git a/block/bsg-lib.c b/block/bsg-lib.c
> index c4513b23f57a..c07333c3b785 100644
> --- a/block/bsg-lib.c
> +++ b/block/bsg-lib.c
> @@ -37,13 +37,11 @@ static void bsg_destroy_job(struct kref *kref)
>  	struct bsg_job *job = container_of(kref, struct bsg_job, kref);
>  	struct request *rq = job->req;
>
> -	blk_end_request_all(rq, BLK_STS_OK);
> -
>  	put_device(job->dev);	/* release reference for the request */
>
>  	kfree(job->request_payload.sg_list);
>  	kfree(job->reply_payload.sg_list);
> -	kfree(job);
> +	blk_end_request_all(rq, BLK_STS_OK);

What is the reason for moving that last line? Just wondering whether
that might change the behavior somehow, although it doesn't look like it
from the code.

>  }
>
>  void bsg_job_put(struct bsg_job *job)
> @@ -100,7 +98,7 @@ EXPORT_SYMBOL_GPL(bsg_job_done);
>   */
>  static void bsg_softirq_done(struct request *rq)
>  {
> -	struct bsg_job *job = rq->special;
> +	struct bsg_job *job = blk_mq_rq_to_pdu(rq);
>
>  	bsg_job_put(job);
>  }
> @@ -129,24 +127,11 @@ static int bsg_map_buffer(struct bsg_buffer *buf, struct request *req)
>  static int bsg_create_job(struct device *dev, struct request *req)
>  {
>  	struct request *rsp = req->next_rq;
> -	struct request_queue *q = req->q;
> +	struct bsg_job *job = blk_mq_rq_to_pdu(req);
>  	struct scsi_request *rq = scsi_req(req);
> -	struct bsg_job *job;
>  	int ret;
>
> -	BUG_ON(req->special);
> -
> -	job = kzalloc(sizeof(struct bsg_job) + q->bsg_job_size, GFP_KERNEL);
> -	if (!job)
> -		return -ENOMEM;
> -
> -	req->special = job;
> -	job->req = req;
> -	if (q->bsg_job_size)
> -		job->dd_data = (void *)&job[1];
> -	job->request = rq->cmd;
>  	job->request_len = rq->cmd_len;
> -	job->reply = rq->sense;
>  	job->reply_len = SCSI_SENSE_BUFFERSIZE;	/* Size of sense buffer
>  						 * allocated */
>  	if (req->bio) {
> @@ -187,7 +172,6 @@ static void bsg_request_fn(struct request_queue *q)
>  {
>  	struct device *dev = q->queuedata;
>  	struct request *req;
> -	struct bsg_job *job;
>  	int ret;
>
>  	if (!get_device(dev))
> @@ -207,8 +191,7 @@ static void bsg_request_fn(struct request_queue *q)
>  			continue;
>  		}
>
> -		job = req->special;
> -		ret = q->bsg_job_fn(job);
> +		ret = q->bsg_job_fn(blk_mq_rq_to_pdu(req));
>  		spin_lock_irq(q->queue_lock);
>  		if (ret)
>  			break;
> @@ -219,6 +202,27 @@ static void bsg_request_fn(struct request_queue *q)
>  	spin_lock_irq(q->queue_lock);
>  }
>
> +static int bsg_init_rq(struct request_queue *q, struct request *req, gfp_t gfp)
> +{
> +	struct bsg_job *job = blk_mq_rq_to_pdu(req);
> +
> +	memset(job, 0, sizeof(*job));
> +	job->req = req;
> +	job->request = job->sreq.cmd;

That doesn't work with bsg.c if the request submitted by the user is
bigger than BLK_MAX_CDB. There is code in blk_fill_sgv4_hdr_rq() that
will reassign the req->cmd point in that case to something else.

This is maybe wrong in the same vein as my Patch 1 is. If not for the
legacy code in bsg.c, setting this here, will miss changes to that
pointer between request-allocation and job-submission.

So maybe just move this to bsg_create_job().

> +	job->dd_data = job + 1;
> +	job->reply = job->sreq.sense = kzalloc(job->reply_len, gfp);

job->reply_len will be 0 here, won't it? You probably have to pull the
"job->reply_len = SCSI_SENSE_BUFFERSIZE" here from bsg_create_job().

> +	if (!job->reply)
> +		return -ENOMEM;
> +	return 0;
> +}
> +
> +static void bsg_exit_rq(struct request_queue *q, struct request *req)
> +{
> +	struct bsg_job *job = blk_mq_rq_to_pdu(req);
> +
> +	kfree(job->reply);

Don't we need to free the reply-buffer as well?

Not sure how you wanna proceed. I can also sketch something up and
already test it on our cards, to prevent the ping-pong.


                                                    Beste Grüße / Best regards,
                                                      - Benjamin Block

> +}
> +
>  /**
>   * bsg_setup_queue - Create and add the bsg hooks so we can receive requests
>   * @dev: device to attach bsg device to
> @@ -235,7 +239,9 @@ struct request_queue *bsg_setup_queue(struct device *dev, char *name,
>  	q = blk_alloc_queue(GFP_KERNEL);
>  	if (!q)
>  		return ERR_PTR(-ENOMEM);
> -	q->cmd_size = sizeof(struct scsi_request);
> +	q->cmd_size = sizeof(struct bsg_job) + dd_job_size;
> +	q->init_rq_fn = bsg_init_rq;
> +	q->exit_rq_fn = bsg_exit_rq;
>  	q->request_fn = bsg_request_fn;
>
>  	ret = blk_init_allocated_queue(q);
> @@ -243,7 +249,6 @@ struct request_queue *bsg_setup_queue(struct device *dev, char *name,
>  		goto out_cleanup_queue;
>
>  	q->queuedata = dev;
> -	q->bsg_job_size = dd_job_size;
>  	q->bsg_job_fn = job_fn;
>  	queue_flag_set_unlocked(QUEUE_FLAG_BIDI, q);
>  	queue_flag_set_unlocked(QUEUE_FLAG_SCSI_PASSTHROUGH, q);
> diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
> index f45f157b2910..6ae9aa6f93f0 100644
> --- a/include/linux/blkdev.h
> +++ b/include/linux/blkdev.h
> @@ -568,7 +568,6 @@ struct request_queue {
>
>  #if defined(CONFIG_BLK_DEV_BSG)
>  	bsg_job_fn		*bsg_job_fn;
> -	int			bsg_job_size;
>  	struct bsg_class_device bsg_dev;
>  #endif
>
> diff --git a/include/linux/bsg-lib.h b/include/linux/bsg-lib.h
> index e34dde2da0ef..637a20cfb237 100644
> --- a/include/linux/bsg-lib.h
> +++ b/include/linux/bsg-lib.h
> @@ -24,6 +24,7 @@
>  #define _BLK_BSG_
>
>  #include <linux/blkdev.h>
> +#include <scsi/scsi_request.h>
>
>  struct request;
>  struct device;
> @@ -37,6 +38,7 @@ struct bsg_buffer {
>  };
>
>  struct bsg_job {
> +	struct scsi_request sreq;
>  	struct device *dev;
>  	struct request *req;
>
> --
> 2.11.0
>

--
Linux on z Systems Development         /         IBM Systems & Technology Group
		  IBM Deutschland Research & Development GmbH
Vorsitz. AufsR.: Martina Koederitz     /        Geschäftsführung: Dirk Wittkopp
Sitz der Gesellschaft: Böblingen / Registergericht: AmtsG Stuttgart, HRB 243294

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC PATCH 1/6] bsg: fix kernel panic resulting from missing allocation of a reply-buffer
  2017-08-13 14:39                     ` Christoph Hellwig
@ 2017-08-14 16:33                       ` Benjamin Block
  0 siblings, 0 replies; 31+ messages in thread
From: Benjamin Block @ 2017-08-14 16:33 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: James E . J . Bottomley, Martin K . Petersen, Jens Axboe,
	linux-block, linux-kernel, linux-scsi, Johannes Thumshirn,
	Steffen Maier, open-iscsi

On Sun, Aug 13, 2017 at 04:39:40PM +0200, Christoph Hellwig wrote:
> On Fri, Aug 11, 2017 at 06:01:42PM +0200, Benjamin Block wrote:
> > When the BSG interface is used with bsg-lib, and the user sends a
> > Bidirectional command - so when he gives an input- and output-buffer
> > (most users of our interface will likely do that, if they wanna get the
> > transport-level response data) - bsg will allocate two requests from the
> > queue. The first request's bio is used to map the input and the second
> > request's bio for the output (see bsg_map_hdr() in the if-statement with
> > (op == REQ_OP_SCSI_OUT && hdr->din_xfer_len)).
> > 
> > When we now allocate the full space of bsg_job, sense, dd_data for each
> > request, these will be wasted on the (linked) second request. They will
> > go unused all the time, as only the first request's bsg_job, sense and
> > dd_data is used by the LLDs and BSG itself.
> > 
> > Right now, because we don't allocate this on each request, those spaces
> > are only allocated for the first request in bsg-lib.
> > 
> > Maybe we can ignore this, if it gets to complicated, I don't wanne
> > prolong this unnecessary.
> 
> We have the same 'issue' with bidirection scsi commands - it's a side
> effect of having two request structures for these commands, and the
> only real fix would be to have a single request structure, which would
> be nice especially if we can't do it without growing struct request.
> 

Alright, I was not aware of that. That is fair then. Thx.


                                                    Beste Grüße / Best regards,
                                                      - Benjamin Block
-- 
Linux on z Systems Development         /         IBM Systems & Technology Group
		  IBM Deutschland Research & Development GmbH 
Vorsitz. AufsR.: Martina Koederitz     /        Geschäftsführung: Dirk Wittkopp
Sitz der Gesellschaft: Böblingen / Registergericht: AmtsG Stuttgart, HRB 243294

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC PATCH 1/6] bsg: fix kernel panic resulting from missing allocation of a reply-buffer
  2017-08-14 16:32                     ` Benjamin Block
@ 2017-08-16 10:53                       ` Christoph Hellwig
  0 siblings, 0 replies; 31+ messages in thread
From: Christoph Hellwig @ 2017-08-16 10:53 UTC (permalink / raw)
  To: Benjamin Block
  Cc: Christoph Hellwig, James E . J . Bottomley, Martin K . Petersen,
	Jens Axboe, linux-block, linux-kernel, linux-scsi,
	Johannes Thumshirn, Steffen Maier, open-iscsi

On Mon, Aug 14, 2017 at 06:32:17PM +0200, Benjamin Block wrote:
> > -	blk_end_request_all(rq, BLK_STS_OK);
> > -
> >  	put_device(job->dev);	/* release reference for the request */
> >
> >  	kfree(job->request_payload.sg_list);
> >  	kfree(job->reply_payload.sg_list);
> > -	kfree(job);
> > +	blk_end_request_all(rq, BLK_STS_OK);
> 
> What is the reason for moving that last line? Just wondering whether
> that might change the behavior somehow, although it doesn't look like it
> from the code.

The job is now allocated as part of the request, so we must fre
it last.  The only change in behavior is that the reference gets dropped
a bit earlier, but once ownership is handed to the block layer
it's not needed, as are the memory allocations for the S/G lists.

> > +{
> > +	struct bsg_job *job = blk_mq_rq_to_pdu(req);
> > +
> > +	memset(job, 0, sizeof(*job));
> > +	job->req = req;
> > +	job->request = job->sreq.cmd;
> 
> That doesn't work with bsg.c if the request submitted by the user is
> bigger than BLK_MAX_CDB. There is code in blk_fill_sgv4_hdr_rq() that
> will reassign the req->cmd point in that case to something else.
> 
> This is maybe wrong in the same vein as my Patch 1 is. If not for the
> legacy code in bsg.c, setting this here, will miss changes to that
> pointer between request-allocation and job-submission.
> 
> So maybe just move this to bsg_create_job().

Yes, this should be in  indeed.

> 
> > +	job->dd_data = job + 1;
> > +	job->reply = job->sreq.sense = kzalloc(job->reply_len, gfp);
> 
> job->reply_len will be 0 here, won't it? You probably have to pull the
> "job->reply_len = SCSI_SENSE_BUFFERSIZE" here from bsg_create_job().

True.

^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2017-08-16 10:53 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-08-09 14:11 [RFC PATCH 0/6] bsg: fix regression resulting in panics when sending commands via BSG and some sanity cleanups Benjamin Block
2017-08-09 14:11 ` [RFC PATCH 1/6] bsg: fix kernel panic resulting from missing allocation of a reply-buffer Benjamin Block
2017-08-10  9:32   ` Christoph Hellwig
2017-08-10 22:10     ` Benjamin Block
2017-08-10 22:45       ` Benjamin Block
2017-08-11  8:38       ` Christoph Hellwig
2017-08-11  9:14         ` Christoph Hellwig
2017-08-11 13:49           ` Benjamin Block
2017-08-11 14:36             ` Christoph Hellwig
2017-08-11 15:32               ` Benjamin Block
2017-08-11 15:35                 ` Christoph Hellwig
2017-08-11 16:01                   ` Benjamin Block
2017-08-13 14:39                     ` Christoph Hellwig
2017-08-14 16:33                       ` Benjamin Block
2017-08-14 16:32                     ` Benjamin Block
2017-08-16 10:53                       ` Christoph Hellwig
2017-08-09 14:11 ` [RFC PATCH 2/6] bsg: assign sense_len instead of fixed SCSI_SENSE_BUFFERSIZE Benjamin Block
2017-08-10  9:32   ` Christoph Hellwig
2017-08-09 14:11 ` [RFC PATCH 3/6] bsg: scsi-transport: add compile-tests to prevent reply-buffer overflows Benjamin Block
2017-08-10  9:32   ` Christoph Hellwig
2017-08-09 14:11 ` [RFC PATCH 4/6] bsg: refactor ioctl to use regular BSG-command infrastructure for SG_IO Benjamin Block
2017-08-10  8:24   ` Johannes Thumshirn
2017-08-10  9:34     ` Christoph Hellwig
2017-08-10 22:12     ` Benjamin Block
2017-08-09 14:11 ` [RFC PATCH 5/6] bsg: reduce unnecessary arguments for bsg_map_hdr() Benjamin Block
2017-08-10  8:26   ` Johannes Thumshirn
2017-08-10  9:35   ` Christoph Hellwig
2017-08-10 22:19     ` Benjamin Block
2017-08-09 14:11 ` [RFC PATCH 6/6] bsg: reduce unnecessary arguments for blk_complete_sgv4_hdr_rq() Benjamin Block
2017-08-10  8:27   ` Johannes Thumshirn
2017-08-10  9:35   ` Christoph Hellwig

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).