All of lore.kernel.org
 help / color / mirror / Atom feed
* fix large I/O regression with iSER in 4.4+
@ 2016-04-11 22:47 Christoph Hellwig
       [not found] ` <1460414846-29540-1-git-send-email-hch-jcswGhMUV9g@public.gmane.org>
  2016-04-11 22:47 ` [PATCH 2/2] IB/iser: set max_segment_size Christoph Hellwig
  0 siblings, 2 replies; 20+ messages in thread
From: Christoph Hellwig @ 2016-04-11 22:47 UTC (permalink / raw)
  To: linux-rdma-u79uwXL29TY76Z2rM5mHXA, linux-scsi-u79uwXL29TY76Z2rM5mHXA

Since iSER was converted to use the generic virt_boundary mechanism
(which was called something else in 4.4), it didn't handle the case
where a request is using up the full size of max_hw_segments, but
not actually aligned to the virt boundary.  This series sets the
maximum segment size limit to fix this workload (xfs_repair is a good
reproducer, btw).

Should probably go into 4.4 and 4.5-stable.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 1/2] scsi: add a max_segment_size limitation to struct Scsi_Host
       [not found] ` <1460414846-29540-1-git-send-email-hch-jcswGhMUV9g@public.gmane.org>
@ 2016-04-11 22:47   ` Christoph Hellwig
  2016-04-11 23:22     ` Laurence Oberman
  2016-04-11 23:32     ` Bart Van Assche
  0 siblings, 2 replies; 20+ messages in thread
From: Christoph Hellwig @ 2016-04-11 22:47 UTC (permalink / raw)
  To: linux-rdma-u79uwXL29TY76Z2rM5mHXA, linux-scsi-u79uwXL29TY76Z2rM5mHXA

RDMA drivers need segments that aren't larger than a single HCA page
for memory registrations to work properly, so wire up this limitation
in the host.

While we could just call blk_queue_max_segment_size from ->slave_configure,
that would override the global limit based on the DMA device, so let's do
it the traditional way by adding a field to the Scsi_Host structure.

Signed-off-by: Christoph Hellwig <hch-jcswGhMUV9g@public.gmane.org>
---
 drivers/scsi/scsi_lib.c  | 3 ++-
 include/scsi/scsi_host.h | 1 +
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 8106515..04c660d 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -2120,7 +2120,8 @@ static void __scsi_init_queue(struct Scsi_Host *shost, struct request_queue *q)
 	blk_queue_segment_boundary(q, shost->dma_boundary);
 	dma_set_seg_boundary(dev, shost->dma_boundary);
 
-	blk_queue_max_segment_size(q, dma_get_max_seg_size(dev));
+	blk_queue_max_segment_size(q,
+		min(shost->max_segment_size, dma_get_max_seg_size(dev)));
 
 	if (!shost->use_clustering)
 		q->limits.cluster = 0;
diff --git a/include/scsi/scsi_host.h b/include/scsi/scsi_host.h
index fcfa3d7..f11d3fe 100644
--- a/include/scsi/scsi_host.h
+++ b/include/scsi/scsi_host.h
@@ -621,6 +621,7 @@ struct Scsi_Host {
 	short unsigned int sg_tablesize;
 	short unsigned int sg_prot_tablesize;
 	unsigned int max_sectors;
+	unsigned int max_segment_size;
 	unsigned long dma_boundary;
 	/*
 	 * In scsi-mq mode, the number of hardware queues supported by the LLD.
-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 2/2] IB/iser: set max_segment_size
  2016-04-11 22:47 fix large I/O regression with iSER in 4.4+ Christoph Hellwig
       [not found] ` <1460414846-29540-1-git-send-email-hch-jcswGhMUV9g@public.gmane.org>
@ 2016-04-11 22:47 ` Christoph Hellwig
  2016-04-12 10:48   ` Sagi Grimberg
  1 sibling, 1 reply; 20+ messages in thread
From: Christoph Hellwig @ 2016-04-11 22:47 UTC (permalink / raw)
  To: linux-rdma, linux-scsi

So that we don't overflow the number of MR segments allocated because
we have to split on SGL segment into multiple MR segments.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/infiniband/ulp/iser/iscsi_iser.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/infiniband/ulp/iser/iscsi_iser.c b/drivers/infiniband/ulp/iser/iscsi_iser.c
index 80b6bed..784504a 100644
--- a/drivers/infiniband/ulp/iser/iscsi_iser.c
+++ b/drivers/infiniband/ulp/iser/iscsi_iser.c
@@ -623,6 +623,7 @@ iscsi_iser_session_create(struct iscsi_endpoint *ep,
 	shost->max_id = 0;
 	shost->max_channel = 0;
 	shost->max_cmd_len = 16;
+	shost->max_segment_size = PAGE_SIZE;
 
 	/*
 	 * older userspace tools (before 2.0-870) did not pass us
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/2] scsi: add a max_segment_size limitation to struct Scsi_Host
  2016-04-11 22:47   ` [PATCH 1/2] scsi: add a max_segment_size limitation to struct Scsi_Host Christoph Hellwig
@ 2016-04-11 23:22     ` Laurence Oberman
  2016-04-11 23:32     ` Bart Van Assche
  1 sibling, 0 replies; 20+ messages in thread
From: Laurence Oberman @ 2016-04-11 23:22 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-rdma, linux-scsi

This looks fine to me.
I am pulling this in to my SRP initiator and target testing ongoing at the moment so will be testing.
Up to now this has likely not affected me but I am pulling in all RDMA patches to test.

Reviewed-by: Laurence Oberman <loberman@redhat.com>

Laurence Oberman
Principal Software Maintenance Engineer
Red Hat Global Support Services

----- Original Message -----
From: "Christoph Hellwig" <hch@lst.de>
To: linux-rdma@vger.kernel.org, linux-scsi@vger.kernel.org
Sent: Monday, April 11, 2016 6:47:25 PM
Subject: [PATCH 1/2] scsi: add a max_segment_size limitation to struct Scsi_Host

RDMA drivers need segments that aren't larger than a single HCA page
for memory registrations to work properly, so wire up this limitation
in the host.

While we could just call blk_queue_max_segment_size from ->slave_configure,
that would override the global limit based on the DMA device, so let's do
it the traditional way by adding a field to the Scsi_Host structure.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/scsi/scsi_lib.c  | 3 ++-
 include/scsi/scsi_host.h | 1 +
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 8106515..04c660d 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -2120,7 +2120,8 @@ static void __scsi_init_queue(struct Scsi_Host *shost, struct request_queue *q)
 	blk_queue_segment_boundary(q, shost->dma_boundary);
 	dma_set_seg_boundary(dev, shost->dma_boundary);
 
-	blk_queue_max_segment_size(q, dma_get_max_seg_size(dev));
+	blk_queue_max_segment_size(q,
+		min(shost->max_segment_size, dma_get_max_seg_size(dev)));
 
 	if (!shost->use_clustering)
 		q->limits.cluster = 0;
diff --git a/include/scsi/scsi_host.h b/include/scsi/scsi_host.h
index fcfa3d7..f11d3fe 100644
--- a/include/scsi/scsi_host.h
+++ b/include/scsi/scsi_host.h
@@ -621,6 +621,7 @@ struct Scsi_Host {
 	short unsigned int sg_tablesize;
 	short unsigned int sg_prot_tablesize;
 	unsigned int max_sectors;
+	unsigned int max_segment_size;
 	unsigned long dma_boundary;
 	/*
 	 * In scsi-mq mode, the number of hardware queues supported by the LLD.
-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/2] scsi: add a max_segment_size limitation to struct Scsi_Host
  2016-04-11 22:47   ` [PATCH 1/2] scsi: add a max_segment_size limitation to struct Scsi_Host Christoph Hellwig
  2016-04-11 23:22     ` Laurence Oberman
@ 2016-04-11 23:32     ` Bart Van Assche
       [not found]       ` <570C3400.9080503-XdAiOPVOjttBDgjK7y7TUQ@public.gmane.org>
  1 sibling, 1 reply; 20+ messages in thread
From: Bart Van Assche @ 2016-04-11 23:32 UTC (permalink / raw)
  To: Christoph Hellwig, linux-rdma, linux-scsi

On 04/11/2016 03:47 PM, Christoph Hellwig wrote:
> diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
> index 8106515..04c660d 100644
> --- a/drivers/scsi/scsi_lib.c
> +++ b/drivers/scsi/scsi_lib.c
> @@ -2120,7 +2120,8 @@ static void __scsi_init_queue(struct Scsi_Host *shost, struct request_queue *q)
>   	blk_queue_segment_boundary(q, shost->dma_boundary);
>   	dma_set_seg_boundary(dev, shost->dma_boundary);
>
> -	blk_queue_max_segment_size(q, dma_get_max_seg_size(dev));
> +	blk_queue_max_segment_size(q,
> +		min(shost->max_segment_size, dma_get_max_seg_size(dev)));
>
>   	if (!shost->use_clustering)
>   		q->limits.cluster = 0;

Hello Christoph,

Since Scsi_Host.max_segment_size is initialized to zero, shouldn't min() 
be changed into min_not_zero()?

Bart.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/2] scsi: add a max_segment_size limitation to struct Scsi_Host
       [not found]       ` <570C3400.9080503-XdAiOPVOjttBDgjK7y7TUQ@public.gmane.org>
@ 2016-04-11 23:35         ` Christoph Hellwig
  2016-04-11 23:44         ` Laurence Oberman
  1 sibling, 0 replies; 20+ messages in thread
From: Christoph Hellwig @ 2016-04-11 23:35 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Christoph Hellwig, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	linux-scsi-u79uwXL29TY76Z2rM5mHXA

On Mon, Apr 11, 2016 at 04:32:16PM -0700, Bart Van Assche wrote:
> Since Scsi_Host.max_segment_size is initialized to zero, shouldn't min() be 
> changed into min_not_zero()?

It probably should.  Thanks for catching this.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/2] scsi: add a max_segment_size limitation to struct Scsi_Host
       [not found]       ` <570C3400.9080503-XdAiOPVOjttBDgjK7y7TUQ@public.gmane.org>
  2016-04-11 23:35         ` Christoph Hellwig
@ 2016-04-11 23:44         ` Laurence Oberman
       [not found]           ` <1317640777.28365032.1460418264272.JavaMail.zimbra-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
  1 sibling, 1 reply; 20+ messages in thread
From: Laurence Oberman @ 2016-04-11 23:44 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Christoph Hellwig, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	linux-scsi-u79uwXL29TY76Z2rM5mHXA

Thanks Bart
Good catch, I completely missed it.

Laurence Oberman
Principal Software Maintenance Engineer
Red Hat Global Support Services

----- Original Message -----
From: "Bart Van Assche" <bart.vanassche-XdAiOPVOjttBDgjK7y7TUQ@public.gmane.org>
To: "Christoph Hellwig" <hch-jcswGhMUV9g@public.gmane.org>, linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, linux-scsi-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Sent: Monday, April 11, 2016 7:32:16 PM
Subject: Re: [PATCH 1/2] scsi: add a max_segment_size limitation to struct Scsi_Host

On 04/11/2016 03:47 PM, Christoph Hellwig wrote:
> diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
> index 8106515..04c660d 100644
> --- a/drivers/scsi/scsi_lib.c
> +++ b/drivers/scsi/scsi_lib.c
> @@ -2120,7 +2120,8 @@ static void __scsi_init_queue(struct Scsi_Host *shost, struct request_queue *q)
>   	blk_queue_segment_boundary(q, shost->dma_boundary);
>   	dma_set_seg_boundary(dev, shost->dma_boundary);
>
> -	blk_queue_max_segment_size(q, dma_get_max_seg_size(dev));
> +	blk_queue_max_segment_size(q,
> +		min(shost->max_segment_size, dma_get_max_seg_size(dev)));
>
>   	if (!shost->use_clustering)
>   		q->limits.cluster = 0;

Hello Christoph,

Since Scsi_Host.max_segment_size is initialized to zero, shouldn't min() 
be changed into min_not_zero()?

Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/2] IB/iser: set max_segment_size
  2016-04-11 22:47 ` [PATCH 2/2] IB/iser: set max_segment_size Christoph Hellwig
@ 2016-04-12 10:48   ` Sagi Grimberg
       [not found]     ` <570CD26E.70502-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org>
  0 siblings, 1 reply; 20+ messages in thread
From: Sagi Grimberg @ 2016-04-12 10:48 UTC (permalink / raw)
  To: Christoph Hellwig, linux-rdma, linux-scsi


> So that we don't overflow the number of MR segments allocated because
> we have to split on SGL segment into multiple MR segments.
>
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>   drivers/infiniband/ulp/iser/iscsi_iser.c | 1 +
>   1 file changed, 1 insertion(+)
>
> diff --git a/drivers/infiniband/ulp/iser/iscsi_iser.c b/drivers/infiniband/ulp/iser/iscsi_iser.c
> index 80b6bed..784504a 100644
> --- a/drivers/infiniband/ulp/iser/iscsi_iser.c
> +++ b/drivers/infiniband/ulp/iser/iscsi_iser.c
> @@ -623,6 +623,7 @@ iscsi_iser_session_create(struct iscsi_endpoint *ep,
>   	shost->max_id = 0;
>   	shost->max_channel = 0;
>   	shost->max_cmd_len = 16;
> +	shost->max_segment_size = PAGE_SIZE;

In iser we sorta rely on 4k pages so we avoid
PAGE_SIZE but rather set SIZE_4K for these sort
of things (like we did in the virt_boundary).

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/2] scsi: add a max_segment_size limitation to struct Scsi_Host
       [not found]           ` <1317640777.28365032.1460418264272.JavaMail.zimbra-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
@ 2016-04-12 12:23             ` Laurence Oberman
  0 siblings, 0 replies; 20+ messages in thread
From: Laurence Oberman @ 2016-04-12 12:23 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Christoph Hellwig, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	linux-scsi-u79uwXL29TY76Z2rM5mHXA

Modified patch to use min_not_zero()
Ran a number of tests overnight on F/C, SCSI/SAS and SRP (RDMA) and no issues found.

Tested-by: Laurence Oberman <loberman-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>

Laurence Oberman
Principal Software Maintenance Engineer
Red Hat Global Support Services

----- Original Message -----
From: "Laurence Oberman" <loberman-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
To: "Bart Van Assche" <bart.vanassche-XdAiOPVOjttBDgjK7y7TUQ@public.gmane.org>
Cc: "Christoph Hellwig" <hch-jcswGhMUV9g@public.gmane.org>, linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, linux-scsi-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Sent: Monday, April 11, 2016 7:44:24 PM
Subject: Re: [PATCH 1/2] scsi: add a max_segment_size limitation to struct Scsi_Host

Thanks Bart
Good catch, I completely missed it.

Laurence Oberman
Principal Software Maintenance Engineer
Red Hat Global Support Services

----- Original Message -----
From: "Bart Van Assche" <bart.vanassche-XdAiOPVOjttBDgjK7y7TUQ@public.gmane.org>
To: "Christoph Hellwig" <hch-jcswGhMUV9g@public.gmane.org>, linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, linux-scsi-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Sent: Monday, April 11, 2016 7:32:16 PM
Subject: Re: [PATCH 1/2] scsi: add a max_segment_size limitation to struct Scsi_Host

On 04/11/2016 03:47 PM, Christoph Hellwig wrote:
> diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
> index 8106515..04c660d 100644
> --- a/drivers/scsi/scsi_lib.c
> +++ b/drivers/scsi/scsi_lib.c
> @@ -2120,7 +2120,8 @@ static void __scsi_init_queue(struct Scsi_Host *shost, struct request_queue *q)
>   	blk_queue_segment_boundary(q, shost->dma_boundary);
>   	dma_set_seg_boundary(dev, shost->dma_boundary);
>
> -	blk_queue_max_segment_size(q, dma_get_max_seg_size(dev));
> +	blk_queue_max_segment_size(q,
> +		min(shost->max_segment_size, dma_get_max_seg_size(dev)));
>
>   	if (!shost->use_clustering)
>   		q->limits.cluster = 0;

Hello Christoph,

Since Scsi_Host.max_segment_size is initialized to zero, shouldn't min() 
be changed into min_not_zero()?

Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/2] IB/iser: set max_segment_size
       [not found]     ` <570CD26E.70502-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org>
@ 2016-04-12 14:27       ` James Bottomley
       [not found]         ` <1460471256.2338.5.camel-d9PhHud1JfjCXq6kfMZ53/egYHeGw8Jk@public.gmane.org>
  0 siblings, 1 reply; 20+ messages in thread
From: James Bottomley @ 2016-04-12 14:27 UTC (permalink / raw)
  To: Sagi Grimberg, Christoph Hellwig,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	linux-scsi-u79uwXL29TY76Z2rM5mHXA

On Tue, 2016-04-12 at 13:48 +0300, Sagi Grimberg wrote:
> > So that we don't overflow the number of MR segments allocated
> > because
> > we have to split on SGL segment into multiple MR segments.
> > 
> > Signed-off-by: Christoph Hellwig <hch-jcswGhMUV9g@public.gmane.org>
> > ---
> >   drivers/infiniband/ulp/iser/iscsi_iser.c | 1 +
> >   1 file changed, 1 insertion(+)
> > 
> > diff --git a/drivers/infiniband/ulp/iser/iscsi_iser.c
> > b/drivers/infiniband/ulp/iser/iscsi_iser.c
> > index 80b6bed..784504a 100644
> > --- a/drivers/infiniband/ulp/iser/iscsi_iser.c
> > +++ b/drivers/infiniband/ulp/iser/iscsi_iser.c
> > @@ -623,6 +623,7 @@ iscsi_iser_session_create(struct iscsi_endpoint
> > *ep,
> >   	shost->max_id = 0;
> >   	shost->max_channel = 0;
> >   	shost->max_cmd_len = 16;
> > +	shost->max_segment_size = PAGE_SIZE;
> 
> In iser we sorta rely on 4k pages so we avoid
> PAGE_SIZE but rather set SIZE_4K for these sort
> of things (like we did in the virt_boundary).

So you still want only 4k segments even on PPC where the PAGE_SIZE is
16k?

James


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/2] IB/iser: set max_segment_size
       [not found]         ` <1460471256.2338.5.camel-d9PhHud1JfjCXq6kfMZ53/egYHeGw8Jk@public.gmane.org>
@ 2016-04-12 15:27           ` Christoph Hellwig
  2016-04-13  8:01           ` Sagi Grimberg
  1 sibling, 0 replies; 20+ messages in thread
From: Christoph Hellwig @ 2016-04-12 15:27 UTC (permalink / raw)
  To: James Bottomley
  Cc: Sagi Grimberg, Christoph Hellwig,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	linux-scsi-u79uwXL29TY76Z2rM5mHXA

On Tue, Apr 12, 2016 at 07:27:36AM -0700, James Bottomley wrote:
> > In iser we sorta rely on 4k pages so we avoid
> > PAGE_SIZE but rather set SIZE_4K for these sort
> > of things (like we did in the virt_boundary).
> 
> So you still want only 4k segments even on PPC where the PAGE_SIZE is
> 16k?

I'll leave the high level question to Sagi and friends, but if virt_boundary
and the MR page size are set to 4k we need to set the segment size
to the same, so this patch needs to be SIZE_4K indeed unless the
other two are changed.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/2] IB/iser: set max_segment_size
       [not found]         ` <1460471256.2338.5.camel-d9PhHud1JfjCXq6kfMZ53/egYHeGw8Jk@public.gmane.org>
  2016-04-12 15:27           ` Christoph Hellwig
@ 2016-04-13  8:01           ` Sagi Grimberg
  1 sibling, 0 replies; 20+ messages in thread
From: Sagi Grimberg @ 2016-04-13  8:01 UTC (permalink / raw)
  To: James Bottomley, Christoph Hellwig,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	linux-scsi-u79uwXL29TY76Z2rM5mHXA


>> In iser we sorta rely on 4k pages so we avoid
>> PAGE_SIZE but rather set SIZE_4K for these sort
>> of things (like we did in the virt_boundary).
>
> So you still want only 4k segments even on PPC where the PAGE_SIZE is
> 16k?

Yes, iSER has the "no-gaps" constraint (like nvme) and some
applications in the past were known to issue vectored IO that doesn't
necessarily align with the system PAGE_SIZE. These sort of workloads
resulted in suboptimal performance on ppc, ia64 etc (almost every I/O
had gaps).

Before the block layer was able to enforce scatterlists without gaps,
iSER used bounce buffering in order to service "gappy" I/O which was
probably a lot worse than bio splitting like the block layer is doing
today, but still we felt that having 4k segments even on larger
PAGE_SIZE systems was a decent compromise.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/2] IB/iser: set max_segment_size
       [not found]     ` <1460470405-11673-3-git-send-email-hch-jcswGhMUV9g@public.gmane.org>
@ 2016-04-28  7:40       ` Or Gerlitz
  0 siblings, 0 replies; 20+ messages in thread
From: Or Gerlitz @ 2016-04-28  7:40 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, linux-scsi-u79uwXL29TY76Z2rM5mHXA

On Tue, Apr 12, 2016 at 5:13 PM, Christoph Hellwig <hch-jcswGhMUV9g@public.gmane.org> wrote:
> So that we don't overflow the number of MR segments allocated because
> we have to split on SGL segment into multiple MR segments.
>
> Signed-off-by: Christoph Hellwig <hch-jcswGhMUV9g@public.gmane.org>

nit, but please fix


IB/iser: set [...] --> IB/iser: Set max segment size in the SCSI host template

I prefer to avoid names of variables and functions in commit titles
for the iser initiator driver.

I find the usage of higher language very beneficial for maintenance  and such.

Sagi, I would appreciate if you avoid acking patches lacking this
practice for iser-I,
specifically we want people to use capital letter in the 1st word that
follows that IB/iser: prefix - ok?

Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/2] IB/iser: set max_segment_size
       [not found]                 ` <20160412184309.GA3333-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
@ 2016-04-13 14:07                   ` Christoph Hellwig
  0 siblings, 0 replies; 20+ messages in thread
From: Christoph Hellwig @ 2016-04-13 14:07 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Christoph Hellwig, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	linux-scsi-u79uwXL29TY76Z2rM5mHXA,
	linux-block-u79uwXL29TY76Z2rM5mHXA

On Tue, Apr 12, 2016 at 11:43:09AM -0700, Christoph Hellwig wrote:
> > I think this means that there is a mismatch between the current block layer
> > limits and what NVMe / RDMA drivers need ...
> 
> If we tell the block layer that we can only handle page sized comments
> using max_segent_size it should do the right thing.
> 
> Now one thing that might be useful is to force the max_segent_size
> when setting the virt boundary, as they seem to be closely related.

I looked into this, and found something that also means the the patches
in this series unfortunately won't work as expected:

blk_queue_max_segment_size checks that we at least set the maximum
segment size to the systen page size.  This means we can't actually
set the segment size to the virt boundary for the case where it's fixed
4k (iSER, NVMe).  So I think we'll have to stick to setting max_sectors
to

	((max_segments - 1) * page_size) >> 9

for all these drivers.  I'll retract this patch and will send a new
one implementing this in iSER, and also research if a common helper
for it makes sense.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/2] IB/iser: set max_segment_size
  2016-04-12 14:13   ` [PATCH 2/2] IB/iser: set max_segment_size Christoph Hellwig
  2016-04-12 15:34     ` Bart Van Assche
@ 2016-04-13  9:39     ` Sagi Grimberg
       [not found]     ` <1460470405-11673-3-git-send-email-hch-jcswGhMUV9g@public.gmane.org>
  2 siblings, 0 replies; 20+ messages in thread
From: Sagi Grimberg @ 2016-04-13  9:39 UTC (permalink / raw)
  To: Christoph Hellwig, linux-rdma, linux-scsi

Acked-by: Sagi Grimberg <sagi@grimberg.me>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/2] IB/iser: set max_segment_size
       [not found]             ` <570D3AE0.4040104-XdAiOPVOjttBDgjK7y7TUQ@public.gmane.org>
@ 2016-04-12 18:43               ` Christoph Hellwig
       [not found]                 ` <20160412184309.GA3333-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
  0 siblings, 1 reply; 20+ messages in thread
From: Christoph Hellwig @ 2016-04-12 18:43 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Christoph Hellwig, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	linux-scsi-u79uwXL29TY76Z2rM5mHXA,
	linux-block-u79uwXL29TY76Z2rM5mHXA

On Tue, Apr 12, 2016 at 11:13:52AM -0700, Bart Van Assche wrote:
> >That's what NVMe does, but I don't think it's a good idea.  Because
> >of the unaligned start into the page this means you have to set the limit
> >to one lower than the actual hardware limit.
> 
> I think this means that there is a mismatch between the current block layer
> limits and what NVMe / RDMA drivers need ...

If we tell the block layer that we can only handle page sized comments
using max_segent_size it should do the right thing.

Now one thing that might be useful is to force the max_segent_size
when setting the virt boundary, as they seem to be closely related.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/2] IB/iser: set max_segment_size
       [not found]         ` <20160412165130.GB9568-jcswGhMUV9g@public.gmane.org>
@ 2016-04-12 18:13           ` Bart Van Assche
       [not found]             ` <570D3AE0.4040104-XdAiOPVOjttBDgjK7y7TUQ@public.gmane.org>
  0 siblings, 1 reply; 20+ messages in thread
From: Bart Van Assche @ 2016-04-12 18:13 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, linux-scsi-u79uwXL29TY76Z2rM5mHXA

On 04/12/2016 09:51 AM, Christoph Hellwig wrote:
> On Tue, Apr 12, 2016 at 08:34:03AM -0700, Bart Van Assche wrote:
>> ib_sg_to_pages() can handle segments that are larger than mr->page_size.
>
> The interface handles it fine, but we'll still end up using a segment
> per (MR) page.
>
>> Have you considered to set queue_limits.max_hw_sectors instead of
>> max_segment_size?
>
> That's what NVMe does, but I don't think it's a good idea.  Because
> of the unaligned start into the page this means you have to set the limit
> to one lower than the actual hardware limit.

I think this means that there is a mismatch between the current block 
layer limits and what NVMe / RDMA drivers need ...

Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/2] IB/iser: set max_segment_size
  2016-04-12 15:34     ` Bart Van Assche
@ 2016-04-12 16:51       ` Christoph Hellwig
       [not found]         ` <20160412165130.GB9568-jcswGhMUV9g@public.gmane.org>
  0 siblings, 1 reply; 20+ messages in thread
From: Christoph Hellwig @ 2016-04-12 16:51 UTC (permalink / raw)
  To: Bart Van Assche; +Cc: Christoph Hellwig, linux-rdma, linux-scsi

On Tue, Apr 12, 2016 at 08:34:03AM -0700, Bart Van Assche wrote:
> Hello Christoph,
>
> ib_sg_to_pages() can handle segments that are larger than mr->page_size. 

The interface handles it fine, but we'll still end up using a segment
per (MR) page.

> Have you considered to set queue_limits.max_hw_sectors instead of 
> max_segment_size?

That's what NVMe does, but I don't think it's a good idea.  Because
of the unaligned start into the page this means you have to set the limit
to one lower than the actual hardware limit.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/2] IB/iser: set max_segment_size
  2016-04-12 14:13   ` [PATCH 2/2] IB/iser: set max_segment_size Christoph Hellwig
@ 2016-04-12 15:34     ` Bart Van Assche
  2016-04-12 16:51       ` Christoph Hellwig
  2016-04-13  9:39     ` Sagi Grimberg
       [not found]     ` <1460470405-11673-3-git-send-email-hch-jcswGhMUV9g@public.gmane.org>
  2 siblings, 1 reply; 20+ messages in thread
From: Bart Van Assche @ 2016-04-12 15:34 UTC (permalink / raw)
  To: Christoph Hellwig, linux-rdma, linux-scsi

On 04/12/2016 07:13 AM, Christoph Hellwig wrote:
> So that we don't overflow the number of MR segments allocated because
> we have to split on SGL segment into multiple MR segments.
>
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>   drivers/infiniband/ulp/iser/iscsi_iser.c | 1 +
>   1 file changed, 1 insertion(+)
>
> diff --git a/drivers/infiniband/ulp/iser/iscsi_iser.c b/drivers/infiniband/ulp/iser/iscsi_iser.c
> index 80b6bed..dc55950 100644
> --- a/drivers/infiniband/ulp/iser/iscsi_iser.c
> +++ b/drivers/infiniband/ulp/iser/iscsi_iser.c
> @@ -623,6 +623,7 @@ iscsi_iser_session_create(struct iscsi_endpoint *ep,
>   	shost->max_id = 0;
>   	shost->max_channel = 0;
>   	shost->max_cmd_len = 16;
> +	shost->max_segment_size = SIZE_4K;
>
>   	/*
>   	 * older userspace tools (before 2.0-870) did not pass us

Hello Christoph,

ib_sg_to_pages() can handle segments that are larger than mr->page_size. 
Have you considered to set queue_limits.max_hw_sectors instead of 
max_segment_size?

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 2/2] IB/iser: set max_segment_size
       [not found] ` <1460470405-11673-1-git-send-email-hch-jcswGhMUV9g@public.gmane.org>
@ 2016-04-12 14:13   ` Christoph Hellwig
  2016-04-12 15:34     ` Bart Van Assche
                       ` (2 more replies)
  0 siblings, 3 replies; 20+ messages in thread
From: Christoph Hellwig @ 2016-04-12 14:13 UTC (permalink / raw)
  To: linux-rdma-u79uwXL29TY76Z2rM5mHXA, linux-scsi-u79uwXL29TY76Z2rM5mHXA

So that we don't overflow the number of MR segments allocated because
we have to split on SGL segment into multiple MR segments.

Signed-off-by: Christoph Hellwig <hch-jcswGhMUV9g@public.gmane.org>
---
 drivers/infiniband/ulp/iser/iscsi_iser.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/infiniband/ulp/iser/iscsi_iser.c b/drivers/infiniband/ulp/iser/iscsi_iser.c
index 80b6bed..dc55950 100644
--- a/drivers/infiniband/ulp/iser/iscsi_iser.c
+++ b/drivers/infiniband/ulp/iser/iscsi_iser.c
@@ -623,6 +623,7 @@ iscsi_iser_session_create(struct iscsi_endpoint *ep,
 	shost->max_id = 0;
 	shost->max_channel = 0;
 	shost->max_cmd_len = 16;
+	shost->max_segment_size = SIZE_4K;
 
 	/*
 	 * older userspace tools (before 2.0-870) did not pass us
-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2016-04-28  7:40 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-04-11 22:47 fix large I/O regression with iSER in 4.4+ Christoph Hellwig
     [not found] ` <1460414846-29540-1-git-send-email-hch-jcswGhMUV9g@public.gmane.org>
2016-04-11 22:47   ` [PATCH 1/2] scsi: add a max_segment_size limitation to struct Scsi_Host Christoph Hellwig
2016-04-11 23:22     ` Laurence Oberman
2016-04-11 23:32     ` Bart Van Assche
     [not found]       ` <570C3400.9080503-XdAiOPVOjttBDgjK7y7TUQ@public.gmane.org>
2016-04-11 23:35         ` Christoph Hellwig
2016-04-11 23:44         ` Laurence Oberman
     [not found]           ` <1317640777.28365032.1460418264272.JavaMail.zimbra-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2016-04-12 12:23             ` Laurence Oberman
2016-04-11 22:47 ` [PATCH 2/2] IB/iser: set max_segment_size Christoph Hellwig
2016-04-12 10:48   ` Sagi Grimberg
     [not found]     ` <570CD26E.70502-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org>
2016-04-12 14:27       ` James Bottomley
     [not found]         ` <1460471256.2338.5.camel-d9PhHud1JfjCXq6kfMZ53/egYHeGw8Jk@public.gmane.org>
2016-04-12 15:27           ` Christoph Hellwig
2016-04-13  8:01           ` Sagi Grimberg
2016-04-12 14:13 fix large I/O regression with iSER in 4.4+ V2 Christoph Hellwig
     [not found] ` <1460470405-11673-1-git-send-email-hch-jcswGhMUV9g@public.gmane.org>
2016-04-12 14:13   ` [PATCH 2/2] IB/iser: set max_segment_size Christoph Hellwig
2016-04-12 15:34     ` Bart Van Assche
2016-04-12 16:51       ` Christoph Hellwig
     [not found]         ` <20160412165130.GB9568-jcswGhMUV9g@public.gmane.org>
2016-04-12 18:13           ` Bart Van Assche
     [not found]             ` <570D3AE0.4040104-XdAiOPVOjttBDgjK7y7TUQ@public.gmane.org>
2016-04-12 18:43               ` Christoph Hellwig
     [not found]                 ` <20160412184309.GA3333-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
2016-04-13 14:07                   ` Christoph Hellwig
2016-04-13  9:39     ` Sagi Grimberg
     [not found]     ` <1460470405-11673-3-git-send-email-hch-jcswGhMUV9g@public.gmane.org>
2016-04-28  7:40       ` Or Gerlitz

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.