* T10 PI offloading seems to be broken in iser/isert in 4.5/4.6-rc
@ 2016-04-08 22:15 Christoph Hellwig
2016-04-08 23:48 ` Nicholas A. Bellinger
0 siblings, 1 reply; 9+ messages in thread
From: Christoph Hellwig @ 2016-04-08 22:15 UTC (permalink / raw)
To: sagig; +Cc: linux-rdma, target-devel
Hi all,
I've recently started looking into adding support for signature MRs
in the RDMA R/W API, but it seems like it's already broken upstream.
I suspect it's the target that's at fault, but I'm not sure. I've
tested 4.5 and todays 4.6-rc tree on both the target and the client
side.
The setup is pretty simple: on the target I've loaded scsi_debug
with dif=1 and dix=1, and set I've set fabric_prot_type=1, t10_pi=1
and ImmediateData to No on the tpg, and then did a simple mkfs.ext4.
Just doing the login I get the following error on the target:
[ 369.890182] Bad block size given: 0
[ 369.894131] mlx5_0:mlx5_ib_post_send:3296:(pid 331):
[ 369.894228] isert: isert_reg_sig_mr: fast registration failed, ret:-22
[ 369.901580] isert: isert_handle_prot_cmd: conn ffff880904d54000 failed to fast reg mr
[ 369.906348] iSCSI/iqn.1993-08.org.debian:01:197dad5d1f17: Unsupported SCSI Opcode 0xa3, sending CHECK_CONDITION.
[ 369.910395] isert: isert_put_datain: Cmd: ffff880902b26b00 failed to prepare RDMA res
If I then do an actual I/O think go downhill quickly.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: T10 PI offloading seems to be broken in iser/isert in 4.5/4.6-rc
2016-04-08 22:15 T10 PI offloading seems to be broken in iser/isert in 4.5/4.6-rc Christoph Hellwig
@ 2016-04-08 23:48 ` Nicholas A. Bellinger
2016-04-09 0:14 ` Christoph Hellwig
0 siblings, 1 reply; 9+ messages in thread
From: Nicholas A. Bellinger @ 2016-04-08 23:48 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: sagig, linux-rdma, target-devel
On Fri, 2016-04-08 at 15:15 -0700, Christoph Hellwig wrote:
> Hi all,
>
> I've recently started looking into adding support for signature MRs
> in the RDMA R/W API, but it seems like it's already broken upstream.
>
> I suspect it's the target that's at fault, but I'm not sure. I've
> tested 4.5 and todays 4.6-rc tree on both the target and the client
> side.
>
> The setup is pretty simple: on the target I've loaded scsi_debug
> with dif=1 and dix=1, and set I've set fabric_prot_type=1, t10_pi=1
> and ImmediateData to No on the tpg, and then did a simple mkfs.ext4.
>
> Just doing the login I get the following error on the target:
>
> [ 369.890182] Bad block size given: 0
> [ 369.894131] mlx5_0:mlx5_ib_post_send:3296:(pid 331):
> [ 369.894228] isert: isert_reg_sig_mr: fast registration failed, ret:-22
> [ 369.901580] isert: isert_handle_prot_cmd: conn ffff880904d54000 failed to fast reg mr
> [ 369.906348] iSCSI/iqn.1993-08.org.debian:01:197dad5d1f17: Unsupported SCSI Opcode 0xa3, sending CHECK_CONDITION.
> [ 369.910395] isert: isert_put_datain: Cmd: ffff880902b26b00 failed to prepare RDMA res
>
> If I then do an actual I/O think go downhill quickly.
It looks like the isert_set_dif_domain() assignments of
ib_sig_domain->sig.dif.* from dev->dev_attrib.block_size + friends is
being zeroed before making it into mlx5_ib_post_send() ->
IB_WR_REG_SIG_MR -> set_sig_umr_wr() -> set_sig_data_segment().
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: T10 PI offloading seems to be broken in iser/isert in 4.5/4.6-rc
2016-04-08 23:48 ` Nicholas A. Bellinger
@ 2016-04-09 0:14 ` Christoph Hellwig
2016-04-09 2:36 ` Nicholas A. Bellinger
0 siblings, 1 reply; 9+ messages in thread
From: Christoph Hellwig @ 2016-04-09 0:14 UTC (permalink / raw)
To: Nicholas A. Bellinger; +Cc: Christoph Hellwig, sagig, linux-rdma, target-devel
On Fri, Apr 08, 2016 at 04:48:10PM -0700, Nicholas A. Bellinger wrote:
> It looks like the isert_set_dif_domain() assignments of
> ib_sig_domain->sig.dif.* from dev->dev_attrib.block_size + friends is
> being zeroed before making it into mlx5_ib_post_send() ->
> IB_WR_REG_SIG_MR -> set_sig_umr_wr() -> set_sig_data_segment().
>From my limited understanding that's not the case, even if the weird
printk about the bad block size might suggest it.
I think what we have is the TARGET_PROT_DIN_INSERT /
TARGET_PROT_DOUT_STRIP case (probably the first), and thus
sig_attrs->mem.sig_type is set to IB_SIG_TYPE_NONE, but for some
reason the prot pointer is set, which will lead to this misleading
printk.
Does anyone know a recent working version of T10 PI in isert? There
aren't any recent-ish changes that look related.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: T10 PI offloading seems to be broken in iser/isert in 4.5/4.6-rc
2016-04-09 0:14 ` Christoph Hellwig
@ 2016-04-09 2:36 ` Nicholas A. Bellinger
2016-04-09 4:58 ` Christoph Hellwig
[not found] ` <1460169374.5010.4.camel-XoQW25Eq2zviZyQQd+hFbcojREIfoBdhmpATvIKMPHk@public.gmane.org>
0 siblings, 2 replies; 9+ messages in thread
From: Nicholas A. Bellinger @ 2016-04-09 2:36 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: sagig, linux-rdma, target-devel
On Fri, 2016-04-08 at 17:14 -0700, Christoph Hellwig wrote:
> On Fri, Apr 08, 2016 at 04:48:10PM -0700, Nicholas A. Bellinger wrote:
> > It looks like the isert_set_dif_domain() assignments of
> > ib_sig_domain->sig.dif.* from dev->dev_attrib.block_size + friends is
> > being zeroed before making it into mlx5_ib_post_send() ->
> > IB_WR_REG_SIG_MR -> set_sig_umr_wr() -> set_sig_data_segment().
>
> From my limited understanding that's not the case, even if the weird
> printk about the bad block size might suggest it.
>
> I think what we have is the TARGET_PROT_DIN_INSERT /
> TARGET_PROT_DOUT_STRIP case (probably the first), and thus
> sig_attrs->mem.sig_type is set to IB_SIG_TYPE_NONE, but for some
> reason the prot pointer is set, which will lead to this misleading
> printk.
Ah, yes. However, TARGET_PROT_DIN_INSERT / TARGET_PROT_DOUT_STRIP
will only be happening in sbc_set_prot_op_checks() if fabric_prot = true
for a backend that does not support PI.
Eg: fabric_prot is for the special case where the fabric supports PI,
but the backend does not, and the normal feature bits a device would
expose for PI are emulated based upon the fabric protection features.
Since your backend does support PI, cmd->prot_op should always be
TARGET_PROT_*_PASS regardless.
The other thing to check is that isert_get_sup_prot_ops() is returning
TARGET_PROT_ALL when tpg->tpg_attrib.t10_pi = true.
>
> Does anyone know a recent working version of T10 PI in isert? There
> aren't any recent-ish changes that look related.
Sagi..?
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: T10 PI offloading seems to be broken in iser/isert in 4.5/4.6-rc
2016-04-09 2:36 ` Nicholas A. Bellinger
@ 2016-04-09 4:58 ` Christoph Hellwig
[not found] ` <20160409045846.GA9269-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
[not found] ` <1460169374.5010.4.camel-XoQW25Eq2zviZyQQd+hFbcojREIfoBdhmpATvIKMPHk@public.gmane.org>
1 sibling, 1 reply; 9+ messages in thread
From: Christoph Hellwig @ 2016-04-09 4:58 UTC (permalink / raw)
To: Nicholas A. Bellinger; +Cc: sagig, linux-rdma, target-devel
On Fri, Apr 08, 2016 at 07:36:14PM -0700, Nicholas A. Bellinger wrote:
> Ah, yes. However, TARGET_PROT_DIN_INSERT / TARGET_PROT_DOUT_STRIP
> will only be happening in sbc_set_prot_op_checks() if fabric_prot = true
> for a backend that does not support PI.
>
> Eg: fabric_prot is for the special case where the fabric supports PI,
> but the backend does not, and the normal feature bits a device would
> expose for PI are emulated based upon the fabric protection features.
>
> Since your backend does support PI, cmd->prot_op should always be
> TARGET_PROT_*_PASS regardless.
>
> The other thing to check is that isert_get_sup_prot_ops() is returning
> TARGET_PROT_ALL when tpg->tpg_attrib.t10_pi = true.
Ok, it turns out this problem was a second LUN that doesn't support
PI. So the reported bug is for a this fabrics_prot case, the scsi_debug
LUN was actually doing fine.
The root cause seems to be that transport_generic_new_cmd allocates
a prot_sg for all protection cases, which causes isert_reg_sig_mr
to assign a value to sig_wr.prot even for the strip/insert case,
which will then cause mlx5 to blow up. But even when fixing that
I run into data compare errors, so there's defintively something deeper
hiding here.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: T10 PI offloading seems to be broken in iser/isert in 4.5/4.6-rc
[not found] ` <1460169374.5010.4.camel-XoQW25Eq2zviZyQQd+hFbcojREIfoBdhmpATvIKMPHk@public.gmane.org>
@ 2016-04-10 8:42 ` sagig
0 siblings, 0 replies; 9+ messages in thread
From: sagig @ 2016-04-10 8:42 UTC (permalink / raw)
To: Nicholas A. Bellinger, Christoph Hellwig
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA,
target-devel-u79uwXL29TY76Z2rM5mHXA, Max Gurtovoy,
jennyf-VPRAkNaXOzVWk0Htik3J/w
>> Does anyone know a recent working version of T10 PI in isert? There
>> aren't any recent-ish changes that look related.
> Sagi..?
The T10-PI isert support should be working fine AFAICT.
I know Mellanox folks are testing it nightly.
CCing Max & Jenny
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: T10 PI offloading seems to be broken in iser/isert in 4.5/4.6-rc
[not found] ` <20160409045846.GA9269-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
@ 2016-04-10 8:53 ` Sagi Grimberg
2016-04-10 14:37 ` Christoph Hellwig
0 siblings, 1 reply; 9+ messages in thread
From: Sagi Grimberg @ 2016-04-10 8:53 UTC (permalink / raw)
To: Christoph Hellwig, Nicholas A. Bellinger
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, target-devel-u79uwXL29TY76Z2rM5mHXA
On 09/04/16 07:58, Christoph Hellwig wrote:
> On Fri, Apr 08, 2016 at 07:36:14PM -0700, Nicholas A. Bellinger wrote:
>> Ah, yes. However, TARGET_PROT_DIN_INSERT / TARGET_PROT_DOUT_STRIP
>> will only be happening in sbc_set_prot_op_checks() if fabric_prot = true
>> for a backend that does not support PI.
>>
>> Eg: fabric_prot is for the special case where the fabric supports PI,
>> but the backend does not, and the normal feature bits a device would
>> expose for PI are emulated based upon the fabric protection features.
>>
>> Since your backend does support PI, cmd->prot_op should always be
>> TARGET_PROT_*_PASS regardless.
>>
>> The other thing to check is that isert_get_sup_prot_ops() is returning
>> TARGET_PROT_ALL when tpg->tpg_attrib.t10_pi = true.
> Ok, it turns out this problem was a second LUN that doesn't support
> PI. So the reported bug is for a this fabrics_prot case, the scsi_debug
> LUN was actually doing fine.
OK, good to know.
Since the fabric_prot was added later, I don't think it was properly
tested (and from what I've seen, not a very interesting use-case).
>
> The root cause seems to be that transport_generic_new_cmd allocates
> a prot_sg for all protection cases, which causes isert_reg_sig_mr
> to assign a value to sig_wr.prot even for the strip/insert case,
> which will then cause mlx5 to blow up. But even when fixing that
> I run into data compare errors, so there's defintively something deeper
> hiding here.
I don't have access to mlx5 devices at the moment so I can't help a lot :(
But data integrity errors sound strange...
Is it possible that the the sess_prot_type is not set correctly?
P.S.
I'm no longer available in @mellanox.com
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: T10 PI offloading seems to be broken in iser/isert in 4.5/4.6-rc
2016-04-10 8:53 ` Sagi Grimberg
@ 2016-04-10 14:37 ` Christoph Hellwig
2016-04-10 15:30 ` Sagi Grimberg
0 siblings, 1 reply; 9+ messages in thread
From: Christoph Hellwig @ 2016-04-10 14:37 UTC (permalink / raw)
To: Sagi Grimberg
Cc: Christoph Hellwig, Nicholas A. Bellinger, linux-rdma, target-devel
On Sun, Apr 10, 2016 at 11:53:14AM +0300, Sagi Grimberg wrote:
> OK, good to know.
>
> Since the fabric_prot was added later, I don't think it was properly
> tested (and from what I've seen, not a very interesting use-case).
It's been there for more a year, and once your have non-T10 capable
LUNs in a TPG it gets turned on automatically it seems..
> P.S.
> I'm no longer available in @mellanox.com
I keep messing that all the time by grabbing an address from the list
archives, sorry. That's why people should only use personal addresses
on mailing lists :)
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: T10 PI offloading seems to be broken in iser/isert in 4.5/4.6-rc
2016-04-10 14:37 ` Christoph Hellwig
@ 2016-04-10 15:30 ` Sagi Grimberg
0 siblings, 0 replies; 9+ messages in thread
From: Sagi Grimberg @ 2016-04-10 15:30 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: Nicholas A. Bellinger, linux-rdma, target-devel
>> Since the fabric_prot was added later, I don't think it was properly
>> tested (and from what I've seen, not a very interesting use-case).
>
> It's been there for more a year, and once your have non-T10 capable
> LUNs in a TPG it gets turned on automatically it seems..
Yea, but I'm not aware of anyone running non-PI together with PI.
Usually people use either/or (although it can make sense to mix
if you have different classes of data sets).
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2016-04-10 15:30 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-04-08 22:15 T10 PI offloading seems to be broken in iser/isert in 4.5/4.6-rc Christoph Hellwig
2016-04-08 23:48 ` Nicholas A. Bellinger
2016-04-09 0:14 ` Christoph Hellwig
2016-04-09 2:36 ` Nicholas A. Bellinger
2016-04-09 4:58 ` Christoph Hellwig
[not found] ` <20160409045846.GA9269-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
2016-04-10 8:53 ` Sagi Grimberg
2016-04-10 14:37 ` Christoph Hellwig
2016-04-10 15:30 ` Sagi Grimberg
[not found] ` <1460169374.5010.4.camel-XoQW25Eq2zviZyQQd+hFbcojREIfoBdhmpATvIKMPHk@public.gmane.org>
2016-04-10 8:42 ` sagig
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.