* [PATCH target-pending] iscsi-target: make sure to wake up sleeping login worker
@ 2018-01-19 13:36 ` Florian Westphal
0 siblings, 0 replies; 8+ messages in thread
From: Florian Westphal @ 2018-01-19 13:36 UTC (permalink / raw)
To: target-devel; +Cc: mchristi, nab, netdev, linux-scsi, Florian Westphal
Mike Christie reports:
Starting in 4.14 iscsi logins will fail around 50% of the time.
Problem appears to be that iscsi_target_sk_data_ready() callback may
return without doing anything in case it finds the login work queue
is still blocked in sock_recvmsg().
Nicholas Bellinger says:
It would indicate users providing their own ->sk_data_ready() callback
must be responsible for waking up a kthread context blocked on
sock_recvmsg(..., MSG_WAITALL), when a second ->sk_data_ready() is
received before the first sock_recvmsg(..., MSG_WAITALL) completes.
So, do this and invoke the original data_ready() callback -- in
case of tcp sockets this takes care of waking the thread.
Disclaimer: I do not understand why this problem did not show up before
tcp prequeue removal.
Reported-by: Mike Christie <mchristi@redhat.com>
Bisected-by: Mike Christie <mchristi@redhat.com>
Tested-by: Mike Christie <mchristi@redhat.com>
Diagnosed-by: Nicholas Bellinger <nab@linux-iscsi.org>
Fixes: e7942d0633c4 ("tcp: remove prequeue support")
Signed-off-by: Florian Westphal <fw@strlen.de>
---
drivers/target/iscsi/iscsi_target_nego.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/target/iscsi/iscsi_target_nego.c b/drivers/target/iscsi/iscsi_target_nego.c
index b686e2ce9c0e..3723f8f419aa 100644
--- a/drivers/target/iscsi/iscsi_target_nego.c
+++ b/drivers/target/iscsi/iscsi_target_nego.c
@@ -432,6 +432,9 @@ static void iscsi_target_sk_data_ready(struct sock *sk)
if (test_and_set_bit(LOGIN_FLAGS_READ_ACTIVE, &conn->login_flags)) {
write_unlock_bh(&sk->sk_callback_lock);
pr_debug("Got LOGIN_FLAGS_READ_ACTIVE=1, conn: %p >>>>\n", conn);
+ if (WARN_ON(iscsi_target_sk_data_ready == conn->orig_data_ready))
+ return;
+ conn->orig_data_ready(sk);
return;
}
--
2.13.6
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH target-pending] iscsi-target: make sure to wake up sleeping login worker
@ 2018-01-19 13:36 ` Florian Westphal
0 siblings, 0 replies; 8+ messages in thread
From: Florian Westphal @ 2018-01-19 13:36 UTC (permalink / raw)
To: target-devel; +Cc: mchristi, nab, netdev, linux-scsi, Florian Westphal
Mike Christie reports:
Starting in 4.14 iscsi logins will fail around 50% of the time.
Problem appears to be that iscsi_target_sk_data_ready() callback may
return without doing anything in case it finds the login work queue
is still blocked in sock_recvmsg().
Nicholas Bellinger says:
It would indicate users providing their own ->sk_data_ready() callback
must be responsible for waking up a kthread context blocked on
sock_recvmsg(..., MSG_WAITALL), when a second ->sk_data_ready() is
received before the first sock_recvmsg(..., MSG_WAITALL) completes.
So, do this and invoke the original data_ready() callback -- in
case of tcp sockets this takes care of waking the thread.
Disclaimer: I do not understand why this problem did not show up before
tcp prequeue removal.
Reported-by: Mike Christie <mchristi@redhat.com>
Bisected-by: Mike Christie <mchristi@redhat.com>
Tested-by: Mike Christie <mchristi@redhat.com>
Diagnosed-by: Nicholas Bellinger <nab@linux-iscsi.org>
Fixes: e7942d0633c4 ("tcp: remove prequeue support")
Signed-off-by: Florian Westphal <fw@strlen.de>
---
drivers/target/iscsi/iscsi_target_nego.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/target/iscsi/iscsi_target_nego.c b/drivers/target/iscsi/iscsi_target_nego.c
index b686e2ce9c0e..3723f8f419aa 100644
--- a/drivers/target/iscsi/iscsi_target_nego.c
+++ b/drivers/target/iscsi/iscsi_target_nego.c
@@ -432,6 +432,9 @@ static void iscsi_target_sk_data_ready(struct sock *sk)
if (test_and_set_bit(LOGIN_FLAGS_READ_ACTIVE, &conn->login_flags)) {
write_unlock_bh(&sk->sk_callback_lock);
pr_debug("Got LOGIN_FLAGS_READ_ACTIVE=1, conn: %p >>>>\n", conn);
+ if (WARN_ON(iscsi_target_sk_data_ready = conn->orig_data_ready))
+ return;
+ conn->orig_data_ready(sk);
return;
}
--
2.13.6
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH target-pending] iscsi-target: make sure to wake up sleeping login worker
2018-01-19 13:36 ` Florian Westphal
@ 2018-01-19 15:46 ` Eric Dumazet
-1 siblings, 0 replies; 8+ messages in thread
From: Eric Dumazet @ 2018-01-19 15:46 UTC (permalink / raw)
To: Florian Westphal, target-devel; +Cc: mchristi, nab, netdev, linux-scsi
On Fri, 2018-01-19 at 14:36 +0100, Florian Westphal wrote:
> Mike Christie reports:
> Starting in 4.14 iscsi logins will fail around 50% of the time.
>
> Problem appears to be that iscsi_target_sk_data_ready() callback may
> return without doing anything in case it finds the login work queue
> is still blocked in sock_recvmsg().
>
> Nicholas Bellinger says:
> It would indicate users providing their own ->sk_data_ready() callback
> must be responsible for waking up a kthread context blocked on
> sock_recvmsg(..., MSG_WAITALL), when a second ->sk_data_ready() is
> received before the first sock_recvmsg(..., MSG_WAITALL) completes.
>
> So, do this and invoke the original data_ready() callback -- in
> case of tcp sockets this takes care of waking the thread.
>
> Disclaimer: I do not understand why this problem did not show up before
> tcp prequeue removal.
>
> Reported-by: Mike Christie <mchristi@redhat.com>
> Bisected-by: Mike Christie <mchristi@redhat.com>
> Tested-by: Mike Christie <mchristi@redhat.com>
> Diagnosed-by: Nicholas Bellinger <nab@linux-iscsi.org>
> Fixes: e7942d0633c4 ("tcp: remove prequeue support")
> Signed-off-by: Florian Westphal <fw@strlen.de>
> ---
> drivers/target/iscsi/iscsi_target_nego.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/drivers/target/iscsi/iscsi_target_nego.c b/drivers/target/iscsi/iscsi_target_nego.c
> index b686e2ce9c0e..3723f8f419aa 100644
> --- a/drivers/target/iscsi/iscsi_target_nego.c
> +++ b/drivers/target/iscsi/iscsi_target_nego.c
> @@ -432,6 +432,9 @@ static void iscsi_target_sk_data_ready(struct sock *sk)
> if (test_and_set_bit(LOGIN_FLAGS_READ_ACTIVE, &conn->login_flags)) {
> write_unlock_bh(&sk->sk_callback_lock);
> pr_debug("Got LOGIN_FLAGS_READ_ACTIVE=1, conn: %p >>>>\n", conn);
> + if (WARN_ON(iscsi_target_sk_data_ready == conn->orig_data_ready))
> + return;
Is this WARN_ON() belonging to this fix ?
At least make it WARN_ON_ONCE() or pr_err_once()
> + conn->orig_data_ready(sk);
> return;
> }
>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH target-pending] iscsi-target: make sure to wake up sleeping login worker
@ 2018-01-19 15:46 ` Eric Dumazet
0 siblings, 0 replies; 8+ messages in thread
From: Eric Dumazet @ 2018-01-19 15:46 UTC (permalink / raw)
To: Florian Westphal, target-devel; +Cc: mchristi, nab, netdev, linux-scsi
On Fri, 2018-01-19 at 14:36 +0100, Florian Westphal wrote:
> Mike Christie reports:
> Starting in 4.14 iscsi logins will fail around 50% of the time.
>
> Problem appears to be that iscsi_target_sk_data_ready() callback may
> return without doing anything in case it finds the login work queue
> is still blocked in sock_recvmsg().
>
> Nicholas Bellinger says:
> It would indicate users providing their own ->sk_data_ready() callback
> must be responsible for waking up a kthread context blocked on
> sock_recvmsg(..., MSG_WAITALL), when a second ->sk_data_ready() is
> received before the first sock_recvmsg(..., MSG_WAITALL) completes.
>
> So, do this and invoke the original data_ready() callback -- in
> case of tcp sockets this takes care of waking the thread.
>
> Disclaimer: I do not understand why this problem did not show up before
> tcp prequeue removal.
>
> Reported-by: Mike Christie <mchristi@redhat.com>
> Bisected-by: Mike Christie <mchristi@redhat.com>
> Tested-by: Mike Christie <mchristi@redhat.com>
> Diagnosed-by: Nicholas Bellinger <nab@linux-iscsi.org>
> Fixes: e7942d0633c4 ("tcp: remove prequeue support")
> Signed-off-by: Florian Westphal <fw@strlen.de>
> ---
> drivers/target/iscsi/iscsi_target_nego.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/drivers/target/iscsi/iscsi_target_nego.c b/drivers/target/iscsi/iscsi_target_nego.c
> index b686e2ce9c0e..3723f8f419aa 100644
> --- a/drivers/target/iscsi/iscsi_target_nego.c
> +++ b/drivers/target/iscsi/iscsi_target_nego.c
> @@ -432,6 +432,9 @@ static void iscsi_target_sk_data_ready(struct sock *sk)
> if (test_and_set_bit(LOGIN_FLAGS_READ_ACTIVE, &conn->login_flags)) {
> write_unlock_bh(&sk->sk_callback_lock);
> pr_debug("Got LOGIN_FLAGS_READ_ACTIVE=1, conn: %p >>>>\n", conn);
> + if (WARN_ON(iscsi_target_sk_data_ready = conn->orig_data_ready))
> + return;
Is this WARN_ON() belonging to this fix ?
At least make it WARN_ON_ONCE() or pr_err_once()
> + conn->orig_data_ready(sk);
> return;
> }
>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH target-pending] iscsi-target: make sure to wake up sleeping login worker
2018-01-19 15:46 ` Eric Dumazet
@ 2018-01-19 17:26 ` Florian Westphal
-1 siblings, 0 replies; 8+ messages in thread
From: Florian Westphal @ 2018-01-19 17:26 UTC (permalink / raw)
To: Eric Dumazet
Cc: Florian Westphal, target-devel, mchristi, nab, netdev, linux-scsi
Eric Dumazet <eric.dumazet@gmail.com> wrote:
> On Fri, 2018-01-19 at 14:36 +0100, Florian Westphal wrote:
> > diff --git a/drivers/target/iscsi/iscsi_target_nego.c b/drivers/target/iscsi/iscsi_target_nego.c
> > index b686e2ce9c0e..3723f8f419aa 100644
> > --- a/drivers/target/iscsi/iscsi_target_nego.c
> > +++ b/drivers/target/iscsi/iscsi_target_nego.c
> > @@ -432,6 +432,9 @@ static void iscsi_target_sk_data_ready(struct sock *sk)
> > if (test_and_set_bit(LOGIN_FLAGS_READ_ACTIVE, &conn->login_flags)) {
> > write_unlock_bh(&sk->sk_callback_lock);
> > pr_debug("Got LOGIN_FLAGS_READ_ACTIVE=1, conn: %p >>>>\n", conn);
> > + if (WARN_ON(iscsi_target_sk_data_ready == conn->orig_data_ready))
> > + return;
>
> Is this WARN_ON() belonging to this fix ?
> At least make it WARN_ON_ONCE() or pr_err_once()
Nicholas, I don't know this code at all so it would be good if you could
give advice here (omit all together, WARN_ON_ONCE, ...).
Thanks!
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH target-pending] iscsi-target: make sure to wake up sleeping login worker
@ 2018-01-19 17:26 ` Florian Westphal
0 siblings, 0 replies; 8+ messages in thread
From: Florian Westphal @ 2018-01-19 17:26 UTC (permalink / raw)
To: Eric Dumazet
Cc: Florian Westphal, target-devel, mchristi, nab, netdev, linux-scsi
Eric Dumazet <eric.dumazet@gmail.com> wrote:
> On Fri, 2018-01-19 at 14:36 +0100, Florian Westphal wrote:
> > diff --git a/drivers/target/iscsi/iscsi_target_nego.c b/drivers/target/iscsi/iscsi_target_nego.c
> > index b686e2ce9c0e..3723f8f419aa 100644
> > --- a/drivers/target/iscsi/iscsi_target_nego.c
> > +++ b/drivers/target/iscsi/iscsi_target_nego.c
> > @@ -432,6 +432,9 @@ static void iscsi_target_sk_data_ready(struct sock *sk)
> > if (test_and_set_bit(LOGIN_FLAGS_READ_ACTIVE, &conn->login_flags)) {
> > write_unlock_bh(&sk->sk_callback_lock);
> > pr_debug("Got LOGIN_FLAGS_READ_ACTIVE=1, conn: %p >>>>\n", conn);
> > + if (WARN_ON(iscsi_target_sk_data_ready = conn->orig_data_ready))
> > + return;
>
> Is this WARN_ON() belonging to this fix ?
> At least make it WARN_ON_ONCE() or pr_err_once()
Nicholas, I don't know this code at all so it would be good if you could
give advice here (omit all together, WARN_ON_ONCE, ...).
Thanks!
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH target-pending] iscsi-target: make sure to wake up sleeping login worker
2018-01-19 17:26 ` Florian Westphal
@ 2018-01-24 7:01 ` Nicholas A. Bellinger
-1 siblings, 0 replies; 8+ messages in thread
From: Nicholas A. Bellinger @ 2018-01-24 7:01 UTC (permalink / raw)
To: Florian Westphal; +Cc: Eric Dumazet, target-devel, mchristi, netdev, linux-scsi
Hey Florian & Co,
On Fri, 2018-01-19 at 18:26 +0100, Florian Westphal wrote:
> Eric Dumazet <eric.dumazet@gmail.com> wrote:
> > On Fri, 2018-01-19 at 14:36 +0100, Florian Westphal wrote:
> > > diff --git a/drivers/target/iscsi/iscsi_target_nego.c b/drivers/target/iscsi/iscsi_target_nego.c
> > > index b686e2ce9c0e..3723f8f419aa 100644
> > > --- a/drivers/target/iscsi/iscsi_target_nego.c
> > > +++ b/drivers/target/iscsi/iscsi_target_nego.c
> > > @@ -432,6 +432,9 @@ static void iscsi_target_sk_data_ready(struct sock *sk)
> > > if (test_and_set_bit(LOGIN_FLAGS_READ_ACTIVE, &conn->login_flags)) {
> > > write_unlock_bh(&sk->sk_callback_lock);
> > > pr_debug("Got LOGIN_FLAGS_READ_ACTIVE=1, conn: %p >>>>\n", conn);
> > > + if (WARN_ON(iscsi_target_sk_data_ready == conn->orig_data_ready))
> > > + return;
> >
> > Is this WARN_ON() belonging to this fix ?
> > At least make it WARN_ON_ONCE() or pr_err_once()
>
> Nicholas, I don't know this code at all so it would be good if you could
> give advice here (omit all together, WARN_ON_ONCE, ...).
>
This is regular behavior during multi PDU login sequences, and should
not include a WARN_ON.
So with MNC's Tested-by in place, applying to target-pending/for-next
minus the WARN_ON, with a extra 4.14.y stable tag.
Thanks again for taking a look at this.
To your earlier point wrt net.ipv4.tcp_low_latency=1 on 4.13 code not
triggering pre-queue logic. From groking the original patch to drop
prequeue I agree this should really be the case, but am still at a loss
how MNC is triggering on 4.14+ unless something else has changed to
uncover this iscsi-target bug.
Still curious to verify the root cause, but I haven't been able to
reproduce this in VMs on small scale, and haven't had cycles to
reproduce on HW yet.
That said, since the bug appears to be masked on <= 4.13.y +
tcp_low_latency=1, unless someone can reproduce this on earlier code
with tcp_low_latency=0, I'll leave off the older stable tag for now.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH target-pending] iscsi-target: make sure to wake up sleeping login worker
@ 2018-01-24 7:01 ` Nicholas A. Bellinger
0 siblings, 0 replies; 8+ messages in thread
From: Nicholas A. Bellinger @ 2018-01-24 7:01 UTC (permalink / raw)
To: Florian Westphal; +Cc: Eric Dumazet, target-devel, mchristi, netdev, linux-scsi
Hey Florian & Co,
On Fri, 2018-01-19 at 18:26 +0100, Florian Westphal wrote:
> Eric Dumazet <eric.dumazet@gmail.com> wrote:
> > On Fri, 2018-01-19 at 14:36 +0100, Florian Westphal wrote:
> > > diff --git a/drivers/target/iscsi/iscsi_target_nego.c b/drivers/target/iscsi/iscsi_target_nego.c
> > > index b686e2ce9c0e..3723f8f419aa 100644
> > > --- a/drivers/target/iscsi/iscsi_target_nego.c
> > > +++ b/drivers/target/iscsi/iscsi_target_nego.c
> > > @@ -432,6 +432,9 @@ static void iscsi_target_sk_data_ready(struct sock *sk)
> > > if (test_and_set_bit(LOGIN_FLAGS_READ_ACTIVE, &conn->login_flags)) {
> > > write_unlock_bh(&sk->sk_callback_lock);
> > > pr_debug("Got LOGIN_FLAGS_READ_ACTIVE=1, conn: %p >>>>\n", conn);
> > > + if (WARN_ON(iscsi_target_sk_data_ready = conn->orig_data_ready))
> > > + return;
> >
> > Is this WARN_ON() belonging to this fix ?
> > At least make it WARN_ON_ONCE() or pr_err_once()
>
> Nicholas, I don't know this code at all so it would be good if you could
> give advice here (omit all together, WARN_ON_ONCE, ...).
>
This is regular behavior during multi PDU login sequences, and should
not include a WARN_ON.
So with MNC's Tested-by in place, applying to target-pending/for-next
minus the WARN_ON, with a extra 4.14.y stable tag.
Thanks again for taking a look at this.
To your earlier point wrt net.ipv4.tcp_low_latency=1 on 4.13 code not
triggering pre-queue logic. From groking the original patch to drop
prequeue I agree this should really be the case, but am still at a loss
how MNC is triggering on 4.14+ unless something else has changed to
uncover this iscsi-target bug.
Still curious to verify the root cause, but I haven't been able to
reproduce this in VMs on small scale, and haven't had cycles to
reproduce on HW yet.
That said, since the bug appears to be masked on <= 4.13.y +
tcp_low_latency=1, unless someone can reproduce this on earlier code
with tcp_low_latency=0, I'll leave off the older stable tag for now.
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2018-01-24 7:01 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-01-19 13:36 [PATCH target-pending] iscsi-target: make sure to wake up sleeping login worker Florian Westphal
2018-01-19 13:36 ` Florian Westphal
2018-01-19 15:46 ` Eric Dumazet
2018-01-19 15:46 ` Eric Dumazet
2018-01-19 17:26 ` Florian Westphal
2018-01-19 17:26 ` Florian Westphal
2018-01-24 7:01 ` Nicholas A. Bellinger
2018-01-24 7:01 ` Nicholas A. Bellinger
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.