All of lore.kernel.org
 help / color / mirror / Atom feed
* WARN_ON added to rpc_create()
@ 2016-08-03 15:27 Chuck Lever
  2016-08-03 17:47 ` J. Bruce Fields
  0 siblings, 1 reply; 15+ messages in thread
From: Chuck Lever @ 2016-08-03 15:27 UTC (permalink / raw)
  To: Bruce Fields; +Cc: Linux NFS Mailing List

Hi Bruce-

I see that commit 39a9beab5acb83176e8b9a4f0778749a09341f1f
Author:     J. Bruce Fields <bfields@redhat.com>
AuthorDate: Tue May 17 12:38:21 2016 -0400

    rpc: share one xps between all backchannels

has added this piece of code:

@@ -452,10 +452,20 @@ static struct rpc_clnt *rpc_create_xprt(struct rpc_create_args *args,
        struct rpc_clnt *clnt = NULL;
        struct rpc_xprt_switch *xps;
 
-       xps = xprt_switch_alloc(xprt, GFP_KERNEL);
-       if (xps == NULL) {
-               xprt_put(xprt);
-               return ERR_PTR(-ENOMEM);
+       if (args->bc_xprt && args->bc_xprt->xpt_bc_xps) {
+               WARN_ON(args->protocol != XPRT_TRANSPORT_BC_TCP);
+               xps = args->bc_xprt->xpt_bc_xps;
+               xprt_switch_get(xps);
+       } else {


the WARN_ON here fires on the server whenever I use NFSv4.1 on RDMA.

Can you say why it was added? Is there something RPC/RDMA needs to
do to make the code safe?


--
Chuck Lever




^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: WARN_ON added to rpc_create()
  2016-08-03 15:27 WARN_ON added to rpc_create() Chuck Lever
@ 2016-08-03 17:47 ` J. Bruce Fields
  2016-08-03 19:40   ` Chuck Lever
  0 siblings, 1 reply; 15+ messages in thread
From: J. Bruce Fields @ 2016-08-03 17:47 UTC (permalink / raw)
  To: Chuck Lever; +Cc: Bruce Fields, Linux NFS Mailing List

On Wed, Aug 03, 2016 at 11:27:47AM -0400, Chuck Lever wrote:
> Hi Bruce-
> 
> I see that commit 39a9beab5acb83176e8b9a4f0778749a09341f1f
> Author:     J. Bruce Fields <bfields@redhat.com>
> AuthorDate: Tue May 17 12:38:21 2016 -0400
> 
>     rpc: share one xps between all backchannels
> 
> has added this piece of code:
> 
> @@ -452,10 +452,20 @@ static struct rpc_clnt *rpc_create_xprt(struct rpc_create_args *args,
>         struct rpc_clnt *clnt = NULL;
>         struct rpc_xprt_switch *xps;
>  
> -       xps = xprt_switch_alloc(xprt, GFP_KERNEL);
> -       if (xps == NULL) {
> -               xprt_put(xprt);
> -               return ERR_PTR(-ENOMEM);
> +       if (args->bc_xprt && args->bc_xprt->xpt_bc_xps) {
> +               WARN_ON(args->protocol != XPRT_TRANSPORT_BC_TCP);
> +               xps = args->bc_xprt->xpt_bc_xps;
> +               xprt_switch_get(xps);
> +       } else {
> 
> 
> the WARN_ON here fires on the server whenever I use NFSv4.1 on RDMA.
> 
> Can you say why it was added? Is there something RPC/RDMA needs to
> do to make the code safe?

What is args->protocol in this case?

Digging around...  OK, I missed that BC_TCP and BC_RDMA were defined as
OR's of an XPRT_TRANSPORT_BC bit with the identifier of the underlying
transport.  That makes sense.

So, I should have just used XPRT_TRANSPORT_BC there--I think all I meant
was "is this a backchannel".

Does that fix the problem?

--b.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: WARN_ON added to rpc_create()
  2016-08-03 17:47 ` J. Bruce Fields
@ 2016-08-03 19:40   ` Chuck Lever
  2016-08-10 18:01     ` Chuck Lever
  2016-08-18 21:56     ` J. Bruce Fields
  0 siblings, 2 replies; 15+ messages in thread
From: Chuck Lever @ 2016-08-03 19:40 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: J. Bruce Fields, Linux NFS Mailing List


> On Aug 3, 2016, at 1:47 PM, bfields@fieldses.org wrote:
> 
> On Wed, Aug 03, 2016 at 11:27:47AM -0400, Chuck Lever wrote:
>> Hi Bruce-
>> 
>> I see that commit 39a9beab5acb83176e8b9a4f0778749a09341f1f
>> Author:     J. Bruce Fields <bfields@redhat.com>
>> AuthorDate: Tue May 17 12:38:21 2016 -0400
>> 
>>    rpc: share one xps between all backchannels
>> 
>> has added this piece of code:
>> 
>> @@ -452,10 +452,20 @@ static struct rpc_clnt *rpc_create_xprt(struct rpc_create_args *args,
>>        struct rpc_clnt *clnt = NULL;
>>        struct rpc_xprt_switch *xps;
>> 
>> -       xps = xprt_switch_alloc(xprt, GFP_KERNEL);
>> -       if (xps == NULL) {
>> -               xprt_put(xprt);
>> -               return ERR_PTR(-ENOMEM);
>> +       if (args->bc_xprt && args->bc_xprt->xpt_bc_xps) {
>> +               WARN_ON(args->protocol != XPRT_TRANSPORT_BC_TCP);
>> +               xps = args->bc_xprt->xpt_bc_xps;
>> +               xprt_switch_get(xps);
>> +       } else {
>> 
>> 
>> the WARN_ON here fires on the server whenever I use NFSv4.1 on RDMA.
>> 
>> Can you say why it was added? Is there something RPC/RDMA needs to
>> do to make the code safe?
> 
> What is args->protocol in this case?
> 
> Digging around...  OK, I missed that BC_TCP and BC_RDMA were defined as
> OR's of an XPRT_TRANSPORT_BC bit with the identifier of the underlying
> transport.  That makes sense.
> 
> So, I should have just used XPRT_TRANSPORT_BC there--I think all I meant
> was "is this a backchannel".
> 
> Does that fix the problem?

This simple fix eliminates the log noise:

diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c
index 2808d55..f94caf7 100644
--- a/net/sunrpc/clnt.c
+++ b/net/sunrpc/clnt.c
@@ -520,7 +520,7 @@ struct rpc_clnt *rpc_create(struct rpc_create_args *args)
        char servername[48];
 
        if (args->bc_xprt) {
-               WARN_ON(args->protocol != XPRT_TRANSPORT_BC_TCP);
+               WARN_ON(!(args->protocol & XPRT_TRANSPORT_BC));
                xprt = args->bc_xprt->xpt_bc_xprt;
                if (xprt) {
                        xprt_get(xprt);


This code seems to come from:

commit d50039ea5ee63c589b0434baa5ecf6e5075bb6f9
Author:     J. Bruce Fields <bfields@redhat.com>
AuthorDate: Mon May 16 17:03:42 2016 -0400

    nfsd4/rpc: move backchannel create logic into rpc code


Where it may have been copied from:

-static struct rpc_clnt *create_backchannel_client(struct rpc_create_args *args)
-{
-       struct rpc_xprt *xprt;
-
-       if (args->protocol != XPRT_TRANSPORT_BC_TCP)
-               return rpc_create(args);
-
-       xprt = args->bc_xprt->xpt_bc_xprt;
-       if (xprt) {
-               xprt_get(xprt);
-               return rpc_create_xprt(args, xprt);
-       }
-
-       return rpc_create(args);
-}

There's no warning here. In fact, protocol != BC_TCP seems to
be expected.

I'm wondering if the warning is needed at all?


--
Chuck Lever




^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: WARN_ON added to rpc_create()
  2016-08-03 19:40   ` Chuck Lever
@ 2016-08-10 18:01     ` Chuck Lever
  2016-08-18 21:56       ` J. Bruce Fields
  2016-08-18 21:56     ` J. Bruce Fields
  1 sibling, 1 reply; 15+ messages in thread
From: Chuck Lever @ 2016-08-10 18:01 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: J. Bruce Fields, Linux NFS Mailing List

Following up.


> On Aug 3, 2016, at 3:40 PM, Chuck Lever <chuck.lever@oracle.com> wrote:
> 
>> 
>> On Aug 3, 2016, at 1:47 PM, bfields@fieldses.org wrote:
>> 
>> On Wed, Aug 03, 2016 at 11:27:47AM -0400, Chuck Lever wrote:
>>> Hi Bruce-
>>> 
>>> I see that commit 39a9beab5acb83176e8b9a4f0778749a09341f1f
>>> Author:     J. Bruce Fields <bfields@redhat.com>
>>> AuthorDate: Tue May 17 12:38:21 2016 -0400
>>> 
>>>   rpc: share one xps between all backchannels
>>> 
>>> has added this piece of code:
>>> 
>>> @@ -452,10 +452,20 @@ static struct rpc_clnt *rpc_create_xprt(struct rpc_create_args *args,
>>>       struct rpc_clnt *clnt = NULL;
>>>       struct rpc_xprt_switch *xps;
>>> 
>>> -       xps = xprt_switch_alloc(xprt, GFP_KERNEL);
>>> -       if (xps == NULL) {
>>> -               xprt_put(xprt);
>>> -               return ERR_PTR(-ENOMEM);
>>> +       if (args->bc_xprt && args->bc_xprt->xpt_bc_xps) {
>>> +               WARN_ON(args->protocol != XPRT_TRANSPORT_BC_TCP);
>>> +               xps = args->bc_xprt->xpt_bc_xps;
>>> +               xprt_switch_get(xps);
>>> +       } else {
>>> 
>>> 
>>> the WARN_ON here fires on the server whenever I use NFSv4.1 on RDMA.
>>> 
>>> Can you say why it was added? Is there something RPC/RDMA needs to
>>> do to make the code safe?
>> 
>> What is args->protocol in this case?
>> 
>> Digging around...  OK, I missed that BC_TCP and BC_RDMA were defined as
>> OR's of an XPRT_TRANSPORT_BC bit with the identifier of the underlying
>> transport.  That makes sense.
>> 
>> So, I should have just used XPRT_TRANSPORT_BC there--I think all I meant
>> was "is this a backchannel".
>> 
>> Does that fix the problem?
> 
> This simple fix eliminates the log noise:
> 
> diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c
> index 2808d55..f94caf7 100644
> --- a/net/sunrpc/clnt.c
> +++ b/net/sunrpc/clnt.c
> @@ -520,7 +520,7 @@ struct rpc_clnt *rpc_create(struct rpc_create_args *args)
>        char servername[48];
> 
>        if (args->bc_xprt) {
> -               WARN_ON(args->protocol != XPRT_TRANSPORT_BC_TCP);
> +               WARN_ON(!(args->protocol & XPRT_TRANSPORT_BC));
>                xprt = args->bc_xprt->xpt_bc_xprt;
>                if (xprt) {
>                        xprt_get(xprt);
> 
> 
> This code seems to come from:
> 
> commit d50039ea5ee63c589b0434baa5ecf6e5075bb6f9
> Author:     J. Bruce Fields <bfields@redhat.com>
> AuthorDate: Mon May 16 17:03:42 2016 -0400
> 
>    nfsd4/rpc: move backchannel create logic into rpc code
> 
> 
> Where it may have been copied from:
> 
> -static struct rpc_clnt *create_backchannel_client(struct rpc_create_args *args)
> -{
> -       struct rpc_xprt *xprt;
> -
> -       if (args->protocol != XPRT_TRANSPORT_BC_TCP)
> -               return rpc_create(args);
> -
> -       xprt = args->bc_xprt->xpt_bc_xprt;
> -       if (xprt) {
> -               xprt_get(xprt);
> -               return rpc_create_xprt(args, xprt);
> -       }
> -
> -       return rpc_create(args);
> -}
> 
> There's no warning here. In fact, protocol != BC_TCP seems to
> be expected.
> 
> I'm wondering if the warning is needed at all?

Using NFSv4.1/RDMA against my v4.7 NFS server seems to result
in a system deadlock in short order on the server. I haven't
looked further into this because you mentioned you were going
to have a look at these commits that change the backchannel
code.


--
Chuck Lever




^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: WARN_ON added to rpc_create()
  2016-08-03 19:40   ` Chuck Lever
  2016-08-10 18:01     ` Chuck Lever
@ 2016-08-18 21:56     ` J. Bruce Fields
  2016-08-18 22:11       ` Chuck Lever
  1 sibling, 1 reply; 15+ messages in thread
From: J. Bruce Fields @ 2016-08-18 21:56 UTC (permalink / raw)
  To: Chuck Lever; +Cc: J. Bruce Fields, Linux NFS Mailing List

On Wed, Aug 03, 2016 at 03:40:11PM -0400, Chuck Lever wrote:
> 
> > On Aug 3, 2016, at 1:47 PM, bfields@fieldses.org wrote:
> > 
> > On Wed, Aug 03, 2016 at 11:27:47AM -0400, Chuck Lever wrote:
> >> Hi Bruce-
> >> 
> >> I see that commit 39a9beab5acb83176e8b9a4f0778749a09341f1f
> >> Author:     J. Bruce Fields <bfields@redhat.com>
> >> AuthorDate: Tue May 17 12:38:21 2016 -0400
> >> 
> >>    rpc: share one xps between all backchannels
> >> 
> >> has added this piece of code:
> >> 
> >> @@ -452,10 +452,20 @@ static struct rpc_clnt *rpc_create_xprt(struct rpc_create_args *args,
> >>        struct rpc_clnt *clnt = NULL;
> >>        struct rpc_xprt_switch *xps;
> >> 
> >> -       xps = xprt_switch_alloc(xprt, GFP_KERNEL);
> >> -       if (xps == NULL) {
> >> -               xprt_put(xprt);
> >> -               return ERR_PTR(-ENOMEM);
> >> +       if (args->bc_xprt && args->bc_xprt->xpt_bc_xps) {
> >> +               WARN_ON(args->protocol != XPRT_TRANSPORT_BC_TCP);
> >> +               xps = args->bc_xprt->xpt_bc_xps;
> >> +               xprt_switch_get(xps);
> >> +       } else {
> >> 
> >> 
> >> the WARN_ON here fires on the server whenever I use NFSv4.1 on RDMA.
> >> 
> >> Can you say why it was added? Is there something RPC/RDMA needs to
> >> do to make the code safe?
> > 
> > What is args->protocol in this case?
> > 
> > Digging around...  OK, I missed that BC_TCP and BC_RDMA were defined as
> > OR's of an XPRT_TRANSPORT_BC bit with the identifier of the underlying
> > transport.  That makes sense.
> > 
> > So, I should have just used XPRT_TRANSPORT_BC there--I think all I meant
> > was "is this a backchannel".
> > 
> > Does that fix the problem?
> 
> This simple fix eliminates the log noise:
> 
> diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c
> index 2808d55..f94caf7 100644
> --- a/net/sunrpc/clnt.c
> +++ b/net/sunrpc/clnt.c
> @@ -520,7 +520,7 @@ struct rpc_clnt *rpc_create(struct rpc_create_args *args)
>         char servername[48];
>  
>         if (args->bc_xprt) {
> -               WARN_ON(args->protocol != XPRT_TRANSPORT_BC_TCP);
> +               WARN_ON(!(args->protocol & XPRT_TRANSPORT_BC));
>                 xprt = args->bc_xprt->xpt_bc_xprt;
>                 if (xprt) {
>                         xprt_get(xprt);
> 
> 
> This code seems to come from:
> 
> commit d50039ea5ee63c589b0434baa5ecf6e5075bb6f9
> Author:     J. Bruce Fields <bfields@redhat.com>
> AuthorDate: Mon May 16 17:03:42 2016 -0400
> 
>     nfsd4/rpc: move backchannel create logic into rpc code
> 
> 
> Where it may have been copied from:
> 
> -static struct rpc_clnt *create_backchannel_client(struct rpc_create_args *args)
> -{
> -       struct rpc_xprt *xprt;
> -
> -       if (args->protocol != XPRT_TRANSPORT_BC_TCP)
> -               return rpc_create(args);
> -
> -       xprt = args->bc_xprt->xpt_bc_xprt;
> -       if (xprt) {
> -               xprt_get(xprt);
> -               return rpc_create_xprt(args, xprt);
> -       }
> -
> -       return rpc_create(args);
> -}
> 
> There's no warning here. In fact, protocol != BC_TCP seems to
> be expected.

The protocol should be BC_TCP (OK, actually just BC) if and only if
bc_xprt is set.

(The BC_TCP case is the 4.1+ case, the other is the 4.0 case.  In the
4.1+ case, the new client uses an existing (client-initiated)
connection, in the 4.0 case, the new client must also have a new
connection.

In the 4.0 case we'll always create a new xprt, in the 4.1 case we might
or might not--depends on whether that particular connection has been
used for a backchannel previously.)

--b.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: WARN_ON added to rpc_create()
  2016-08-10 18:01     ` Chuck Lever
@ 2016-08-18 21:56       ` J. Bruce Fields
  2016-08-18 21:59         ` Chuck Lever
  0 siblings, 1 reply; 15+ messages in thread
From: J. Bruce Fields @ 2016-08-18 21:56 UTC (permalink / raw)
  To: Chuck Lever; +Cc: J. Bruce Fields, Linux NFS Mailing List

On Wed, Aug 10, 2016 at 02:01:20PM -0400, Chuck Lever wrote:
> Following up.
> 
> 
> > On Aug 3, 2016, at 3:40 PM, Chuck Lever <chuck.lever@oracle.com> wrote:
> > 
> >> 
> >> On Aug 3, 2016, at 1:47 PM, bfields@fieldses.org wrote:
> >> 
> >> On Wed, Aug 03, 2016 at 11:27:47AM -0400, Chuck Lever wrote:
> >>> Hi Bruce-
> >>> 
> >>> I see that commit 39a9beab5acb83176e8b9a4f0778749a09341f1f
> >>> Author:     J. Bruce Fields <bfields@redhat.com>
> >>> AuthorDate: Tue May 17 12:38:21 2016 -0400
> >>> 
> >>>   rpc: share one xps between all backchannels
> >>> 
> >>> has added this piece of code:
> >>> 
> >>> @@ -452,10 +452,20 @@ static struct rpc_clnt *rpc_create_xprt(struct rpc_create_args *args,
> >>>       struct rpc_clnt *clnt = NULL;
> >>>       struct rpc_xprt_switch *xps;
> >>> 
> >>> -       xps = xprt_switch_alloc(xprt, GFP_KERNEL);
> >>> -       if (xps == NULL) {
> >>> -               xprt_put(xprt);
> >>> -               return ERR_PTR(-ENOMEM);
> >>> +       if (args->bc_xprt && args->bc_xprt->xpt_bc_xps) {
> >>> +               WARN_ON(args->protocol != XPRT_TRANSPORT_BC_TCP);
> >>> +               xps = args->bc_xprt->xpt_bc_xps;
> >>> +               xprt_switch_get(xps);
> >>> +       } else {
> >>> 
> >>> 
> >>> the WARN_ON here fires on the server whenever I use NFSv4.1 on RDMA.
> >>> 
> >>> Can you say why it was added? Is there something RPC/RDMA needs to
> >>> do to make the code safe?
> >> 
> >> What is args->protocol in this case?
> >> 
> >> Digging around...  OK, I missed that BC_TCP and BC_RDMA were defined as
> >> OR's of an XPRT_TRANSPORT_BC bit with the identifier of the underlying
> >> transport.  That makes sense.
> >> 
> >> So, I should have just used XPRT_TRANSPORT_BC there--I think all I meant
> >> was "is this a backchannel".
> >> 
> >> Does that fix the problem?
> > 
> > This simple fix eliminates the log noise:
> > 
> > diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c
> > index 2808d55..f94caf7 100644
> > --- a/net/sunrpc/clnt.c
> > +++ b/net/sunrpc/clnt.c
> > @@ -520,7 +520,7 @@ struct rpc_clnt *rpc_create(struct rpc_create_args *args)
> >        char servername[48];
> > 
> >        if (args->bc_xprt) {
> > -               WARN_ON(args->protocol != XPRT_TRANSPORT_BC_TCP);
> > +               WARN_ON(!(args->protocol & XPRT_TRANSPORT_BC));
> >                xprt = args->bc_xprt->xpt_bc_xprt;
> >                if (xprt) {
> >                        xprt_get(xprt);
> > 
> > 
> > This code seems to come from:
> > 
> > commit d50039ea5ee63c589b0434baa5ecf6e5075bb6f9
> > Author:     J. Bruce Fields <bfields@redhat.com>
> > AuthorDate: Mon May 16 17:03:42 2016 -0400
> > 
> >    nfsd4/rpc: move backchannel create logic into rpc code
> > 
> > 
> > Where it may have been copied from:
> > 
> > -static struct rpc_clnt *create_backchannel_client(struct rpc_create_args *args)
> > -{
> > -       struct rpc_xprt *xprt;
> > -
> > -       if (args->protocol != XPRT_TRANSPORT_BC_TCP)
> > -               return rpc_create(args);
> > -
> > -       xprt = args->bc_xprt->xpt_bc_xprt;
> > -       if (xprt) {
> > -               xprt_get(xprt);
> > -               return rpc_create_xprt(args, xprt);
> > -       }
> > -
> > -       return rpc_create(args);
> > -}
> > 
> > There's no warning here. In fact, protocol != BC_TCP seems to
> > be expected.
> > 
> > I'm wondering if the warning is needed at all?
> 
> Using NFSv4.1/RDMA against my v4.7 NFS server seems to result
> in a system deadlock in short order on the server. I haven't
> looked further into this because you mentioned you were going
> to have a look at these commits that change the backchannel
> code.

I'm not seeing an obvious bug in those commits, for what it's worth.

--b.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: WARN_ON added to rpc_create()
  2016-08-18 21:56       ` J. Bruce Fields
@ 2016-08-18 21:59         ` Chuck Lever
  2016-08-19 14:51           ` J. Bruce Fields
  0 siblings, 1 reply; 15+ messages in thread
From: Chuck Lever @ 2016-08-18 21:59 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: J. Bruce Fields, Linux NFS Mailing List


> On Aug 18, 2016, at 5:56 PM, J. Bruce Fields <bfields@fieldses.org> wrote:
> 
> On Wed, Aug 10, 2016 at 02:01:20PM -0400, Chuck Lever wrote:
>> Following up.
>> 
>> 
>>> On Aug 3, 2016, at 3:40 PM, Chuck Lever <chuck.lever@oracle.com> wrote:
>>> 
>>>> 
>>>> On Aug 3, 2016, at 1:47 PM, bfields@fieldses.org wrote:
>>>> 
>>>> On Wed, Aug 03, 2016 at 11:27:47AM -0400, Chuck Lever wrote:
>>>>> Hi Bruce-
>>>>> 
>>>>> I see that commit 39a9beab5acb83176e8b9a4f0778749a09341f1f
>>>>> Author:     J. Bruce Fields <bfields@redhat.com>
>>>>> AuthorDate: Tue May 17 12:38:21 2016 -0400
>>>>> 
>>>>>  rpc: share one xps between all backchannels
>>>>> 
>>>>> has added this piece of code:
>>>>> 
>>>>> @@ -452,10 +452,20 @@ static struct rpc_clnt *rpc_create_xprt(struct rpc_create_args *args,
>>>>>      struct rpc_clnt *clnt = NULL;
>>>>>      struct rpc_xprt_switch *xps;
>>>>> 
>>>>> -       xps = xprt_switch_alloc(xprt, GFP_KERNEL);
>>>>> -       if (xps == NULL) {
>>>>> -               xprt_put(xprt);
>>>>> -               return ERR_PTR(-ENOMEM);
>>>>> +       if (args->bc_xprt && args->bc_xprt->xpt_bc_xps) {
>>>>> +               WARN_ON(args->protocol != XPRT_TRANSPORT_BC_TCP);
>>>>> +               xps = args->bc_xprt->xpt_bc_xps;
>>>>> +               xprt_switch_get(xps);
>>>>> +       } else {
>>>>> 
>>>>> 
>>>>> the WARN_ON here fires on the server whenever I use NFSv4.1 on RDMA.
>>>>> 
>>>>> Can you say why it was added? Is there something RPC/RDMA needs to
>>>>> do to make the code safe?
>>>> 
>>>> What is args->protocol in this case?
>>>> 
>>>> Digging around...  OK, I missed that BC_TCP and BC_RDMA were defined as
>>>> OR's of an XPRT_TRANSPORT_BC bit with the identifier of the underlying
>>>> transport.  That makes sense.
>>>> 
>>>> So, I should have just used XPRT_TRANSPORT_BC there--I think all I meant
>>>> was "is this a backchannel".
>>>> 
>>>> Does that fix the problem?
>>> 
>>> This simple fix eliminates the log noise:
>>> 
>>> diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c
>>> index 2808d55..f94caf7 100644
>>> --- a/net/sunrpc/clnt.c
>>> +++ b/net/sunrpc/clnt.c
>>> @@ -520,7 +520,7 @@ struct rpc_clnt *rpc_create(struct rpc_create_args *args)
>>>       char servername[48];
>>> 
>>>       if (args->bc_xprt) {
>>> -               WARN_ON(args->protocol != XPRT_TRANSPORT_BC_TCP);
>>> +               WARN_ON(!(args->protocol & XPRT_TRANSPORT_BC));
>>>               xprt = args->bc_xprt->xpt_bc_xprt;
>>>               if (xprt) {
>>>                       xprt_get(xprt);
>>> 
>>> 
>>> This code seems to come from:
>>> 
>>> commit d50039ea5ee63c589b0434baa5ecf6e5075bb6f9
>>> Author:     J. Bruce Fields <bfields@redhat.com>
>>> AuthorDate: Mon May 16 17:03:42 2016 -0400
>>> 
>>>   nfsd4/rpc: move backchannel create logic into rpc code
>>> 
>>> 
>>> Where it may have been copied from:
>>> 
>>> -static struct rpc_clnt *create_backchannel_client(struct rpc_create_args *args)
>>> -{
>>> -       struct rpc_xprt *xprt;
>>> -
>>> -       if (args->protocol != XPRT_TRANSPORT_BC_TCP)
>>> -               return rpc_create(args);
>>> -
>>> -       xprt = args->bc_xprt->xpt_bc_xprt;
>>> -       if (xprt) {
>>> -               xprt_get(xprt);
>>> -               return rpc_create_xprt(args, xprt);
>>> -       }
>>> -
>>> -       return rpc_create(args);
>>> -}
>>> 
>>> There's no warning here. In fact, protocol != BC_TCP seems to
>>> be expected.
>>> 
>>> I'm wondering if the warning is needed at all?
>> 
>> Using NFSv4.1/RDMA against my v4.7 NFS server seems to result
>> in a system deadlock in short order on the server. I haven't
>> looked further into this because you mentioned you were going
>> to have a look at these commits that change the backchannel
>> code.
> 
> I'm not seeing an obvious bug in those commits, for what it's worth.

Thanks for checking.

I've tracked the misbehavior down to a DMA mapping mismatch. It's
not related to the backchannel at all. I'm putting together a fix
right now.

But I would like to use NFSv4.1/RDMA without that warning firing.
Any reason to keep it (in either place) ?


--
Chuck Lever




^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: WARN_ON added to rpc_create()
  2016-08-18 21:56     ` J. Bruce Fields
@ 2016-08-18 22:11       ` Chuck Lever
  2016-08-19 14:50         ` J. Bruce Fields
  0 siblings, 1 reply; 15+ messages in thread
From: Chuck Lever @ 2016-08-18 22:11 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: J. Bruce Fields, Linux NFS Mailing List


> On Aug 18, 2016, at 5:56 PM, J. Bruce Fields <bfields@fieldses.org> wrote:
> 
> On Wed, Aug 03, 2016 at 03:40:11PM -0400, Chuck Lever wrote:
>> 
>>> On Aug 3, 2016, at 1:47 PM, bfields@fieldses.org wrote:
>>> 
>>> On Wed, Aug 03, 2016 at 11:27:47AM -0400, Chuck Lever wrote:
>>>> Hi Bruce-
>>>> 
>>>> I see that commit 39a9beab5acb83176e8b9a4f0778749a09341f1f
>>>> Author:     J. Bruce Fields <bfields@redhat.com>
>>>> AuthorDate: Tue May 17 12:38:21 2016 -0400
>>>> 
>>>>   rpc: share one xps between all backchannels
>>>> 
>>>> has added this piece of code:
>>>> 
>>>> @@ -452,10 +452,20 @@ static struct rpc_clnt *rpc_create_xprt(struct rpc_create_args *args,
>>>>       struct rpc_clnt *clnt = NULL;
>>>>       struct rpc_xprt_switch *xps;
>>>> 
>>>> -       xps = xprt_switch_alloc(xprt, GFP_KERNEL);
>>>> -       if (xps == NULL) {
>>>> -               xprt_put(xprt);
>>>> -               return ERR_PTR(-ENOMEM);
>>>> +       if (args->bc_xprt && args->bc_xprt->xpt_bc_xps) {
>>>> +               WARN_ON(args->protocol != XPRT_TRANSPORT_BC_TCP);
>>>> +               xps = args->bc_xprt->xpt_bc_xps;
>>>> +               xprt_switch_get(xps);
>>>> +       } else {
>>>> 
>>>> 
>>>> the WARN_ON here fires on the server whenever I use NFSv4.1 on RDMA.
>>>> 
>>>> Can you say why it was added? Is there something RPC/RDMA needs to
>>>> do to make the code safe?
>>> 
>>> What is args->protocol in this case?
>>> 
>>> Digging around...  OK, I missed that BC_TCP and BC_RDMA were defined as
>>> OR's of an XPRT_TRANSPORT_BC bit with the identifier of the underlying
>>> transport.  That makes sense.
>>> 
>>> So, I should have just used XPRT_TRANSPORT_BC there--I think all I meant
>>> was "is this a backchannel".
>>> 
>>> Does that fix the problem?
>> 
>> This simple fix eliminates the log noise:
>> 
>> diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c
>> index 2808d55..f94caf7 100644
>> --- a/net/sunrpc/clnt.c
>> +++ b/net/sunrpc/clnt.c
>> @@ -520,7 +520,7 @@ struct rpc_clnt *rpc_create(struct rpc_create_args *args)
>>        char servername[48];
>> 
>>        if (args->bc_xprt) {
>> -               WARN_ON(args->protocol != XPRT_TRANSPORT_BC_TCP);
>> +               WARN_ON(!(args->protocol & XPRT_TRANSPORT_BC));
>>                xprt = args->bc_xprt->xpt_bc_xprt;
>>                if (xprt) {
>>                        xprt_get(xprt);
>> 
>> 
>> This code seems to come from:
>> 
>> commit d50039ea5ee63c589b0434baa5ecf6e5075bb6f9
>> Author:     J. Bruce Fields <bfields@redhat.com>
>> AuthorDate: Mon May 16 17:03:42 2016 -0400
>> 
>>    nfsd4/rpc: move backchannel create logic into rpc code
>> 
>> 
>> Where it may have been copied from:
>> 
>> -static struct rpc_clnt *create_backchannel_client(struct rpc_create_args *args)
>> -{
>> -       struct rpc_xprt *xprt;
>> -
>> -       if (args->protocol != XPRT_TRANSPORT_BC_TCP)
>> -               return rpc_create(args);
>> -
>> -       xprt = args->bc_xprt->xpt_bc_xprt;
>> -       if (xprt) {
>> -               xprt_get(xprt);
>> -               return rpc_create_xprt(args, xprt);
>> -       }
>> -
>> -       return rpc_create(args);
>> -}
>> 
>> There's no warning here. In fact, protocol != BC_TCP seems to
>> be expected.
> 
> The protocol should be BC_TCP (OK, actually just BC) if and only if
> bc_xprt is set.
> 
> (The BC_TCP case is the 4.1+ case, the other is the 4.0 case.  In the
> 4.1+ case, the new client uses an existing (client-initiated)
> connection, in the 4.0 case, the new client must also have a new
> connection.
> 
> In the 4.0 case we'll always create a new xprt, in the 4.1 case we might
> or might not--depends on whether that particular connection has been
> used for a backchannel previously.)

OK, but why is a WARN_ON needed here? Why not return -EINVAL,
for example (once you've corrected BC_TCP -> BC) ?


--
Chuck Lever




^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: WARN_ON added to rpc_create()
  2016-08-18 22:11       ` Chuck Lever
@ 2016-08-19 14:50         ` J. Bruce Fields
  2016-08-19 15:06           ` Chuck Lever
  0 siblings, 1 reply; 15+ messages in thread
From: J. Bruce Fields @ 2016-08-19 14:50 UTC (permalink / raw)
  To: Chuck Lever; +Cc: J. Bruce Fields, Linux NFS Mailing List

On Thu, Aug 18, 2016 at 06:11:43PM -0400, Chuck Lever wrote:
> 
> > On Aug 18, 2016, at 5:56 PM, J. Bruce Fields <bfields@fieldses.org> wrote:
> > 
> > On Wed, Aug 03, 2016 at 03:40:11PM -0400, Chuck Lever wrote:
> >> 
> >>> On Aug 3, 2016, at 1:47 PM, bfields@fieldses.org wrote:
> >>> 
> >>> On Wed, Aug 03, 2016 at 11:27:47AM -0400, Chuck Lever wrote:
> >>>> Hi Bruce-
> >>>> 
> >>>> I see that commit 39a9beab5acb83176e8b9a4f0778749a09341f1f
> >>>> Author:     J. Bruce Fields <bfields@redhat.com>
> >>>> AuthorDate: Tue May 17 12:38:21 2016 -0400
> >>>> 
> >>>>   rpc: share one xps between all backchannels
> >>>> 
> >>>> has added this piece of code:
> >>>> 
> >>>> @@ -452,10 +452,20 @@ static struct rpc_clnt *rpc_create_xprt(struct rpc_create_args *args,
> >>>>       struct rpc_clnt *clnt = NULL;
> >>>>       struct rpc_xprt_switch *xps;
> >>>> 
> >>>> -       xps = xprt_switch_alloc(xprt, GFP_KERNEL);
> >>>> -       if (xps == NULL) {
> >>>> -               xprt_put(xprt);
> >>>> -               return ERR_PTR(-ENOMEM);
> >>>> +       if (args->bc_xprt && args->bc_xprt->xpt_bc_xps) {
> >>>> +               WARN_ON(args->protocol != XPRT_TRANSPORT_BC_TCP);
> >>>> +               xps = args->bc_xprt->xpt_bc_xps;
> >>>> +               xprt_switch_get(xps);
> >>>> +       } else {
> >>>> 
> >>>> 
> >>>> the WARN_ON here fires on the server whenever I use NFSv4.1 on RDMA.
> >>>> 
> >>>> Can you say why it was added? Is there something RPC/RDMA needs to
> >>>> do to make the code safe?
> >>> 
> >>> What is args->protocol in this case?
> >>> 
> >>> Digging around...  OK, I missed that BC_TCP and BC_RDMA were defined as
> >>> OR's of an XPRT_TRANSPORT_BC bit with the identifier of the underlying
> >>> transport.  That makes sense.
> >>> 
> >>> So, I should have just used XPRT_TRANSPORT_BC there--I think all I meant
> >>> was "is this a backchannel".
> >>> 
> >>> Does that fix the problem?
> >> 
> >> This simple fix eliminates the log noise:
> >> 
> >> diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c
> >> index 2808d55..f94caf7 100644
> >> --- a/net/sunrpc/clnt.c
> >> +++ b/net/sunrpc/clnt.c
> >> @@ -520,7 +520,7 @@ struct rpc_clnt *rpc_create(struct rpc_create_args *args)
> >>        char servername[48];
> >> 
> >>        if (args->bc_xprt) {
> >> -               WARN_ON(args->protocol != XPRT_TRANSPORT_BC_TCP);
> >> +               WARN_ON(!(args->protocol & XPRT_TRANSPORT_BC));
> >>                xprt = args->bc_xprt->xpt_bc_xprt;
> >>                if (xprt) {
> >>                        xprt_get(xprt);
> >> 
> >> 
> >> This code seems to come from:
> >> 
> >> commit d50039ea5ee63c589b0434baa5ecf6e5075bb6f9
> >> Author:     J. Bruce Fields <bfields@redhat.com>
> >> AuthorDate: Mon May 16 17:03:42 2016 -0400
> >> 
> >>    nfsd4/rpc: move backchannel create logic into rpc code
> >> 
> >> 
> >> Where it may have been copied from:
> >> 
> >> -static struct rpc_clnt *create_backchannel_client(struct rpc_create_args *args)
> >> -{
> >> -       struct rpc_xprt *xprt;
> >> -
> >> -       if (args->protocol != XPRT_TRANSPORT_BC_TCP)
> >> -               return rpc_create(args);
> >> -
> >> -       xprt = args->bc_xprt->xpt_bc_xprt;
> >> -       if (xprt) {
> >> -               xprt_get(xprt);
> >> -               return rpc_create_xprt(args, xprt);
> >> -       }
> >> -
> >> -       return rpc_create(args);
> >> -}
> >> 
> >> There's no warning here. In fact, protocol != BC_TCP seems to
> >> be expected.
> > 
> > The protocol should be BC_TCP (OK, actually just BC) if and only if
> > bc_xprt is set.
> > 
> > (The BC_TCP case is the 4.1+ case, the other is the 4.0 case.  In the
> > 4.1+ case, the new client uses an existing (client-initiated)
> > connection, in the 4.0 case, the new client must also have a new
> > connection.
> > 
> > In the 4.0 case we'll always create a new xprt, in the 4.1 case we might
> > or might not--depends on whether that particular connection has been
> > used for a backchannel previously.)
> 
> OK, but why is a WARN_ON needed here? Why not return -EINVAL,
> for example (once you've corrected BC_TCP -> BC) ?

Well, it would be a programming bug, so I'd want a WARN_ON or similar
somewhere, I don't care particularly where it is if you see a better way
to organize things.

--b.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: WARN_ON added to rpc_create()
  2016-08-18 21:59         ` Chuck Lever
@ 2016-08-19 14:51           ` J. Bruce Fields
  0 siblings, 0 replies; 15+ messages in thread
From: J. Bruce Fields @ 2016-08-19 14:51 UTC (permalink / raw)
  To: Chuck Lever; +Cc: J. Bruce Fields, Linux NFS Mailing List

On Thu, Aug 18, 2016 at 05:59:12PM -0400, Chuck Lever wrote:
> Thanks for checking.
> 
> I've tracked the misbehavior down to a DMA mapping mismatch. It's
> not related to the backchannel at all. I'm putting together a fix
> right now.
> 
> But I would like to use NFSv4.1/RDMA without that warning firing.
> Any reason to keep it (in either place) ?

I thought you said the BC_TCP->BC change fixed it?  If so, we should
just do that.

--b.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: WARN_ON added to rpc_create()
  2016-08-19 14:50         ` J. Bruce Fields
@ 2016-08-19 15:06           ` Chuck Lever
  2016-08-19 15:19             ` Chuck Lever
  2016-08-19 15:47             ` J. Bruce Fields
  0 siblings, 2 replies; 15+ messages in thread
From: Chuck Lever @ 2016-08-19 15:06 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: J. Bruce Fields, Linux NFS Mailing List


> On Aug 19, 2016, at 10:50 AM, J. Bruce Fields <bfields@redhat.com> wrote:
> 
> On Thu, Aug 18, 2016 at 06:11:43PM -0400, Chuck Lever wrote:
>> 
>>> On Aug 18, 2016, at 5:56 PM, J. Bruce Fields <bfields@fieldses.org> wrote:
>>> 
>>> On Wed, Aug 03, 2016 at 03:40:11PM -0400, Chuck Lever wrote:
>>>> 
>>>>> On Aug 3, 2016, at 1:47 PM, bfields@fieldses.org wrote:
>>>>> 
>>>>> On Wed, Aug 03, 2016 at 11:27:47AM -0400, Chuck Lever wrote:
>>>>>> Hi Bruce-
>>>>>> 
>>>>>> I see that commit 39a9beab5acb83176e8b9a4f0778749a09341f1f
>>>>>> Author:     J. Bruce Fields <bfields@redhat.com>
>>>>>> AuthorDate: Tue May 17 12:38:21 2016 -0400
>>>>>> 
>>>>>>  rpc: share one xps between all backchannels
>>>>>> 
>>>>>> has added this piece of code:
>>>>>> 
>>>>>> @@ -452,10 +452,20 @@ static struct rpc_clnt *rpc_create_xprt(struct rpc_create_args *args,
>>>>>>      struct rpc_clnt *clnt = NULL;
>>>>>>      struct rpc_xprt_switch *xps;
>>>>>> 
>>>>>> -       xps = xprt_switch_alloc(xprt, GFP_KERNEL);
>>>>>> -       if (xps == NULL) {
>>>>>> -               xprt_put(xprt);
>>>>>> -               return ERR_PTR(-ENOMEM);
>>>>>> +       if (args->bc_xprt && args->bc_xprt->xpt_bc_xps) {
>>>>>> +               WARN_ON(args->protocol != XPRT_TRANSPORT_BC_TCP);
>>>>>> +               xps = args->bc_xprt->xpt_bc_xps;
>>>>>> +               xprt_switch_get(xps);
>>>>>> +       } else {
>>>>>> 
>>>>>> 
>>>>>> the WARN_ON here fires on the server whenever I use NFSv4.1 on RDMA.
>>>>>> 
>>>>>> Can you say why it was added? Is there something RPC/RDMA needs to
>>>>>> do to make the code safe?
>>>>> 
>>>>> What is args->protocol in this case?
>>>>> 
>>>>> Digging around...  OK, I missed that BC_TCP and BC_RDMA were defined as
>>>>> OR's of an XPRT_TRANSPORT_BC bit with the identifier of the underlying
>>>>> transport.  That makes sense.
>>>>> 
>>>>> So, I should have just used XPRT_TRANSPORT_BC there--I think all I meant
>>>>> was "is this a backchannel".
>>>>> 
>>>>> Does that fix the problem?
>>>> 
>>>> This simple fix eliminates the log noise:
>>>> 
>>>> diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c
>>>> index 2808d55..f94caf7 100644
>>>> --- a/net/sunrpc/clnt.c
>>>> +++ b/net/sunrpc/clnt.c
>>>> @@ -520,7 +520,7 @@ struct rpc_clnt *rpc_create(struct rpc_create_args *args)
>>>>       char servername[48];
>>>> 
>>>>       if (args->bc_xprt) {
>>>> -               WARN_ON(args->protocol != XPRT_TRANSPORT_BC_TCP);
>>>> +               WARN_ON(!(args->protocol & XPRT_TRANSPORT_BC));
>>>>               xprt = args->bc_xprt->xpt_bc_xprt;
>>>>               if (xprt) {
>>>>                       xprt_get(xprt);
>>>> 
>>>> 
>>>> This code seems to come from:
>>>> 
>>>> commit d50039ea5ee63c589b0434baa5ecf6e5075bb6f9
>>>> Author:     J. Bruce Fields <bfields@redhat.com>
>>>> AuthorDate: Mon May 16 17:03:42 2016 -0400
>>>> 
>>>>   nfsd4/rpc: move backchannel create logic into rpc code
>>>> 
>>>> 
>>>> Where it may have been copied from:
>>>> 
>>>> -static struct rpc_clnt *create_backchannel_client(struct rpc_create_args *args)
>>>> -{
>>>> -       struct rpc_xprt *xprt;
>>>> -
>>>> -       if (args->protocol != XPRT_TRANSPORT_BC_TCP)
>>>> -               return rpc_create(args);
>>>> -
>>>> -       xprt = args->bc_xprt->xpt_bc_xprt;
>>>> -       if (xprt) {
>>>> -               xprt_get(xprt);
>>>> -               return rpc_create_xprt(args, xprt);
>>>> -       }
>>>> -
>>>> -       return rpc_create(args);
>>>> -}
>>>> 
>>>> There's no warning here. In fact, protocol != BC_TCP seems to
>>>> be expected.
>>> 
>>> The protocol should be BC_TCP (OK, actually just BC) if and only if
>>> bc_xprt is set.
>>> 
>>> (The BC_TCP case is the 4.1+ case, the other is the 4.0 case.  In the
>>> 4.1+ case, the new client uses an existing (client-initiated)
>>> connection, in the 4.0 case, the new client must also have a new
>>> connection.
>>> 
>>> In the 4.0 case we'll always create a new xprt, in the 4.1 case we might
>>> or might not--depends on whether that particular connection has been
>>> used for a backchannel previously.)
>> 
>> OK, but why is a WARN_ON needed here? Why not return -EINVAL,
>> for example (once you've corrected BC_TCP -> BC) ?
> 
> Well, it would be a programming bug, so I'd want a WARN_ON or similar
> somewhere, I don't care particularly where it is if you see a better way
> to organize things.

The way it works now, the WARN_ON fires, but the logic goes ahead
and creates the transport anyway.

If this is a programming bug, it should fail and return an error,
no transport should be created. I can see a WARN_ON being useful
because it displays a backtrace which identifies the broken
caller.

If it is not a programming bug (which is implied by the fact that
a transport is created anyway) then no WARN_ON is needed.

If you think it is correct that a WARN_ON fires _and_ a transport
is created, could a comment be added explaining that? The new
logic seems less straightforward to me than what it replaces.


--
Chuck Lever




^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: WARN_ON added to rpc_create()
  2016-08-19 15:06           ` Chuck Lever
@ 2016-08-19 15:19             ` Chuck Lever
  2016-08-19 15:47             ` J. Bruce Fields
  1 sibling, 0 replies; 15+ messages in thread
From: Chuck Lever @ 2016-08-19 15:19 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: J. Bruce Fields, Linux NFS Mailing List


> On Aug 19, 2016, at 11:06 AM, Chuck Lever <chuck.lever@oracle.com> wrote:
> 
>> 
>> On Aug 19, 2016, at 10:50 AM, J. Bruce Fields <bfields@redhat.com> wrote:
>> 
>> On Thu, Aug 18, 2016 at 06:11:43PM -0400, Chuck Lever wrote:
>>> 
>>>> On Aug 18, 2016, at 5:56 PM, J. Bruce Fields <bfields@fieldses.org> wrote:
>>>> 
>>>> On Wed, Aug 03, 2016 at 03:40:11PM -0400, Chuck Lever wrote:
>>>>> 
>>>>>> On Aug 3, 2016, at 1:47 PM, bfields@fieldses.org wrote:
>>>>>> 
>>>>>> On Wed, Aug 03, 2016 at 11:27:47AM -0400, Chuck Lever wrote:
>>>>>>> Hi Bruce-
>>>>>>> 
>>>>>>> I see that commit 39a9beab5acb83176e8b9a4f0778749a09341f1f
>>>>>>> Author:     J. Bruce Fields <bfields@redhat.com>
>>>>>>> AuthorDate: Tue May 17 12:38:21 2016 -0400
>>>>>>> 
>>>>>>> rpc: share one xps between all backchannels
>>>>>>> 
>>>>>>> has added this piece of code:
>>>>>>> 
>>>>>>> @@ -452,10 +452,20 @@ static struct rpc_clnt *rpc_create_xprt(struct rpc_create_args *args,
>>>>>>>     struct rpc_clnt *clnt = NULL;
>>>>>>>     struct rpc_xprt_switch *xps;
>>>>>>> 
>>>>>>> -       xps = xprt_switch_alloc(xprt, GFP_KERNEL);
>>>>>>> -       if (xps == NULL) {
>>>>>>> -               xprt_put(xprt);
>>>>>>> -               return ERR_PTR(-ENOMEM);
>>>>>>> +       if (args->bc_xprt && args->bc_xprt->xpt_bc_xps) {
>>>>>>> +               WARN_ON(args->protocol != XPRT_TRANSPORT_BC_TCP);
>>>>>>> +               xps = args->bc_xprt->xpt_bc_xps;
>>>>>>> +               xprt_switch_get(xps);
>>>>>>> +       } else {
>>>>>>> 
>>>>>>> 
>>>>>>> the WARN_ON here fires on the server whenever I use NFSv4.1 on RDMA.
>>>>>>> 
>>>>>>> Can you say why it was added? Is there something RPC/RDMA needs to
>>>>>>> do to make the code safe?
>>>>>> 
>>>>>> What is args->protocol in this case?
>>>>>> 
>>>>>> Digging around...  OK, I missed that BC_TCP and BC_RDMA were defined as
>>>>>> OR's of an XPRT_TRANSPORT_BC bit with the identifier of the underlying
>>>>>> transport.  That makes sense.
>>>>>> 
>>>>>> So, I should have just used XPRT_TRANSPORT_BC there--I think all I meant
>>>>>> was "is this a backchannel".
>>>>>> 
>>>>>> Does that fix the problem?
>>>>> 
>>>>> This simple fix eliminates the log noise:
>>>>> 
>>>>> diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c
>>>>> index 2808d55..f94caf7 100644
>>>>> --- a/net/sunrpc/clnt.c
>>>>> +++ b/net/sunrpc/clnt.c
>>>>> @@ -520,7 +520,7 @@ struct rpc_clnt *rpc_create(struct rpc_create_args *args)
>>>>>      char servername[48];
>>>>> 
>>>>>      if (args->bc_xprt) {
>>>>> -               WARN_ON(args->protocol != XPRT_TRANSPORT_BC_TCP);
>>>>> +               WARN_ON(!(args->protocol & XPRT_TRANSPORT_BC));
>>>>>              xprt = args->bc_xprt->xpt_bc_xprt;
>>>>>              if (xprt) {
>>>>>                      xprt_get(xprt);
>>>>> 
>>>>> 
>>>>> This code seems to come from:
>>>>> 
>>>>> commit d50039ea5ee63c589b0434baa5ecf6e5075bb6f9
>>>>> Author:     J. Bruce Fields <bfields@redhat.com>
>>>>> AuthorDate: Mon May 16 17:03:42 2016 -0400
>>>>> 
>>>>>  nfsd4/rpc: move backchannel create logic into rpc code
>>>>> 
>>>>> 
>>>>> Where it may have been copied from:
>>>>> 
>>>>> -static struct rpc_clnt *create_backchannel_client(struct rpc_create_args *args)
>>>>> -{
>>>>> -       struct rpc_xprt *xprt;
>>>>> -
>>>>> -       if (args->protocol != XPRT_TRANSPORT_BC_TCP)
>>>>> -               return rpc_create(args);
>>>>> -
>>>>> -       xprt = args->bc_xprt->xpt_bc_xprt;
>>>>> -       if (xprt) {
>>>>> -               xprt_get(xprt);
>>>>> -               return rpc_create_xprt(args, xprt);
>>>>> -       }
>>>>> -
>>>>> -       return rpc_create(args);
>>>>> -}
>>>>> 
>>>>> There's no warning here. In fact, protocol != BC_TCP seems to
>>>>> be expected.
>>>> 
>>>> The protocol should be BC_TCP (OK, actually just BC) if and only if
>>>> bc_xprt is set.
>>>> 
>>>> (The BC_TCP case is the 4.1+ case, the other is the 4.0 case.  In the
>>>> 4.1+ case, the new client uses an existing (client-initiated)
>>>> connection, in the 4.0 case, the new client must also have a new
>>>> connection.
>>>> 
>>>> In the 4.0 case we'll always create a new xprt, in the 4.1 case we might
>>>> or might not--depends on whether that particular connection has been
>>>> used for a backchannel previously.)
>>> 
>>> OK, but why is a WARN_ON needed here? Why not return -EINVAL,
>>> for example (once you've corrected BC_TCP -> BC) ?
>> 
>> Well, it would be a programming bug, so I'd want a WARN_ON or similar
>> somewhere, I don't care particularly where it is if you see a better way
>> to organize things.
> 
> The way it works now, the WARN_ON fires, but the logic goes ahead
> and creates the transport anyway.
> 
> If this is a programming bug, it should fail and return an error,
> no transport should be created. I can see a WARN_ON being useful
> because it displays a backtrace which identifies the broken
> caller.
> 
> If it is not a programming bug (which is implied by the fact that
> a transport is created anyway) then no WARN_ON is needed.
> 
> If you think it is correct that a WARN_ON fires _and_ a transport
> is created, could a comment be added explaining that? The new
> logic seems less straightforward to me than what it replaces.

Also, WARN_ONCE might be preferable.


--
Chuck Lever




^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: WARN_ON added to rpc_create()
  2016-08-19 15:06           ` Chuck Lever
  2016-08-19 15:19             ` Chuck Lever
@ 2016-08-19 15:47             ` J. Bruce Fields
  2016-08-19 15:51               ` Chuck Lever
  1 sibling, 1 reply; 15+ messages in thread
From: J. Bruce Fields @ 2016-08-19 15:47 UTC (permalink / raw)
  To: Chuck Lever; +Cc: J. Bruce Fields, Linux NFS Mailing List

On Fri, Aug 19, 2016 at 11:06:16AM -0400, Chuck Lever wrote:
> > On Aug 19, 2016, at 10:50 AM, J. Bruce Fields <bfields@redhat.com> wrote:
> > On Thu, Aug 18, 2016 at 06:11:43PM -0400, Chuck Lever wrote:
> >> OK, but why is a WARN_ON needed here? Why not return -EINVAL,
> >> for example (once you've corrected BC_TCP -> BC) ?
> > 
> > Well, it would be a programming bug, so I'd want a WARN_ON or similar
> > somewhere, I don't care particularly where it is if you see a better way
> > to organize things.
> 
> The way it works now, the WARN_ON fires, but the logic goes ahead
> and creates the transport anyway.
> 
> If this is a programming bug, it should fail and return an error,

I haven't been following that rule.

Once upon a time, I would have put a BUG() there.  Then Linus pointed
out that sometimes a BUG() can bork the machine badly enough that the
backtrace doesn't even make it to the logs, rendering it useless.  (And
I believe that could be the case here since this is running as a work
item.)  So, I stick a WARN() there instead and don't worry much what
happens afterwards.

> If it is not a programming bug (which is implied by the fact that
> a transport is created anyway) then no WARN_ON is needed.

So, could we just agree that WARN_ON means "there's a programming
error", regardless of what happens next?  Backtraces should never happen
on a working kernel.

And then ignore the following code path.  Unless it's something that's
obviously going to immediately oops in the warned case, in which case if
we really want the warning then we should return if that looks safer.

But I don't have really strong feelings about this case, the warning may
be academic since setup_callback_client() makes this look obviously
impossible, so if you want to reorganize this somehow, feel free to give
it a shot.

--b.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: WARN_ON added to rpc_create()
  2016-08-19 15:47             ` J. Bruce Fields
@ 2016-08-19 15:51               ` Chuck Lever
  2016-08-19 15:55                 ` J. Bruce Fields
  0 siblings, 1 reply; 15+ messages in thread
From: Chuck Lever @ 2016-08-19 15:51 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: J. Bruce Fields, Linux NFS Mailing List


> On Aug 19, 2016, at 11:47 AM, J. Bruce Fields <bfields@fieldses.org> wrote:
> 
> On Fri, Aug 19, 2016 at 11:06:16AM -0400, Chuck Lever wrote:
>>> On Aug 19, 2016, at 10:50 AM, J. Bruce Fields <bfields@redhat.com> wrote:
>>> On Thu, Aug 18, 2016 at 06:11:43PM -0400, Chuck Lever wrote:
>>>> OK, but why is a WARN_ON needed here? Why not return -EINVAL,
>>>> for example (once you've corrected BC_TCP -> BC) ?
>>> 
>>> Well, it would be a programming bug, so I'd want a WARN_ON or similar
>>> somewhere, I don't care particularly where it is if you see a better way
>>> to organize things.
>> 
>> The way it works now, the WARN_ON fires, but the logic goes ahead
>> and creates the transport anyway.
>> 
>> If this is a programming bug, it should fail and return an error,
> 
> I haven't been following that rule.
> 
> Once upon a time, I would have put a BUG() there.  Then Linus pointed
> out that sometimes a BUG() can bork the machine badly enough that the
> backtrace doesn't even make it to the logs, rendering it useless.  (And
> I believe that could be the case here since this is running as a work
> item.)  So, I stick a WARN() there instead and don't worry much what
> happens afterwards.

I still don't understand. If you would have put a BUG here, then
why does this logic continue and create the transport anyway?

Well, it's a nit, so I'll drop it.


>> If it is not a programming bug (which is implied by the fact that
>> a transport is created anyway) then no WARN_ON is needed.
> 
> So, could we just agree that WARN_ON means "there's a programming
> error", regardless of what happens next?  Backtraces should never happen
> on a working kernel.
> 
> And then ignore the following code path.  Unless it's something that's
> obviously going to immediately oops in the warned case, in which case if
> we really want the warning then we should return if that looks safer.
> 
> But I don't have really strong feelings about this case, the warning may
> be academic since setup_callback_client() makes this look obviously
> impossible, so if you want to reorganize this somehow, feel free to give
> it a shot.

Right, it's that obviously impossible part that made me wonder why
there was a warning here in the first place.

OK, the fix is to do BC_TCP -> BC and put our pencils down. Do you
want me to send you a patch, or do you plan to take care of it?

--
Chuck Lever




^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: WARN_ON added to rpc_create()
  2016-08-19 15:51               ` Chuck Lever
@ 2016-08-19 15:55                 ` J. Bruce Fields
  0 siblings, 0 replies; 15+ messages in thread
From: J. Bruce Fields @ 2016-08-19 15:55 UTC (permalink / raw)
  To: Chuck Lever; +Cc: J. Bruce Fields, Linux NFS Mailing List

On Fri, Aug 19, 2016 at 11:51:13AM -0400, Chuck Lever wrote:
> 
> > On Aug 19, 2016, at 11:47 AM, J. Bruce Fields <bfields@fieldses.org> wrote:
> > 
> > On Fri, Aug 19, 2016 at 11:06:16AM -0400, Chuck Lever wrote:
> >>> On Aug 19, 2016, at 10:50 AM, J. Bruce Fields <bfields@redhat.com> wrote:
> >>> On Thu, Aug 18, 2016 at 06:11:43PM -0400, Chuck Lever wrote:
> >>>> OK, but why is a WARN_ON needed here? Why not return -EINVAL,
> >>>> for example (once you've corrected BC_TCP -> BC) ?
> >>> 
> >>> Well, it would be a programming bug, so I'd want a WARN_ON or similar
> >>> somewhere, I don't care particularly where it is if you see a better way
> >>> to organize things.
> >> 
> >> The way it works now, the WARN_ON fires, but the logic goes ahead
> >> and creates the transport anyway.
> >> 
> >> If this is a programming bug, it should fail and return an error,
> > 
> > I haven't been following that rule.
> > 
> > Once upon a time, I would have put a BUG() there.  Then Linus pointed
> > out that sometimes a BUG() can bork the machine badly enough that the
> > backtrace doesn't even make it to the logs, rendering it useless.  (And
> > I believe that could be the case here since this is running as a work
> > item.)  So, I stick a WARN() there instead and don't worry much what
> > happens afterwards.
> 
> I still don't understand. If you would have put a BUG here, then
> why does this logic continue and create the transport anyway?

It seemed worth 1 line of screen real estate to say "this should never
happen", but not 3?

> Well, it's a nit, so I'll drop it.

But, I mean, I don't care that much either.

> >> If it is not a programming bug (which is implied by the fact that
> >> a transport is created anyway) then no WARN_ON is needed.
> > 
> > So, could we just agree that WARN_ON means "there's a programming
> > error", regardless of what happens next?  Backtraces should never happen
> > on a working kernel.
> > 
> > And then ignore the following code path.  Unless it's something that's
> > obviously going to immediately oops in the warned case, in which case if
> > we really want the warning then we should return if that looks safer.
> > 
> > But I don't have really strong feelings about this case, the warning may
> > be academic since setup_callback_client() makes this look obviously
> > impossible, so if you want to reorganize this somehow, feel free to give
> > it a shot.
> 
> Right, it's that obviously impossible part that made me wonder why
> there was a warning here in the first place.
> 
> OK, the fix is to do BC_TCP -> BC and put our pencils down. Do you
> want me to send you a patch

That would be great.--b.

> or do you plan to take care of it?

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2016-08-19 15:55 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-08-03 15:27 WARN_ON added to rpc_create() Chuck Lever
2016-08-03 17:47 ` J. Bruce Fields
2016-08-03 19:40   ` Chuck Lever
2016-08-10 18:01     ` Chuck Lever
2016-08-18 21:56       ` J. Bruce Fields
2016-08-18 21:59         ` Chuck Lever
2016-08-19 14:51           ` J. Bruce Fields
2016-08-18 21:56     ` J. Bruce Fields
2016-08-18 22:11       ` Chuck Lever
2016-08-19 14:50         ` J. Bruce Fields
2016-08-19 15:06           ` Chuck Lever
2016-08-19 15:19             ` Chuck Lever
2016-08-19 15:47             ` J. Bruce Fields
2016-08-19 15:51               ` Chuck Lever
2016-08-19 15:55                 ` J. Bruce Fields

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.