linux-nfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Mkrtchyan, Tigran" <tigran.mkrtchyan@desy.de>
To: trondmy <trondmy@hammerspace.com>
Cc: linux-nfs <linux-nfs@vger.kernel.org>,
	Anna Schumaker <anna.schumaker@netapp.com>
Subject: Re: [PATCH v3 00/11] Add RDMA support to the pNFS file+flexfiles data channels
Date: Tue, 1 Dec 2020 15:44:46 +0100 (CET)	[thread overview]
Message-ID: <1164910234.1237300.1606833886593.JavaMail.zimbra@desy.de> (raw)
In-Reply-To: <449631871.1158780.1606820362845.JavaMail.zimbra@desy.de>



Actually, this was a co-incidence, which means that buf points to
more or less a random place.

Tigran.

----- Original Message -----
> From: "Tigran Mkrtchyan" <tigran.mkrtchyan@desy.de>
> To: "trondmy" <trondmy@hammerspace.com>
> Cc: "linux-nfs" <linux-nfs@vger.kernel.org>, "Anna Schumaker" <anna.schumaker@netapp.com>
> Sent: Tuesday, 1 December, 2020 11:59:22
> Subject: Re: [PATCH v3 00/11] Add RDMA support to the pNFS file+flexfiles data channels

> More investigation...
> 
> After xdr_read_pages call the xdr stream points to the
> beginning of RPC package and return the xid value on
> the next read (which should be the size of the notification bitmap).
> 
> 
> Regards,
>   Tigran.
> 
> ----- Original Message -----
>> From: "Tigran Mkrtchyan" <tigran.mkrtchyan@desy.de>
>> To: "trondmy" <trondmy@hammerspace.com>
>> Cc: "linux-nfs" <linux-nfs@vger.kernel.org>, "Anna Schumaker"
>> <anna.schumaker@netapp.com>
>> Sent: Thursday, 26 November, 2020 18:17:41
>> Subject: Re: [PATCH v3 00/11] Add RDMA support to the pNFS file+flexfiles data
>> channels
> 
>> I have added some debug info Ind got this:
>> 
>> decode_getdeviceinfo: layout type 4
>> decode_getdeviceinfo: layout size 64
>> decode_getdeviceinfo: layout notification bitmap size 1094994757
>> 
>> So it looks like that xdr_read_pages set xdr pointer to a wrong position
>> and next read, which is bitmap size
>> 
>> 5857         pdev->mincount = be32_to_cpup(p);
>> 5858         dprintk("%s: layout size %u\n", __func__, pdev->mincount);
>> 5859         if (xdr_read_pages(xdr, pdev->mincount) != pdev->mincount)
>> 5860                 return -EIO;
>> 5861
>> 5862         /* Parse notification bitmap, verifying that it is zero. */
>> 5863         p = xdr_inline_decode(xdr, 4);
>> 5864         if (unlikely(!p))
>> 5865                 return -EIO;
>> 5866         len = be32_to_cpup(p);
>> 5867         dprintk("%s: layout notification bitmap size %u\n", __func__, len);
>> 5868         if (len) {
>> 5869                 uint32_t i;
>> 
>> 
>> Tigran.
>> 
>> 
>> ----- Original Message -----
>>> From: "Tigran Mkrtchyan" <tigran.mkrtchyan@desy.de>
>>> To: "trondmy" <trondmy@hammerspace.com>
>>> Cc: "linux-nfs" <linux-nfs@vger.kernel.org>, "Anna Schumaker"
>>> <anna.schumaker@netapp.com>
>>> Sent: Tuesday, 17 November, 2020 15:50:57
>>> Subject: Re: [PATCH v3 00/11] Add RDMA support to the pNFS file+flexfiles data
>>> channels
>> 
>>> Here is the result:
>>> 
>>> 
>>> $ git bisect bad
>>> c567552612ece787b178e3b147b5854ad422a836 is the first bad commit
>>> commit c567552612ece787b178e3b147b5854ad422a836
>>> Author: Anna Schumaker <Anna.Schumaker@Netapp.com>
>>> Date:   Wed May 28 13:41:22 2014 -0400
>>> 
>>>    NFS: Add READ_PLUS data segment support
>>>    
>>>    This patch adds client support for decoding a single NFS4_CONTENT_DATA
>>>    segment returned by the server. This is the simplest implementation
>>>    possible, since it does not account for any hole segments in the reply.
>>>    
>>>    Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
>>> 
>>> fs/nfs/nfs42xdr.c         | 141 ++++++++++++++++++++++++++++++++++++++++++++++
>>> fs/nfs/nfs4client.c       |   2 +
>>> fs/nfs/nfs4proc.c         |  43 +++++++++++++-
>>> fs/nfs/nfs4xdr.c          |   1 +
>>> include/linux/nfs4.h      |   2 +-
>>> include/linux/nfs_fs_sb.h |   1 +
>>> include/linux/nfs_xdr.h   |   2 +-
>>> 7 files changed, 187 insertions(+), 5 deletions(-)
>>> 
>>> 
>>> Regards,
>>>   Tigran.
>>> 
>>> 
>>> 
>>> ----- Original Message -----
>>>> From: "Tigran Mkrtchyan" <tigran.mkrtchyan@desy.de>
>>>> To: "trondmy" <trondmy@hammerspace.com>
>>>> Cc: "linux-nfs" <linux-nfs@vger.kernel.org>
>>>> Sent: Monday, 16 November, 2020 21:55:50
>>>> Subject: Re: [PATCH v3 00/11] Add RDMA support to the pNFS file+flexfiles data
>>>> channels
>>> 
>>>> Hi Trond,
>>>> 
>>>> I am afraid, that the fix didn't work. I bisecting it....
>>>> 
>>>> 
>>>> Tigran.
>>>> 
>>>> 
>>>> ----- Original Message -----
>>>>> From: "trondmy" <trondmy@hammerspace.com>
>>>>> To: "Tigran Mkrtchyan" <tigran.mkrtchyan@desy.de>
>>>>> Cc: "linux-nfs" <linux-nfs@vger.kernel.org>
>>>>> Sent: Saturday, 14 November, 2020 15:29:01
>>>>> Subject: Re: [PATCH v3 00/11] Add RDMA support to the pNFS file+flexfiles data
>>>>> channels
>>>> 
>>>>> On Sat, 2020-11-14 at 00:46 +0100, Mkrtchyan, Tigran wrote:
>>>>>> 
>>>>>> 
>>>>>> ----- Original Message -----
>>>>>> > From: "trondmy" <trondmy@hammerspace.com>
>>>>>> > To: "Tigran Mkrtchyan" <tigran.mkrtchyan@desy.de>
>>>>>> > Cc: "linux-nfs" <linux-nfs@vger.kernel.org>
>>>>>> > Sent: Friday, 13 November, 2020 23:45:00
>>>>>> > Subject: Re: [PATCH v3 00/11] Add RDMA support to the pNFS
>>>>>> > file+flexfiles data channels
>>>>>> 
>>>>>> > On Fri, 2020-11-13 at 22:30 +0100, Mkrtchyan, Tigran wrote:
>>>>>> > > 
>>>>>> > > After more testing, it looks like that client doesn't like
>>>>>> > > notification bitmap:
>>>>>> > > 
>>>>>> > > 
>>>>>> > > [31576.789492] --> _nfs4_proc_getdeviceinfo
>>>>>> > > [31576.789503] --> nfs41_call_sync_prepare data->seq_server
>>>>>> > > 000000001d17c43e
>>>>>> > > [31576.789507] --> nfs4_alloc_slot used_slots=0000
>>>>>> > > highest_used=4294967295 max_slots=16
>>>>>> > > [31576.789510] <-- nfs4_alloc_slot used_slots=0001 highest_used=0
>>>>>> > > slotid=0
>>>>>> > > [31576.789527] encode_sequence:
>>>>>> > > sessionid=2910695007:150995712:0:16777216 seqid=92462 slotid=0
>>>>>> > > max_slotid=0 cache_this=0
>>>>>> > > [31576.789991] decode_getdeviceinfo: unsupported notification
>>>>>> > 
>>>>>> > According to this, you appear to be returning a deviceinfo bitmap
>>>>>> > with
>>>>>> > at least one non-zero entry that is not in the first 32-bit word.
>>>>>> > We
>>>>>> > only ask for notifications for NOTIFY_DEVICEID4_CHANGE and
>>>>>> > NOTIFY_DEVICEID4_DELETE, so we only expect bitmap[0] to have non-
>>>>>> > zero
>>>>>> > entries.
>>>>>> 
>>>>>> 
>>>>>> according to packet capture only bitmap[0] has non zero bits set.
>>>>>> This is the reply of compound starting from nfs staus code, tag
>>>>>> length and so on.
>>>>>> 
>>>>>> 
>>>>>> 0000   00 00 00 00 00 00 00 00 00 00 00 02 00 00 00 35
>>>>>> 0010   00 00 00 00 5f ae 7d ad 00 03 00 09 00 00 00 00
>>>>>> 0020   00 00 00 01 00 00 00 4c 00 00 00 00 00 00 00 0f
>>>>>> 0030   00 00 00 0f 00 00 00 00 00 00 00 2f 00 00 00 00
>>>>>> 0040   00 00 00 04 00 00 00 40 00 00 00 01 00 00 00 03
>>>>>> 0050   74 63 70 00 00 00 00 16 31 33 31 2e 31 36 39 2e
>>>>>> 0060   31 39 31 2e 31 34 33 2e 31 32 35 2e 34 39 00 00
>>>>>> 0070   00 00 00 01 00 00 00 04 00 00 00 01 00 10 00 00
>>>>>> 0080   00 10 00 00 00 00 00 01 00 00 00 02 00 00 00 06
>>>>>> 0090   00 00 00 00
>>>>>> 
>>>>>> 
>>>>>> the last 12 bytes : bitmap size, bitmap[0], bitmap[1]
>>>>>> 
>>>>>> 
>>>>>> This part of code in the didn't change since 2010, and I
>>>>>> have no issues to use 5.8 kernel. I am pretty sure, that
>>>>>> tests with 5.9 did pass as expected. I will try to bisec it.
>>>>> 
>>>>> I don't think I've introduced this bug. I did not touch anything in the
>>>>> getdeviceinfo proc or XDR code.
>>>>> Does the following patch help?
>>>>> 
>>>>> 8<-------------------------------------------------------
>>>>> From e92b2d4e39e91d379ec1147115820ab5dfe4c89a Mon Sep 17 00:00:00 2001
>>>>> From: Trond Myklebust <trond.myklebust@hammerspace.com>
>>>>> Date: Fri, 13 Nov 2020 21:42:16 -0500
>>>>> Subject: [PATCH] NFSv4: Fix the alignment of page data in the getdeviceinfo
>>>>> reply
>>>>> 
>>>>> We can fit the device_addr4 opaque data padding in the pages.
>>>>> 
>>>>> Fixes: cf500bac8fd4 ("SUNRPC: Introduce rpc_prepare_reply_pages()")
>>>>> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
>>>>> ---
>>>>> fs/nfs/nfs4xdr.c | 14 ++++++++++----
>>>>> 1 file changed, 10 insertions(+), 4 deletions(-)
>>>>> 
>>>>> diff --git a/fs/nfs/nfs4xdr.c b/fs/nfs/nfs4xdr.c
>>>>> index c6dbfcae7517..c8714381d511 100644
>>>>> --- a/fs/nfs/nfs4xdr.c
>>>>> +++ b/fs/nfs/nfs4xdr.c
>>>>> @@ -3009,15 +3009,19 @@ static void nfs4_xdr_enc_getdeviceinfo(struct rpc_rqst
>>>>> *req,
>>>>> 	struct compound_hdr hdr = {
>>>>> 		.minorversion = nfs4_xdr_minorversion(&args->seq_args),
>>>>> 	};
>>>>> +	uint32_t replen;
>>>>> 
>>>>> 	encode_compound_hdr(xdr, req, &hdr);
>>>>> 	encode_sequence(xdr, &args->seq_args, &hdr);
>>>>> +
>>>>> +	replen = hdr.replen + op_decode_hdr_maxsz;
>>>>> +
>>>>> 	encode_getdeviceinfo(xdr, args, &hdr);
>>>>> 
>>>>> -	/* set up reply kvec. Subtract notification bitmap max size (2)
>>>>> -	 * so that notification bitmap is put in xdr_buf tail */
>>>>> +	/* set up reply kvec. device_addr4 opaque data is read into the
>>>>> +	 * pages */
>>>>> 	rpc_prepare_reply_pages(req, args->pdev->pages, args->pdev->pgbase,
>>>>> -				args->pdev->pglen, hdr.replen - 2);
>>>>> +				args->pdev->pglen, replen + 2);
>>>>> 	encode_nops(&hdr);
>>>>> }
>>>>> 
>>>>> @@ -5848,7 +5852,9 @@ static int decode_getdeviceinfo(struct xdr_stream *xdr,
>>>>> 	 * and places the remaining xdr data in xdr_buf->tail
>>>>> 	 */
>>>>> 	pdev->mincount = be32_to_cpup(p);
>>>>> -	if (xdr_read_pages(xdr, pdev->mincount) != pdev->mincount)
>>>>> +	/* Calculate padding */
>>>>> +	len = xdr_align_size(pdev->mincount);
>>>>> +	if (xdr_read_pages(xdr, len) != len)
>>>>> 		return -EIO;
>>>>> 
>>>>> 	/* Parse notification bitmap, verifying that it is zero. */
>>>>> --
>>>>> 2.28.0
>>>>> 
>>>>> 
>>>>> 
>>>>> --
>>>>> Trond Myklebust
>>>>> Linux NFS client maintainer, Hammerspace
> > > > > trond.myklebust@hammerspace.com

      reply	other threads:[~2020-12-01 14:45 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-10 23:18 [PATCH v3 00/11] Add RDMA support to the pNFS file+flexfiles data channels trondmy
2020-11-10 23:18 ` [PATCH v3 01/11] SUNRPC: xprt_load_transport() needs to support the netid "rdma6" trondmy
2020-11-10 23:18   ` [PATCH v3 02/11] SUNRPC: Close a race with transport setup and module put trondmy
2020-11-10 23:18     ` [PATCH v3 03/11] SUNRPC: Add a helper to return the transport identifier given a netid trondmy
2020-11-10 23:18       ` [PATCH v3 04/11] NFS: Switch mount code to use xprt_find_transport_ident() trondmy
2020-11-10 23:19         ` [PATCH v3 05/11] SUNRPC: Remove unused function xprt_load_transport() trondmy
2020-11-10 23:19           ` [PATCH v3 06/11] NFSv4/pNFS: Use connections to a DS that are all of the same protocol family trondmy
2020-11-10 23:19             ` [PATCH v3 07/11] pNFS: Add helpers for allocation/free of struct nfs4_pnfs_ds_addr trondmy
2020-11-10 23:19               ` [PATCH v3 08/11] NFSv4/pNFS: Store the transport type in " trondmy
2020-11-10 23:19                 ` [PATCH v3 09/11] pNFS/flexfiles: Fix up layoutstats reporting for non-TCP transports trondmy
2020-11-10 23:19                   ` [PATCH v3 10/11] SUNRPC: Fix up open coded kmemdup_nul() trondmy
2020-11-10 23:19                     ` [PATCH v3 11/11] pNFS: Clean up open coded xdr string decoding trondmy
2020-11-10 23:42 ` [PATCH v3 00/11] Add RDMA support to the pNFS file+flexfiles data channels Trond Myklebust
2020-11-13 12:48   ` Mkrtchyan, Tigran
2020-11-13 21:30     ` Mkrtchyan, Tigran
2020-11-13 22:45       ` Trond Myklebust
2020-11-13 23:46         ` Mkrtchyan, Tigran
2020-11-14 14:29           ` Trond Myklebust
2020-11-16 20:55             ` Mkrtchyan, Tigran
2020-11-17 14:50               ` Mkrtchyan, Tigran
2020-11-26 17:17                 ` Mkrtchyan, Tigran
2020-12-01 10:59                   ` Mkrtchyan, Tigran
2020-12-01 14:44                     ` Mkrtchyan, Tigran [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1164910234.1237300.1606833886593.JavaMail.zimbra@desy.de \
    --to=tigran.mkrtchyan@desy.de \
    --cc=anna.schumaker@netapp.com \
    --cc=linux-nfs@vger.kernel.org \
    --cc=trondmy@hammerspace.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).