From: Chuck Lever <chuck.lever@oracle.com>
To: Olga Kornievskaia <aglo@umich.edu>
Cc: Trond Myklebust <trond.myklebust@hammerspace.com>,
Linux NFS Mailing List <linux-nfs@vger.kernel.org>
Subject: Re: [RFC PATCH] fix krb5p mount not providing large enough buffer in rq_rcvsize
Date: Thu, 12 Mar 2020 16:10:39 -0400 [thread overview]
Message-ID: <E8169251-A626-4FD6-9A62-42C218AB79DF@oracle.com> (raw)
In-Reply-To: <CAN-5tyHjrNcSc+h62dBiYhNmLxWcR1Pj7fLJOnSfgR6JDZbEAA@mail.gmail.com>
> On Mar 12, 2020, at 3:17 PM, Olga Kornievskaia <aglo@umich.edu> wrote:
>
> On Tue, Mar 10, 2020 at 7:56 PM Chuck Lever <chuck.lever@oracle.com> wrote:
>>
>>
>>
>>> On Mar 10, 2020, at 5:07 PM, Olga Kornievskaia <aglo@umich.edu> wrote:
>>>
>>> Hi Chuck,
>>>
>>> On Tue, Mar 10, 2020 at 3:57 PM Chuck Lever <chuck.lever@oracle.com> wrote:
>>>>
>>>> Hi Olga-
>>>>
>>>>> On Mar 10, 2020, at 2:58 PM, Olga Kornievskaia <aglo@umich.edu> wrote:
>>>>>
>>>>> Ever since commit 2c94b8eca1a26 "SUNRPC: Use au_rslack when computing
>>>>> reply buffer size". It changed how "req->rq_rcvsize" is calculated. It
>>>>> used to use au_cslack value which was nice and large and changed it to
>>>>> au_rslack value which turns out to be too small.
>>>>>
>>>>> Since 5.1, v3 mount with sec=krb5p fails against an Ontap server
>>>>> because client's receive buffer it too small.
>>>>
>>>> Can you be more specific? For instance, why is 100 bytes adequate for
>>>> Linux servers, but not OnTAP?
>>>
>>> I don't know why Ontap sends more data than Linux server.
>>
>> Let's be sure we are fixing the right problem. Yes, au_rslack is
>> smaller in v5.1, and that results in a behavioral regression. But
>> exactly which part of the new calculation is incorrect is not yet
>> clear. Simply bumping GSS_VERF_SLACK could very well plaster over
>> the real problem.
>>
>>
>>> The opaque_len is just a lot larger. For the first message Linux
>>> opaque_len is 120bytes and Ontap it's 206. So it could be for instance
>>> for FSINFO that sends the file handle, for Netapp the file handle is
>>> 44bytes and for Linux it's only 28bytes.
>>
>> The maximum filehandle size should already be accounted for in the
>> maxsize macro for FSINFO.
>>
>> Is this problem evident only with NFSv3 plus krb5p?
>
> So far that seems to be the case. Every other version and security flavor works.
>
>>>> Is this explanation for the current value not correct?
>>>>
>>>> 51 /* length of a krb5 verifier (48), plus data added before arguments when
>>>> 52 * using integrity (two 4-byte integers): */
>>>
>>> I'm not sure what it is suppose to be. Isn't "data before arguments"
>>> can vary in length and thus explain why linux and onto sizes are
>>> different?
>>> Looking at the network trace the krb5 verifier I see is 36bytes.
>>
>> GSS_VERF_SLACK is only for the extra length added by GSS data. The
>> length of the RPC message itself is handled separately, see above.
>>
>> Can you post a Wireshark dissection of the problematic FSINFO reply?
>> (Having a working reply from Linux and a failing reply from OnTAP
>> would be even better).
>
> I'm attaching two files. I mount against linux and mount against ontap.
>
>
>
>
>>>>> For GSS, au_rslack is calculated from GSS_VERF_SLACK value which is
>>>>> currently 100. And it's not enough. Changing it to 104 works and then
>>>>> au_rslack is recalculated based on actual received mic.len and not
>>>>> just the default buffer size.
>>
>> What are the computed au_ralign and au_rslack values after the first
>> successful operation?
>
> With GSS_VERF_SLACK 100
> Linux run:
>
> Mar 12 13:14:29 localhost kernel: AGLO: gss_create_new setting for
> auth=00000000e14fdc39 cslack=200 and rslack=25
> Mar 12 13:14:29 localhost kernel: AGLO: gss_create_new setting for
> auth=00000000e14fdc39 ralign=25
> Mar 12 13:14:29 localhost kernel: NFS call fsinfo
> ... <gssd upcall>
> Mar 12 13:14:29 localhost kernel: AGLO: call_allocate
> auth=00000000e14fdc39 au_cslack=200 au_rslack=25 rq_rcvsize=256
> p_replen=35
> Mar 12 13:14:29 localhost kernel: AGLO: gss_unwrap_resp_priv rcv_buf
> len=176 is ok offset=56 opaque=120
> Mar 12 13:14:29 localhost kernel: AGLO: gss_unwrap_resp_priv ****
> auth=00000000e14fdc39 resetting au_rslack=26
> Mar 12 13:14:29 localhost kernel: AGLO: gss_unwrap_resp_priv ****
> auth=00000000e14fdc39 resetting au_ralign=26
> Mar 12 13:14:29 localhost kernel: NFS reply fsinfo: 0
>
> Ontap run:
> Mar 12 13:16:46 localhost kernel: AGLO: gss_create_new setting for
> auth=00000000e02d9e6e cslack=200 and rslack=25
> Mar 12 13:16:46 localhost kernel: AGLO: gss_create_new setting for
> auth=00000000e02d9e6e ralign=25
> Mar 12 13:16:46 localhost kernel: NFS call fsinfo
> ... <gssd upcall>
> Mar 12 13:16:46 localhost kernel: AGLO: call_allocate
> auth=00000000e02d9e6e au_cslack=200 au_rslack=25 rq_rcvsize=256
> p_replen=35
> Mar 12 13:16:46 localhost kernel: AGLO: gss_unwrap_resp_priv rcv_buf
> len=256 too small offset=56 opaque=204
> Mar 12 13:16:46 localhost kernel: NFS reply fsinfo: -5
>
> With GSS_VERF_SLACK 104
> Mar 12 13:33:23 localhost kernel: AGLO: gss_create_new setting for
> auth=000000004a545ea2 cslack=200 and rslack=26
> Mar 12 13:33:23 localhost kernel: AGLO: gss_create_new setting for
> auth=000000004a545ea2 ralign=26
> Mar 12 13:33:23 localhost kernel: NFS call fsinfo
> ... <gssd upcall>
> Mar 12 13:33:23 localhost kernel: AGLO: call_allocate
> auth=000000004a545ea2 au_cslack=200 au_rslack=26 rq_rcvsize=260
> p_replen=35
> Mar 12 13:33:23 localhost kernel: AGLO: gss_unwrap_resp_priv rcv_buf
> len=260 is ok offset=56 opaque=204
> Mar 12 13:33:23 localhost kernel: AGLO: gss_unwrap_resp_priv ****
> auth=000000004a545ea2 resetting au_rslack=26
> Mar 12 13:33:23 localhost kernel: AGLO: gss_unwrap_resp_priv ****
> auth=000000004a545ea2 resetting au_ralign=26
> Mar 12 13:33:23 localhost kernel: NFS reply fsinfo: 0
>
> difference in actual packets in fsinfo is that ontap sends postattrs
> so that's 84bytes.
>
> req->rq_rcvsize = RPC_REPHDRSIZE + auth->au_rslack + \
> max_t(size_t, proc->p_replen, 2);
>
> RPC_REPHDRSIZE is defined to be 4 (*4) (it says it doesn't include
> the verifier ???)
> rslack needs to cover kerberos blob 25 (*4) (but that's the kerberos
> part a part of the wrap and not the verifier)
> p_replen to cover fs_info args 35 (*4) (seems like the right number)
>
> So we are missing the GSS to include the verifier and the kerberos
> blob of the wrapper (and lengths!!). Basically we need GSS_VERF_SLACK
> to cover 2 kerberos blobs (or more specifically KRB_TOKEN_CFX_GetMic
> 9*4 and KRB_TOKEN_CFS_WRAP 15*4 + 2 lengths before the kerberos blobs
> = 104 and we are only giving 100).
GSS_VERF_SLACK is also used for setting au_verfsize, so please don't
change its value. Define a new constant for initializing au_rslack.
Let's construct that constant using the KRB5_TOKEN constants you mention
here... include/linux/sunrpc/gss_krb5.h has
221 /*
222 * This compile-time check verifies that we will not exceed the
223 * slack space allotted by the client and server auth_gss code
224 * before they call gss_wrap().
225 */
226 #define GSS_KRB5_MAX_SLACK_NEEDED \
227 (GSS_KRB5_TOK_HDR_LEN /* gss token header */ \
228 + GSS_KRB5_MAX_CKSUM_LEN /* gss token checksum */ \
229 + GSS_KRB5_MAX_BLOCKSIZE /* confounder */ \
230 + GSS_KRB5_MAX_BLOCKSIZE /* possible padding */ \
231 + GSS_KRB5_TOK_HDR_LEN /* encrypted hdr in v2 token */\
232 + GSS_KRB5_MAX_CKSUM_LEN /* encryption hmac */ \
233 + 4 + 4 /* RPC verifier */ \
234 + GSS_KRB5_TOK_HDR_LEN \
235 + GSS_KRB5_MAX_CKSUM_LEN)
So this, or something like this, plus the comment below.
> The reason things work against linux is because it has a nice buffer
> of 84bytes of post attributes that it doesn't send.
Missing post-attributes makes sense. Thank you for the analysis.
> To address your later point that kerberos blob is encryption type
> depended and that once some other encryption is added to gss-kerberos
> that's larger than existing checksum then this value would need to be
> adjusted again.
> If you agree with my reasoning for the number then I'd like to send
> out a patch now.
The current numbers are based on the kernel GSS implementation supporting
only Kerberos with a narrow set of enctypes. That needs to be made clear
in a documenting comment.
The reason this has been bothersome is because the existing setting is
a magic number (100), and its documenting comment has been stale since
2006. Any proposed fix has to address the missing documentation.
>>>>> I would like to propose to change it to something a little larger than
>>>>> 104, like 120 to give room if some other server might reply with
>>>>> something even larger.
>>>>
>>>> Why does it need to be larger than 104?
>>>
>>> I don't know why 100 was chosen and given that I think arguments are
>>> taken into the account and arguments can change. I think NetApp has
>>> changed their file handle sizes (at some point, not in the near past
>>> but i think so?). Perhaps they might want to do that again so the size
>>> will change again.
>>>
>>> Honestly, I would have like for 100 to be 200 to be safe.
>>
>> To be safe, I would like to have a good understanding of the details,
>> rather than guessing at an arbitrary maximum value. Let's choose a
>> rational maximum and include a descriptive comment about why that value
>> is the best choice.
>>
>>
>>>>> Thoughts? Will send an actual patch if no objections to this one.
>>>>>
>>>>> diff --git a/net/sunrpc/auth_gss/auth_gss.c b/net/sunrpc/auth_gss/auth_gss.c
>>>>> index 24ca861..44ae6bc 100644
>>>>> --- a/net/sunrpc/auth_gss/auth_gss.c
>>>>> +++ b/net/sunrpc/auth_gss/auth_gss.c
>>>>> @@ -50,7 +50,7 @@
>>>>> #define GSS_CRED_SLACK (RPC_MAX_AUTH_SIZE * 2)
>>>>> /* length of a krb5 verifier (48), plus data added before arguments when
>>>>> * using integrity (two 4-byte integers): */
>>>>> -#define GSS_VERF_SLACK 100
>>>>> +#define GSS_VERF_SLACK 120
>>>>>
>>>>> static DEFINE_HASHTABLE(gss_auth_hash_table, 4);
>>>>> static DEFINE_SPINLOCK(gss_auth_hash_lock);
>>>>
>>>> --
>>>> Chuck Lever
>>
>> --
>> Chuck Lever
>>
>>
>>
> <linux-v3-krb5p-mount.pcap.gz><ontap-v3-krb5-mount.pcap.gz>
--
Chuck Lever
next prev parent reply other threads:[~2020-03-12 20:10 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-03-10 18:58 [RFC PATCH] fix krb5p mount not providing large enough buffer in rq_rcvsize Olga Kornievskaia
2020-03-10 19:57 ` Chuck Lever
2020-03-10 21:07 ` Olga Kornievskaia
2020-03-10 23:56 ` Chuck Lever
2020-03-11 14:57 ` Chuck Lever
[not found] ` <CAN-5tyHjrNcSc+h62dBiYhNmLxWcR1Pj7fLJOnSfgR6JDZbEAA@mail.gmail.com>
2020-03-12 20:10 ` Chuck Lever [this message]
2020-03-20 19:28 ` Olga Kornievskaia
2020-03-20 20:18 ` Chuck Lever
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=E8169251-A626-4FD6-9A62-42C218AB79DF@oracle.com \
--to=chuck.lever@oracle.com \
--cc=aglo@umich.edu \
--cc=linux-nfs@vger.kernel.org \
--cc=trond.myklebust@hammerspace.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).