All of lore.kernel.org
 help / color / mirror / Atom feed
* ip link vf info truncating with many VFs
@ 2020-02-29  0:33 Jacob Keller
  2020-03-02 23:17 ` Jakub Kicinski
  0 siblings, 1 reply; 5+ messages in thread
From: Jacob Keller @ 2020-02-29  0:33 UTC (permalink / raw)
  To: netdev, Florian Westphal, Thomas Graf

Hi,

I recently noticed an issue in the rtnetlink API for obtaining VF
information.

If a device creates 222 or more VF devices, the rtnl_fill_vf function
will incorrectly label the size of the IFLA_VFINFO_LIST attribute. This
occurs because rtnl_fill_vfinfo will have added more than 65k (maximum
size of a single attribute since nla_len is a __u16).

This causes the calculation in nla_nest_end to overflow and report a
significantly shorter length value. Worse case, with 222 VFs, the "ip
link show <device>" reports no VF info at all.

For some reason, the nla_put calls do not trigger an EMSGSIZE error,
because the skb itself is capable of holding the data.

I think the right thing is probably to do some sort of
overflow-protected calculation and print a warning... or find a way to
fix nla_put to error with -EMSGSIZE if we would exceed the nested
attribute size limit... I am not sure how to do that at a glance.

Thanks,
Jake

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: ip link vf info truncating with many VFs
  2020-02-29  0:33 ip link vf info truncating with many VFs Jacob Keller
@ 2020-03-02 23:17 ` Jakub Kicinski
  2020-03-02 23:21   ` Jacob Keller
  0 siblings, 1 reply; 5+ messages in thread
From: Jakub Kicinski @ 2020-03-02 23:17 UTC (permalink / raw)
  To: Jacob Keller; +Cc: netdev, Florian Westphal, Thomas Graf

On Fri, 28 Feb 2020 16:33:40 -0800 Jacob Keller wrote:
> Hi,
> 
> I recently noticed an issue in the rtnetlink API for obtaining VF
> information.
> 
> If a device creates 222 or more VF devices, the rtnl_fill_vf function
> will incorrectly label the size of the IFLA_VFINFO_LIST attribute. This
> occurs because rtnl_fill_vfinfo will have added more than 65k (maximum
> size of a single attribute since nla_len is a __u16).
> 
> This causes the calculation in nla_nest_end to overflow and report a
> significantly shorter length value. Worse case, with 222 VFs, the "ip
> link show <device>" reports no VF info at all.
> 
> For some reason, the nla_put calls do not trigger an EMSGSIZE error,
> because the skb itself is capable of holding the data.
> 
> I think the right thing is probably to do some sort of
> overflow-protected calculation and print a warning... or find a way to
> fix nla_put to error with -EMSGSIZE if we would exceed the nested
> attribute size limit... I am not sure how to do that at a glance.

Making nla_nest_end() return an error on overflow seems like 
the most reasonable way forward to me, FWIW. Simply compare
the result to U16_MAX, I don't think anything more clever is
needed.

Some of the callers actually already check for errors of
nla_nest_end() (qdiscs' dump methods use the result which 
is later checked for less that zero).

Then rtnetlink code should be made aware that nla_nest_end() 
may fail.

(When you post it's probably a good idea to widen the CC list 
to Johannes Berg, Pablo, DaveA, Jiri..)

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: ip link vf info truncating with many VFs
  2020-03-02 23:17 ` Jakub Kicinski
@ 2020-03-02 23:21   ` Jacob Keller
  2020-03-03 17:16     ` David Ahern
  0 siblings, 1 reply; 5+ messages in thread
From: Jacob Keller @ 2020-03-02 23:21 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: netdev, Florian Westphal, Thomas Graf

On 3/2/2020 3:17 PM, Jakub Kicinski wrote:
> On Fri, 28 Feb 2020 16:33:40 -0800 Jacob Keller wrote:
>> Hi,
>>
>> I recently noticed an issue in the rtnetlink API for obtaining VF
>> information.
>>
>> If a device creates 222 or more VF devices, the rtnl_fill_vf function
>> will incorrectly label the size of the IFLA_VFINFO_LIST attribute. This
>> occurs because rtnl_fill_vfinfo will have added more than 65k (maximum
>> size of a single attribute since nla_len is a __u16).
>>
>> This causes the calculation in nla_nest_end to overflow and report a
>> significantly shorter length value. Worse case, with 222 VFs, the "ip
>> link show <device>" reports no VF info at all.
>>
>> For some reason, the nla_put calls do not trigger an EMSGSIZE error,
>> because the skb itself is capable of holding the data.
>>
>> I think the right thing is probably to do some sort of
>> overflow-protected calculation and print a warning... or find a way to
>> fix nla_put to error with -EMSGSIZE if we would exceed the nested
>> attribute size limit... I am not sure how to do that at a glance.
> 
> Making nla_nest_end() return an error on overflow seems like 
> the most reasonable way forward to me, FWIW. Simply compare
> the result to U16_MAX, I don't think anything more clever is
> needed.
> 

Sure, I alto think that's the right approach to fix this.

As long we calculate the value using something larger than a u16 first,
that should work.

> Some of the callers actually already check for errors of
> nla_nest_end() (qdiscs' dump methods use the result which 
> is later checked for less that zero).

I'll take a look at the qdisc code.

> 
> Then rtnetlink code should be made aware that nla_nest_end() 
> may fail.
> 

Right.

> (When you post it's probably a good idea to widen the CC list 
> to Johannes Berg, Pablo, DaveA, Jiri..)
> 

Yep. I wasn't sure who all to add here, so I tried looking at the
MAINTAINERS file.

Thanks,
Jake

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: ip link vf info truncating with many VFs
  2020-03-02 23:21   ` Jacob Keller
@ 2020-03-03 17:16     ` David Ahern
  2020-03-03 17:59       ` Jacob Keller
  0 siblings, 1 reply; 5+ messages in thread
From: David Ahern @ 2020-03-03 17:16 UTC (permalink / raw)
  To: Jacob Keller, Jakub Kicinski; +Cc: netdev, Florian Westphal, Thomas Graf

On 3/2/20 4:21 PM, Jacob Keller wrote:
> On 3/2/2020 3:17 PM, Jakub Kicinski wrote:
>> On Fri, 28 Feb 2020 16:33:40 -0800 Jacob Keller wrote:
>>> Hi,
>>>
>>> I recently noticed an issue in the rtnetlink API for obtaining VF
>>> information.
>>>
>>> If a device creates 222 or more VF devices, the rtnl_fill_vf function
>>> will incorrectly label the size of the IFLA_VFINFO_LIST attribute. This
>>> occurs because rtnl_fill_vfinfo will have added more than 65k (maximum
>>> size of a single attribute since nla_len is a __u16).
>>>
>>> This causes the calculation in nla_nest_end to overflow and report a
>>> significantly shorter length value. Worse case, with 222 VFs, the "ip
>>> link show <device>" reports no VF info at all.
>>>
>>> For some reason, the nla_put calls do not trigger an EMSGSIZE error,
>>> because the skb itself is capable of holding the data.
>>>
>>> I think the right thing is probably to do some sort of
>>> overflow-protected calculation and print a warning... or find a way to
>>> fix nla_put to error with -EMSGSIZE if we would exceed the nested
>>> attribute size limit... I am not sure how to do that at a glance.
>>
>> Making nla_nest_end() return an error on overflow seems like 
>> the most reasonable way forward to me, FWIW. Simply compare
>> the result to U16_MAX, I don't think anything more clever is
>> needed.
>>
> 
> Sure, I alto think that's the right approach to fix this.
> 
> As long we calculate the value using something larger than a u16 first,
> that should work.
> 

Another pandora's box.

Seems like there are a few other places that set nla_len that should be
checked as well - like __nla_reserve and that bleeds into __nla_put.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: ip link vf info truncating with many VFs
  2020-03-03 17:16     ` David Ahern
@ 2020-03-03 17:59       ` Jacob Keller
  0 siblings, 0 replies; 5+ messages in thread
From: Jacob Keller @ 2020-03-03 17:59 UTC (permalink / raw)
  To: David Ahern, Jakub Kicinski; +Cc: netdev, Florian Westphal, Thomas Graf

On 3/3/2020 9:16 AM, David Ahern wrote:
> On 3/2/20 4:21 PM, Jacob Keller wrote:
>> On 3/2/2020 3:17 PM, Jakub Kicinski wrote:
>>> On Fri, 28 Feb 2020 16:33:40 -0800 Jacob Keller wrote:
>>>> Hi,
>>>>
>>>> I recently noticed an issue in the rtnetlink API for obtaining VF
>>>> information.
>>>>
>>>> If a device creates 222 or more VF devices, the rtnl_fill_vf function
>>>> will incorrectly label the size of the IFLA_VFINFO_LIST attribute. This
>>>> occurs because rtnl_fill_vfinfo will have added more than 65k (maximum
>>>> size of a single attribute since nla_len is a __u16).
>>>>
>>>> This causes the calculation in nla_nest_end to overflow and report a
>>>> significantly shorter length value. Worse case, with 222 VFs, the "ip
>>>> link show <device>" reports no VF info at all.
>>>>
>>>> For some reason, the nla_put calls do not trigger an EMSGSIZE error,
>>>> because the skb itself is capable of holding the data.
>>>>
>>>> I think the right thing is probably to do some sort of
>>>> overflow-protected calculation and print a warning... or find a way to
>>>> fix nla_put to error with -EMSGSIZE if we would exceed the nested
>>>> attribute size limit... I am not sure how to do that at a glance.
>>>
>>> Making nla_nest_end() return an error on overflow seems like 
>>> the most reasonable way forward to me, FWIW. Simply compare
>>> the result to U16_MAX, I don't think anything more clever is
>>> needed.
>>>
>>
>> Sure, I alto think that's the right approach to fix this.
>>
>> As long we calculate the value using something larger than a u16 first,
>> that should work.
>>
> 
> Another pandora's box.
> 
> Seems like there are a few other places that set nla_len that should be
> checked as well - like __nla_reserve and that bleeds into __nla_put.
> 
Yea, I bet. I'm working on a patch for this nla_nest_end first.

Thanks,
Jake

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2020-03-03 18:04 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-02-29  0:33 ip link vf info truncating with many VFs Jacob Keller
2020-03-02 23:17 ` Jakub Kicinski
2020-03-02 23:21   ` Jacob Keller
2020-03-03 17:16     ` David Ahern
2020-03-03 17:59       ` Jacob Keller

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.