netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [ovs-dev] [PATCH net v2] net: openvswitch: fix upcall counter access before allocation
@ 2023-06-07  1:05 wangchuanlei
  2023-06-07  9:09 ` Eelco Chaudron
  0 siblings, 1 reply; 4+ messages in thread
From: wangchuanlei @ 2023-06-07  1:05 UTC (permalink / raw)
  To: echaudro
  Cc: aconole, dev, netdev, simon.horman, wangpeihui, kuba, pabeni, davem

Thanks for fix this, in common enviroment, it's a 
small probability event.

> Eelco Chaudron <echaudro@redhat.com> writes:

> Currently, the per cpu upcall counters are allocated after the vport 
> is created and inserted into the system. This could lead to the 
> datapath accessing the counters before they are allocated resulting in 
> a kernel Oops.
>
> Here is an example:
>
>   PID: 59693    TASK: ffff0005f4f51500  CPU: 0    COMMAND: "ovs-vswitchd"
>    #0 [ffff80000a39b5b0] __switch_to at ffffb70f0629f2f4
>    #1 [ffff80000a39b5d0] __schedule at ffffb70f0629f5cc
>    #2 [ffff80000a39b650] preempt_schedule_common at ffffb70f0629fa60
>    #3 [ffff80000a39b670] dynamic_might_resched at ffffb70f0629fb58
>    #4 [ffff80000a39b680] mutex_lock_killable at ffffb70f062a1388
>    #5 [ffff80000a39b6a0] pcpu_alloc at ffffb70f0594460c
>    #6 [ffff80000a39b750] __alloc_percpu_gfp at ffffb70f05944e68
>    #7 [ffff80000a39b760] ovs_vport_cmd_new at ffffb70ee6961b90 [openvswitch]
>    ...
>
>   PID: 58682    TASK: ffff0005b2f0bf00  CPU: 0    COMMAND: "kworker/0:3"
>    #0 [ffff80000a5d2f40] machine_kexec at ffffb70f056a0758
>    #1 [ffff80000a5d2f70] __crash_kexec at ffffb70f057e2994
>    #2 [ffff80000a5d3100] crash_kexec at ffffb70f057e2ad8
>    #3 [ffff80000a5d3120] die at ffffb70f0628234c
>    #4 [ffff80000a5d31e0] die_kernel_fault at ffffb70f062828a8
>    #5 [ffff80000a5d3210] __do_kernel_fault at ffffb70f056a31f4
>    #6 [ffff80000a5d3240] do_bad_area at ffffb70f056a32a4
>    #7 [ffff80000a5d3260] do_translation_fault at ffffb70f062a9710
>    #8 [ffff80000a5d3270] do_mem_abort at ffffb70f056a2f74
>    #9 [ffff80000a5d32a0] el1_abort at ffffb70f06297dac
>   #10 [ffff80000a5d32d0] el1h_64_sync_handler at ffffb70f06299b24
>   #11 [ffff80000a5d3410] el1h_64_sync at ffffb70f056812dc
>   #12 [ffff80000a5d3430] ovs_dp_upcall at ffffb70ee6963c84 [openvswitch]
>   #13 [ffff80000a5d3470] ovs_dp_process_packet at ffffb70ee6963fdc [openvswitch]
>   #14 [ffff80000a5d34f0] ovs_vport_receive at ffffb70ee6972c78 [openvswitch]
>   #15 [ffff80000a5d36f0] netdev_port_receive at ffffb70ee6973948 [openvswitch]
>   #16 [ffff80000a5d3720] netdev_frame_hook at ffffb70ee6973a28 [openvswitch]
>   #17 [ffff80000a5d3730] __netif_receive_skb_core.constprop.0 at 
> ffffb70f06079f90
>
> We moved the per cpu upcall counter allocation to the existing vport 
> alloc and free functions to solve this.
>
> Fixes: 95637d91fefd ("net: openvswitch: release vport resources on 
> failure")
> Fixes: 1933ea365aa7 ("net: openvswitch: Add support to count upcall 
> packets")
> Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
> ---

Acked-by: Aaron Conole <aconole@redhat.com>

_______________________________________________
dev mailing list
dev@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [ovs-dev] [PATCH net v2] net: openvswitch: fix upcall counter access before allocation
  2023-06-07  1:05 [ovs-dev] [PATCH net v2] net: openvswitch: fix upcall counter access before allocation wangchuanlei
@ 2023-06-07  9:09 ` Eelco Chaudron
  2023-06-08  8:50   ` Simon Horman
  0 siblings, 1 reply; 4+ messages in thread
From: Eelco Chaudron @ 2023-06-07  9:09 UTC (permalink / raw)
  To: wangchuanlei
  Cc: aconole, dev, netdev, simon.horman, wangpeihui, kuba, pabeni, davem



On 7 Jun 2023, at 3:05, wangchuanlei wrote:

> Thanks for fix this, in common enviroment, it's a
> small probability event.

Well, on ARM, they could replicate it a couple of times, but I guess the system was under memory pressure and has a lot of cores.

>> Eelco Chaudron <echaudro@redhat.com> writes:
>
>> Currently, the per cpu upcall counters are allocated after the vport
>> is created and inserted into the system. This could lead to the
>> datapath accessing the counters before they are allocated resulting in
>> a kernel Oops.
>>
>> Here is an example:
>>
>>   PID: 59693    TASK: ffff0005f4f51500  CPU: 0    COMMAND: "ovs-vswitchd"
>>    #0 [ffff80000a39b5b0] __switch_to at ffffb70f0629f2f4
>>    #1 [ffff80000a39b5d0] __schedule at ffffb70f0629f5cc
>>    #2 [ffff80000a39b650] preempt_schedule_common at ffffb70f0629fa60
>>    #3 [ffff80000a39b670] dynamic_might_resched at ffffb70f0629fb58
>>    #4 [ffff80000a39b680] mutex_lock_killable at ffffb70f062a1388
>>    #5 [ffff80000a39b6a0] pcpu_alloc at ffffb70f0594460c
>>    #6 [ffff80000a39b750] __alloc_percpu_gfp at ffffb70f05944e68
>>    #7 [ffff80000a39b760] ovs_vport_cmd_new at ffffb70ee6961b90 [openvswitch]
>>    ...
>>
>>   PID: 58682    TASK: ffff0005b2f0bf00  CPU: 0    COMMAND: "kworker/0:3"
>>    #0 [ffff80000a5d2f40] machine_kexec at ffffb70f056a0758
>>    #1 [ffff80000a5d2f70] __crash_kexec at ffffb70f057e2994
>>    #2 [ffff80000a5d3100] crash_kexec at ffffb70f057e2ad8
>>    #3 [ffff80000a5d3120] die at ffffb70f0628234c
>>    #4 [ffff80000a5d31e0] die_kernel_fault at ffffb70f062828a8
>>    #5 [ffff80000a5d3210] __do_kernel_fault at ffffb70f056a31f4
>>    #6 [ffff80000a5d3240] do_bad_area at ffffb70f056a32a4
>>    #7 [ffff80000a5d3260] do_translation_fault at ffffb70f062a9710
>>    #8 [ffff80000a5d3270] do_mem_abort at ffffb70f056a2f74
>>    #9 [ffff80000a5d32a0] el1_abort at ffffb70f06297dac
>>   #10 [ffff80000a5d32d0] el1h_64_sync_handler at ffffb70f06299b24
>>   #11 [ffff80000a5d3410] el1h_64_sync at ffffb70f056812dc
>>   #12 [ffff80000a5d3430] ovs_dp_upcall at ffffb70ee6963c84 [openvswitch]
>>   #13 [ffff80000a5d3470] ovs_dp_process_packet at ffffb70ee6963fdc [openvswitch]
>>   #14 [ffff80000a5d34f0] ovs_vport_receive at ffffb70ee6972c78 [openvswitch]
>>   #15 [ffff80000a5d36f0] netdev_port_receive at ffffb70ee6973948 [openvswitch]
>>   #16 [ffff80000a5d3720] netdev_frame_hook at ffffb70ee6973a28 [openvswitch]
>>   #17 [ffff80000a5d3730] __netif_receive_skb_core.constprop.0 at
>> ffffb70f06079f90
>>
>> We moved the per cpu upcall counter allocation to the existing vport
>> alloc and free functions to solve this.
>>
>> Fixes: 95637d91fefd ("net: openvswitch: release vport resources on
>> failure")
>> Fixes: 1933ea365aa7 ("net: openvswitch: Add support to count upcall
>> packets")
>> Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
>> ---
>
> Acked-by: Aaron Conole <aconole@redhat.com>

Were you intentionally ACKing this on Aaron’s behalf? Or just a cut/paste error ;)

> _______________________________________________
> dev mailing list
> dev@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [ovs-dev] [PATCH net v2] net: openvswitch: fix upcall counter access before allocation
  2023-06-07  9:09 ` Eelco Chaudron
@ 2023-06-08  8:50   ` Simon Horman
  2023-06-12 15:06     ` Aaron Conole
  0 siblings, 1 reply; 4+ messages in thread
From: Simon Horman @ 2023-06-08  8:50 UTC (permalink / raw)
  To: Eelco Chaudron
  Cc: wangchuanlei, aconole, dev, netdev, wangpeihui, kuba, pabeni, davem

On Wed, Jun 07, 2023 at 11:09:58AM +0200, Eelco Chaudron wrote:

...

> >> We moved the per cpu upcall counter allocation to the existing vport
> >> alloc and free functions to solve this.
> >>
> >> Fixes: 95637d91fefd ("net: openvswitch: release vport resources on
> >> failure")
> >> Fixes: 1933ea365aa7 ("net: openvswitch: Add support to count upcall
> >> packets")
> >> Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
> >> ---
> >
> > Acked-by: Aaron Conole <aconole@redhat.com>
> 
> Were you intentionally ACKing this on Aaron’s behalf? Or just a cut/paste error ;)

I was wondering that too.
But then I concluded it was an artifact of top-posting or some
other behaviour of the mail client.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [ovs-dev] [PATCH net v2] net: openvswitch: fix upcall counter access before allocation
  2023-06-08  8:50   ` Simon Horman
@ 2023-06-12 15:06     ` Aaron Conole
  0 siblings, 0 replies; 4+ messages in thread
From: Aaron Conole @ 2023-06-12 15:06 UTC (permalink / raw)
  To: Simon Horman
  Cc: Eelco Chaudron, wangchuanlei, dev, netdev, wangpeihui, kuba,
	pabeni, davem

Simon Horman <simon.horman@corigine.com> writes:

> On Wed, Jun 07, 2023 at 11:09:58AM +0200, Eelco Chaudron wrote:
>
> ...
>
>> >> We moved the per cpu upcall counter allocation to the existing vport
>> >> alloc and free functions to solve this.
>> >>
>> >> Fixes: 95637d91fefd ("net: openvswitch: release vport resources on
>> >> failure")
>> >> Fixes: 1933ea365aa7 ("net: openvswitch: Add support to count upcall
>> >> packets")
>> >> Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
>> >> ---
>> >
>> > Acked-by: Aaron Conole <aconole@redhat.com>
>> 
>> Were you intentionally ACKing this on Aaron’s behalf? Or just a cut/paste error ;)
>
> I was wondering that too.
> But then I concluded it was an artifact of top-posting or some
> other behaviour of the mail client.

Thankfully, I did ack the patch, but yes, this is something to be
careful.

To wangchuanlei <wangchuanlei@inspur.com>, please read

  https://people.kernel.org/tglx/

Specifically, do not top-post for this and other reasons.


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2023-06-12 15:06 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-06-07  1:05 [ovs-dev] [PATCH net v2] net: openvswitch: fix upcall counter access before allocation wangchuanlei
2023-06-07  9:09 ` Eelco Chaudron
2023-06-08  8:50   ` Simon Horman
2023-06-12 15:06     ` Aaron Conole

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).