linux-rdma.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [bug report] net/mlx5e: Extend encap entry with reference counter
@ 2019-08-14 10:53 Dan Carpenter
  2019-08-14 13:49 ` Vlad Buslov
  0 siblings, 1 reply; 3+ messages in thread
From: Dan Carpenter @ 2019-08-14 10:53 UTC (permalink / raw)
  To: vladbu; +Cc: linux-rdma

[ I already wrote this email, but it looks like I deleted it instead of
  sending it.  So weird.  I hopefully don't send it twice! ]

Hi Vlad,

I noticed a possible refcounting bug in commit 948993f2beeb ("net/mlx5e:
Extend encap entry with reference counter") from Jun 3, 2018.

	drivers/net/ethernet/mellanox/mlx5/core/en_tc.c:1435 mlx5e_tc_update_neigh_used_value()
	error: dereferencing freed memory 'e'

drivers/net/ethernet/mellanox/mlx5/core/en_tc.c

  1415  void mlx5e_tc_update_neigh_used_value(struct mlx5e_neigh_hash_entry *nhe)
  1416  {
  1417          struct mlx5e_neigh *m_neigh = &nhe->m_neigh;
  1418          struct mlx5e_tc_flow *flow;
  1419          struct mlx5e_encap_entry *e;
  1420          struct mlx5_fc *counter;
  1421          struct neigh_table *tbl;
  1422          bool neigh_used = false;
  1423          struct neighbour *n;
  1424          u64 lastuse;
  1425  
  1426          if (m_neigh->family == AF_INET)
  1427                  tbl = &arp_tbl;
  1428  #if IS_ENABLED(CONFIG_IPV6)
  1429          else if (m_neigh->family == AF_INET6)
  1430                  tbl = &nd_tbl;
  1431  #endif
  1432          else
  1433                  return;
  1434  
  1435          list_for_each_entry_safe(e, tmp, &nhe->encap_list, encap_list) {
  1436                  struct encap_flow_item *efi, *tmp;
  1437  
  1438                  if (!(e->flags & MLX5_ENCAP_ENTRY_VALID) ||
  1439                      !mlx5e_encap_take(e))
                            ^^^^^^^^^^^^^^^^^^^
We take a reference here.

  1440                          continue;
  1441  
  1442                  list_for_each_entry_safe(efi, tmp, &e->flows, list) {
  1443                          flow = container_of(efi, struct mlx5e_tc_flow,
  1444                                              encaps[efi->index]);
  1445                          if (IS_ERR(mlx5e_flow_get(flow)))
  1446                                  continue;
  1447  
  1448                          if (mlx5e_is_offloaded_flow(flow)) {
  1449                                  counter = mlx5e_tc_get_counter(flow);
  1450                                  lastuse = mlx5_fc_query_lastuse(counter);
  1451                                  if (time_after((unsigned long)lastuse, nhe->reported_lastuse)) {
  1452                                          mlx5e_flow_put(netdev_priv(e->out_dev), flow);
  1453                                          neigh_used = true;
  1454                                          break;

I think we need to call mlx5e_encap_put(netdev_priv(e->out_dev), e);
before this break;

  1455                                  }
  1456                          }
  1457  
  1458                          mlx5e_flow_put(netdev_priv(e->out_dev), flow);
  1459                  }
  1460  
  1461                  mlx5e_encap_put(netdev_priv(e->out_dev), e);
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  1462                  if (neigh_used)
  1463                          break;
  1464          }
  1465  

regards,
dan carpenter

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [bug report] net/mlx5e: Extend encap entry with reference counter
  2019-08-14 10:53 [bug report] net/mlx5e: Extend encap entry with reference counter Dan Carpenter
@ 2019-08-14 13:49 ` Vlad Buslov
  2019-08-14 14:00   ` Dan Carpenter
  0 siblings, 1 reply; 3+ messages in thread
From: Vlad Buslov @ 2019-08-14 13:49 UTC (permalink / raw)
  To: Dan Carpenter; +Cc: Vlad Buslov, linux-rdma


On Wed 14 Aug 2019 at 13:53, Dan Carpenter <dan.carpenter@oracle.com> wrote:
> [ I already wrote this email, but it looks like I deleted it instead of
>   sending it.  So weird.  I hopefully don't send it twice! ]
>
> Hi Vlad,
>
> I noticed a possible refcounting bug in commit 948993f2beeb ("net/mlx5e:
> Extend encap entry with reference counter") from Jun 3, 2018.
>
> 	drivers/net/ethernet/mellanox/mlx5/core/en_tc.c:1435 mlx5e_tc_update_neigh_used_value()
> 	error: dereferencing freed memory 'e'

Hi Dan,

Thanks for reporting!

>
> drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
>
>   1415  void mlx5e_tc_update_neigh_used_value(struct mlx5e_neigh_hash_entry *nhe)
>   1416  {
>   1417          struct mlx5e_neigh *m_neigh = &nhe->m_neigh;
>   1418          struct mlx5e_tc_flow *flow;
>   1419          struct mlx5e_encap_entry *e;
>   1420          struct mlx5_fc *counter;
>   1421          struct neigh_table *tbl;
>   1422          bool neigh_used = false;
>   1423          struct neighbour *n;
>   1424          u64 lastuse;
>   1425
>   1426          if (m_neigh->family == AF_INET)
>   1427                  tbl = &arp_tbl;
>   1428  #if IS_ENABLED(CONFIG_IPV6)
>   1429          else if (m_neigh->family == AF_INET6)
>   1430                  tbl = &nd_tbl;
>   1431  #endif
>   1432          else
>   1433                  return;
>   1434
>   1435          list_for_each_entry_safe(e, tmp, &nhe->encap_list, encap_list) {
>   1436                  struct encap_flow_item *efi, *tmp;
>   1437
>   1438                  if (!(e->flags & MLX5_ENCAP_ENTRY_VALID) ||
>   1439                      !mlx5e_encap_take(e))
>                             ^^^^^^^^^^^^^^^^^^^
> We take a reference here.

Okay, we take reference to encap at the beginning of outer loop.

>
>   1440                          continue;
>   1441
>   1442                  list_for_each_entry_safe(efi, tmp, &e->flows, list) {
>   1443                          flow = container_of(efi, struct mlx5e_tc_flow,
>   1444                                              encaps[efi->index]);
>   1445                          if (IS_ERR(mlx5e_flow_get(flow)))
>   1446                                  continue;

Now we take reference to flow at the beginning of the inner loop.

>   1447
>   1448                          if (mlx5e_is_offloaded_flow(flow)) {
>   1449                                  counter = mlx5e_tc_get_counter(flow);
>   1450                                  lastuse = mlx5_fc_query_lastuse(counter);
>   1451                                  if (time_after((unsigned long)lastuse, nhe->reported_lastuse)) {
>   1452                                          mlx5e_flow_put(netdev_priv(e->out_dev), flow);
>   1453                                          neigh_used = true;
>   1454                                          break;
>
> I think we need to call mlx5e_encap_put(netdev_priv(e->out_dev), e);
> before this break;

We are breaking from the inner loop (not returning from the function,
just breaking the innermost loop) here after releasing the reference to
flow which was obtained at the beginning of the loop.

>
>   1455                                  }
>   1456                          }
>   1457
>   1458                          mlx5e_flow_put(netdev_priv(e->out_dev), flow);
>   1459                  }
>   1460
>   1461                  mlx5e_encap_put(netdev_priv(e->out_dev), e);
>                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Now we are releasing the reference to encap, both when inner loop
completed normally or when it was terminated with 'break'. If we were to
release the encap before the break, this would be the second time we
'put' it.

What am I missing?

>   1462                  if (neigh_used)
>   1463                          break;
>   1464          }
>   1465
>
> regards,
> dan carpenter

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [bug report] net/mlx5e: Extend encap entry with reference counter
  2019-08-14 13:49 ` Vlad Buslov
@ 2019-08-14 14:00   ` Dan Carpenter
  0 siblings, 0 replies; 3+ messages in thread
From: Dan Carpenter @ 2019-08-14 14:00 UTC (permalink / raw)
  To: Vlad Buslov; +Cc: linux-rdma

On Wed, Aug 14, 2019 at 01:49:54PM +0000, Vlad Buslov wrote:
> >   1447
> >   1448                          if (mlx5e_is_offloaded_flow(flow)) {
> >   1449                                  counter = mlx5e_tc_get_counter(flow);
> >   1450                                  lastuse = mlx5_fc_query_lastuse(counter);
> >   1451                                  if (time_after((unsigned long)lastuse, nhe->reported_lastuse)) {
> >   1452                                          mlx5e_flow_put(netdev_priv(e->out_dev), flow);
> >   1453                                          neigh_used = true;
> >   1454                                          break;
> >
> > I think we need to call mlx5e_encap_put(netdev_priv(e->out_dev), e);
> > before this break;
> 
> We are breaking from the inner loop (not returning from the function,
> just breaking the innermost loop) here after releasing the reference to
> flow which was obtained at the beginning of the loop.
> 

Oh...  Duh.  I am embarrassed.  I misread it to think we were breaking
from the outer loop.

regards,
dan carpenter


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2019-08-14 14:00 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-08-14 10:53 [bug report] net/mlx5e: Extend encap entry with reference counter Dan Carpenter
2019-08-14 13:49 ` Vlad Buslov
2019-08-14 14:00   ` Dan Carpenter

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).