netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] netpoll: Don't call driver methods from interrupt context.
@ 2014-03-03 20:40 Eric W. Biederman
  2014-03-04  4:23 ` Cong Wang
  2014-03-04 21:08 ` David Miller
  0 siblings, 2 replies; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-03 20:40 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, Cong Wang, Matt Mackall, Satyam Sharma


The attraction of the netpoll design is that with just one simple extra
method .ndo_poll_controller added to the driver a network adapter can be
polled.  This promise of simplicity and no special maintenance falls
down in the case of using network addapters from interrupt context.

There are multiple failure modes.  A typical example is:
WARNING: at net/core/skbuff.c:451 skb_release_head_state+0x7b/0xe1()
Pid: 0, comm: swapper/2 Not tainted 3.4 #1
Call Trace:
<IRQ>  [<ffffffff8104934c>] warn_slowpath_common+0x85/0x9d
[<ffffffff8104937e>] warn_slowpath_null+0x1a/0x1c
[<ffffffff81429aa7>] skb_release_head_state+0x7b/0xe1
[<ffffffff814297e1>] __kfree_skb+0x16/0x81
[<ffffffff814298a0>] consume_skb+0x54/0x69
[<ffffffffa015925b>] bnx2_tx_int.clone.6+0x1b0/0x33e [bnx2]
[<ffffffff8129c54d>] ? unmask_msi_irq+0x10/0x12
[<ffffffffa015aa06>] bnx2_poll_work+0x3a/0x73 [bnx2]
[<ffffffffa015aa73>] bnx2_poll_msix+0x34/0xb4 [bnx2]
[<ffffffff814466a2>] netpoll_poll_dev+0xb9/0x1b7
[<ffffffff814467d7>] ? find_skb+0x37/0x82
[<ffffffff814461ed>] netpoll_send_skb_on_dev+0x117/0x200
[<ffffffff81446a52>] netpoll_send_udp+0x230/0x242
[<ffffffffa0174296>] write_msg+0xa7/0xfb [netconsole]
[<ffffffff814258a4>] ? sk_free+0x1c/0x1e
[<ffffffff810495ad>] __call_console_drivers+0x7d/0x8f
[<ffffffff81049674>] _call_console_drivers+0xb5/0xd0
[<ffffffff8104a134>] console_unlock+0x131/0x219
[<ffffffff8104a7f9>] vprintk+0x3bc/0x405
[<ffffffff81460073>] ? NF_HOOK.clone.1+0x4c/0x53
[<ffffffff81460308>] ? ip_rcv+0x23c/0x268
[<ffffffff814ddd4f>] printk+0x68/0x71
[<ffffffff813315b3>] __dev_printk+0x78/0x7a
[<ffffffff813316b2>] dev_warn+0x53/0x55
[<ffffffff8127f181>] ? swiotlb_unmap_sg_attrs+0x47/0x5c
[<ffffffffa004f876>] complete_scsi_command+0x28a/0x4a0 [hpsa]
[<ffffffffa004fadb>] finish_cmd+0x4f/0x66 [hpsa]
[<ffffffffa004fd97>] process_indexed_cmd+0x48/0x54 [hpsa]
[<ffffffffa004ff25>] do_hpsa_intr_msi+0x4e/0x77 [hpsa]
[<ffffffff810baebb>] handle_irq_event_percpu+0x5e/0x1b6
[<ffffffff81088a0b>] ? timekeeping_update+0x43/0x45
[<ffffffff810bb04b>] handle_irq_event+0x38/0x54
[<ffffffff8102bd1e>] ? ack_apic_edge+0x36/0x3a
[<ffffffff810bd762>] handle_edge_irq+0xa5/0xc8
[<ffffffff81010d56>] handle_irq+0x127/0x135
[<ffffffff814e3426>] ? __atomic_notifier_call_chain+0x12/0x14
[<ffffffff814e343c>] ? atomic_notifier_call_chain+0x14/0x16
[<ffffffff814e897d>] do_IRQ+0x4d/0xb4
[<ffffffff814dffea>] common_interrupt+0x6a/0x6a
<EOI>  [<ffffffff812b7603>] ? intel_idle+0xd8/0x112
[<ffffffff812b7603>] ? intel_idle+0xd8/0x112
[<ffffffff812b75e9>] ? intel_idle+0xbe/0x112
[<ffffffff814012fc>] cpuidle_enter+0x12/0x14
[<ffffffff814019c2>] cpuidle_idle_call+0xd1/0x19b
[<ffffffff81016551>] cpu_idle+0xb6/0xff
[<ffffffff814d726b>] start_secondary+0xc8/0xca

To avoid this class of problem modify the netpoll so that it does not call
driver methods from interrupt context.

To achieve this all that is required is the addition of two simple tests
of in_irq(), and the ultilization of the existing logic.

Instead of attempting to transmit a packet from interrupt context,
updated the code to queue the skb in struct netpoll_info txq.

Similary when attempting to allocate a skb to hold the packet to be
transmitted when in interrupt context don't poll the device to see if
we can free some packet buffers.

In all cases where netpoll works reliably today this should result in no
change, but in nasty cases where there are messages printed from
interrupt context this should result in queued skbs that will transmited
with a small delay instead of executing code in conditions the network
deriver code has never been tested in which results in unpredictable
behavior.

One easy to trigger nasty pathology this avoids is generating a message
in interrupt context that generates a warning message the warning
message for calling the code in interrupt context which then generates
another warning message for calling the code in interrupt context
potentialy indefinitely.  That is a pathology I have observed triggered
with sysrq-t.

Cc: stable@vger.kernel.org
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 net/core/netpoll.c |    6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/net/core/netpoll.c b/net/core/netpoll.c
index a664f7829a6d..a1877621bf31 100644
--- a/net/core/netpoll.c
+++ b/net/core/netpoll.c
@@ -330,7 +330,7 @@ repeat:
 		skb = skb_dequeue(&skb_pool);
 
 	if (!skb) {
-		if (++count < 10) {
+		if (++count < 10 && !in_irq()) {
 			netpoll_poll_dev(np->dev);
 			goto repeat;
 		}
@@ -371,8 +371,8 @@ void netpoll_send_skb_on_dev(struct netpoll *np, struct sk_buff *skb,
 		return;
 	}
 
-	/* don't get messages out of order, and no recursion */
-	if (skb_queue_len(&npinfo->txq) == 0 && !netpoll_owner_active(dev)) {
+	/* don't get messages out of order, and no recursion, and don't operate in irq context */
+	if (skb_queue_len(&npinfo->txq) == 0 && !netpoll_owner_active(dev) && !in_irq()) {
 		struct netdev_queue *txq;
 
 		txq = netdev_pick_tx(dev, skb, NULL);
-- 
1.7.5.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* Re: [PATCH] netpoll: Don't call driver methods from interrupt context.
  2014-03-03 20:40 [PATCH] netpoll: Don't call driver methods from interrupt context Eric W. Biederman
@ 2014-03-04  4:23 ` Cong Wang
  2014-03-04 10:29   ` Eric W. Biederman
  2014-03-04 21:09   ` David Miller
  2014-03-04 21:08 ` David Miller
  1 sibling, 2 replies; 288+ messages in thread
From: Cong Wang @ 2014-03-04  4:23 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: David Miller, Linux Kernel Network Developers, Matt Mackall,
	Satyam Sharma

On Mon, Mar 3, 2014 at 12:40 PM, Eric W. Biederman
<ebiederm@xmission.com> wrote:
>  net/core/netpoll.c |    6 +++---
>  1 files changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/net/core/netpoll.c b/net/core/netpoll.c
> index a664f7829a6d..a1877621bf31 100644
> --- a/net/core/netpoll.c
> +++ b/net/core/netpoll.c
> @@ -330,7 +330,7 @@ repeat:
>                 skb = skb_dequeue(&skb_pool);
>
>         if (!skb) {
> -               if (++count < 10) {
> +               if (++count < 10 && !in_irq()) {
>                         netpoll_poll_dev(np->dev);

This looks like a workaround.

Here ou are trying to avoid calling netpoll_poll_dev()
in IRQ context, but it has a side effect for netpoll_send_udp()
which could possibly return early after find_skb().

Also, netpoll_poll_dev() does more than just calling driver
poll method, I am not sure if it is safe to skip it either.

netpoll code needs to rewrite.

Thanks.

^ permalink raw reply	[flat|nested] 288+ messages in thread

* Re: [PATCH] netpoll: Don't call driver methods from interrupt context.
  2014-03-04  4:23 ` Cong Wang
@ 2014-03-04 10:29   ` Eric W. Biederman
  2014-03-04 21:09   ` David Miller
  1 sibling, 0 replies; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-04 10:29 UTC (permalink / raw)
  To: Cong Wang
  Cc: David Miller, Linux Kernel Network Developers, Matt Mackall,
	Satyam Sharma

Cong Wang <xiyou.wangcong@gmail.com> writes:

> On Mon, Mar 3, 2014 at 12:40 PM, Eric W. Biederman
> <ebiederm@xmission.com> wrote:
>>  net/core/netpoll.c |    6 +++---
>>  1 files changed, 3 insertions(+), 3 deletions(-)
>>
>> diff --git a/net/core/netpoll.c b/net/core/netpoll.c
>> index a664f7829a6d..a1877621bf31 100644
>> --- a/net/core/netpoll.c
>> +++ b/net/core/netpoll.c
>> @@ -330,7 +330,7 @@ repeat:
>>                 skb = skb_dequeue(&skb_pool);
>>
>>         if (!skb) {
>> -               if (++count < 10) {
>> +               if (++count < 10 && !in_irq()) {
>>                         netpoll_poll_dev(np->dev);
>
> This looks like a workaround.

It is not a workaround.  It is a neutering of the netpoll code (when
run in irq context) to just allocate memory for the message we are going
to send and queueing that message for delivery in a safer context.

> Here ou are trying to avoid calling netpoll_poll_dev()
> in IRQ context, but it has a side effect for netpoll_send_udp()
> which could possibly return early after find_skb().

It isn't a side effect, it is an unfortuante fact of life that sometimes
you can not allocate memory in irq context.

> Also, netpoll_poll_dev() does more than just calling driver
> poll method, I am not sure if it is safe to skip it either.

netpoll_poll_dev is absolutely safe to skip at this location because
it is not called at this location 99% of the time.

Also note my patch is about much much more than not calling
the driver's napi poll method from interrupt context.  It is about
not calling any driver method from interrupt context.  This includes
ndo_poll_controller, and ndo_start_xmit.

The only work netpoll_poll_dev does that does not call into driver
methods is zap_completion_queues and find_skb has already called
zap_completion_queues.  Which means even a more fine grained aproach
could not find code in netpoll_poll_dev that is desirable to call in
interrupt context.

I expect you were referring to netpoll_neigh_reply and there are issues
with that code.

Semantically netpoll_neigh_reply is a path into the driver methods for
sending packets and as such is not appropriate to call from interrupt
context.

Practically speaking netpoll_neigh_reply is dead code because there
is not a single netpoll user in the kernel that sets rx_skb_hook.

Functionally netpoll_neigh_reply scares me to read.  It has the
potential to infinitely recurse and unless I am missing some deep magic
it leaks every packet that makes it on to the neigh_tx queue.

The code path that can has the potential to infinitely recurse is:
find_skb
   netpoll_poll_dev
      service_neigh_queue
         netpoll_neigh_reply
             find_skb
                ...

It is my personal recommendation that all support for receiving packets
in netpoll be removed.  It has been a decade and we still have yet to
see a user of that code merged into the tree, and the code is extremely
fishy if not totally horrifically broken.

> netpoll code needs to rewrite.

I don't have a clue what you mean there.  My guess is you are saying
netpoll need a rewrite.  After having read through the netpoll code I
would argue that my two line change is the rewrite the netpoll code
needs.  (Well other than dead code removal).

What is desired to do in interrupt context is to allocate a buffer, put
our data in it, and queue that buffer to be handled later.  That is
exactly what happens with my changes when the code is run in interrupt
context.

Better than that I have tested my changes, and the code works.

In this instance I got lucky that everything netpoll needed to do to
handle being called from interrupt context was already present in the
code and I just needed to tell the netpoll code to use those other paths
in interrupt context.

Eric

^ permalink raw reply	[flat|nested] 288+ messages in thread

* Re: [PATCH] netpoll: Don't call driver methods from interrupt context.
  2014-03-03 20:40 [PATCH] netpoll: Don't call driver methods from interrupt context Eric W. Biederman
  2014-03-04  4:23 ` Cong Wang
@ 2014-03-04 21:08 ` David Miller
  2014-03-05  0:03   ` Eric W. Biederman
                     ` (2 more replies)
  1 sibling, 3 replies; 288+ messages in thread
From: David Miller @ 2014-03-04 21:08 UTC (permalink / raw)
  To: ebiederm; +Cc: netdev, xiyou.wangcong, mpm, satyam.sharma

From: ebiederm@xmission.com (Eric W. Biederman)
Date: Mon, 03 Mar 2014 12:40:05 -0800

> <IRQ>  [<ffffffff8104934c>] warn_slowpath_common+0x85/0x9d
> [<ffffffff8104937e>] warn_slowpath_null+0x1a/0x1c
> [<ffffffff81429aa7>] skb_release_head_state+0x7b/0xe1
> [<ffffffff814297e1>] __kfree_skb+0x16/0x81
> [<ffffffff814298a0>] consume_skb+0x54/0x69
> [<ffffffffa015925b>] bnx2_tx_int.clone.6+0x1b0/0x33e [bnx2]

Other drivers, such as bnx2x, uses dev_kfree_skb_any(), probably
exactly to deal with this situation.

If in_irq() is true or interrupts are disabled, __dev_kfree_skb_any
will use __dev_kfree_skb_irq, which will queue up the SKB and schedule
a software interrupt to do the actual work.

For example, see this commit, which we probably just need to duplicate
into other poll supporting drivers:

commit 40955532bc9d865999dfc58b7896605d58650655
Author: Vladislav Zolotarov <vladz@broadcom.com>
Date:   Sun May 22 10:06:58 2011 +0000

    bnx2x: call dev_kfree_skb_any instead of dev_kfree_skb
    
    replace function calls when possible call in both irq/non-irq contexts
    
    Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com>
    Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

diff --git a/drivers/net/bnx2x/bnx2x_cmn.c b/drivers/net/bnx2x/bnx2x_cmn.c
index 64d01e7..d5bd35b 100644
--- a/drivers/net/bnx2x/bnx2x_cmn.c
+++ b/drivers/net/bnx2x/bnx2x_cmn.c
@@ -131,7 +131,7 @@ static u16 bnx2x_free_tx_pkt(struct bnx2x *bp, struct bnx2x_fastpath *fp,
 
 	/* release skb */
 	WARN_ON(!skb);
-	dev_kfree_skb(skb);
+	dev_kfree_skb_any(skb);
 	tx_buf->first_bd = 0;
 	tx_buf->skb = NULL;
 
@@ -465,7 +465,7 @@ static void bnx2x_tpa_stop(struct bnx2x *bp, struct bnx2x_fastpath *fp,
 		} else {
 			DP(NETIF_MSG_RX_STATUS, "Failed to allocate new pages"
 			   " - dropping packet!\n");
-			dev_kfree_skb(skb);
+			dev_kfree_skb_any(skb);
 		}
 
 
diff --git a/drivers/net/bnx2x/bnx2x_cmn.h b/drivers/net/bnx2x/bnx2x_cmn.h
index fab161e..1a3545b 100644
--- a/drivers/net/bnx2x/bnx2x_cmn.h
+++ b/drivers/net/bnx2x/bnx2x_cmn.h
@@ -840,7 +840,7 @@ static inline int bnx2x_alloc_rx_skb(struct bnx2x *bp,
 	mapping = dma_map_single(&bp->pdev->dev, skb->data, fp->rx_buf_size,
 				 DMA_FROM_DEVICE);
 	if (unlikely(dma_mapping_error(&bp->pdev->dev, mapping))) {
-		dev_kfree_skb(skb);
+		dev_kfree_skb_any(skb);
 		return -ENOMEM;
 	}
 

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* Re: [PATCH] netpoll: Don't call driver methods from interrupt context.
  2014-03-04  4:23 ` Cong Wang
  2014-03-04 10:29   ` Eric W. Biederman
@ 2014-03-04 21:09   ` David Miller
  1 sibling, 0 replies; 288+ messages in thread
From: David Miller @ 2014-03-04 21:09 UTC (permalink / raw)
  To: xiyou.wangcong; +Cc: ebiederm, netdev, mpm, satyam.sharma

From: Cong Wang <xiyou.wangcong@gmail.com>
Date: Mon, 3 Mar 2014 20:23:03 -0800

> netpoll code needs to rewrite.

People (myself included) have been saying this for a decade, nobody
has come up with a better design that achieves what the current one is
at least able to.

^ permalink raw reply	[flat|nested] 288+ messages in thread

* Re: [PATCH] netpoll: Don't call driver methods from interrupt context.
  2014-03-04 21:08 ` David Miller
@ 2014-03-05  0:03   ` Eric W. Biederman
  2014-03-05  0:26     ` David Miller
  2014-03-05 19:14   ` Eric W. Biederman
  2014-03-11  3:16   ` [PATCH next-next 0/11] Using dev_kfree_skb_any for functions called in multiple contexts Eric W. Biederman
  2 siblings, 1 reply; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-05  0:03 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, xiyou.wangcong, mpm, satyam.sharma

David Miller <davem@davemloft.net> writes:

> From: ebiederm@xmission.com (Eric W. Biederman)
> Date: Mon, 03 Mar 2014 12:40:05 -0800
>
>> <IRQ>  [<ffffffff8104934c>] warn_slowpath_common+0x85/0x9d
>> [<ffffffff8104937e>] warn_slowpath_null+0x1a/0x1c
>> [<ffffffff81429aa7>] skb_release_head_state+0x7b/0xe1
>> [<ffffffff814297e1>] __kfree_skb+0x16/0x81
>> [<ffffffff814298a0>] consume_skb+0x54/0x69
>> [<ffffffffa015925b>] bnx2_tx_int.clone.6+0x1b0/0x33e [bnx2]
>
> Other drivers, such as bnx2x, uses dev_kfree_skb_any(), probably
> exactly to deal with this situation.
>
> If in_irq() is true or interrupts are disabled, __dev_kfree_skb_any
> will use __dev_kfree_skb_irq, which will queue up the SKB and schedule
> a software interrupt to do the actual work.
>
> For example, see this commit, which we probably just need to duplicate
> into other poll supporting drivers:

When the patch to make that change was submitted to you to do that you
rejected it:

> From:	David Miller <davem@davemloft.net>
> Subject: Re: [PATCH] bnx2: Use dev_kfree_skb_any() in bnx2_tx_int()
> To:	tdmackey@booleanhaiku.com
> Cc:	mchan@broadcom.com, netdev@vger.kernel.org, linux-kernel@vger.kernel.org
> Date: Tue, 29 Oct 2013 22:42:27 -0400 (EDT) (17 weeks, 6 days, 20 hours ago)
> 
> From: David Mackey <tdmackey@booleanhaiku.com>
> Date: Tue, 29 Oct 2013 15:16:38 -0700
> 
> > Using dev_kfree_skb_any() will resolve the below issue when a
> > netconsole message is transmitted in an irq.
>  ...
> > Signed-off-by: David Mackey <tdmackey@booleanhaiku.com>
> 
> This is absolutely not the correct fix.
> 
> The netpoll facility must invoke ->poll() in an environment which
> is compatible, locking and interrupt/soft-interrupt wise, as that
> in which it is normally called.
> 
> Therefore, bnx2_tx_int(), which is invoked from the driver's ->poll()
> method, should not need to use dev_kfree_skb_any().  The real problem
> is somewhere else.

In that discussion you were also strongly objecting to making the poll
methods safe in hard irq context.  Although there may be something
subtle I am missing.

So when I looked at this problem this weekend I realized it was not at
all hard to just queue packets in hard irq context, so we would not call
driver methods in a context they were not prepared for.

If dev_kfree_skb to dev_kfree_skb_any were the only issue (which is
seems to be in many cases I wouldn't much care).

However it appears we can run huge portions of the networking stack from
napi context and there are other issues beyond dev_kfree_skb.

The worst issues I have seen are with the mlx4 driver.  In part mlx4
looks crazy calling napi_synchronize which calls msleep with interrupts
disabled.  But I have some warnings that I can should be triggerable
with other drivers using netpoll as well.

So I would like some clear guidance.  Will you accept patches to make
it safe to call the napi poll routines from hard irq context, or should
we simply defer messages prented with netconsole in hard irq context
into another context where we can run the napi code?

If there is not a clear way to fix the problems that crop up we should
just delete all of the netpoll code altogether, as it seems deadly in
it's current form.



This first mlx4 trace looks a little questionable but I can find
an equivalent trace through the tg3 driver so this looks like
another legitmate netpoll issue.
tg3_rx
  napi_gro_receive
    napi_skb_finish
      netif_receive_skb_internal
         __netif_receive_skb
            ip_rcv
              NF_HOOK()
                 nf_iterate
                    ip_tables_mange_hook
                       ipt_do_table
            
------------[ cut here ]------------
WARNING: at kernel/softirq.c:159 _local_bh_enable_ip+0x41/0x8b()
Hardware name: PowerEdge C6220
Modules linked in: xt_DSCP iptable_mangle netconsole configfs ipv6 ppdev parport_pc lp parport tcp_diag inet_diag ipmi_si ipmi_devintf ipmi_msghandler hed dcdbas coretemp crc32c_intel ghash_clmulni_intel microcode i2c_i801 i2c_core iTCO_wdt iTCO_vendor_support shpchp ioatdma dca mlx4_en mlx4_core wmi [last unloaded: scsi_wait_scan]
Pid: 0, comm: swapper/9 Not tainted 3.4 #1
Call Trace:
 <NMI>  [<ffffffff8103c20c>] warn_slowpath_common+0x85/0x9d
 [<ffffffff8103c23e>] warn_slowpath_null+0x1a/0x1c
 [<ffffffff810429ef>] _local_bh_enable_ip+0x41/0x8b
 [<ffffffff81042a5b>] local_bh_enable+0x12/0x14
 [<ffffffff8147df9a>] ipt_do_table+0x652/0x68b
 [<ffffffffa005e177>] iptable_mangle_hook+0xff/0x11c [iptable_mangle]
 [<ffffffff81435d44>] nf_iterate+0x48/0x7d
 [<ffffffff8143e7a4>] ? inet_del_protocol+0x3a/0x3a
 [<ffffffff81435eaf>] nf_hook_slow+0x6c/0xff
 [<ffffffff8143e7a4>] ? inet_del_protocol+0x3a/0x3a
 [<ffffffff8143e7a4>] ? inet_del_protocol+0x3a/0x3a
 [<ffffffff8143edd4>] NF_HOOK.clone.1+0x41/0x53
 [<ffffffff8143f074>] ip_rcv+0x23c/0x268
 [<ffffffff8140fe4f>] __netif_receive_skb+0x3a5/0x3fe
 [<ffffffff81411874>] netif_receive_skb+0x4b/0x7b
 [<ffffffff81411e24>] ? __napi_gro_receive+0xf8/0x107
 [<ffffffff814118f4>] napi_frags_finish+0x50/0xc4
 [<ffffffff81411e6c>] napi_gro_frags+0x39/0x3e
 [<ffffffffa003b6bf>] mlx4_en_process_rx_cq+0x30c/0x567 [mlx4_en]
 [<ffffffffa003d4a6>] mlx4_en_netpoll+0x8d/0xb1 [mlx4_en]
 [<ffffffff8142541b>] netpoll_poll_dev+0x4a/0x1b7
 [<ffffffff814255bf>] ? find_skb+0x37/0x82
 [<ffffffff8111cf3a>] ? virt_to_head_page+0x9/0x2c
 [<ffffffff81424fd5>] netpoll_send_skb_on_dev+0x117/0x200
 [<ffffffff8142583a>] netpoll_send_udp+0x230/0x242
 [<ffffffffa0067296>] write_msg+0xa7/0xfb [netconsole]
 [<ffffffff8103c46d>] __call_console_drivers+0x7d/0x8f
 [<ffffffff8103c534>] _call_console_drivers+0xb5/0xd0
 [<ffffffff8103d02f>] console_unlock+0x16c/0x219
 [<ffffffff8103d6b9>] vprintk+0x3bc/0x405
 [<ffffffff814ba4b7>] printk+0x68/0x71
 [<ffffffff810891dc>] print_modules+0x6a/0xf3
 [<ffffffff81004164>] show_registers+0x48/0x214
 [<ffffffff814ba4b7>] ? printk+0x68/0x71
 [<ffffffff81009a18>] show_regs+0x16/0x2d
 [<ffffffff814bdcf1>] arch_trigger_all_cpu_backtrace_handler+0x60/0x79
 [<ffffffff814bd663>] default_do_nmi+0x66/0x1d4
 [<ffffffff814bd841>] do_nmi+0x70/0xbb
 [<ffffffff814bcd0c>] end_repeat_nmi+0x1a/0x1e
 [<ffffffff812a24bd>] ? intel_idle+0xae/0x112
 [<ffffffff812a24bd>] ? intel_idle+0xae/0x112
 [<ffffffff812a24bd>] ? intel_idle+0xae/0x112
 <<EOE>>  [<ffffffff813e23d8>] ? menu_select+0x1ac/0x303
 [<ffffffff813e0d6c>] cpuidle_enter+0x12/0x14
 [<ffffffff813e1432>] cpuidle_idle_call+0xd1/0x19b
 [<ffffffff81009545>] cpu_idle+0xb6/0xff
 [<ffffffff814b3975>] start_secondary+0xc8/0xca
---[ end trace b7bd3fb31d1fc0d7 ]---
------------[ cut here ]------------

I see a lot of this one, but this one is clearly bogus. Calling
napi_synchronize with irqs disabled!

BUG: scheduling while atomic: swapper/9/0/0x04010000
Modules linked in: xt_DSCP iptable_mangle netconsole configfs ipv6 ppdev parport_pc lp parport tcp_diag inet_diag ipmi_si ipmi_devintf ipmi_msghandler hed dcdbas coretemp crc32c_intel ghash_clmulni_intel microcode i2c_i801 i2c_core iTCO_wdt iTCO_vendor_support shpchp ioatdma dca mlx4_en mlx4_core wmi [last unloaded: scsi_wait_scan]
Pid: 0, comm: swapper/9 Tainted: G        W    3.4 #1
Call Trace:
 <NMI>  [<ffffffff8106493e>] __schedule_bug+0x4d/0x4f
 [<ffffffff814bb21a>] __schedule+0xa3/0x4ec
 [<ffffffff814bb925>] schedule+0x64/0x66
 [<ffffffff814ba584>] schedule_timeout+0xab/0xe3
 [<ffffffff81048fd1>] ? del_timer+0x82/0x82
 [<ffffffff814ba5da>] schedule_timeout_uninterruptible+0x1e/0x20
 [<ffffffff810494c0>] msleep+0x1b/0x22
                      napi_synchronize()
 [<ffffffffa003d47e>] mlx4_en_netpoll+0x65/0xb1 [mlx4_en]
 [<ffffffff8142541b>] netpoll_poll_dev+0x4a/0x1b7
 [<ffffffff814255bf>] ? find_skb+0x37/0x82
 [<ffffffff8111cf3a>] ? virt_to_head_page+0x9/0x2c
 [<ffffffff81424fd5>] netpoll_send_skb_on_dev+0x117/0x200
 [<ffffffff8142583a>] netpoll_send_udp+0x230/0x242
 [<ffffffffa0067296>] write_msg+0xa7/0xfb [netconsole]
 [<ffffffff8103c46d>] __call_console_drivers+0x7d/0x8f
 [<ffffffff8103c534>] _call_console_drivers+0xb5/0xd0
 [<ffffffff8103d02f>] console_unlock+0x16c/0x219
 [<ffffffff8103d6b9>] vprintk+0x3bc/0x405
 [<ffffffff814ba4b7>] printk+0x68/0x71
 [<ffffffff810891dc>] print_modules+0x6a/0xf3
 [<ffffffff81004164>] show_registers+0x48/0x214
 [<ffffffff814ba4b7>] ? printk+0x68/0x71
 [<ffffffff81009a18>] show_regs+0x16/0x2d
 [<ffffffff814bdcf1>] arch_trigger_all_cpu_backtrace_handler+0x60/0x79
 [<ffffffff814bd663>] default_do_nmi+0x66/0x1d4
 [<ffffffff814bd841>] do_nmi+0x70/0xbb
 [<ffffffff814bcd0c>] end_repeat_nmi+0x1a/0x1e
 [<ffffffff812a24bd>] ? intel_idle+0xae/0x112
 [<ffffffff812a24bd>] ? intel_idle+0xae/0x112
 [<ffffffff812a24bd>] ? intel_idle+0xae/0x112
 <<EOE>>  [<ffffffff813e23d8>] ? menu_select+0x1ac/0x303
 [<ffffffff813e0d6c>] cpuidle_enter+0x12/0x14
 [<ffffffff813e1432>] cpuidle_idle_call+0xd1/0x19b
 [<ffffffff81009545>] cpu_idle+0xb6/0xff
 [<ffffffff814b3975>] start_secondary+0xc8/0xca
------------[ cut here ]------------

^ permalink raw reply	[flat|nested] 288+ messages in thread

* Re: [PATCH] netpoll: Don't call driver methods from interrupt context.
  2014-03-05  0:03   ` Eric W. Biederman
@ 2014-03-05  0:26     ` David Miller
  2014-03-05 19:24       ` Eric W. Biederman
  0 siblings, 1 reply; 288+ messages in thread
From: David Miller @ 2014-03-05  0:26 UTC (permalink / raw)
  To: ebiederm; +Cc: netdev, xiyou.wangcong, mpm, satyam.sharma

From: ebiederm@xmission.com (Eric W. Biederman)
Date: Tue, 04 Mar 2014 16:03:43 -0800

> So I would like some clear guidance.  Will you accept patches to make
> it safe to call the napi poll routines from hard irq context, or should
> we simply defer messages prented with netconsole in hard irq context
> into another context where we can run the napi code?
> 
> If there is not a clear way to fix the problems that crop up we should
> just delete all of the netpoll code altogether, as it seems deadly in
> it's current form.

Clearly to make netconsole most useful we should synchronously emit
log messages.

Because what if the system hangs right after this event, but before
we get back to a "safe" context.

That's one bug that will be a billion times harder to diagnose if
we defer.

^ permalink raw reply	[flat|nested] 288+ messages in thread

* Re: [PATCH] netpoll: Don't call driver methods from interrupt context.
  2014-03-04 21:08 ` David Miller
  2014-03-05  0:03   ` Eric W. Biederman
@ 2014-03-05 19:14   ` Eric W. Biederman
  2014-03-11  3:16   ` [PATCH next-next 0/11] Using dev_kfree_skb_any for functions called in multiple contexts Eric W. Biederman
  2 siblings, 0 replies; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-05 19:14 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, xiyou.wangcong, mpm, satyam.sharma

David Miller <davem@davemloft.net> writes:

> From: ebiederm@xmission.com (Eric W. Biederman)
> Date: Mon, 03 Mar 2014 12:40:05 -0800
>
>> <IRQ>  [<ffffffff8104934c>] warn_slowpath_common+0x85/0x9d
>> [<ffffffff8104937e>] warn_slowpath_null+0x1a/0x1c
>> [<ffffffff81429aa7>] skb_release_head_state+0x7b/0xe1
>> [<ffffffff814297e1>] __kfree_skb+0x16/0x81
>> [<ffffffff814298a0>] consume_skb+0x54/0x69
>> [<ffffffffa015925b>] bnx2_tx_int.clone.6+0x1b0/0x33e [bnx2]
>
> Other drivers, such as bnx2x, uses dev_kfree_skb_any(), probably
> exactly to deal with this situation.
>
> If in_irq() is true or interrupts are disabled, __dev_kfree_skb_any
> will use __dev_kfree_skb_irq, which will queue up the SKB and schedule
> a software interrupt to do the actual work.
>
> For example, see this commit, which we probably just need to duplicate
> into other poll supporting drivers:

Fair enough.  I will cook up a patch to see about doing this.
Especially as this appears to be all that drivers need to do differently
in practice.

Looking at bnx2x I found at least one more case that needs to be
changed, so for netpoll supporting drivers I will plan on simply
changing all of the dev_kfree_skb instances to dev_kfree_skby_any
so I don't have to worry about missing one by not having audited the
code properly.

diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c
index 9d7419e0390b..e790d654f35f 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c
@@ -3719,7 +3719,7 @@ netdev_tx_t bnx2x_start_xmit(struct sk_buff *skb, struct net_device *dev)
                        struct bnx2x_eth_q_stats *q_stats =
                                bnx2x_fp_qstats(bp, txdata->parent_fp);
                        q_stats->driver_filtered_tx_pkt++;
-                       dev_kfree_skb(skb);
+                       dev_kfree_skb_any(skb);
                        return NETDEV_TX_OK;
                }
                bnx2x_fp_qstats(bp, txdata->parent_fp)->driver_xoff++;

Eric



> commit 40955532bc9d865999dfc58b7896605d58650655
> Author: Vladislav Zolotarov <vladz@broadcom.com>
> Date:   Sun May 22 10:06:58 2011 +0000
>
>     bnx2x: call dev_kfree_skb_any instead of dev_kfree_skb
>     
>     replace function calls when possible call in both irq/non-irq contexts
>     
>     Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com>
>     Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
>     Signed-off-by: David S. Miller <davem@davemloft.net>
>
> diff --git a/drivers/net/bnx2x/bnx2x_cmn.c b/drivers/net/bnx2x/bnx2x_cmn.c
> index 64d01e7..d5bd35b 100644
> --- a/drivers/net/bnx2x/bnx2x_cmn.c
> +++ b/drivers/net/bnx2x/bnx2x_cmn.c
> @@ -131,7 +131,7 @@ static u16 bnx2x_free_tx_pkt(struct bnx2x *bp, struct bnx2x_fastpath *fp,
>  
>  	/* release skb */
>  	WARN_ON(!skb);
> -	dev_kfree_skb(skb);
> +	dev_kfree_skb_any(skb);
>  	tx_buf->first_bd = 0;
>  	tx_buf->skb = NULL;
>  
> @@ -465,7 +465,7 @@ static void bnx2x_tpa_stop(struct bnx2x *bp, struct bnx2x_fastpath *fp,
>  		} else {
>  			DP(NETIF_MSG_RX_STATUS, "Failed to allocate new pages"
>  			   " - dropping packet!\n");
> -			dev_kfree_skb(skb);
> +			dev_kfree_skb_any(skb);
>  		}
>  
>  
> diff --git a/drivers/net/bnx2x/bnx2x_cmn.h b/drivers/net/bnx2x/bnx2x_cmn.h
> index fab161e..1a3545b 100644
> --- a/drivers/net/bnx2x/bnx2x_cmn.h
> +++ b/drivers/net/bnx2x/bnx2x_cmn.h
> @@ -840,7 +840,7 @@ static inline int bnx2x_alloc_rx_skb(struct bnx2x *bp,
>  	mapping = dma_map_single(&bp->pdev->dev, skb->data, fp->rx_buf_size,
>  				 DMA_FROM_DEVICE);
>  	if (unlikely(dma_mapping_error(&bp->pdev->dev, mapping))) {
> -		dev_kfree_skb(skb);
> +		dev_kfree_skb_any(skb);
>  		return -ENOMEM;
>  	}
>  

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* Re: [PATCH] netpoll: Don't call driver methods from interrupt context.
  2014-03-05  0:26     ` David Miller
@ 2014-03-05 19:24       ` Eric W. Biederman
  2014-03-07 19:30         ` David Miller
  0 siblings, 1 reply; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-05 19:24 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, xiyou.wangcong, mpm, satyam.sharma

David Miller <davem@davemloft.net> writes:

> From: ebiederm@xmission.com (Eric W. Biederman)
> Date: Tue, 04 Mar 2014 16:03:43 -0800
>
>> So I would like some clear guidance.  Will you accept patches to make
>> it safe to call the napi poll routines from hard irq context, or should
>> we simply defer messages prented with netconsole in hard irq context
>> into another context where we can run the napi code?
>> 
>> If there is not a clear way to fix the problems that crop up we should
>> just delete all of the netpoll code altogether, as it seems deadly in
>> it's current form.
>
> Clearly to make netconsole most useful we should synchronously emit
> log messages.
>
> Because what if the system hangs right after this event, but before
> we get back to a "safe" context.
>
> That's one bug that will be a billion times harder to diagnose if
> we defer.

In general I agree.  

The gripping hand for me is kernel/rcu/tree.c:print_cpu_stall() that
generates a warning from irq context on every cpu simultaneously.

Which without netpoll I can debug by just logging into the machine and
dumping dmesg, but with netpoll machine die when the warning is
generarted because of the after the first few messages each additional
message generates a new message.

Now that I have looked closer the printk generating a printk problem
seems to be something that is best solved at the printk level.  So if
you will accept the patches I will proceed to shore up the existing
netpoll implementations.

I am thinking pretty seriously about forcing hard irq context during
netconsole's use of netpoll to ensure that the hard irq context case
actually get's tested.  I need to do some audit's to see if that would
cause any side effects beyond leaving irq's disabled.

diff --git a/drivers/net/netconsole.c b/drivers/net/netconsole.c
index ba2f5e710af1..aaa9062061c8 100644
--- a/drivers/net/netconsole.c
+++ b/drivers/net/netconsole.c
@@ -734,6 +734,7 @@ static void write_msg(struct console *con, const char *msg, unsigned int len)
        unsigned long flags;
        struct netconsole_target *nt;
        const char *tmp;
+       bool hard_irq;
 
        if (oops_only && !oops_in_progress)
                return;
@@ -742,6 +743,9 @@ static void write_msg(struct console *con, const char *msg, unsigned int len)
                return;
 
        spin_lock_irqsave(&target_list_lock, flags);
+       hard_irq = in_irq();
+       if (!hard_irq)
+               irq_enter();
        list_for_each_entry(nt, &target_list, list) {
                netconsole_target_get(nt);
                if (nt->enabled && netif_running(nt->np.dev)) {
@@ -761,6 +765,8 @@ static void write_msg(struct console *con, const char *msg, unsigned int len)
                }
                netconsole_target_put(nt);
        }
+       if (!hard_irq)
+               irq_exit();
        spin_unlock_irqrestore(&target_list_lock, flags);
 }


Eric

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* Re: [PATCH] netpoll: Don't call driver methods from interrupt context.
  2014-03-05 19:24       ` Eric W. Biederman
@ 2014-03-07 19:30         ` David Miller
  2014-03-08  5:13           ` Eric W. Biederman
  0 siblings, 1 reply; 288+ messages in thread
From: David Miller @ 2014-03-07 19:30 UTC (permalink / raw)
  To: ebiederm; +Cc: netdev, xiyou.wangcong, mpm, satyam.sharma

From: ebiederm@xmission.com (Eric W. Biederman)
Date: Wed, 05 Mar 2014 11:24:33 -0800

> Now that I have looked closer the printk generating a printk problem
> seems to be something that is best solved at the printk level.

I'm not so sure that disallowing printk recursion is necessary.

If you consider an error printk emitted from a device driver's
transmit function during netconsole output, netpoll handles this
transparently already.

Basically, what happens right now in this situation is that netpoll
queues it up when recursion is detected, and delayed work is scheduled
to process such pending packets.

The only issue at hand is the IRQ context bit.

^ permalink raw reply	[flat|nested] 288+ messages in thread

* Re: [PATCH] netpoll: Don't call driver methods from interrupt context.
  2014-03-07 19:30         ` David Miller
@ 2014-03-08  5:13           ` Eric W. Biederman
  0 siblings, 0 replies; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-08  5:13 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, xiyou.wangcong, mpm, satyam.sharma

David Miller <davem@davemloft.net> writes:

> From: ebiederm@xmission.com (Eric W. Biederman)
> Date: Wed, 05 Mar 2014 11:24:33 -0800
>
>> Now that I have looked closer the printk generating a printk problem
>> seems to be something that is best solved at the printk level.
>
> I'm not so sure that disallowing printk recursion is necessary.
>
> If you consider an error printk emitted from a device driver's
> transmit function during netconsole output, netpoll handles this
> transparently already.
>
> Basically, what happens right now in this situation is that netpoll
> queues it up when recursion is detected, and delayed work is scheduled
> to process such pending packets.

Except that printk does not recurse into netpoll again, printk adds the
message to printk's ring buffer, and then the next the next time through
the loop in console_unlock writes that message out with console_unlock.

I have had warnings from printk kill a couple of machines, which is
largely why I am anxious to fix netpoll.  Further I have experimentally
verified that I can still kill a machine that way in the 3.14-rcX.

> The only issue at hand is the IRQ context bit.

That is the only issue that is a networking stack issue, and I am happy to
focus there.  If we don't get printk's generating warnings the machine
won't lock up.

I am slowly working my way through reading the code and verifying I
really understand what is going on so I can reasonably say the routines
in the appropriate drivers should be safe in hard irq context.

Hopefully I will have patches in the next couple of days.

Eric

^ permalink raw reply	[flat|nested] 288+ messages in thread

* [PATCH next-next 0/11] Using dev_kfree_skb_any for functions called in multiple contexts
  2014-03-04 21:08 ` David Miller
  2014-03-05  0:03   ` Eric W. Biederman
  2014-03-05 19:14   ` Eric W. Biederman
@ 2014-03-11  3:16   ` Eric W. Biederman
  2014-03-11  3:18     ` [PATCH 01/11] bonding: Call dev_kfree_skby_any instead of kfree_skb Eric W. Biederman
                       ` (10 more replies)
  2 siblings, 11 replies; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-11  3:16 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, xiyou.wangcong, mpm, satyam.sharma


This patchset should be an uncontroversial set of changes to change
dev_kfree_skb to dev_kfree_skb_any for code paths that are called in
hard irq contexts in addition to other contexts.  netpoll is the reason
this code gets called in multiple contexts.

There is more coming but these changes are a good starting place, and
stand on their own.

Eric W. Biederman (11):
      bonding: Call dev_kfree_skby_any instead of kfree_skb.
      bnx2: Call dev_kfree_skby_any instead of dev_kfree_skb.
      bnx2x: Call dev_kfree_skby_any instead of dev_kfree_skb.
      tg3: Call dev_kfree_skby_any instead of dev_kfree_skb.
      bcm63xx_enet: Call dev_kfree_skby_any instead of dev_kfree_skb.
      e1000: Call dev_kfree_skby_any instead of dev_kfree_skb.
      igbvf: Call dev_kfree_skby_any instead of dev_kfree_skb.
      ixgb: Call dev_kfree_skby_any instead of dev_kfree_skb.
      mlx4: Call dev_kfree_skby_any instead of dev_kfree_skb.
      benet: Call dev_kfree_skby_any instead of kfree_skb.
      gianfar: Carefully free skbs in functions called by netpoll.

 drivers/net/bonding/bond_3ad.c                  |    2 +-
 drivers/net/bonding/bond_alb.c                  |    2 +-
 drivers/net/bonding/bond_main.c                 |   14 +++++++-------
 drivers/net/ethernet/broadcom/bcm63xx_enet.c    |    4 ++--
 drivers/net/ethernet/broadcom/bnx2.c            |   10 +++++-----
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c |    2 +-
 drivers/net/ethernet/broadcom/tg3.c             |   14 +++++++-------
 drivers/net/ethernet/emulex/benet/be_main.c     |    2 +-
 drivers/net/ethernet/freescale/gianfar.c        |    6 +++---
 drivers/net/ethernet/intel/e1000/e1000_main.c   |   18 +++++++++---------
 drivers/net/ethernet/intel/igbvf/netdev.c       |    2 +-
 drivers/net/ethernet/intel/ixgb/ixgb_main.c     |    6 +++---
 drivers/net/ethernet/mellanox/mlx4/en_tx.c      |    2 +-
 13 files changed, 42 insertions(+), 42 deletions(-)

Eric

^ permalink raw reply	[flat|nested] 288+ messages in thread

* [PATCH 01/11] bonding: Call dev_kfree_skby_any instead of kfree_skb.
  2014-03-11  3:16   ` [PATCH next-next 0/11] Using dev_kfree_skb_any for functions called in multiple contexts Eric W. Biederman
@ 2014-03-11  3:18     ` Eric W. Biederman
  2014-03-11  3:44       ` Eric Dumazet
  2014-03-11  3:18     ` [PATCH 02/11] bnx2: Call dev_kfree_skby_any instead of dev_kfree_skb Eric W. Biederman
                       ` (9 subsequent siblings)
  10 siblings, 1 reply; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-11  3:18 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, xiyou.wangcong, mpm, satyam.sharma


Replace kfree_skb with dev_kfree_skb_any in functions that can
be called in hard irq and other contexts.

Replace consume_skb with dev_consume_skb_any in functions that can
be called in hard irq and other contexts.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 drivers/net/bonding/bond_3ad.c  |    2 +-
 drivers/net/bonding/bond_alb.c  |    2 +-
 drivers/net/bonding/bond_main.c |   14 +++++++-------
 3 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/drivers/net/bonding/bond_3ad.c b/drivers/net/bonding/bond_3ad.c
index a2ef3f72de88..dee2a84a2929 100644
--- a/drivers/net/bonding/bond_3ad.c
+++ b/drivers/net/bonding/bond_3ad.c
@@ -2479,7 +2479,7 @@ out:
 	return NETDEV_TX_OK;
 err_free:
 	/* no suitable interface, frame not sent */
-	kfree_skb(skb);
+	dev_kfree_skb_any(skb);
 	goto out;
 }
 
diff --git a/drivers/net/bonding/bond_alb.c b/drivers/net/bonding/bond_alb.c
index aaeeacf767f2..9cf836b67b15 100644
--- a/drivers/net/bonding/bond_alb.c
+++ b/drivers/net/bonding/bond_alb.c
@@ -1464,7 +1464,7 @@ int bond_alb_xmit(struct sk_buff *skb, struct net_device *bond_dev)
 	}
 
 	/* no suitable interface, frame not sent */
-	kfree_skb(skb);
+	dev_kfree_skb_any(skb);
 out:
 	return NETDEV_TX_OK;
 }
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 730d72c706c9..63f8df8af4d4 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -1115,7 +1115,7 @@ static rx_handler_result_t bond_handle_frame(struct sk_buff **pskb)
 	if (recv_probe) {
 		ret = recv_probe(skb, bond, slave);
 		if (ret == RX_HANDLER_CONSUMED) {
-			consume_skb(skb);
+			dev_consume_skb_any(skb);
 			return ret;
 		}
 	}
@@ -1132,7 +1132,7 @@ static rx_handler_result_t bond_handle_frame(struct sk_buff **pskb)
 
 		if (unlikely(skb_cow_head(skb,
 					  skb->data - skb_mac_header(skb)))) {
-			kfree_skb(skb);
+			dev_kfree_skb_any(skb);
 			return RX_HANDLER_CONSUMED;
 		}
 		ether_addr_copy(eth_hdr(skb)->h_dest, bond->dev->dev_addr);
@@ -3548,7 +3548,7 @@ static void bond_xmit_slave_id(struct bonding *bond, struct sk_buff *skb, int sl
 		}
 	}
 	/* no slave that can tx has been found */
-	kfree_skb(skb);
+	dev_kfree_skb_any(skb);
 }
 
 /**
@@ -3624,7 +3624,7 @@ static int bond_xmit_activebackup(struct sk_buff *skb, struct net_device *bond_d
 	if (slave)
 		bond_dev_queue_xmit(bond, skb, slave->dev);
 	else
-		kfree_skb(skb);
+		dev_kfree_skb_any(skb);
 
 	return NETDEV_TX_OK;
 }
@@ -3667,7 +3667,7 @@ static int bond_xmit_broadcast(struct sk_buff *skb, struct net_device *bond_dev)
 	if (slave && IS_UP(slave->dev) && slave->link == BOND_LINK_UP)
 		bond_dev_queue_xmit(bond, skb, slave->dev);
 	else
-		kfree_skb(skb);
+		dev_kfree_skb_any(skb);
 
 	return NETDEV_TX_OK;
 }
@@ -3754,7 +3754,7 @@ static netdev_tx_t __bond_start_xmit(struct sk_buff *skb, struct net_device *dev
 		pr_err("%s: Error: Unknown bonding mode %d\n",
 		       dev->name, bond->params.mode);
 		WARN_ON_ONCE(1);
-		kfree_skb(skb);
+		dev_kfree_skb_any(skb);
 		return NETDEV_TX_OK;
 	}
 }
@@ -3775,7 +3775,7 @@ static netdev_tx_t bond_start_xmit(struct sk_buff *skb, struct net_device *dev)
 	if (bond_has_slaves(bond))
 		ret = __bond_start_xmit(skb, dev);
 	else
-		kfree_skb(skb);
+		dev_kfree_skb_any(skb);
 	rcu_read_unlock();
 
 	return ret;
-- 
1.7.5.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* [PATCH 02/11] bnx2: Call dev_kfree_skby_any instead of dev_kfree_skb.
  2014-03-11  3:16   ` [PATCH next-next 0/11] Using dev_kfree_skb_any for functions called in multiple contexts Eric W. Biederman
  2014-03-11  3:18     ` [PATCH 01/11] bonding: Call dev_kfree_skby_any instead of kfree_skb Eric W. Biederman
@ 2014-03-11  3:18     ` Eric W. Biederman
  2014-03-11  3:47       ` Eric Dumazet
  2014-03-11  3:19     ` [PATCH 03/11] bnx2x: " Eric W. Biederman
                       ` (8 subsequent siblings)
  10 siblings, 1 reply; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-11  3:18 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, xiyou.wangcong, mpm, satyam.sharma


Replace dev_kfree_skb with dev_kfree_skb_any in functions that can
be called in hard irq and other contexts.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 drivers/net/ethernet/broadcom/bnx2.c |   10 +++++-----
 1 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnx2.c b/drivers/net/ethernet/broadcom/bnx2.c
index ca6b36220d94..c94735de808d 100644
--- a/drivers/net/ethernet/broadcom/bnx2.c
+++ b/drivers/net/ethernet/broadcom/bnx2.c
@@ -2885,7 +2885,7 @@ bnx2_tx_int(struct bnx2 *bp, struct bnx2_napi *bnapi, int budget)
 		sw_cons = BNX2_NEXT_TX_BD(sw_cons);
 
 		tx_bytes += skb->len;
-		dev_kfree_skb(skb);
+		dev_kfree_skb_any(skb);
 		tx_pkt++;
 		if (tx_pkt == budget)
 			break;
@@ -2943,7 +2943,7 @@ bnx2_reuse_rx_skb_pages(struct bnx2 *bp, struct bnx2_rx_ring_info *rxr,
 		__skb_frag_set_page(&shinfo->frags[shinfo->nr_frags], NULL);
 
 		cons_rx_pg->page = page;
-		dev_kfree_skb(skb);
+		dev_kfree_skb_any(skb);
 	}
 
 	hw_prod = rxr->rx_pg_prod;
@@ -3234,7 +3234,7 @@ bnx2_rx_int(struct bnx2 *bp, struct bnx2_napi *bnapi, int budget)
 		if ((len > (bp->dev->mtu + ETH_HLEN)) &&
 			(ntohs(skb->protocol) != 0x8100)) {
 
-			dev_kfree_skb(skb);
+			dev_kfree_skb_any(skb);
 			goto next_rx;
 
 		}
@@ -6604,7 +6604,7 @@ bnx2_start_xmit(struct sk_buff *skb, struct net_device *dev)
 
 	mapping = dma_map_single(&bp->pdev->dev, skb->data, len, PCI_DMA_TODEVICE);
 	if (dma_mapping_error(&bp->pdev->dev, mapping)) {
-		dev_kfree_skb(skb);
+		dev_kfree_skb_any(skb);
 		return NETDEV_TX_OK;
 	}
 
@@ -6697,7 +6697,7 @@ dma_error:
 			       PCI_DMA_TODEVICE);
 	}
 
-	dev_kfree_skb(skb);
+	dev_kfree_skb_any(skb);
 	return NETDEV_TX_OK;
 }
 
-- 
1.7.5.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* [PATCH 03/11] bnx2x: Call dev_kfree_skby_any instead of dev_kfree_skb.
  2014-03-11  3:16   ` [PATCH next-next 0/11] Using dev_kfree_skb_any for functions called in multiple contexts Eric W. Biederman
  2014-03-11  3:18     ` [PATCH 01/11] bonding: Call dev_kfree_skby_any instead of kfree_skb Eric W. Biederman
  2014-03-11  3:18     ` [PATCH 02/11] bnx2: Call dev_kfree_skby_any instead of dev_kfree_skb Eric W. Biederman
@ 2014-03-11  3:19     ` Eric W. Biederman
  2014-03-11  3:19     ` [PATCH 04/11] tg3: " Eric W. Biederman
                       ` (7 subsequent siblings)
  10 siblings, 0 replies; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-11  3:19 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, xiyou.wangcong, mpm, satyam.sharma


Replace dev_kfree_skb with dev_kfree_skb_any in functions that can
be called in hard irq and other contexts.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c
index 117b5c7f8ac9..ebe5ae2b961e 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c
@@ -3719,7 +3719,7 @@ netdev_tx_t bnx2x_start_xmit(struct sk_buff *skb, struct net_device *dev)
 			struct bnx2x_eth_q_stats *q_stats =
 				bnx2x_fp_qstats(bp, txdata->parent_fp);
 			q_stats->driver_filtered_tx_pkt++;
-			dev_kfree_skb(skb);
+			dev_kfree_skb_any(skb);
 			return NETDEV_TX_OK;
 		}
 		bnx2x_fp_qstats(bp, txdata->parent_fp)->driver_xoff++;
-- 
1.7.5.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* [PATCH 04/11] tg3: Call dev_kfree_skby_any instead of dev_kfree_skb.
  2014-03-11  3:16   ` [PATCH next-next 0/11] Using dev_kfree_skb_any for functions called in multiple contexts Eric W. Biederman
                       ` (2 preceding siblings ...)
  2014-03-11  3:19     ` [PATCH 03/11] bnx2x: " Eric W. Biederman
@ 2014-03-11  3:19     ` Eric W. Biederman
  2014-03-11  3:20     ` [PATCH 05/11] bcm63xx_enet: " Eric W. Biederman
                       ` (6 subsequent siblings)
  10 siblings, 0 replies; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-11  3:19 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, xiyou.wangcong, mpm, satyam.sharma


Replace dev_kfree_skb with dev_kfree_skb_any in functions that can
be called in hard irq and other contexts.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 drivers/net/ethernet/broadcom/tg3.c |   14 +++++++-------
 1 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/tg3.c b/drivers/net/ethernet/broadcom/tg3.c
index e12735fbdcdb..bbbd2a4bc161 100644
--- a/drivers/net/ethernet/broadcom/tg3.c
+++ b/drivers/net/ethernet/broadcom/tg3.c
@@ -6593,7 +6593,7 @@ static void tg3_tx(struct tg3_napi *tnapi)
 		pkts_compl++;
 		bytes_compl += skb->len;
 
-		dev_kfree_skb(skb);
+		dev_kfree_skb_any(skb);
 
 		if (unlikely(tx_bug)) {
 			tg3_tx_recover(tp);
@@ -6924,7 +6924,7 @@ static int tg3_rx(struct tg3_napi *tnapi, int budget)
 
 		if (len > (tp->dev->mtu + ETH_HLEN) &&
 		    skb->protocol != htons(ETH_P_8021Q)) {
-			dev_kfree_skb(skb);
+			dev_kfree_skb_any(skb);
 			goto drop_it_no_recycle;
 		}
 
@@ -7807,7 +7807,7 @@ static int tigon3_dma_hwbug_workaround(struct tg3_napi *tnapi,
 					  PCI_DMA_TODEVICE);
 		/* Make sure the mapping succeeded */
 		if (pci_dma_mapping_error(tp->pdev, new_addr)) {
-			dev_kfree_skb(new_skb);
+			dev_kfree_skb_any(new_skb);
 			ret = -1;
 		} else {
 			u32 save_entry = *entry;
@@ -7822,13 +7822,13 @@ static int tigon3_dma_hwbug_workaround(struct tg3_napi *tnapi,
 					    new_skb->len, base_flags,
 					    mss, vlan)) {
 				tg3_tx_skb_unmap(tnapi, save_entry, -1);
-				dev_kfree_skb(new_skb);
+				dev_kfree_skb_any(new_skb);
 				ret = -1;
 			}
 		}
 	}
 
-	dev_kfree_skb(skb);
+	dev_kfree_skb_any(skb);
 	*pskb = new_skb;
 	return ret;
 }
@@ -7871,7 +7871,7 @@ static int tg3_tso_bug(struct tg3 *tp, struct sk_buff *skb)
 	} while (segs);
 
 tg3_tso_bug_end:
-	dev_kfree_skb(skb);
+	dev_kfree_skb_any(skb);
 
 	return NETDEV_TX_OK;
 }
@@ -8093,7 +8093,7 @@ dma_error:
 	tg3_tx_skb_unmap(tnapi, tnapi->tx_prod, --i);
 	tnapi->tx_buffers[tnapi->tx_prod].skb = NULL;
 drop:
-	dev_kfree_skb(skb);
+	dev_kfree_skb_any(skb);
 drop_nofree:
 	tp->tx_dropped++;
 	return NETDEV_TX_OK;
-- 
1.7.5.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* [PATCH 05/11] bcm63xx_enet: Call dev_kfree_skby_any instead of dev_kfree_skb.
  2014-03-11  3:16   ` [PATCH next-next 0/11] Using dev_kfree_skb_any for functions called in multiple contexts Eric W. Biederman
                       ` (3 preceding siblings ...)
  2014-03-11  3:19     ` [PATCH 04/11] tg3: " Eric W. Biederman
@ 2014-03-11  3:20     ` Eric W. Biederman
  2014-03-11  3:21     ` [PATCH 06/11] e1000: " Eric W. Biederman
                       ` (5 subsequent siblings)
  10 siblings, 0 replies; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-11  3:20 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, xiyou.wangcong, mpm, satyam.sharma


Replace dev_kfree_skb with dev_kfree_skb_any in functions that can
be called in hard irq and other contexts.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 drivers/net/ethernet/broadcom/bcm63xx_enet.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bcm63xx_enet.c b/drivers/net/ethernet/broadcom/bcm63xx_enet.c
index b9a5fb6400d3..7cde07fee3c0 100644
--- a/drivers/net/ethernet/broadcom/bcm63xx_enet.c
+++ b/drivers/net/ethernet/broadcom/bcm63xx_enet.c
@@ -469,7 +469,7 @@ static int bcm_enet_tx_reclaim(struct net_device *dev, int force)
 		if (desc->len_stat & DMADESC_UNDER_MASK)
 			dev->stats.tx_errors++;
 
-		dev_kfree_skb(skb);
+		dev_kfree_skb_any(skb);
 		released++;
 	}
 
@@ -606,7 +606,7 @@ static int bcm_enet_start_xmit(struct sk_buff *skb, struct net_device *dev)
 				ret = NETDEV_TX_BUSY;
 				goto out_unlock;
 			}
-			dev_kfree_skb(skb);
+			dev_kfree_skb_any(skb);
 			skb = nskb;
 		}
 		data = skb_put(skb, needed);
-- 
1.7.5.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* [PATCH 06/11] e1000: Call dev_kfree_skby_any instead of dev_kfree_skb.
  2014-03-11  3:16   ` [PATCH next-next 0/11] Using dev_kfree_skb_any for functions called in multiple contexts Eric W. Biederman
                       ` (4 preceding siblings ...)
  2014-03-11  3:20     ` [PATCH 05/11] bcm63xx_enet: " Eric W. Biederman
@ 2014-03-11  3:21     ` Eric W. Biederman
  2014-03-11  3:22     ` [PATCH 07/11] igbvf: " Eric W. Biederman
                       ` (4 subsequent siblings)
  10 siblings, 0 replies; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-11  3:21 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, xiyou.wangcong, mpm, satyam.sharma


Replace dev_kfree_skb with dev_kfree_skb_any in functions that can
be called in hard irq and other contexts.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 drivers/net/ethernet/intel/e1000/e1000_main.c |   18 +++++++++---------
 1 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/drivers/net/ethernet/intel/e1000/e1000_main.c b/drivers/net/ethernet/intel/e1000/e1000_main.c
index 46e6544ed1b7..64036d3ef62b 100644
--- a/drivers/net/ethernet/intel/e1000/e1000_main.c
+++ b/drivers/net/ethernet/intel/e1000/e1000_main.c
@@ -4066,7 +4066,7 @@ static bool e1000_clean_jumbo_rx_irq(struct e1000_adapter *adapter,
 				 * too
 				 */
 				if (rx_ring->rx_skb_top)
-					dev_kfree_skb(rx_ring->rx_skb_top);
+					dev_kfree_skb_any(rx_ring->rx_skb_top);
 				rx_ring->rx_skb_top = NULL;
 				goto next_desc;
 			}
@@ -4143,7 +4143,7 @@ process_skb:
 		/* eth type trans needs skb->data to point to something */
 		if (!pskb_may_pull(skb, ETH_HLEN)) {
 			e_err(drv, "pskb_may_pull failed.\n");
-			dev_kfree_skb(skb);
+			dev_kfree_skb_any(skb);
 			goto next_desc;
 		}
 
@@ -4394,7 +4394,7 @@ check_page:
 							DMA_FROM_DEVICE);
 			if (dma_mapping_error(&pdev->dev, buffer_info->dma)) {
 				put_page(buffer_info->page);
-				dev_kfree_skb(skb);
+				dev_kfree_skb_any(skb);
 				buffer_info->page = NULL;
 				buffer_info->skb = NULL;
 				buffer_info->dma = 0;
@@ -4469,21 +4469,21 @@ static void e1000_alloc_rx_buffers(struct e1000_adapter *adapter,
 			skb = netdev_alloc_skb_ip_align(netdev, bufsz);
 			/* Failed allocation, critical failure */
 			if (!skb) {
-				dev_kfree_skb(oldskb);
+				dev_kfree_skb_any(oldskb);
 				adapter->alloc_rx_buff_failed++;
 				break;
 			}
 
 			if (!e1000_check_64k_bound(adapter, skb->data, bufsz)) {
 				/* give up */
-				dev_kfree_skb(skb);
-				dev_kfree_skb(oldskb);
+				dev_kfree_skb_any(skb);
+				dev_kfree_skb_any(oldskb);
 				adapter->alloc_rx_buff_failed++;
 				break; /* while !buffer_info->skb */
 			}
 
 			/* Use new allocation */
-			dev_kfree_skb(oldskb);
+			dev_kfree_skb_any(oldskb);
 		}
 		buffer_info->skb = skb;
 		buffer_info->length = adapter->rx_buffer_len;
@@ -4493,7 +4493,7 @@ map_skb:
 						  buffer_info->length,
 						  DMA_FROM_DEVICE);
 		if (dma_mapping_error(&pdev->dev, buffer_info->dma)) {
-			dev_kfree_skb(skb);
+			dev_kfree_skb_any(skb);
 			buffer_info->skb = NULL;
 			buffer_info->dma = 0;
 			adapter->alloc_rx_buff_failed++;
@@ -4511,7 +4511,7 @@ map_skb:
 			e_err(rx_err, "dma align check failed: %u bytes at "
 			      "%p\n", adapter->rx_buffer_len,
 			      (void *)(unsigned long)buffer_info->dma);
-			dev_kfree_skb(skb);
+			dev_kfree_skb_any(skb);
 			buffer_info->skb = NULL;
 
 			dma_unmap_single(&pdev->dev, buffer_info->dma,
-- 
1.7.5.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* [PATCH 07/11] igbvf: Call dev_kfree_skby_any instead of dev_kfree_skb.
  2014-03-11  3:16   ` [PATCH next-next 0/11] Using dev_kfree_skb_any for functions called in multiple contexts Eric W. Biederman
                       ` (5 preceding siblings ...)
  2014-03-11  3:21     ` [PATCH 06/11] e1000: " Eric W. Biederman
@ 2014-03-11  3:22     ` Eric W. Biederman
  2014-03-11  3:22     ` [PATCH 08/11] ixgb: " Eric W. Biederman
                       ` (3 subsequent siblings)
  10 siblings, 0 replies; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-11  3:22 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, xiyou.wangcong, mpm, satyam.sharma


Replace dev_kfree_skb with dev_kfree_skb_any in functions that can
be called in hard irq and other contexts.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 drivers/net/ethernet/intel/igbvf/netdev.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/net/ethernet/intel/igbvf/netdev.c b/drivers/net/ethernet/intel/igbvf/netdev.c
index e2c6d8059b74..a1b480368aee 100644
--- a/drivers/net/ethernet/intel/igbvf/netdev.c
+++ b/drivers/net/ethernet/intel/igbvf/netdev.c
@@ -212,7 +212,7 @@ static void igbvf_alloc_rx_buffers(struct igbvf_ring *rx_ring,
 			                                  bufsz,
 							  DMA_FROM_DEVICE);
 			if (dma_mapping_error(&pdev->dev, buffer_info->dma)) {
-				dev_kfree_skb(buffer_info->skb);
+				dev_kfree_skb_any(buffer_info->skb);
 				buffer_info->skb = NULL;
 				dev_err(&pdev->dev, "RX DMA map failed\n");
 				goto no_buffers;
-- 
1.7.5.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* [PATCH 08/11] ixgb: Call dev_kfree_skby_any instead of dev_kfree_skb.
  2014-03-11  3:16   ` [PATCH next-next 0/11] Using dev_kfree_skb_any for functions called in multiple contexts Eric W. Biederman
                       ` (6 preceding siblings ...)
  2014-03-11  3:22     ` [PATCH 07/11] igbvf: " Eric W. Biederman
@ 2014-03-11  3:22     ` Eric W. Biederman
  2014-03-11  3:23     ` [PATCH 09/11] mlx4: " Eric W. Biederman
                       ` (2 subsequent siblings)
  10 siblings, 0 replies; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-11  3:22 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, xiyou.wangcong, mpm, satyam.sharma


Replace dev_kfree_skb with dev_kfree_skb_any in functions that can
be called in hard irq and other contexts.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 drivers/net/ethernet/intel/ixgb/ixgb_main.c |    6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/intel/ixgb/ixgb_main.c b/drivers/net/ethernet/intel/ixgb/ixgb_main.c
index 57e390cbe6d0..f42c201f727f 100644
--- a/drivers/net/ethernet/intel/ixgb/ixgb_main.c
+++ b/drivers/net/ethernet/intel/ixgb/ixgb_main.c
@@ -1521,12 +1521,12 @@ ixgb_xmit_frame(struct sk_buff *skb, struct net_device *netdev)
 	int tso;
 
 	if (test_bit(__IXGB_DOWN, &adapter->flags)) {
-		dev_kfree_skb(skb);
+		dev_kfree_skb_any(skb);
 		return NETDEV_TX_OK;
 	}
 
 	if (skb->len <= 0) {
-		dev_kfree_skb(skb);
+		dev_kfree_skb_any(skb);
 		return NETDEV_TX_OK;
 	}
 
@@ -1543,7 +1543,7 @@ ixgb_xmit_frame(struct sk_buff *skb, struct net_device *netdev)
 
 	tso = ixgb_tso(adapter, skb);
 	if (tso < 0) {
-		dev_kfree_skb(skb);
+		dev_kfree_skb_any(skb);
 		return NETDEV_TX_OK;
 	}
 
-- 
1.7.5.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* [PATCH 09/11] mlx4: Call dev_kfree_skby_any instead of dev_kfree_skb.
  2014-03-11  3:16   ` [PATCH next-next 0/11] Using dev_kfree_skb_any for functions called in multiple contexts Eric W. Biederman
                       ` (7 preceding siblings ...)
  2014-03-11  3:22     ` [PATCH 08/11] ixgb: " Eric W. Biederman
@ 2014-03-11  3:23     ` Eric W. Biederman
  2014-03-11  3:23     ` [PATCH 10/11] benet: Call dev_kfree_skby_any instead of kfree_skb Eric W. Biederman
  2014-03-11  3:24     ` [PATCH 11/11] gianfar: Carefully free skbs in functions called by netpoll Eric W. Biederman
  10 siblings, 0 replies; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-11  3:23 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, xiyou.wangcong, mpm, satyam.sharma


Replace dev_kfree_skb with dev_kfree_skb_any in functions that can
be called in hard irq and other contexts.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 drivers/net/ethernet/mellanox/mlx4/en_tx.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx4/en_tx.c b/drivers/net/ethernet/mellanox/mlx4/en_tx.c
index 69c2fcef9d4c..dd1f6d346459 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_tx.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_tx.c
@@ -314,7 +314,7 @@ static u32 mlx4_en_free_tx_desc(struct mlx4_en_priv *priv,
 			}
 		}
 	}
-	dev_kfree_skb(skb);
+	dev_kfree_skb_any(skb);
 	return tx_info->nr_txbb;
 }
 
-- 
1.7.5.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* [PATCH 10/11] benet: Call dev_kfree_skby_any instead of kfree_skb.
  2014-03-11  3:16   ` [PATCH next-next 0/11] Using dev_kfree_skb_any for functions called in multiple contexts Eric W. Biederman
                       ` (8 preceding siblings ...)
  2014-03-11  3:23     ` [PATCH 09/11] mlx4: " Eric W. Biederman
@ 2014-03-11  3:23     ` Eric W. Biederman
  2014-03-11  3:24     ` [PATCH 11/11] gianfar: Carefully free skbs in functions called by netpoll Eric W. Biederman
  10 siblings, 0 replies; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-11  3:23 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, xiyou.wangcong, mpm, satyam.sharma


Replace free_skb with dev_kfree_skb_any in be_tx_compl_process as which
can be called in hard irq by netpoll, softirq context by normal napi
polling, and in normal sleepable context by the network device close
method.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 drivers/net/ethernet/emulex/benet/be_main.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/net/ethernet/emulex/benet/be_main.c b/drivers/net/ethernet/emulex/benet/be_main.c
index 6e10230a2ee0..2eee0b2577f8 100644
--- a/drivers/net/ethernet/emulex/benet/be_main.c
+++ b/drivers/net/ethernet/emulex/benet/be_main.c
@@ -1897,7 +1897,7 @@ static u16 be_tx_compl_process(struct be_adapter *adapter,
 		queue_tail_inc(txq);
 	} while (cur_index != last_index);
 
-	kfree_skb(sent_skb);
+	dev_kfree_skb_any(sent_skb);
 	return num_wrbs;
 }
 
-- 
1.7.5.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* [PATCH 11/11] gianfar: Carefully free skbs in functions called by netpoll.
  2014-03-11  3:16   ` [PATCH next-next 0/11] Using dev_kfree_skb_any for functions called in multiple contexts Eric W. Biederman
                       ` (9 preceding siblings ...)
  2014-03-11  3:23     ` [PATCH 10/11] benet: Call dev_kfree_skby_any instead of kfree_skb Eric W. Biederman
@ 2014-03-11  3:24     ` Eric W. Biederman
  10 siblings, 0 replies; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-11  3:24 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, xiyou.wangcong, mpm, satyam.sharma


netpoll can call functions in hard irq context that are ordinarily
called in lesser contexts.  For those functions use dev_kfree_skb_any
and dev_consume_skb_any so skbs are freed safely from hard irq context.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 drivers/net/ethernet/freescale/gianfar.c |    6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/freescale/gianfar.c b/drivers/net/ethernet/freescale/gianfar.c
index c5b9320f7629..9ab00ba6b580 100644
--- a/drivers/net/ethernet/freescale/gianfar.c
+++ b/drivers/net/ethernet/freescale/gianfar.c
@@ -2146,13 +2146,13 @@ static int gfar_start_xmit(struct sk_buff *skb, struct net_device *dev)
 		skb_new = skb_realloc_headroom(skb, fcb_len);
 		if (!skb_new) {
 			dev->stats.tx_errors++;
-			kfree_skb(skb);
+			dev_kfree_skb_any(skb);
 			return NETDEV_TX_OK;
 		}
 
 		if (skb->sk)
 			skb_set_owner_w(skb_new, skb->sk);
-		consume_skb(skb);
+		dev_consume_skb_any(skb);
 		skb = skb_new;
 	}
 
@@ -2744,7 +2744,7 @@ int gfar_clean_rx_ring(struct gfar_priv_rx_q *rx_queue, int rx_work_limit)
 			if (unlikely(!newskb))
 				newskb = skb;
 			else if (skb)
-				dev_kfree_skb(skb);
+				dev_kfree_skb_any(skb);
 		} else {
 			/* Increment the number of packets */
 			rx_queue->stats.rx_packets++;
-- 
1.7.5.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* Re: [PATCH 01/11] bonding: Call dev_kfree_skby_any instead of kfree_skb.
  2014-03-11  3:18     ` [PATCH 01/11] bonding: Call dev_kfree_skby_any instead of kfree_skb Eric W. Biederman
@ 2014-03-11  3:44       ` Eric Dumazet
  2014-03-11  4:00         ` Eric W. Biederman
  2014-03-11  4:42         ` David Miller
  0 siblings, 2 replies; 288+ messages in thread
From: Eric Dumazet @ 2014-03-11  3:44 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: David Miller, netdev, xiyou.wangcong, mpm, satyam.sharma

On Mon, 2014-03-10 at 20:18 -0700, Eric W. Biederman wrote:
> Replace kfree_skb with dev_kfree_skb_any in functions that can
> be called in hard irq and other contexts.
> 
> Replace consume_skb with dev_consume_skb_any in functions that can
> be called in hard irq and other contexts.
> 
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
> ---
>  drivers/net/bonding/bond_3ad.c  |    2 +-
>  drivers/net/bonding/bond_alb.c  |    2 +-
>  drivers/net/bonding/bond_main.c |   14 +++++++-------
>  3 files changed, 9 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/net/bonding/bond_3ad.c b/drivers/net/bonding/bond_3ad.c
> index a2ef3f72de88..dee2a84a2929 100644
> --- a/drivers/net/bonding/bond_3ad.c
> +++ b/drivers/net/bonding/bond_3ad.c
> @@ -2479,7 +2479,7 @@ out:
>  	return NETDEV_TX_OK;
>  err_free:
>  	/* no suitable interface, frame not sent */
> -	kfree_skb(skb);
> +	dev_kfree_skb_any(skb);
>  	goto out;
>  }
>  
> diff --git a/drivers/net/bonding/bond_alb.c b/drivers/net/bonding/bond_alb.c
> index aaeeacf767f2..9cf836b67b15 100644
> --- a/drivers/net/bonding/bond_alb.c
> +++ b/drivers/net/bonding/bond_alb.c
> @@ -1464,7 +1464,7 @@ int bond_alb_xmit(struct sk_buff *skb, struct net_device *bond_dev)
>  	}
>  
>  	/* no suitable interface, frame not sent */
> -	kfree_skb(skb);
> +	dev_kfree_skb_any(skb);
>  out:
>  	return NETDEV_TX_OK;
>  }
> diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
> index 730d72c706c9..63f8df8af4d4 100644
> --- a/drivers/net/bonding/bond_main.c
> +++ b/drivers/net/bonding/bond_main.c
> @@ -1115,7 +1115,7 @@ static rx_handler_result_t bond_handle_frame(struct sk_buff **pskb)
>  	if (recv_probe) {
>  		ret = recv_probe(skb, bond, slave);
>  		if (ret == RX_HANDLER_CONSUMED) {
> -			consume_skb(skb);
> +			dev_consume_skb_any(skb);

Why is this needed ? AFAIK we run in softirq here.

>  			return ret;
>  		}
>  	}
> @@ -1132,7 +1132,7 @@ static rx_handler_result_t bond_handle_frame(struct sk_buff **pskb)
>  
>  		if (unlikely(skb_cow_head(skb,
>  					  skb->data - skb_mac_header(skb)))) {
> -			kfree_skb(skb);
> +			dev_kfree_skb_any(skb);

same here.

>  			return RX_HANDLER_CONSUMED;
>  		}
>  		ether_addr_copy(eth_hdr(skb)->h_dest, bond->dev->dev_addr);
> @@ -3548,7 +3548,7 @@ static void bond_xmit_slave_id(struct bonding *bond, struct sk_buff *skb, int sl
>  		}

^ permalink raw reply	[flat|nested] 288+ messages in thread

* Re: [PATCH 02/11] bnx2: Call dev_kfree_skby_any instead of dev_kfree_skb.
  2014-03-11  3:18     ` [PATCH 02/11] bnx2: Call dev_kfree_skby_any instead of dev_kfree_skb Eric W. Biederman
@ 2014-03-11  3:47       ` Eric Dumazet
  2014-03-11  4:10         ` Eric W. Biederman
  2014-03-11  4:43         ` David Miller
  0 siblings, 2 replies; 288+ messages in thread
From: Eric Dumazet @ 2014-03-11  3:47 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: David Miller, netdev, xiyou.wangcong, mpm, satyam.sharma

On Mon, 2014-03-10 at 20:18 -0700, Eric W. Biederman wrote:
> Replace dev_kfree_skb with dev_kfree_skb_any in functions that can
> be called in hard irq and other contexts.
> 
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
> ---
>  drivers/net/ethernet/broadcom/bnx2.c |   10 +++++-----
>  1 files changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/net/ethernet/broadcom/bnx2.c b/drivers/net/ethernet/broadcom/bnx2.c
> index ca6b36220d94..c94735de808d 100644
> --- a/drivers/net/ethernet/broadcom/bnx2.c
> +++ b/drivers/net/ethernet/broadcom/bnx2.c
> @@ -2885,7 +2885,7 @@ bnx2_tx_int(struct bnx2 *bp, struct bnx2_napi *bnapi, int budget)
>  		sw_cons = BNX2_NEXT_TX_BD(sw_cons);
>  
>  		tx_bytes += skb->len;
> -		dev_kfree_skb(skb);
> +		dev_kfree_skb_any(skb);

This looks like a dev_consume_skb_any() candidate ?

Anyway, why can this be called from hard irq ?

I'll stop my review here, it seems either me or you are confused/tired.

^ permalink raw reply	[flat|nested] 288+ messages in thread

* Re: [PATCH 01/11] bonding: Call dev_kfree_skby_any instead of kfree_skb.
  2014-03-11  3:44       ` Eric Dumazet
@ 2014-03-11  4:00         ` Eric W. Biederman
  2014-03-11  4:56           ` Eric Dumazet
  2014-03-11  4:42         ` David Miller
  1 sibling, 1 reply; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-11  4:00 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: David Miller, netdev, xiyou.wangcong, mpm, satyam.sharma

Eric Dumazet <eric.dumazet@gmail.com> writes:

> On Mon, 2014-03-10 at 20:18 -0700, Eric W. Biederman wrote:
>> Replace kfree_skb with dev_kfree_skb_any in functions that can
>> be called in hard irq and other contexts.
>> 
>> Replace consume_skb with dev_consume_skb_any in functions that can
>> be called in hard irq and other contexts.
>> 
>> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
>> ---
>>  drivers/net/bonding/bond_3ad.c  |    2 +-
>>  drivers/net/bonding/bond_alb.c  |    2 +-
>>  drivers/net/bonding/bond_main.c |   14 +++++++-------
>>  3 files changed, 9 insertions(+), 9 deletions(-)
>> 
>> diff --git a/drivers/net/bonding/bond_3ad.c b/drivers/net/bonding/bond_3ad.c
>> index a2ef3f72de88..dee2a84a2929 100644
>> --- a/drivers/net/bonding/bond_3ad.c
>> +++ b/drivers/net/bonding/bond_3ad.c
>> @@ -2479,7 +2479,7 @@ out:
>>  	return NETDEV_TX_OK;
>>  err_free:
>>  	/* no suitable interface, frame not sent */
>> -	kfree_skb(skb);
>> +	dev_kfree_skb_any(skb);
>>  	goto out;
>>  }
>>  
>> diff --git a/drivers/net/bonding/bond_alb.c b/drivers/net/bonding/bond_alb.c
>> index aaeeacf767f2..9cf836b67b15 100644
>> --- a/drivers/net/bonding/bond_alb.c
>> +++ b/drivers/net/bonding/bond_alb.c
>> @@ -1464,7 +1464,7 @@ int bond_alb_xmit(struct sk_buff *skb, struct net_device *bond_dev)
>>  	}
>>  
>>  	/* no suitable interface, frame not sent */
>> -	kfree_skb(skb);
>> +	dev_kfree_skb_any(skb);
>>  out:
>>  	return NETDEV_TX_OK;
>>  }
>> diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
>> index 730d72c706c9..63f8df8af4d4 100644
>> --- a/drivers/net/bonding/bond_main.c
>> +++ b/drivers/net/bonding/bond_main.c
>> @@ -1115,7 +1115,7 @@ static rx_handler_result_t bond_handle_frame(struct sk_buff **pskb)
>>  	if (recv_probe) {
>>  		ret = recv_probe(skb, bond, slave);
>>  		if (ret == RX_HANDLER_CONSUMED) {
>> -			consume_skb(skb);
>> +			dev_consume_skb_any(skb);
>
> Why is this needed ? AFAIK we run in softirq here.

Except when we call printk in hard irq context.  Then we can easily have
a call trace like:

drivers/net/netconsole.c:write_msg
  netpoll_send_udp
    netpoll_send_skb_on_dev
      netpoll_poll_dev
        poll_napi
           poll_one_napi
         -------------------------
             tg3_poll
                tg3_poll_work   -- Or any other driver supporting netpoll
                   tg3_rx
         -------------------------
                     napi_gro_receive
                       napi_skb_finish
                         netif_receive_skb_internal
                           __netif_receive_skb
                             __netif_receive_skb_core
                                bond_handle_frame


>>  			return ret;
>>  		}
>>  	}
>> @@ -1132,7 +1132,7 @@ static rx_handler_result_t bond_handle_frame(struct sk_buff **pskb)
>>  
>>  		if (unlikely(skb_cow_head(skb,
>>  					  skb->data - skb_mac_header(skb)))) {
>> -			kfree_skb(skb);
>> +			dev_kfree_skb_any(skb);
>
> same here.

Same reason as above.

>>  			return RX_HANDLER_CONSUMED;
>>  		}
>>  		ether_addr_copy(eth_hdr(skb)->h_dest, bond->dev->dev_addr);
>> @@ -3548,7 +3548,7 @@ static void bond_xmit_slave_id(struct bonding *bond, struct sk_buff *skb, int sl
>>  		}

Eric

^ permalink raw reply	[flat|nested] 288+ messages in thread

* Re: [PATCH 02/11] bnx2: Call dev_kfree_skby_any instead of dev_kfree_skb.
  2014-03-11  3:47       ` Eric Dumazet
@ 2014-03-11  4:10         ` Eric W. Biederman
  2014-03-11  4:43         ` David Miller
  1 sibling, 0 replies; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-11  4:10 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: David Miller, netdev, xiyou.wangcong, mpm, satyam.sharma

Eric Dumazet <eric.dumazet@gmail.com> writes:

> On Mon, 2014-03-10 at 20:18 -0700, Eric W. Biederman wrote:
>> Replace dev_kfree_skb with dev_kfree_skb_any in functions that can
>> be called in hard irq and other contexts.
>> 
>> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
>> ---
>>  drivers/net/ethernet/broadcom/bnx2.c |   10 +++++-----
>>  1 files changed, 5 insertions(+), 5 deletions(-)
>> 
>> diff --git a/drivers/net/ethernet/broadcom/bnx2.c b/drivers/net/ethernet/broadcom/bnx2.c
>> index ca6b36220d94..c94735de808d 100644
>> --- a/drivers/net/ethernet/broadcom/bnx2.c
>> +++ b/drivers/net/ethernet/broadcom/bnx2.c
>> @@ -2885,7 +2885,7 @@ bnx2_tx_int(struct bnx2 *bp, struct bnx2_napi *bnapi, int budget)
>>  		sw_cons = BNX2_NEXT_TX_BD(sw_cons);
>>  
>>  		tx_bytes += skb->len;
>> -		dev_kfree_skb(skb);
>> +		dev_kfree_skb_any(skb);
>
> This looks like a dev_consume_skb_any() candidate ?

That seems reasonable.  I am focusing on one dimension at a time.

> Anyway, why can this be called from hard irq ?

netpoll_poll_dev
   bnx2_poll
     bnx2_poll_work
       bnx2_tx_int

> I'll stop my review here, it seems either me or you are confused/tired.

I did my best to verify the code paths I am changing actually exist.  I
think I even have a stack backtrace from skb_release_head_state around
somewhere.  Transmitted packets frequently have dst cache entries,
conntrack entries, and destructores which make them actually problematic
to free in hard irq context.

Eric

^ permalink raw reply	[flat|nested] 288+ messages in thread

* Re: [PATCH 01/11] bonding: Call dev_kfree_skby_any instead of kfree_skb.
  2014-03-11  3:44       ` Eric Dumazet
  2014-03-11  4:00         ` Eric W. Biederman
@ 2014-03-11  4:42         ` David Miller
  2014-03-11  5:02           ` Eric Dumazet
  2014-03-11  5:31           ` Eric W. Biederman
  1 sibling, 2 replies; 288+ messages in thread
From: David Miller @ 2014-03-11  4:42 UTC (permalink / raw)
  To: eric.dumazet; +Cc: ebiederm, netdev, xiyou.wangcong, mpm, satyam.sharma

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Mon, 10 Mar 2014 20:44:27 -0700

> On Mon, 2014-03-10 at 20:18 -0700, Eric W. Biederman wrote:
>> -			consume_skb(skb);
>> +			dev_consume_skb_any(skb);
> 
> Why is this needed ? AFAIK we run in softirq here.
> 
>>  			return ret;
>>  		}
>>  	}
>> @@ -1132,7 +1132,7 @@ static rx_handler_result_t bond_handle_frame(struct sk_buff **pskb)
>>  
>>  		if (unlikely(skb_cow_head(skb,
>>  					  skb->data - skb_mac_header(skb)))) {
>> -			kfree_skb(skb);
>> +			dev_kfree_skb_any(skb);
> 
> same here.

These changes eminate from a recent discussion about netpoll, which can
call into the driver from hardware interrupts, particularly when netconsole
services a printk from hardware interrupt context.

bnx2x already makes similar ammends.

I hope that Eric B. here audited to make sure he's only doing this
transformation in situations that actually need this treatment for
the above mentioned issue.

^ permalink raw reply	[flat|nested] 288+ messages in thread

* Re: [PATCH 02/11] bnx2: Call dev_kfree_skby_any instead of dev_kfree_skb.
  2014-03-11  3:47       ` Eric Dumazet
  2014-03-11  4:10         ` Eric W. Biederman
@ 2014-03-11  4:43         ` David Miller
  1 sibling, 0 replies; 288+ messages in thread
From: David Miller @ 2014-03-11  4:43 UTC (permalink / raw)
  To: eric.dumazet; +Cc: ebiederm, netdev, xiyou.wangcong, mpm, satyam.sharma

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Mon, 10 Mar 2014 20:47:57 -0700

> Anyway, why can this be called from hard irq ?

Eric B. explains this in his 0/11 posting, the culprit is netpoll.

I had a discussion about this with him on this list last week.

^ permalink raw reply	[flat|nested] 288+ messages in thread

* Re: [PATCH 01/11] bonding: Call dev_kfree_skby_any instead of kfree_skb.
  2014-03-11  4:00         ` Eric W. Biederman
@ 2014-03-11  4:56           ` Eric Dumazet
  0 siblings, 0 replies; 288+ messages in thread
From: Eric Dumazet @ 2014-03-11  4:56 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: David Miller, netdev, xiyou.wangcong, mpm, satyam.sharma

On Mon, 2014-03-10 at 21:00 -0700, Eric W. Biederman wrote:
> Eric Dumazet <eric.dumazet@gmail.com> writes:
> 
> > On Mon, 2014-03-10 at 20:18 -0700, Eric W. Biederman wrote:
> >> Replace kfree_skb with dev_kfree_skb_any in functions that can
> >> be called in hard irq and other contexts.
> >> 
> >> Replace consume_skb with dev_consume_skb_any in functions that can
> >> be called in hard irq and other contexts.
> >> 
> >> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
> >> ---
> >>  drivers/net/bonding/bond_3ad.c  |    2 +-
> >>  drivers/net/bonding/bond_alb.c  |    2 +-
> >>  drivers/net/bonding/bond_main.c |   14 +++++++-------
> >>  3 files changed, 9 insertions(+), 9 deletions(-)
> >> 
> >> diff --git a/drivers/net/bonding/bond_3ad.c b/drivers/net/bonding/bond_3ad.c
> >> index a2ef3f72de88..dee2a84a2929 100644
> >> --- a/drivers/net/bonding/bond_3ad.c
> >> +++ b/drivers/net/bonding/bond_3ad.c
> >> @@ -2479,7 +2479,7 @@ out:
> >>  	return NETDEV_TX_OK;
> >>  err_free:
> >>  	/* no suitable interface, frame not sent */
> >> -	kfree_skb(skb);
> >> +	dev_kfree_skb_any(skb);
> >>  	goto out;
> >>  }
> >>  
> >> diff --git a/drivers/net/bonding/bond_alb.c b/drivers/net/bonding/bond_alb.c
> >> index aaeeacf767f2..9cf836b67b15 100644
> >> --- a/drivers/net/bonding/bond_alb.c
> >> +++ b/drivers/net/bonding/bond_alb.c
> >> @@ -1464,7 +1464,7 @@ int bond_alb_xmit(struct sk_buff *skb, struct net_device *bond_dev)
> >>  	}
> >>  
> >>  	/* no suitable interface, frame not sent */
> >> -	kfree_skb(skb);
> >> +	dev_kfree_skb_any(skb);
> >>  out:
> >>  	return NETDEV_TX_OK;
> >>  }
> >> diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
> >> index 730d72c706c9..63f8df8af4d4 100644
> >> --- a/drivers/net/bonding/bond_main.c
> >> +++ b/drivers/net/bonding/bond_main.c
> >> @@ -1115,7 +1115,7 @@ static rx_handler_result_t bond_handle_frame(struct sk_buff **pskb)
> >>  	if (recv_probe) {
> >>  		ret = recv_probe(skb, bond, slave);
> >>  		if (ret == RX_HANDLER_CONSUMED) {
> >> -			consume_skb(skb);
> >> +			dev_consume_skb_any(skb);
> >
> > Why is this needed ? AFAIK we run in softirq here.
> 
> Except when we call printk in hard irq context.  Then we can easily have
> a call trace like:
> 
> drivers/net/netconsole.c:write_msg
>   netpoll_send_udp
>     netpoll_send_skb_on_dev
>       netpoll_poll_dev
>         poll_napi
>            poll_one_napi
>          -------------------------
>              tg3_poll
>                 tg3_poll_work   -- Or any other driver supporting netpoll
>                    tg3_rx
>          -------------------------
>                      napi_gro_receive
>                        napi_skb_finish
>                          netif_receive_skb_internal
>                            __netif_receive_skb
>                              __netif_receive_skb_core
>                                 bond_handle_frame

But RX path (napi_gro_receive) is not supposed to be called from hard
irq.

Many things will horribly break.

Sorry, I must be very tired.

^ permalink raw reply	[flat|nested] 288+ messages in thread

* Re: [PATCH 01/11] bonding: Call dev_kfree_skby_any instead of kfree_skb.
  2014-03-11  4:42         ` David Miller
@ 2014-03-11  5:02           ` Eric Dumazet
  2014-03-11  8:43             ` [RFC PATCH 0/2] remove netpoll rx support Eric W. Biederman
  2014-03-11 16:39             ` [PATCH 01/11] bonding: Call dev_kfree_skby_any instead of kfree_skb David Miller
  2014-03-11  5:31           ` Eric W. Biederman
  1 sibling, 2 replies; 288+ messages in thread
From: Eric Dumazet @ 2014-03-11  5:02 UTC (permalink / raw)
  To: David Miller; +Cc: ebiederm, netdev, xiyou.wangcong, mpm, satyam.sharma

On Tue, 2014-03-11 at 00:42 -0400, David Miller wrote:

> These changes eminate from a recent discussion about netpoll, which can
> call into the driver from hardware interrupts, particularly when netconsole
> services a printk from hardware interrupt context.
> 
> bnx2x already makes similar ammends.
> 
> I hope that Eric B. here audited to make sure he's only doing this
> transformation in situations that actually need this treatment for
> the above mentioned issue.

I totally understand the TX path, not the RX.

It seems netpoll should not drain rx queues.

^ permalink raw reply	[flat|nested] 288+ messages in thread

* Re: [PATCH 01/11] bonding: Call dev_kfree_skby_any instead of kfree_skb.
  2014-03-11  4:42         ` David Miller
  2014-03-11  5:02           ` Eric Dumazet
@ 2014-03-11  5:31           ` Eric W. Biederman
  1 sibling, 0 replies; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-11  5:31 UTC (permalink / raw)
  To: David Miller; +Cc: eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma

David Miller <davem@davemloft.net> writes:

> From: Eric Dumazet <eric.dumazet@gmail.com>
> Date: Mon, 10 Mar 2014 20:44:27 -0700
>
>> On Mon, 2014-03-10 at 20:18 -0700, Eric W. Biederman wrote:
>>> -			consume_skb(skb);
>>> +			dev_consume_skb_any(skb);
>> 
>> Why is this needed ? AFAIK we run in softirq here.
>> 
>>>  			return ret;
>>>  		}
>>>  	}
>>> @@ -1132,7 +1132,7 @@ static rx_handler_result_t bond_handle_frame(struct sk_buff **pskb)
>>>  
>>>  		if (unlikely(skb_cow_head(skb,
>>>  					  skb->data - skb_mac_header(skb)))) {
>>> -			kfree_skb(skb);
>>> +			dev_kfree_skb_any(skb);
>> 
>> same here.
>
> These changes eminate from a recent discussion about netpoll, which can
> call into the driver from hardware interrupts, particularly when netconsole
> services a printk from hardware interrupt context.
>
> bnx2x already makes similar ammends.
>
> I hope that Eric B. here audited to make sure he's only doing this
> transformation in situations that actually need this treatment for
> the above mentioned issue.

It looks like our replies crossed.  Yes every location that I changed is
a location that I can currently find a code path to from a printk in a
hard irq context.

Furthermore the difference between dev_kfree_skb/kfree_skb and
dev_kfree_skb_any is simply a test to see if you are in interrupt
context so there should be no functional differences except
when we make it to these code paths in interrupt context.

Eric

^ permalink raw reply	[flat|nested] 288+ messages in thread

* [RFC PATCH 0/2] remove netpoll rx support
  2014-03-11  5:02           ` Eric Dumazet
@ 2014-03-11  8:43             ` Eric W. Biederman
  2014-03-11  8:44               ` [RFC PATCH 1/2] netpoll: Remove dead netpoll_rx code Eric W. Biederman
                                 ` (3 more replies)
  2014-03-11 16:39             ` [PATCH 01/11] bonding: Call dev_kfree_skby_any instead of kfree_skb David Miller
  1 sibling, 4 replies; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-11  8:43 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: David Miller, netdev, xiyou.wangcong, mpm, satyam.sharma

Eric Dumazet <eric.dumazet@gmail.com> writes:

> On Tue, 2014-03-11 at 00:42 -0400, David Miller wrote:
>
>> These changes eminate from a recent discussion about netpoll, which can
>> call into the driver from hardware interrupts, particularly when netconsole
>> services a printk from hardware interrupt context.
>> 
>> bnx2x already makes similar ammends.
>> 
>> I hope that Eric B. here audited to make sure he's only doing this
>> transformation in situations that actually need this treatment for
>> the above mentioned issue.
>
> I totally understand the TX path, not the RX.
>
> It seems netpoll should not drain rx queues.

It does seem desirable that netpoll should not drain the rx queues.
Unfortunately that is not how netpoll is built.  By my quick count
there are 132 drivers in the kernel that support netpoll.

Several of them such as the e1000e driver already call dev_kfree_skb_any
or dev_kfree_skb_irq in their rx paths.  What I am implementing seems to
be the pattern that the better drivers follow today.

Furthermore netpoll by it's design depends on the ability to receive
packets in netpoll_poll_dev.  It is a capability I don't think we have
ever used in the mainline kernel but it is a capability that is there
deliberately.  Which means if we want netpoll to not mess with the rx
path we need to change netpoll.


If we are willing to change the definition of netpoll this is fixable.

The big enabler is the fact that calling the napi poll function with a
budget of 0 means don't perform any rx work.

Which leads to the following set of changes to netpoll if we are brave.

Eric W. Biederman (2):
      netpoll:  Remove dead netpoll_rx code
      netpoll: Don't poll for received packets

 drivers/net/Kconfig       |    5 -
 include/linux/netdevice.h |   17 --
 include/linux/netpoll.h   |   59 ------
 net/core/dev.c            |   11 +-
 net/core/netpoll.c        |  499 +--------------------------------------------
 5 files changed, 10 insertions(+), 581 deletions(-)

Eric

^ permalink raw reply	[flat|nested] 288+ messages in thread

* [RFC PATCH 1/2] netpoll:  Remove dead netpoll_rx code
  2014-03-11  8:43             ` [RFC PATCH 0/2] remove netpoll rx support Eric W. Biederman
@ 2014-03-11  8:44               ` Eric W. Biederman
  2014-03-11 12:29                 ` Eric Dumazet
  2014-03-11  8:45               ` [RFC PATCH 2/2] netpoll: Don't poll for received packets Eric W. Biederman
                                 ` (2 subsequent siblings)
  3 siblings, 1 reply; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-11  8:44 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: David Miller, netdev, xiyou.wangcong, mpm, satyam.sharma


The netpoll_rx code only becomes active if the netpoll rx_skb_hook is
implemented.  There is not a single implementation of the netpoll
rx_skb_hook in the kernel.

There are problems with the netpoll packet receive code. Most
speifically every packet that makes it to netpoll_neigh_reply is
leaked.

Given that the netpoll packet receive code is buggy and has not been used
for a decade let's just remove it.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 drivers/net/Kconfig       |    5 -
 include/linux/netdevice.h |   17 --
 include/linux/netpoll.h   |   59 ------
 net/core/dev.c            |   11 +-
 net/core/netpoll.c        |  471 ---------------------------------------------
 5 files changed, 1 insertions(+), 562 deletions(-)

diff --git a/drivers/net/Kconfig b/drivers/net/Kconfig
index 494b888a6568..89402c3b64f8 100644
--- a/drivers/net/Kconfig
+++ b/drivers/net/Kconfig
@@ -177,11 +177,6 @@ config NETCONSOLE_DYNAMIC
 config NETPOLL
 	def_bool NETCONSOLE
 
-config NETPOLL_TRAP
-	bool "Netpoll traffic trapping"
-	default n
-	depends on NETPOLL
-
 config NET_POLL_CONTROLLER
 	def_bool NETPOLL
 
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 1a869488b8ae..cd345f102926 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -1979,9 +1979,6 @@ struct net_device *__dev_get_by_index(struct net *net, int ifindex);
 struct net_device *dev_get_by_index_rcu(struct net *net, int ifindex);
 int netdev_get_name(struct net *net, char *name, int ifindex);
 int dev_restart(struct net_device *dev);
-#ifdef CONFIG_NETPOLL_TRAP
-int netpoll_trap(void);
-#endif
 int skb_gro_receive(struct sk_buff **head, struct sk_buff *skb);
 
 static inline unsigned int skb_gro_offset(const struct sk_buff *skb)
@@ -2186,12 +2183,6 @@ static inline void netif_tx_start_all_queues(struct net_device *dev)
 
 static inline void netif_tx_wake_queue(struct netdev_queue *dev_queue)
 {
-#ifdef CONFIG_NETPOLL_TRAP
-	if (netpoll_trap()) {
-		netif_tx_start_queue(dev_queue);
-		return;
-	}
-#endif
 	if (test_and_clear_bit(__QUEUE_STATE_DRV_XOFF, &dev_queue->state))
 		__netif_schedule(dev_queue->qdisc);
 }
@@ -2435,10 +2426,6 @@ static inline void netif_start_subqueue(struct net_device *dev, u16 queue_index)
 static inline void netif_stop_subqueue(struct net_device *dev, u16 queue_index)
 {
 	struct netdev_queue *txq = netdev_get_tx_queue(dev, queue_index);
-#ifdef CONFIG_NETPOLL_TRAP
-	if (netpoll_trap())
-		return;
-#endif
 	netif_tx_stop_queue(txq);
 }
 
@@ -2473,10 +2460,6 @@ static inline bool netif_subqueue_stopped(const struct net_device *dev,
 static inline void netif_wake_subqueue(struct net_device *dev, u16 queue_index)
 {
 	struct netdev_queue *txq = netdev_get_tx_queue(dev, queue_index);
-#ifdef CONFIG_NETPOLL_TRAP
-	if (netpoll_trap())
-		return;
-#endif
 	if (test_and_clear_bit(__QUEUE_STATE_DRV_XOFF, &txq->state))
 		__netif_schedule(txq->qdisc);
 }
diff --git a/include/linux/netpoll.h b/include/linux/netpoll.h
index fbfdb9d8d3a7..6c14f0d89bcb 100644
--- a/include/linux/netpoll.h
+++ b/include/linux/netpoll.h
@@ -24,27 +24,20 @@ struct netpoll {
 	struct net_device *dev;
 	char dev_name[IFNAMSIZ];
 	const char *name;
-	void (*rx_skb_hook)(struct netpoll *np, int source, struct sk_buff *skb,
-			    int offset, int len);
 
 	union inet_addr local_ip, remote_ip;
 	bool ipv6;
 	u16 local_port, remote_port;
 	u8 remote_mac[ETH_ALEN];
 
-	struct list_head rx; /* rx_np list element */
 	struct work_struct cleanup_work;
 };
 
 struct netpoll_info {
 	atomic_t refcnt;
 
-	unsigned long rx_flags;
-	spinlock_t rx_lock;
 	struct semaphore dev_lock;
-	struct list_head rx_np; /* netpolls that registered an rx_skb_hook */
 
-	struct sk_buff_head neigh_tx; /* list of neigh requests to reply to */
 	struct sk_buff_head txq;
 
 	struct delayed_work tx_work;
@@ -66,12 +59,9 @@ void netpoll_print_options(struct netpoll *np);
 int netpoll_parse_options(struct netpoll *np, char *opt);
 int __netpoll_setup(struct netpoll *np, struct net_device *ndev, gfp_t gfp);
 int netpoll_setup(struct netpoll *np);
-int netpoll_trap(void);
-void netpoll_set_trap(int trap);
 void __netpoll_cleanup(struct netpoll *np);
 void __netpoll_free_async(struct netpoll *np);
 void netpoll_cleanup(struct netpoll *np);
-int __netpoll_rx(struct sk_buff *skb, struct netpoll_info *npinfo);
 void netpoll_send_skb_on_dev(struct netpoll *np, struct sk_buff *skb,
 			     struct net_device *dev);
 static inline void netpoll_send_skb(struct netpoll *np, struct sk_buff *skb)
@@ -83,44 +73,7 @@ static inline void netpoll_send_skb(struct netpoll *np, struct sk_buff *skb)
 }
 
 
-
 #ifdef CONFIG_NETPOLL
-static inline bool netpoll_rx_on(struct sk_buff *skb)
-{
-	struct netpoll_info *npinfo = rcu_dereference_bh(skb->dev->npinfo);
-
-	return npinfo && (!list_empty(&npinfo->rx_np) || npinfo->rx_flags);
-}
-
-static inline bool netpoll_rx(struct sk_buff *skb)
-{
-	struct netpoll_info *npinfo;
-	unsigned long flags;
-	bool ret = false;
-
-	local_irq_save(flags);
-
-	if (!netpoll_rx_on(skb))
-		goto out;
-
-	npinfo = rcu_dereference_bh(skb->dev->npinfo);
-	spin_lock(&npinfo->rx_lock);
-	/* check rx_flags again with the lock held */
-	if (npinfo->rx_flags && __netpoll_rx(skb, npinfo))
-		ret = true;
-	spin_unlock(&npinfo->rx_lock);
-
-out:
-	local_irq_restore(flags);
-	return ret;
-}
-
-static inline int netpoll_receive_skb(struct sk_buff *skb)
-{
-	if (!list_empty(&skb->dev->napi_list))
-		return netpoll_rx(skb);
-	return 0;
-}
 
 static inline void *netpoll_poll_lock(struct napi_struct *napi)
 {
@@ -150,18 +103,6 @@ static inline bool netpoll_tx_running(struct net_device *dev)
 }
 
 #else
-static inline bool netpoll_rx(struct sk_buff *skb)
-{
-	return false;
-}
-static inline bool netpoll_rx_on(struct sk_buff *skb)
-{
-	return false;
-}
-static inline int netpoll_receive_skb(struct sk_buff *skb)
-{
-	return 0;
-}
 static inline void *netpoll_poll_lock(struct napi_struct *napi)
 {
 	return NULL;
diff --git a/net/core/dev.c b/net/core/dev.c
index b1b0c8d4d7df..3565c898a910 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -3231,10 +3231,6 @@ static int netif_rx_internal(struct sk_buff *skb)
 {
 	int ret;
 
-	/* if netpoll wants it, pretend we never saw it */
-	if (netpoll_rx(skb))
-		return NET_RX_DROP;
-
 	net_timestamp_check(netdev_tstamp_prequeue, skb);
 
 	trace_netif_rx(skb);
@@ -3520,10 +3516,6 @@ static int __netif_receive_skb_core(struct sk_buff *skb, bool pfmemalloc)
 
 	trace_netif_receive_skb(skb);
 
-	/* if we've gotten here through NAPI, check netpoll */
-	if (netpoll_receive_skb(skb))
-		goto out;
-
 	orig_dev = skb->dev;
 
 	skb_reset_network_header(skb);
@@ -3650,7 +3642,6 @@ drop:
 
 unlock:
 	rcu_read_unlock();
-out:
 	return ret;
 }
 
@@ -3875,7 +3866,7 @@ static enum gro_result dev_gro_receive(struct napi_struct *napi, struct sk_buff
 	int same_flow;
 	enum gro_result ret;
 
-	if (!(skb->dev->features & NETIF_F_GRO) || netpoll_rx_on(skb))
+	if (!(skb->dev->features & NETIF_F_GRO))
 		goto normal;
 
 	if (skb_is_gso(skb) || skb_has_frag_list(skb))
diff --git a/net/core/netpoll.c b/net/core/netpoll.c
index a664f7829a6d..e883eff6799e 100644
--- a/net/core/netpoll.c
+++ b/net/core/netpoll.c
@@ -46,13 +46,9 @@
 
 static struct sk_buff_head skb_pool;
 
-static atomic_t trapped;
-
 DEFINE_STATIC_SRCU(netpoll_srcu);
 
 #define USEC_PER_POLL	50
-#define NETPOLL_RX_ENABLED  1
-#define NETPOLL_RX_DROP     2
 
 #define MAX_SKB_SIZE							\
 	(sizeof(struct ethhdr) +					\
@@ -61,7 +57,6 @@ DEFINE_STATIC_SRCU(netpoll_srcu);
 	 MAX_UDP_CHUNK)
 
 static void zap_completion_queue(void);
-static void netpoll_neigh_reply(struct sk_buff *skb, struct netpoll_info *npinfo);
 static void netpoll_async_cleanup(struct work_struct *work);
 
 static unsigned int carrier_timeout = 4;
@@ -109,25 +104,6 @@ static void queue_process(struct work_struct *work)
 	}
 }
 
-static __sum16 checksum_udp(struct sk_buff *skb, struct udphdr *uh,
-			    unsigned short ulen, __be32 saddr, __be32 daddr)
-{
-	__wsum psum;
-
-	if (uh->check == 0 || skb_csum_unnecessary(skb))
-		return 0;
-
-	psum = csum_tcpudp_nofold(saddr, daddr, ulen, IPPROTO_UDP, 0);
-
-	if (skb->ip_summed == CHECKSUM_COMPLETE &&
-	    !csum_fold(csum_add(psum, skb->csum)))
-		return 0;
-
-	skb->csum = psum;
-
-	return __skb_checksum_complete(skb);
-}
-
 /*
  * Check whether delayed processing was scheduled for our NIC. If so,
  * we attempt to grab the poll lock and use ->poll() to pump the card.
@@ -156,16 +132,12 @@ static int poll_one_napi(struct netpoll_info *npinfo,
 	if (!test_bit(NAPI_STATE_SCHED, &napi->state))
 		return budget;
 
-	npinfo->rx_flags |= NETPOLL_RX_DROP;
-	atomic_inc(&trapped);
 	set_bit(NAPI_STATE_NPSVC, &napi->state);
 
 	work = napi->poll(napi, budget);
 	trace_napi_poll(napi);
 
 	clear_bit(NAPI_STATE_NPSVC, &napi->state);
-	atomic_dec(&trapped);
-	npinfo->rx_flags &= ~NETPOLL_RX_DROP;
 
 	return budget - work;
 }
@@ -188,16 +160,6 @@ static void poll_napi(struct net_device *dev)
 	}
 }
 
-static void service_neigh_queue(struct netpoll_info *npi)
-{
-	if (npi) {
-		struct sk_buff *skb;
-
-		while ((skb = skb_dequeue(&npi->neigh_tx)))
-			netpoll_neigh_reply(skb, npi);
-	}
-}
-
 static void netpoll_poll_dev(struct net_device *dev)
 {
 	const struct net_device_ops *ops;
@@ -228,23 +190,6 @@ static void netpoll_poll_dev(struct net_device *dev)
 
 	up(&ni->dev_lock);
 
-	if (dev->flags & IFF_SLAVE) {
-		if (ni) {
-			struct net_device *bond_dev;
-			struct sk_buff *skb;
-			struct netpoll_info *bond_ni;
-
-			bond_dev = netdev_master_upper_dev_get_rcu(dev);
-			bond_ni = rcu_dereference_bh(bond_dev->npinfo);
-			while ((skb = skb_dequeue(&ni->neigh_tx))) {
-				skb->dev = bond_dev;
-				skb_queue_tail(&bond_ni->neigh_tx, skb);
-			}
-		}
-	}
-
-	service_neigh_queue(ni);
-
 	zap_completion_queue();
 }
 
@@ -529,384 +474,6 @@ void netpoll_send_udp(struct netpoll *np, const char *msg, int len)
 }
 EXPORT_SYMBOL(netpoll_send_udp);
 
-static void netpoll_neigh_reply(struct sk_buff *skb, struct netpoll_info *npinfo)
-{
-	int size, type = ARPOP_REPLY;
-	__be32 sip, tip;
-	unsigned char *sha;
-	struct sk_buff *send_skb;
-	struct netpoll *np, *tmp;
-	unsigned long flags;
-	int hlen, tlen;
-	int hits = 0, proto;
-
-	if (list_empty(&npinfo->rx_np))
-		return;
-
-	/* Before checking the packet, we do some early
-	   inspection whether this is interesting at all */
-	spin_lock_irqsave(&npinfo->rx_lock, flags);
-	list_for_each_entry_safe(np, tmp, &npinfo->rx_np, rx) {
-		if (np->dev == skb->dev)
-			hits++;
-	}
-	spin_unlock_irqrestore(&npinfo->rx_lock, flags);
-
-	/* No netpoll struct is using this dev */
-	if (!hits)
-		return;
-
-	proto = ntohs(eth_hdr(skb)->h_proto);
-	if (proto == ETH_P_ARP) {
-		struct arphdr *arp;
-		unsigned char *arp_ptr;
-		/* No arp on this interface */
-		if (skb->dev->flags & IFF_NOARP)
-			return;
-
-		if (!pskb_may_pull(skb, arp_hdr_len(skb->dev)))
-			return;
-
-		skb_reset_network_header(skb);
-		skb_reset_transport_header(skb);
-		arp = arp_hdr(skb);
-
-		if ((arp->ar_hrd != htons(ARPHRD_ETHER) &&
-		     arp->ar_hrd != htons(ARPHRD_IEEE802)) ||
-		    arp->ar_pro != htons(ETH_P_IP) ||
-		    arp->ar_op != htons(ARPOP_REQUEST))
-			return;
-
-		arp_ptr = (unsigned char *)(arp+1);
-		/* save the location of the src hw addr */
-		sha = arp_ptr;
-		arp_ptr += skb->dev->addr_len;
-		memcpy(&sip, arp_ptr, 4);
-		arp_ptr += 4;
-		/* If we actually cared about dst hw addr,
-		   it would get copied here */
-		arp_ptr += skb->dev->addr_len;
-		memcpy(&tip, arp_ptr, 4);
-
-		/* Should we ignore arp? */
-		if (ipv4_is_loopback(tip) || ipv4_is_multicast(tip))
-			return;
-
-		size = arp_hdr_len(skb->dev);
-
-		spin_lock_irqsave(&npinfo->rx_lock, flags);
-		list_for_each_entry_safe(np, tmp, &npinfo->rx_np, rx) {
-			if (tip != np->local_ip.ip)
-				continue;
-
-			hlen = LL_RESERVED_SPACE(np->dev);
-			tlen = np->dev->needed_tailroom;
-			send_skb = find_skb(np, size + hlen + tlen, hlen);
-			if (!send_skb)
-				continue;
-
-			skb_reset_network_header(send_skb);
-			arp = (struct arphdr *) skb_put(send_skb, size);
-			send_skb->dev = skb->dev;
-			send_skb->protocol = htons(ETH_P_ARP);
-
-			/* Fill the device header for the ARP frame */
-			if (dev_hard_header(send_skb, skb->dev, ETH_P_ARP,
-					    sha, np->dev->dev_addr,
-					    send_skb->len) < 0) {
-				kfree_skb(send_skb);
-				continue;
-			}
-
-			/*
-			 * Fill out the arp protocol part.
-			 *
-			 * we only support ethernet device type,
-			 * which (according to RFC 1390) should
-			 * always equal 1 (Ethernet).
-			 */
-
-			arp->ar_hrd = htons(np->dev->type);
-			arp->ar_pro = htons(ETH_P_IP);
-			arp->ar_hln = np->dev->addr_len;
-			arp->ar_pln = 4;
-			arp->ar_op = htons(type);
-
-			arp_ptr = (unsigned char *)(arp + 1);
-			memcpy(arp_ptr, np->dev->dev_addr, np->dev->addr_len);
-			arp_ptr += np->dev->addr_len;
-			memcpy(arp_ptr, &tip, 4);
-			arp_ptr += 4;
-			memcpy(arp_ptr, sha, np->dev->addr_len);
-			arp_ptr += np->dev->addr_len;
-			memcpy(arp_ptr, &sip, 4);
-
-			netpoll_send_skb(np, send_skb);
-
-			/* If there are several rx_skb_hooks for the same
-			 * address we're fine by sending a single reply
-			 */
-			break;
-		}
-		spin_unlock_irqrestore(&npinfo->rx_lock, flags);
-	} else if( proto == ETH_P_IPV6) {
-#if IS_ENABLED(CONFIG_IPV6)
-		struct nd_msg *msg;
-		u8 *lladdr = NULL;
-		struct ipv6hdr *hdr;
-		struct icmp6hdr *icmp6h;
-		const struct in6_addr *saddr;
-		const struct in6_addr *daddr;
-		struct inet6_dev *in6_dev = NULL;
-		struct in6_addr *target;
-
-		in6_dev = in6_dev_get(skb->dev);
-		if (!in6_dev || !in6_dev->cnf.accept_ra)
-			return;
-
-		if (!pskb_may_pull(skb, skb->len))
-			return;
-
-		msg = (struct nd_msg *)skb_transport_header(skb);
-
-		__skb_push(skb, skb->data - skb_transport_header(skb));
-
-		if (ipv6_hdr(skb)->hop_limit != 255)
-			return;
-		if (msg->icmph.icmp6_code != 0)
-			return;
-		if (msg->icmph.icmp6_type != NDISC_NEIGHBOUR_SOLICITATION)
-			return;
-
-		saddr = &ipv6_hdr(skb)->saddr;
-		daddr = &ipv6_hdr(skb)->daddr;
-
-		size = sizeof(struct icmp6hdr) + sizeof(struct in6_addr);
-
-		spin_lock_irqsave(&npinfo->rx_lock, flags);
-		list_for_each_entry_safe(np, tmp, &npinfo->rx_np, rx) {
-			if (!ipv6_addr_equal(daddr, &np->local_ip.in6))
-				continue;
-
-			hlen = LL_RESERVED_SPACE(np->dev);
-			tlen = np->dev->needed_tailroom;
-			send_skb = find_skb(np, size + hlen + tlen, hlen);
-			if (!send_skb)
-				continue;
-
-			send_skb->protocol = htons(ETH_P_IPV6);
-			send_skb->dev = skb->dev;
-
-			skb_reset_network_header(send_skb);
-			hdr = (struct ipv6hdr *) skb_put(send_skb, sizeof(struct ipv6hdr));
-			*(__be32*)hdr = htonl(0x60000000);
-			hdr->payload_len = htons(size);
-			hdr->nexthdr = IPPROTO_ICMPV6;
-			hdr->hop_limit = 255;
-			hdr->saddr = *saddr;
-			hdr->daddr = *daddr;
-
-			icmp6h = (struct icmp6hdr *) skb_put(send_skb, sizeof(struct icmp6hdr));
-			icmp6h->icmp6_type = NDISC_NEIGHBOUR_ADVERTISEMENT;
-			icmp6h->icmp6_router = 0;
-			icmp6h->icmp6_solicited = 1;
-
-			target = (struct in6_addr *) skb_put(send_skb, sizeof(struct in6_addr));
-			*target = msg->target;
-			icmp6h->icmp6_cksum = csum_ipv6_magic(saddr, daddr, size,
-							      IPPROTO_ICMPV6,
-							      csum_partial(icmp6h,
-									   size, 0));
-
-			if (dev_hard_header(send_skb, skb->dev, ETH_P_IPV6,
-					    lladdr, np->dev->dev_addr,
-					    send_skb->len) < 0) {
-				kfree_skb(send_skb);
-				continue;
-			}
-
-			netpoll_send_skb(np, send_skb);
-
-			/* If there are several rx_skb_hooks for the same
-			 * address, we're fine by sending a single reply
-			 */
-			break;
-		}
-		spin_unlock_irqrestore(&npinfo->rx_lock, flags);
-#endif
-	}
-}
-
-static bool pkt_is_ns(struct sk_buff *skb)
-{
-	struct nd_msg *msg;
-	struct ipv6hdr *hdr;
-
-	if (skb->protocol != htons(ETH_P_ARP))
-		return false;
-	if (!pskb_may_pull(skb, sizeof(struct ipv6hdr) + sizeof(struct nd_msg)))
-		return false;
-
-	msg = (struct nd_msg *)skb_transport_header(skb);
-	__skb_push(skb, skb->data - skb_transport_header(skb));
-	hdr = ipv6_hdr(skb);
-
-	if (hdr->nexthdr != IPPROTO_ICMPV6)
-		return false;
-	if (hdr->hop_limit != 255)
-		return false;
-	if (msg->icmph.icmp6_code != 0)
-		return false;
-	if (msg->icmph.icmp6_type != NDISC_NEIGHBOUR_SOLICITATION)
-		return false;
-
-	return true;
-}
-
-int __netpoll_rx(struct sk_buff *skb, struct netpoll_info *npinfo)
-{
-	int proto, len, ulen, data_len;
-	int hits = 0, offset;
-	const struct iphdr *iph;
-	struct udphdr *uh;
-	struct netpoll *np, *tmp;
-	uint16_t source;
-
-	if (list_empty(&npinfo->rx_np))
-		goto out;
-
-	if (skb->dev->type != ARPHRD_ETHER)
-		goto out;
-
-	/* check if netpoll clients need ARP */
-	if (skb->protocol == htons(ETH_P_ARP) && atomic_read(&trapped)) {
-		skb_queue_tail(&npinfo->neigh_tx, skb);
-		return 1;
-	} else if (pkt_is_ns(skb) && atomic_read(&trapped)) {
-		skb_queue_tail(&npinfo->neigh_tx, skb);
-		return 1;
-	}
-
-	if (skb->protocol == cpu_to_be16(ETH_P_8021Q)) {
-		skb = vlan_untag(skb);
-		if (unlikely(!skb))
-			goto out;
-	}
-
-	proto = ntohs(eth_hdr(skb)->h_proto);
-	if (proto != ETH_P_IP && proto != ETH_P_IPV6)
-		goto out;
-	if (skb->pkt_type == PACKET_OTHERHOST)
-		goto out;
-	if (skb_shared(skb))
-		goto out;
-
-	if (proto == ETH_P_IP) {
-		if (!pskb_may_pull(skb, sizeof(struct iphdr)))
-			goto out;
-		iph = (struct iphdr *)skb->data;
-		if (iph->ihl < 5 || iph->version != 4)
-			goto out;
-		if (!pskb_may_pull(skb, iph->ihl*4))
-			goto out;
-		iph = (struct iphdr *)skb->data;
-		if (ip_fast_csum((u8 *)iph, iph->ihl) != 0)
-			goto out;
-
-		len = ntohs(iph->tot_len);
-		if (skb->len < len || len < iph->ihl*4)
-			goto out;
-
-		/*
-		 * Our transport medium may have padded the buffer out.
-		 * Now We trim to the true length of the frame.
-		 */
-		if (pskb_trim_rcsum(skb, len))
-			goto out;
-
-		iph = (struct iphdr *)skb->data;
-		if (iph->protocol != IPPROTO_UDP)
-			goto out;
-
-		len -= iph->ihl*4;
-		uh = (struct udphdr *)(((char *)iph) + iph->ihl*4);
-		offset = (unsigned char *)(uh + 1) - skb->data;
-		ulen = ntohs(uh->len);
-		data_len = skb->len - offset;
-		source = ntohs(uh->source);
-
-		if (ulen != len)
-			goto out;
-		if (checksum_udp(skb, uh, ulen, iph->saddr, iph->daddr))
-			goto out;
-		list_for_each_entry_safe(np, tmp, &npinfo->rx_np, rx) {
-			if (np->local_ip.ip && np->local_ip.ip != iph->daddr)
-				continue;
-			if (np->remote_ip.ip && np->remote_ip.ip != iph->saddr)
-				continue;
-			if (np->local_port && np->local_port != ntohs(uh->dest))
-				continue;
-
-			np->rx_skb_hook(np, source, skb, offset, data_len);
-			hits++;
-		}
-	} else {
-#if IS_ENABLED(CONFIG_IPV6)
-		const struct ipv6hdr *ip6h;
-
-		if (!pskb_may_pull(skb, sizeof(struct ipv6hdr)))
-			goto out;
-		ip6h = (struct ipv6hdr *)skb->data;
-		if (ip6h->version != 6)
-			goto out;
-		len = ntohs(ip6h->payload_len);
-		if (!len)
-			goto out;
-		if (len + sizeof(struct ipv6hdr) > skb->len)
-			goto out;
-		if (pskb_trim_rcsum(skb, len + sizeof(struct ipv6hdr)))
-			goto out;
-		ip6h = ipv6_hdr(skb);
-		if (!pskb_may_pull(skb, sizeof(struct udphdr)))
-			goto out;
-		uh = udp_hdr(skb);
-		offset = (unsigned char *)(uh + 1) - skb->data;
-		ulen = ntohs(uh->len);
-		data_len = skb->len - offset;
-		source = ntohs(uh->source);
-		if (ulen != skb->len)
-			goto out;
-		if (udp6_csum_init(skb, uh, IPPROTO_UDP))
-			goto out;
-		list_for_each_entry_safe(np, tmp, &npinfo->rx_np, rx) {
-			if (!ipv6_addr_equal(&np->local_ip.in6, &ip6h->daddr))
-				continue;
-			if (!ipv6_addr_equal(&np->remote_ip.in6, &ip6h->saddr))
-				continue;
-			if (np->local_port && np->local_port != ntohs(uh->dest))
-				continue;
-
-			np->rx_skb_hook(np, source, skb, offset, data_len);
-			hits++;
-		}
-#endif
-	}
-
-	if (!hits)
-		goto out;
-
-	kfree_skb(skb);
-	return 1;
-
-out:
-	if (atomic_read(&trapped)) {
-		kfree_skb(skb);
-		return 1;
-	}
-
-	return 0;
-}
-
 void netpoll_print_options(struct netpoll *np)
 {
 	np_info(np, "local port %d\n", np->local_port);
@@ -1030,7 +597,6 @@ int __netpoll_setup(struct netpoll *np, struct net_device *ndev, gfp_t gfp)
 {
 	struct netpoll_info *npinfo;
 	const struct net_device_ops *ops;
-	unsigned long flags;
 	int err;
 
 	np->dev = ndev;
@@ -1052,12 +618,7 @@ int __netpoll_setup(struct netpoll *np, struct net_device *ndev, gfp_t gfp)
 			goto out;
 		}
 
-		npinfo->rx_flags = 0;
-		INIT_LIST_HEAD(&npinfo->rx_np);
-
-		spin_lock_init(&npinfo->rx_lock);
 		sema_init(&npinfo->dev_lock, 1);
-		skb_queue_head_init(&npinfo->neigh_tx);
 		skb_queue_head_init(&npinfo->txq);
 		INIT_DELAYED_WORK(&npinfo->tx_work, queue_process);
 
@@ -1076,13 +637,6 @@ int __netpoll_setup(struct netpoll *np, struct net_device *ndev, gfp_t gfp)
 
 	npinfo->netpoll = np;
 
-	if (np->rx_skb_hook) {
-		spin_lock_irqsave(&npinfo->rx_lock, flags);
-		npinfo->rx_flags |= NETPOLL_RX_ENABLED;
-		list_add_tail(&np->rx, &npinfo->rx_np);
-		spin_unlock_irqrestore(&npinfo->rx_lock, flags);
-	}
-
 	/* last thing to do is link it to the net device structure */
 	rcu_assign_pointer(ndev->npinfo, npinfo);
 
@@ -1231,7 +785,6 @@ static void rcu_cleanup_netpoll_info(struct rcu_head *rcu_head)
 	struct netpoll_info *npinfo =
 			container_of(rcu_head, struct netpoll_info, rcu);
 
-	skb_queue_purge(&npinfo->neigh_tx);
 	skb_queue_purge(&npinfo->txq);
 
 	/* we can't call cancel_delayed_work_sync here, as we are in softirq */
@@ -1247,7 +800,6 @@ static void rcu_cleanup_netpoll_info(struct rcu_head *rcu_head)
 void __netpoll_cleanup(struct netpoll *np)
 {
 	struct netpoll_info *npinfo;
-	unsigned long flags;
 
 	/* rtnl_dereference would be preferable here but
 	 * rcu_cleanup_netpoll path can put us in here safely without
@@ -1257,14 +809,6 @@ void __netpoll_cleanup(struct netpoll *np)
 	if (!npinfo)
 		return;
 
-	if (!list_empty(&npinfo->rx_np)) {
-		spin_lock_irqsave(&npinfo->rx_lock, flags);
-		list_del(&np->rx);
-		if (list_empty(&npinfo->rx_np))
-			npinfo->rx_flags &= ~NETPOLL_RX_ENABLED;
-		spin_unlock_irqrestore(&npinfo->rx_lock, flags);
-	}
-
 	synchronize_srcu(&netpoll_srcu);
 
 	if (atomic_dec_and_test(&npinfo->refcnt)) {
@@ -1308,18 +852,3 @@ out:
 	rtnl_unlock();
 }
 EXPORT_SYMBOL(netpoll_cleanup);
-
-int netpoll_trap(void)
-{
-	return atomic_read(&trapped);
-}
-EXPORT_SYMBOL(netpoll_trap);
-
-void netpoll_set_trap(int trap)
-{
-	if (trap)
-		atomic_inc(&trapped);
-	else
-		atomic_dec(&trapped);
-}
-EXPORT_SYMBOL(netpoll_set_trap);
-- 
1.7.5.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* [RFC PATCH 2/2] netpoll: Don't poll for received packets
  2014-03-11  8:43             ` [RFC PATCH 0/2] remove netpoll rx support Eric W. Biederman
  2014-03-11  8:44               ` [RFC PATCH 1/2] netpoll: Remove dead netpoll_rx code Eric W. Biederman
@ 2014-03-11  8:45               ` Eric W. Biederman
  2014-03-11 12:44                 ` Eric Dumazet
  2014-03-12 18:39                 ` Cong Wang
  2014-03-11 12:24               ` [RFC PATCH 0/2] remove netpoll rx support Eric Dumazet
  2014-03-11 16:49               ` David Miller
  3 siblings, 2 replies; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-11  8:45 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: David Miller, netdev, xiyou.wangcong, mpm, satyam.sharma


When calling drivers napi poll function pass it a budget of 0, to
request that no rx packets be processed, and warn if any rx packets
are actually processed.

Additionally remove the drop_mon tracepoint as nothing interesting
should be happening in netpoll, and running an arbitrary tracepoint
in irq context is probably a bad idea.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 net/core/netpoll.c |   28 +++++++++-------------------
 1 files changed, 9 insertions(+), 19 deletions(-)

diff --git a/net/core/netpoll.c b/net/core/netpoll.c
index e883eff6799e..f95ca7f9d246 100644
--- a/net/core/netpoll.c
+++ b/net/core/netpoll.c
@@ -114,48 +114,38 @@ static void queue_process(struct work_struct *work)
  * trylock here and interrupts are already disabled in the softirq
  * case. Further, we test the poll_owner to avoid recursion on UP
  * systems where the lock doesn't exist.
- *
- * In cases where there is bi-directional communications, reading only
- * one message at a time can lead to packets being dropped by the
- * network adapter, forcing superfluous retries and possibly timeouts.
- * Thus, we set our budget to greater than 1.
  */
-static int poll_one_napi(struct netpoll_info *npinfo,
-			 struct napi_struct *napi, int budget)
+static void poll_one_napi(struct netpoll_info *npinfo,
+			  struct napi_struct *napi)
 {
 	int work;
-
 	/* net_rx_action's ->poll() invocations and our's are
 	 * synchronized by this test which is only made while
 	 * holding the napi->poll_lock.
 	 */
 	if (!test_bit(NAPI_STATE_SCHED, &napi->state))
-		return budget;
+		return;
 
 	set_bit(NAPI_STATE_NPSVC, &napi->state);
 
-	work = napi->poll(napi, budget);
-	trace_napi_poll(napi);
+	/* Use a budget of 0 to request the drivers not process
+	 * their receive queue.  Warn when they do anyway.
+	 */
+	work = napi->poll(napi, 0);
+	WARN_ON_ONCE(work != 0);
 
 	clear_bit(NAPI_STATE_NPSVC, &napi->state);
-
-	return budget - work;
 }
 
 static void poll_napi(struct net_device *dev)
 {
 	struct napi_struct *napi;
-	int budget = 16;
 
 	list_for_each_entry(napi, &dev->napi_list, dev_list) {
 		if (napi->poll_owner != smp_processor_id() &&
 		    spin_trylock(&napi->poll_lock)) {
-			budget = poll_one_napi(rcu_dereference_bh(dev->npinfo),
-					       napi, budget);
+			poll_one_napi(rcu_dereference_bh(dev->npinfo), napi);
 			spin_unlock(&napi->poll_lock);
-
-			if (!budget)
-				break;
 		}
 	}
 }
-- 
1.7.5.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* Re: [RFC PATCH 0/2] remove netpoll rx support
  2014-03-11  8:43             ` [RFC PATCH 0/2] remove netpoll rx support Eric W. Biederman
  2014-03-11  8:44               ` [RFC PATCH 1/2] netpoll: Remove dead netpoll_rx code Eric W. Biederman
  2014-03-11  8:45               ` [RFC PATCH 2/2] netpoll: Don't poll for received packets Eric W. Biederman
@ 2014-03-11 12:24               ` Eric Dumazet
  2014-03-11 16:49               ` David Miller
  3 siblings, 0 replies; 288+ messages in thread
From: Eric Dumazet @ 2014-03-11 12:24 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: David Miller, netdev, xiyou.wangcong, mpm, satyam.sharma

On Tue, 2014-03-11 at 01:43 -0700, Eric W. Biederman wrote:

> It does seem desirable that netpoll should not drain the rx queues.
> Unfortunately that is not how netpoll is built.  By my quick count
> there are 132 drivers in the kernel that support netpoll.
> 
> Several of them such as the e1000e driver already call dev_kfree_skb_any
> or dev_kfree_skb_irq in their rx paths.  What I am implementing seems to
> be the pattern that the better drivers follow today.
> 
> Furthermore netpoll by it's design depends on the ability to receive
> packets in netpoll_poll_dev.  It is a capability I don't think we have
> ever used in the mainline kernel but it is a capability that is there
> deliberately.  Which means if we want netpoll to not mess with the rx
> path we need to change netpoll.
> 
> 
> If we are willing to change the definition of netpoll this is fixable.
> 
> The big enabler is the fact that calling the napi poll function with a
> budget of 0 means don't perform any rx work.
> 
> Which leads to the following set of changes to netpoll if we are brave.

Well, it cannot be worse than current situation, right ?

I never understood and never enabled CONFIG_NETPOLL_TRAP on any of my
builds, I can tell you.

^ permalink raw reply	[flat|nested] 288+ messages in thread

* Re: [RFC PATCH 1/2] netpoll:  Remove dead netpoll_rx code
  2014-03-11  8:44               ` [RFC PATCH 1/2] netpoll: Remove dead netpoll_rx code Eric W. Biederman
@ 2014-03-11 12:29                 ` Eric Dumazet
  2014-03-11 15:23                   ` Stephen Hemminger
  0 siblings, 1 reply; 288+ messages in thread
From: Eric Dumazet @ 2014-03-11 12:29 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: David Miller, netdev, xiyou.wangcong, mpm, satyam.sharma

On Tue, 2014-03-11 at 01:44 -0700, Eric W. Biederman wrote:
> The netpoll_rx code only becomes active if the netpoll rx_skb_hook is
> implemented.  There is not a single implementation of the netpoll
> rx_skb_hook in the kernel.
> 
> There are problems with the netpoll packet receive code. Most
> speifically every packet that makes it to netpoll_neigh_reply is
> leaked.
> 
> Given that the netpoll packet receive code is buggy and has not been used
> for a decade let's just remove it.
> 
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
> ---
>  drivers/net/Kconfig       |    5 -
>  include/linux/netdevice.h |   17 --
>  include/linux/netpoll.h   |   59 ------
>  net/core/dev.c            |   11 +-
>  net/core/netpoll.c        |  471 ---------------------------------------------
>  5 files changed, 1 insertions(+), 562 deletions(-)

I cannot agree more, thanks Eric.

Acked-by: Eric Dumazet <edumazet@google.com>

^ permalink raw reply	[flat|nested] 288+ messages in thread

* Re: [RFC PATCH 2/2] netpoll: Don't poll for received packets
  2014-03-11  8:45               ` [RFC PATCH 2/2] netpoll: Don't poll for received packets Eric W. Biederman
@ 2014-03-11 12:44                 ` Eric Dumazet
  2014-03-12 18:39                 ` Cong Wang
  1 sibling, 0 replies; 288+ messages in thread
From: Eric Dumazet @ 2014-03-11 12:44 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: David Miller, netdev, xiyou.wangcong, mpm, satyam.sharma

On Tue, 2014-03-11 at 01:45 -0700, Eric W. Biederman wrote:
> When calling drivers napi poll function pass it a budget of 0, to
> request that no rx packets be processed, and warn if any rx packets
> are actually processed.
> 
> Additionally remove the drop_mon tracepoint as nothing interesting
> should be happening in netpoll, and running an arbitrary tracepoint
> in irq context is probably a bad idea.
> 
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
> ---
>  net/core/netpoll.c |   28 +++++++++-------------------
>  1 files changed, 9 insertions(+), 19 deletions(-)

Acked-by: Eric Dumazet <edumazet@google.com>

^ permalink raw reply	[flat|nested] 288+ messages in thread

* Re: [RFC PATCH 1/2] netpoll:  Remove dead netpoll_rx code
  2014-03-11 12:29                 ` Eric Dumazet
@ 2014-03-11 15:23                   ` Stephen Hemminger
  2014-03-11 15:34                     ` Hannes Frederic Sowa
  2014-03-11 20:48                     ` Eric W. Biederman
  0 siblings, 2 replies; 288+ messages in thread
From: Stephen Hemminger @ 2014-03-11 15:23 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Eric W. Biederman, David Miller, netdev, xiyou.wangcong, mpm,
	satyam.sharma

On Tue, 11 Mar 2014 05:29:21 -0700
Eric Dumazet <eric.dumazet@gmail.com> wrote:

> On Tue, 2014-03-11 at 01:44 -0700, Eric W. Biederman wrote:
> > The netpoll_rx code only becomes active if the netpoll rx_skb_hook is
> > implemented.  There is not a single implementation of the netpoll
> > rx_skb_hook in the kernel.
> > 
> > There are problems with the netpoll packet receive code. Most
> > speifically every packet that makes it to netpoll_neigh_reply is
> > leaked.
> > 
> > Given that the netpoll packet receive code is buggy and has not been used
> > for a decade let's just remove it.
> > 
> > Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
> > ---
> >  drivers/net/Kconfig       |    5 -
> >  include/linux/netdevice.h |   17 --
> >  include/linux/netpoll.h   |   59 ------
> >  net/core/dev.c            |   11 +-
> >  net/core/netpoll.c        |  471 ---------------------------------------------
> >  5 files changed, 1 insertions(+), 562 deletions(-)
> 
> I cannot agree more, thanks Eric.
> 
> Acked-by: Eric Dumazet <edumazet@google.com>

I agree but removing it breaks people trying to kgdb over network (kgdboe).
That code never made it upstream, was unreliable and fragile and should be
sent to the retirement home with IMQ.

^ permalink raw reply	[flat|nested] 288+ messages in thread

* Re: [RFC PATCH 1/2] netpoll:  Remove dead netpoll_rx code
  2014-03-11 15:23                   ` Stephen Hemminger
@ 2014-03-11 15:34                     ` Hannes Frederic Sowa
  2014-03-11 20:48                     ` Eric W. Biederman
  1 sibling, 0 replies; 288+ messages in thread
From: Hannes Frederic Sowa @ 2014-03-11 15:34 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: Eric Dumazet, Eric W. Biederman, David Miller, netdev,
	xiyou.wangcong, mpm, satyam.sharma

On Tue, Mar 11, 2014 at 08:23:12AM -0700, Stephen Hemminger wrote:
> On Tue, 11 Mar 2014 05:29:21 -0700
> Eric Dumazet <eric.dumazet@gmail.com> wrote:
> 
> > On Tue, 2014-03-11 at 01:44 -0700, Eric W. Biederman wrote:
> > > The netpoll_rx code only becomes active if the netpoll rx_skb_hook is
> > > implemented.  There is not a single implementation of the netpoll
> > > rx_skb_hook in the kernel.
> > > 
> > > There are problems with the netpoll packet receive code. Most
> > > speifically every packet that makes it to netpoll_neigh_reply is
> > > leaked.
> > > 
> > > Given that the netpoll packet receive code is buggy and has not been used
> > > for a decade let's just remove it.
> > > 
> > > Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
> > > ---
> > >  drivers/net/Kconfig       |    5 -
> > >  include/linux/netdevice.h |   17 --
> > >  include/linux/netpoll.h   |   59 ------
> > >  net/core/dev.c            |   11 +-
> > >  net/core/netpoll.c        |  471 ---------------------------------------------
> > >  5 files changed, 1 insertions(+), 562 deletions(-)
> > 
> > I cannot agree more, thanks Eric.
> > 
> > Acked-by: Eric Dumazet <edumazet@google.com>
> 
> I agree but removing it breaks people trying to kgdb over network (kgdboe).
> That code never made it upstream, was unreliable and fragile and should be
> sent to the retirement home with IMQ.

:)

But there is still no replacement for IMQ if one wants to do ingress NAT-aware
traffic shaping, not even with ifb, I fear? So to be fair there still seems to
be a reason why IMQ is still around.

But that is a different topic...

^ permalink raw reply	[flat|nested] 288+ messages in thread

* Re: [PATCH 01/11] bonding: Call dev_kfree_skby_any instead of kfree_skb.
  2014-03-11  5:02           ` Eric Dumazet
  2014-03-11  8:43             ` [RFC PATCH 0/2] remove netpoll rx support Eric W. Biederman
@ 2014-03-11 16:39             ` David Miller
  1 sibling, 0 replies; 288+ messages in thread
From: David Miller @ 2014-03-11 16:39 UTC (permalink / raw)
  To: eric.dumazet; +Cc: ebiederm, netdev, xiyou.wangcong, mpm, satyam.sharma

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Mon, 10 Mar 2014 22:02:38 -0700

> It seems netpoll should not drain rx queues.

There are netpoll users (admittedly, out of tree) that make use of
packet receive.

kdump is one.

That's what all of the __netpoll_rx() et al. stuff is for.

Therefore we'll probably just have to enforce that if a netpoll
user wants RX packets, he has to invoke netpoll from a reasonable
context.

But if we're called from hardware interrupts, yes we block RX
processing.

^ permalink raw reply	[flat|nested] 288+ messages in thread

* Re: [RFC PATCH 0/2] remove netpoll rx support
  2014-03-11  8:43             ` [RFC PATCH 0/2] remove netpoll rx support Eric W. Biederman
                                 ` (2 preceding siblings ...)
  2014-03-11 12:24               ` [RFC PATCH 0/2] remove netpoll rx support Eric Dumazet
@ 2014-03-11 16:49               ` David Miller
  2014-03-11 19:48                 ` Eric W. Biederman
  3 siblings, 1 reply; 288+ messages in thread
From: David Miller @ 2014-03-11 16:49 UTC (permalink / raw)
  To: ebiederm; +Cc: eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma

From: ebiederm@xmission.com (Eric W. Biederman)
Date: Tue, 11 Mar 2014 01:43:10 -0700

> Furthermore netpoll by it's design depends on the ability to receive
> packets in netpoll_poll_dev.  It is a capability I don't think we have
> ever used in the mainline kernel but it is a capability that is there
> deliberately.  Which means if we want netpoll to not mess with the rx
> path we need to change netpoll.

This breaks kdump, and any other users of netpoll_rx() et al.

Make the zero budget depend upon us being invoked from hardware
irq context, or something like that.

Thanks.

^ permalink raw reply	[flat|nested] 288+ messages in thread

* Re: [RFC PATCH 0/2] remove netpoll rx support
  2014-03-11 16:49               ` David Miller
@ 2014-03-11 19:48                 ` Eric W. Biederman
  2014-03-11 20:09                   ` David Miller
  0 siblings, 1 reply; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-11 19:48 UTC (permalink / raw)
  To: David Miller; +Cc: eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma

David Miller <davem@davemloft.net> writes:

> From: ebiederm@xmission.com (Eric W. Biederman)
> Date: Tue, 11 Mar 2014 01:43:10 -0700
>
>> Furthermore netpoll by it's design depends on the ability to receive
>> packets in netpoll_poll_dev.  It is a capability I don't think we have
>> ever used in the mainline kernel but it is a capability that is there
>> deliberately.  Which means if we want netpoll to not mess with the rx
>> path we need to change netpoll.
>
> This breaks kdump, and any other users of netpoll_rx() et al.

It does not break kdump.  (kdump starts a new kernel to do it's work). 

It does break the ancient lkcd netdump that was never merged, and has
been abandoned (to the best of my knowledge). crash dumps proved
entirely too fragile to perform from a broken kernel.

It does break kgdboe that was never merged.

> Make the zero budget depend upon us being invoked from hardware
> irq context, or something like that.

Good enough.  I will respin my driver patches based on the assumption
that netpoll will be changed in this way.  There are no dependencies
for the drivers, I just need to remove my rx path changes.

We can have the conversation about how to change netpoll in parallel.

Eric

^ permalink raw reply	[flat|nested] 288+ messages in thread

* Re: [RFC PATCH 0/2] remove netpoll rx support
  2014-03-11 19:48                 ` Eric W. Biederman
@ 2014-03-11 20:09                   ` David Miller
  2014-03-11 21:13                     ` [PATCH next-next 0/10] Using dev_kfree_skb_any for functions called in multiple contexts Eric W. Biederman
                                       ` (2 more replies)
  0 siblings, 3 replies; 288+ messages in thread
From: David Miller @ 2014-03-11 20:09 UTC (permalink / raw)
  To: ebiederm; +Cc: eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma

From: ebiederm@xmission.com (Eric W. Biederman)
Date: Tue, 11 Mar 2014 12:48:30 -0700

> David Miller <davem@davemloft.net> writes:
> 
>> From: ebiederm@xmission.com (Eric W. Biederman)
>> Date: Tue, 11 Mar 2014 01:43:10 -0700
>>
>>> Furthermore netpoll by it's design depends on the ability to receive
>>> packets in netpoll_poll_dev.  It is a capability I don't think we have
>>> ever used in the mainline kernel but it is a capability that is there
>>> deliberately.  Which means if we want netpoll to not mess with the rx
>>> path we need to change netpoll.
>>
>> This breaks kdump, and any other users of netpoll_rx() et al.
> 
> It does not break kdump.  (kdump starts a new kernel to do it's work). 
> 
> It does break the ancient lkcd netdump that was never merged, and has
> been abandoned (to the best of my knowledge). crash dumps proved
> entirely too fragile to perform from a broken kernel.

I feel like I've mixed this up in the past, multiple times, thanks
for the sanity check.

>> Make the zero budget depend upon us being invoked from hardware
>> irq context, or something like that.
> 
> Good enough.  I will respin my driver patches based on the assumption
> that netpoll will be changed in this way.  There are no dependencies
> for the drivers, I just need to remove my rx path changes.
> 
> We can have the conversation about how to change netpoll in parallel.

Sounds great.

^ permalink raw reply	[flat|nested] 288+ messages in thread

* Re: [RFC PATCH 1/2] netpoll:  Remove dead netpoll_rx code
  2014-03-11 15:23                   ` Stephen Hemminger
  2014-03-11 15:34                     ` Hannes Frederic Sowa
@ 2014-03-11 20:48                     ` Eric W. Biederman
  2014-03-12 18:31                       ` Cong Wang
  2014-03-13 19:23                       ` David Miller
  1 sibling, 2 replies; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-11 20:48 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: Eric Dumazet, David Miller, netdev, xiyou.wangcong, mpm, satyam.sharma

Stephen Hemminger <stephen@networkplumber.org> writes:

> On Tue, 11 Mar 2014 05:29:21 -0700
> Eric Dumazet <eric.dumazet@gmail.com> wrote:
>
>> On Tue, 2014-03-11 at 01:44 -0700, Eric W. Biederman wrote:
>> > The netpoll_rx code only becomes active if the netpoll rx_skb_hook is
>> > implemented.  There is not a single implementation of the netpoll
>> > rx_skb_hook in the kernel.
>> > 
>> > There are problems with the netpoll packet receive code. Most
>> > speifically every packet that makes it to netpoll_neigh_reply is
>> > leaked.
>> > 
>> > Given that the netpoll packet receive code is buggy and has not been used
>> > for a decade let's just remove it.
>> > 
>> > Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
>> > ---
>> >  drivers/net/Kconfig       |    5 -
>> >  include/linux/netdevice.h |   17 --
>> >  include/linux/netpoll.h   |   59 ------
>> >  net/core/dev.c            |   11 +-
>> >  net/core/netpoll.c        |  471 ---------------------------------------------
>> >  5 files changed, 1 insertions(+), 562 deletions(-)
>> 
>> I cannot agree more, thanks Eric.
>> 
>> Acked-by: Eric Dumazet <edumazet@google.com>
>
> I agree but removing it breaks people trying to kgdb over network (kgdboe).
> That code never made it upstream, was unreliable and fragile and should be
> sent to the retirement home with IMQ.

To play devil's advocate to my own patch.  Does anyone know where kgdb
over network (kgdboe) code lives today?

What little I could find in a quick google search strongly suggests that
kgdboe was abandoned in 2010 or so.

I am trying to figure out if there are any active out of tree projects
that need by directional netpoll.

Eric

^ permalink raw reply	[flat|nested] 288+ messages in thread

* [PATCH next-next 0/10] Using dev_kfree_skb_any for functions called in multiple contexts
  2014-03-11 20:09                   ` David Miller
@ 2014-03-11 21:13                     ` Eric W. Biederman
  2014-03-11 21:14                       ` [PATCH net-next 01/10] 8139cp: Call dev_kfree_skby_any instead of kfree_skb Eric W. Biederman
                                         ` (11 more replies)
  2014-03-11 21:30                     ` [PATCH net-next 0/2] Don't receive packets when the napi budget == 0 Eric W. Biederman
  2014-03-11 21:33                     ` [PATCH net-next] bcm63xx_enet: Stop pretending to support netpoll Eric W. Biederman
  2 siblings, 12 replies; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-11 21:13 UTC (permalink / raw)
  To: David Miller; +Cc: eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma


This patchset should be an uncontroversial set of changes to change
dev_kfree_skb to dev_kfree_skb_any for code paths that are called in
hard irq contexts in addition to other contexts.  netpoll is the reason
this code gets called in multiple contexts.

There is more coming but these changes are a good starting place, and
stand on their own.

Since the last round changes to the rx path have been removed netpoll
will changed to avoid that.

Eric W. Biederman (10):
      8139cp: Call dev_kfree_skby_any instead of kfree_skb.
      8139too: Call dev_kfree_skby_any instead of dev_kfree_skb.
      r8169: Call dev_kfree_skby_any instead of dev_kfree_skb.
      bonding: Call dev_kfree_skby_any instead of kfree_skb.
      bnx2: Call dev_kfree_skby_any instead of dev_kfree_skb.
      tg3: Call dev_kfree_skby_any instead of dev_kfree_skb.
      ixgb: Call dev_kfree_skby_any instead of dev_kfree_skb.
      mlx4: Call dev_kfree_skby_any instead of dev_kfree_skb.
      benet: Call dev_kfree_skby_any instead of kfree_skb.
      gianfar: Carefully free skbs in functions called by netpoll.


 drivers/net/bonding/bond_3ad.c              |    2 +-
 drivers/net/bonding/bond_alb.c              |    2 +-
 drivers/net/bonding/bond_main.c             |   10 +++++-----
 drivers/net/ethernet/broadcom/bnx2.c        |    6 +++---
 drivers/net/ethernet/broadcom/tg3.c         |   14 +++++++-------
 drivers/net/ethernet/emulex/benet/be_main.c |    2 +-
 drivers/net/ethernet/freescale/gianfar.c    |    4 ++--
 drivers/net/ethernet/intel/ixgb/ixgb_main.c |    6 +++---
 drivers/net/ethernet/mellanox/mlx4/en_tx.c  |    2 +-
 drivers/net/ethernet/realtek/8139cp.c       |    2 +-
 drivers/net/ethernet/realtek/8139too.c      |    4 ++--
 drivers/net/ethernet/realtek/r8169.c        |    6 +++---
 12 files changed, 30 insertions(+), 30 deletions(-)

Eric

^ permalink raw reply	[flat|nested] 288+ messages in thread

* [PATCH net-next 01/10] 8139cp: Call dev_kfree_skby_any instead of kfree_skb.
  2014-03-11 21:13                     ` [PATCH next-next 0/10] Using dev_kfree_skb_any for functions called in multiple contexts Eric W. Biederman
@ 2014-03-11 21:14                       ` Eric W. Biederman
  2014-03-11 21:15                       ` [PATCH net-next 02/10] 8139too: Call dev_kfree_skby_any instead of dev_kfree_skb Eric W. Biederman
                                         ` (10 subsequent siblings)
  11 siblings, 0 replies; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-11 21:14 UTC (permalink / raw)
  To: David Miller; +Cc: eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma


Replace kfree_skb with dev_kfree_skb_any in cp_start_xmit
as it can be called in both hard irq and other contexts.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 drivers/net/ethernet/realtek/8139cp.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/net/ethernet/realtek/8139cp.c b/drivers/net/ethernet/realtek/8139cp.c
index 737c1a881f78..a3c1daa7ad5c 100644
--- a/drivers/net/ethernet/realtek/8139cp.c
+++ b/drivers/net/ethernet/realtek/8139cp.c
@@ -899,7 +899,7 @@ out_unlock:
 
 	return NETDEV_TX_OK;
 out_dma_error:
-	kfree_skb(skb);
+	dev_kfree_skb_any(skb);
 	cp->dev->stats.tx_dropped++;
 	goto out_unlock;
 }
-- 
1.7.5.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* [PATCH net-next 02/10] 8139too: Call dev_kfree_skby_any instead of dev_kfree_skb.
  2014-03-11 21:13                     ` [PATCH next-next 0/10] Using dev_kfree_skb_any for functions called in multiple contexts Eric W. Biederman
  2014-03-11 21:14                       ` [PATCH net-next 01/10] 8139cp: Call dev_kfree_skby_any instead of kfree_skb Eric W. Biederman
@ 2014-03-11 21:15                       ` Eric W. Biederman
  2014-03-12  2:06                         ` Eric Dumazet
  2014-03-11 21:16                       ` [PATCH net-next 03/10] r8169: Call dev_kfree_skby_any instead of dev_kfree_skb Eric W. Biederman
                                         ` (9 subsequent siblings)
  11 siblings, 1 reply; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-11 21:15 UTC (permalink / raw)
  To: David Miller; +Cc: eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma


Replace dev_kfree_skb with dev_kfree_skb_any in functions that can
be called in hard irq and other contexts.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 drivers/net/ethernet/realtek/8139too.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/realtek/8139too.c b/drivers/net/ethernet/realtek/8139too.c
index da5972eefdd2..8cb2f357026e 100644
--- a/drivers/net/ethernet/realtek/8139too.c
+++ b/drivers/net/ethernet/realtek/8139too.c
@@ -1717,9 +1717,9 @@ static netdev_tx_t rtl8139_start_xmit (struct sk_buff *skb,
 		if (len < ETH_ZLEN)
 			memset(tp->tx_buf[entry], 0, ETH_ZLEN);
 		skb_copy_and_csum_dev(skb, tp->tx_buf[entry]);
-		dev_kfree_skb(skb);
+		dev_kfree_skb_any(skb);
 	} else {
-		dev_kfree_skb(skb);
+		dev_kfree_skb_any(skb);
 		dev->stats.tx_dropped++;
 		return NETDEV_TX_OK;
 	}
-- 
1.7.5.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* [PATCH net-next 03/10] r8169: Call dev_kfree_skby_any instead of dev_kfree_skb.
  2014-03-11 21:13                     ` [PATCH next-next 0/10] Using dev_kfree_skb_any for functions called in multiple contexts Eric W. Biederman
  2014-03-11 21:14                       ` [PATCH net-next 01/10] 8139cp: Call dev_kfree_skby_any instead of kfree_skb Eric W. Biederman
  2014-03-11 21:15                       ` [PATCH net-next 02/10] 8139too: Call dev_kfree_skby_any instead of dev_kfree_skb Eric W. Biederman
@ 2014-03-11 21:16                       ` Eric W. Biederman
  2014-03-12  2:02                         ` Eric Dumazet
  2014-03-11 21:16                       ` [PATCH net-next 04/10] bonding: Call dev_kfree_skby_any instead of kfree_skb Eric W. Biederman
                                         ` (8 subsequent siblings)
  11 siblings, 1 reply; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-11 21:16 UTC (permalink / raw)
  To: David Miller; +Cc: eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma


Replace dev_kfree_skb with dev_kfree_skb_any in functions that can
be called in hard irq and other contexts.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 drivers/net/ethernet/realtek/r8169.c |    6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/realtek/r8169.c b/drivers/net/ethernet/realtek/r8169.c
index e9779653cd4c..cf947337e0d6 100644
--- a/drivers/net/ethernet/realtek/r8169.c
+++ b/drivers/net/ethernet/realtek/r8169.c
@@ -5834,7 +5834,7 @@ static void rtl8169_tx_clear_range(struct rtl8169_private *tp, u32 start,
 					     tp->TxDescArray + entry);
 			if (skb) {
 				tp->dev->stats.tx_dropped++;
-				dev_kfree_skb(skb);
+				dev_kfree_skb_any(skb);
 				tx_skb->skb = NULL;
 			}
 		}
@@ -6059,7 +6059,7 @@ static netdev_tx_t rtl8169_start_xmit(struct sk_buff *skb,
 err_dma_1:
 	rtl8169_unmap_tx_skb(d, tp->tx_skb + entry, txd);
 err_dma_0:
-	dev_kfree_skb(skb);
+	dev_kfree_skb_any(skb);
 err_update_stats:
 	dev->stats.tx_dropped++;
 	return NETDEV_TX_OK;
@@ -6142,7 +6142,7 @@ static void rtl_tx(struct net_device *dev, struct rtl8169_private *tp)
 			tp->tx_stats.packets++;
 			tp->tx_stats.bytes += tx_skb->skb->len;
 			u64_stats_update_end(&tp->tx_stats.syncp);
-			dev_kfree_skb(tx_skb->skb);
+			dev_kfree_skb_any(tx_skb->skb);
 			tx_skb->skb = NULL;
 		}
 		dirty_tx++;
-- 
1.7.5.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* [PATCH net-next 04/10] bonding: Call dev_kfree_skby_any instead of kfree_skb.
  2014-03-11 21:13                     ` [PATCH next-next 0/10] Using dev_kfree_skb_any for functions called in multiple contexts Eric W. Biederman
                                         ` (2 preceding siblings ...)
  2014-03-11 21:16                       ` [PATCH net-next 03/10] r8169: Call dev_kfree_skby_any instead of dev_kfree_skb Eric W. Biederman
@ 2014-03-11 21:16                       ` Eric W. Biederman
  2014-03-11 21:17                       ` [PATCH net-next 05/10] bnx2: Call dev_kfree_skby_any instead of dev_kfree_skb Eric W. Biederman
                                         ` (7 subsequent siblings)
  11 siblings, 0 replies; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-11 21:16 UTC (permalink / raw)
  To: David Miller; +Cc: eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma


Replace kfree_skb with dev_kfree_skb_any in functions that can
be called in hard irq and other contexts.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 drivers/net/bonding/bond_3ad.c  |    2 +-
 drivers/net/bonding/bond_alb.c  |    2 +-
 drivers/net/bonding/bond_main.c |   10 +++++-----
 3 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/net/bonding/bond_3ad.c b/drivers/net/bonding/bond_3ad.c
index a2ef3f72de88..dee2a84a2929 100644
--- a/drivers/net/bonding/bond_3ad.c
+++ b/drivers/net/bonding/bond_3ad.c
@@ -2479,7 +2479,7 @@ out:
 	return NETDEV_TX_OK;
 err_free:
 	/* no suitable interface, frame not sent */
-	kfree_skb(skb);
+	dev_kfree_skb_any(skb);
 	goto out;
 }
 
diff --git a/drivers/net/bonding/bond_alb.c b/drivers/net/bonding/bond_alb.c
index aaeeacf767f2..9cf836b67b15 100644
--- a/drivers/net/bonding/bond_alb.c
+++ b/drivers/net/bonding/bond_alb.c
@@ -1464,7 +1464,7 @@ int bond_alb_xmit(struct sk_buff *skb, struct net_device *bond_dev)
 	}
 
 	/* no suitable interface, frame not sent */
-	kfree_skb(skb);
+	dev_kfree_skb_any(skb);
 out:
 	return NETDEV_TX_OK;
 }
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 730d72c706c9..60f35e5c7f74 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -3548,7 +3548,7 @@ static void bond_xmit_slave_id(struct bonding *bond, struct sk_buff *skb, int sl
 		}
 	}
 	/* no slave that can tx has been found */
-	kfree_skb(skb);
+	dev_kfree_skb_any(skb);
 }
 
 /**
@@ -3624,7 +3624,7 @@ static int bond_xmit_activebackup(struct sk_buff *skb, struct net_device *bond_d
 	if (slave)
 		bond_dev_queue_xmit(bond, skb, slave->dev);
 	else
-		kfree_skb(skb);
+		dev_kfree_skb_any(skb);
 
 	return NETDEV_TX_OK;
 }
@@ -3667,7 +3667,7 @@ static int bond_xmit_broadcast(struct sk_buff *skb, struct net_device *bond_dev)
 	if (slave && IS_UP(slave->dev) && slave->link == BOND_LINK_UP)
 		bond_dev_queue_xmit(bond, skb, slave->dev);
 	else
-		kfree_skb(skb);
+		dev_kfree_skb_any(skb);
 
 	return NETDEV_TX_OK;
 }
@@ -3754,7 +3754,7 @@ static netdev_tx_t __bond_start_xmit(struct sk_buff *skb, struct net_device *dev
 		pr_err("%s: Error: Unknown bonding mode %d\n",
 		       dev->name, bond->params.mode);
 		WARN_ON_ONCE(1);
-		kfree_skb(skb);
+		dev_kfree_skb_any(skb);
 		return NETDEV_TX_OK;
 	}
 }
@@ -3775,7 +3775,7 @@ static netdev_tx_t bond_start_xmit(struct sk_buff *skb, struct net_device *dev)
 	if (bond_has_slaves(bond))
 		ret = __bond_start_xmit(skb, dev);
 	else
-		kfree_skb(skb);
+		dev_kfree_skb_any(skb);
 	rcu_read_unlock();
 
 	return ret;
-- 
1.7.5.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* [PATCH net-next 05/10] bnx2: Call dev_kfree_skby_any instead of dev_kfree_skb.
  2014-03-11 21:13                     ` [PATCH next-next 0/10] Using dev_kfree_skb_any for functions called in multiple contexts Eric W. Biederman
                                         ` (3 preceding siblings ...)
  2014-03-11 21:16                       ` [PATCH net-next 04/10] bonding: Call dev_kfree_skby_any instead of kfree_skb Eric W. Biederman
@ 2014-03-11 21:17                       ` Eric W. Biederman
  2014-03-11 21:18                       ` [PATCH net-next 06/10] tg3: " Eric W. Biederman
                                         ` (6 subsequent siblings)
  11 siblings, 0 replies; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-11 21:17 UTC (permalink / raw)
  To: David Miller; +Cc: eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma


Replace dev_kfree_skb with dev_kfree_skb_any in functions that can
be called in hard irq and other contexts.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 drivers/net/ethernet/broadcom/bnx2.c |    6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnx2.c b/drivers/net/ethernet/broadcom/bnx2.c
index ca6b36220d94..c251ca3056de 100644
--- a/drivers/net/ethernet/broadcom/bnx2.c
+++ b/drivers/net/ethernet/broadcom/bnx2.c
@@ -2885,7 +2885,7 @@ bnx2_tx_int(struct bnx2 *bp, struct bnx2_napi *bnapi, int budget)
 		sw_cons = BNX2_NEXT_TX_BD(sw_cons);
 
 		tx_bytes += skb->len;
-		dev_kfree_skb(skb);
+		dev_kfree_skb_any(skb);
 		tx_pkt++;
 		if (tx_pkt == budget)
 			break;
@@ -6604,7 +6604,7 @@ bnx2_start_xmit(struct sk_buff *skb, struct net_device *dev)
 
 	mapping = dma_map_single(&bp->pdev->dev, skb->data, len, PCI_DMA_TODEVICE);
 	if (dma_mapping_error(&bp->pdev->dev, mapping)) {
-		dev_kfree_skb(skb);
+		dev_kfree_skb_any(skb);
 		return NETDEV_TX_OK;
 	}
 
@@ -6697,7 +6697,7 @@ dma_error:
 			       PCI_DMA_TODEVICE);
 	}
 
-	dev_kfree_skb(skb);
+	dev_kfree_skb_any(skb);
 	return NETDEV_TX_OK;
 }
 
-- 
1.7.5.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* [PATCH net-next 06/10] tg3: Call dev_kfree_skby_any instead of dev_kfree_skb.
  2014-03-11 21:13                     ` [PATCH next-next 0/10] Using dev_kfree_skb_any for functions called in multiple contexts Eric W. Biederman
                                         ` (4 preceding siblings ...)
  2014-03-11 21:17                       ` [PATCH net-next 05/10] bnx2: Call dev_kfree_skby_any instead of dev_kfree_skb Eric W. Biederman
@ 2014-03-11 21:18                       ` Eric W. Biederman
  2014-03-11 21:18                       ` [PATCH net-next 07/10] ixgb: " Eric W. Biederman
                                         ` (5 subsequent siblings)
  11 siblings, 0 replies; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-11 21:18 UTC (permalink / raw)
  To: David Miller; +Cc: eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma


Replace dev_kfree_skb with dev_kfree_skb_any in functions that can
be called in hard irq and other contexts.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 drivers/net/ethernet/broadcom/tg3.c |   14 +++++++-------
 1 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/tg3.c b/drivers/net/ethernet/broadcom/tg3.c
index e12735fbdcdb..bbbd2a4bc161 100644
--- a/drivers/net/ethernet/broadcom/tg3.c
+++ b/drivers/net/ethernet/broadcom/tg3.c
@@ -6593,7 +6593,7 @@ static void tg3_tx(struct tg3_napi *tnapi)
 		pkts_compl++;
 		bytes_compl += skb->len;
 
-		dev_kfree_skb(skb);
+		dev_kfree_skb_any(skb);
 
 		if (unlikely(tx_bug)) {
 			tg3_tx_recover(tp);
@@ -6924,7 +6924,7 @@ static int tg3_rx(struct tg3_napi *tnapi, int budget)
 
 		if (len > (tp->dev->mtu + ETH_HLEN) &&
 		    skb->protocol != htons(ETH_P_8021Q)) {
-			dev_kfree_skb(skb);
+			dev_kfree_skb_any(skb);
 			goto drop_it_no_recycle;
 		}
 
@@ -7807,7 +7807,7 @@ static int tigon3_dma_hwbug_workaround(struct tg3_napi *tnapi,
 					  PCI_DMA_TODEVICE);
 		/* Make sure the mapping succeeded */
 		if (pci_dma_mapping_error(tp->pdev, new_addr)) {
-			dev_kfree_skb(new_skb);
+			dev_kfree_skb_any(new_skb);
 			ret = -1;
 		} else {
 			u32 save_entry = *entry;
@@ -7822,13 +7822,13 @@ static int tigon3_dma_hwbug_workaround(struct tg3_napi *tnapi,
 					    new_skb->len, base_flags,
 					    mss, vlan)) {
 				tg3_tx_skb_unmap(tnapi, save_entry, -1);
-				dev_kfree_skb(new_skb);
+				dev_kfree_skb_any(new_skb);
 				ret = -1;
 			}
 		}
 	}
 
-	dev_kfree_skb(skb);
+	dev_kfree_skb_any(skb);
 	*pskb = new_skb;
 	return ret;
 }
@@ -7871,7 +7871,7 @@ static int tg3_tso_bug(struct tg3 *tp, struct sk_buff *skb)
 	} while (segs);
 
 tg3_tso_bug_end:
-	dev_kfree_skb(skb);
+	dev_kfree_skb_any(skb);
 
 	return NETDEV_TX_OK;
 }
@@ -8093,7 +8093,7 @@ dma_error:
 	tg3_tx_skb_unmap(tnapi, tnapi->tx_prod, --i);
 	tnapi->tx_buffers[tnapi->tx_prod].skb = NULL;
 drop:
-	dev_kfree_skb(skb);
+	dev_kfree_skb_any(skb);
 drop_nofree:
 	tp->tx_dropped++;
 	return NETDEV_TX_OK;
-- 
1.7.5.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* [PATCH net-next 07/10] ixgb: Call dev_kfree_skby_any instead of dev_kfree_skb.
  2014-03-11 21:13                     ` [PATCH next-next 0/10] Using dev_kfree_skb_any for functions called in multiple contexts Eric W. Biederman
                                         ` (5 preceding siblings ...)
  2014-03-11 21:18                       ` [PATCH net-next 06/10] tg3: " Eric W. Biederman
@ 2014-03-11 21:18                       ` Eric W. Biederman
  2014-03-11 21:19                       ` [PATCH net-next 08/10] mlx4: " Eric W. Biederman
                                         ` (4 subsequent siblings)
  11 siblings, 0 replies; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-11 21:18 UTC (permalink / raw)
  To: David Miller; +Cc: eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma


Replace dev_kfree_skb with dev_kfree_skb_any in functions that can
be called in hard irq and other contexts.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 drivers/net/ethernet/intel/ixgb/ixgb_main.c |    6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/intel/ixgb/ixgb_main.c b/drivers/net/ethernet/intel/ixgb/ixgb_main.c
index 57e390cbe6d0..f42c201f727f 100644
--- a/drivers/net/ethernet/intel/ixgb/ixgb_main.c
+++ b/drivers/net/ethernet/intel/ixgb/ixgb_main.c
@@ -1521,12 +1521,12 @@ ixgb_xmit_frame(struct sk_buff *skb, struct net_device *netdev)
 	int tso;
 
 	if (test_bit(__IXGB_DOWN, &adapter->flags)) {
-		dev_kfree_skb(skb);
+		dev_kfree_skb_any(skb);
 		return NETDEV_TX_OK;
 	}
 
 	if (skb->len <= 0) {
-		dev_kfree_skb(skb);
+		dev_kfree_skb_any(skb);
 		return NETDEV_TX_OK;
 	}
 
@@ -1543,7 +1543,7 @@ ixgb_xmit_frame(struct sk_buff *skb, struct net_device *netdev)
 
 	tso = ixgb_tso(adapter, skb);
 	if (tso < 0) {
-		dev_kfree_skb(skb);
+		dev_kfree_skb_any(skb);
 		return NETDEV_TX_OK;
 	}
 
-- 
1.7.5.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* [PATCH net-next 08/10] mlx4: Call dev_kfree_skby_any instead of dev_kfree_skb.
  2014-03-11 21:13                     ` [PATCH next-next 0/10] Using dev_kfree_skb_any for functions called in multiple contexts Eric W. Biederman
                                         ` (6 preceding siblings ...)
  2014-03-11 21:18                       ` [PATCH net-next 07/10] ixgb: " Eric W. Biederman
@ 2014-03-11 21:19                       ` Eric W. Biederman
  2014-03-11 21:19                       ` [PATCH net-next 09/10] benet: Call dev_kfree_skby_any instead of kfree_skb Eric W. Biederman
                                         ` (3 subsequent siblings)
  11 siblings, 0 replies; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-11 21:19 UTC (permalink / raw)
  To: David Miller; +Cc: eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma


Replace dev_kfree_skb with dev_kfree_skb_any in functions that can
be called in hard irq and other contexts.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 drivers/net/ethernet/mellanox/mlx4/en_tx.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx4/en_tx.c b/drivers/net/ethernet/mellanox/mlx4/en_tx.c
index 69c2fcef9d4c..dd1f6d346459 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_tx.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_tx.c
@@ -314,7 +314,7 @@ static u32 mlx4_en_free_tx_desc(struct mlx4_en_priv *priv,
 			}
 		}
 	}
-	dev_kfree_skb(skb);
+	dev_kfree_skb_any(skb);
 	return tx_info->nr_txbb;
 }
 
-- 
1.7.5.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* [PATCH net-next 09/10] benet: Call dev_kfree_skby_any instead of kfree_skb.
  2014-03-11 21:13                     ` [PATCH next-next 0/10] Using dev_kfree_skb_any for functions called in multiple contexts Eric W. Biederman
                                         ` (7 preceding siblings ...)
  2014-03-11 21:19                       ` [PATCH net-next 08/10] mlx4: " Eric W. Biederman
@ 2014-03-11 21:19                       ` Eric W. Biederman
  2014-03-11 21:20                       ` [PATCH net-next 10/10] gianfar: Carefully free skbs in functions called by netpoll Eric W. Biederman
                                         ` (2 subsequent siblings)
  11 siblings, 0 replies; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-11 21:19 UTC (permalink / raw)
  To: David Miller; +Cc: eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma


Replace free_skb with dev_kfree_skb_any in be_tx_compl_process as
which can be called in hard irq by netpoll, softirq context
by normal napi polling, and in normal sleepable context
by the network device close method.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 drivers/net/ethernet/emulex/benet/be_main.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/net/ethernet/emulex/benet/be_main.c b/drivers/net/ethernet/emulex/benet/be_main.c
index 6e10230a2ee0..2eee0b2577f8 100644
--- a/drivers/net/ethernet/emulex/benet/be_main.c
+++ b/drivers/net/ethernet/emulex/benet/be_main.c
@@ -1897,7 +1897,7 @@ static u16 be_tx_compl_process(struct be_adapter *adapter,
 		queue_tail_inc(txq);
 	} while (cur_index != last_index);
 
-	kfree_skb(sent_skb);
+	dev_kfree_skb_any(sent_skb);
 	return num_wrbs;
 }
 
-- 
1.7.5.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* [PATCH net-next 10/10] gianfar: Carefully free skbs in functions called by netpoll.
  2014-03-11 21:13                     ` [PATCH next-next 0/10] Using dev_kfree_skb_any for functions called in multiple contexts Eric W. Biederman
                                         ` (8 preceding siblings ...)
  2014-03-11 21:19                       ` [PATCH net-next 09/10] benet: Call dev_kfree_skby_any instead of kfree_skb Eric W. Biederman
@ 2014-03-11 21:20                       ` Eric W. Biederman
  2014-03-12  2:54                       ` [PATCH next-next 0/10] Using dev_kfree_skb_any for functions called in multiple contexts Eric Dumazet
  2014-03-25  5:58                       ` [net-next 00/54][pull request] Using dev_kfree/consume_skb_any " Eric W. Biederman
  11 siblings, 0 replies; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-11 21:20 UTC (permalink / raw)
  To: David Miller; +Cc: eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma


netpoll can call functions in hard irq context that are ordinarily
called in lesser contexts.  For those functions use dev_kfree_skb_any
and dev_consume_skb_any so skbs are freed safely from hard irq
context.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 drivers/net/ethernet/freescale/gianfar.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/freescale/gianfar.c b/drivers/net/ethernet/freescale/gianfar.c
index c5b9320f7629..4d000f844390 100644
--- a/drivers/net/ethernet/freescale/gianfar.c
+++ b/drivers/net/ethernet/freescale/gianfar.c
@@ -2146,13 +2146,13 @@ static int gfar_start_xmit(struct sk_buff *skb, struct net_device *dev)
 		skb_new = skb_realloc_headroom(skb, fcb_len);
 		if (!skb_new) {
 			dev->stats.tx_errors++;
-			kfree_skb(skb);
+			dev_kfree_skb_any(skb);
 			return NETDEV_TX_OK;
 		}
 
 		if (skb->sk)
 			skb_set_owner_w(skb_new, skb->sk);
-		consume_skb(skb);
+		dev_consume_skb_any(skb);
 		skb = skb_new;
 	}
 
-- 
1.7.5.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* [PATCH net-next 0/2] Don't receive packets when the napi budget == 0
  2014-03-11 20:09                   ` David Miller
  2014-03-11 21:13                     ` [PATCH next-next 0/10] Using dev_kfree_skb_any for functions called in multiple contexts Eric W. Biederman
@ 2014-03-11 21:30                     ` Eric W. Biederman
  2014-03-11 21:31                       ` [PATCH net-next 1/2] bnx2: " Eric W. Biederman
                                         ` (3 more replies)
  2014-03-11 21:33                     ` [PATCH net-next] bcm63xx_enet: Stop pretending to support netpoll Eric W. Biederman
  2 siblings, 4 replies; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-11 21:30 UTC (permalink / raw)
  To: David Miller; +Cc: eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma


To the best of understanding processing any received packets when the
napi budget == 0 is broken driver behavior.  At the same time I don't
think we have ever cared before so there are a handful of drivers that
need fixes.

I care now as I will shortly be using htis in netpoll to get the
tx queue processing without the rx queue processing.

Drivers that need fixes are few and far between, and so far I have only
found two of them.  More similar patches later if I find more drivers
that need fixes.

Eric W. Biederman (2):
      bnx2: Don't receive packets when the napi budget == 0
      8139cp: Don't receive packets when the napi budget == 0

 drivers/net/ethernet/broadcom/bnx2.c  |    3 +++
 drivers/net/ethernet/realtek/8139cp.c |    5 +----
 2 files changed, 4 insertions(+), 4 deletions(-)

Eric

^ permalink raw reply	[flat|nested] 288+ messages in thread

* [PATCH net-next 1/2] bnx2: Don't receive packets when the napi budget == 0
  2014-03-11 21:30                     ` [PATCH net-next 0/2] Don't receive packets when the napi budget == 0 Eric W. Biederman
@ 2014-03-11 21:31                       ` Eric W. Biederman
  2014-03-12  5:07                         ` Eric Dumazet
  2014-03-11 21:31                       ` [PATCH net-next 2/2] 8139cp: " Eric W. Biederman
                                         ` (2 subsequent siblings)
  3 siblings, 1 reply; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-11 21:31 UTC (permalink / raw)
  To: David Miller; +Cc: eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma


Processing any incoming packets with a with a napi budget of 0
is incorrect driver behavior.

This matters as netpoll will shortly call drivers with a budget of 0
to avoid receive packet processing happening in hard irq context.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 drivers/net/ethernet/broadcom/bnx2.c |    3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnx2.c b/drivers/net/ethernet/broadcom/bnx2.c
index c251ca3056de..2e42de239798 100644
--- a/drivers/net/ethernet/broadcom/bnx2.c
+++ b/drivers/net/ethernet/broadcom/bnx2.c
@@ -3132,6 +3132,9 @@ bnx2_rx_int(struct bnx2 *bp, struct bnx2_napi *bnapi, int budget)
 	struct l2_fhdr *rx_hdr;
 	int rx_pkt = 0, pg_ring_used = 0;
 
+	if (budget <= 0)
+		return rx_pkt;
+
 	hw_cons = bnx2_get_hw_rx_cons(bnapi);
 	sw_cons = rxr->rx_cons;
 	sw_prod = rxr->rx_prod;
-- 
1.7.5.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* [PATCH net-next 2/2] 8139cp: Don't receive packets when the napi budget == 0
  2014-03-11 21:30                     ` [PATCH net-next 0/2] Don't receive packets when the napi budget == 0 Eric W. Biederman
  2014-03-11 21:31                       ` [PATCH net-next 1/2] bnx2: " Eric W. Biederman
@ 2014-03-11 21:31                       ` Eric W. Biederman
  2014-03-12  5:08                         ` Eric Dumazet
  2014-03-13 19:19                       ` [PATCH net-next 0/2] " David Miller
  2014-03-15  0:56                       ` [PATCH net-next 0/16] " Eric W. Biederman
  3 siblings, 1 reply; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-11 21:31 UTC (permalink / raw)
  To: David Miller; +Cc: eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma


Processing any incoming packets with a with a napi budget of 0
is incorrect driver behavior.

This matters as netpoll will shortly call drivers with a budget of 0
to avoid receive packet processing happening in hard irq context.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com
---
 drivers/net/ethernet/realtek/8139cp.c |    5 +----
 1 files changed, 1 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/realtek/8139cp.c b/drivers/net/ethernet/realtek/8139cp.c
index a3c1daa7ad5c..2bc728e65e24 100644
--- a/drivers/net/ethernet/realtek/8139cp.c
+++ b/drivers/net/ethernet/realtek/8139cp.c
@@ -476,7 +476,7 @@ rx_status_loop:
 	rx = 0;
 	cpw16(IntrStatus, cp_rx_intr_mask);
 
-	while (1) {
+	while (rx < budget) {
 		u32 status, len;
 		dma_addr_t mapping, new_mapping;
 		struct sk_buff *skb, *new_skb;
@@ -554,9 +554,6 @@ rx_next:
 		else
 			desc->opts1 = cpu_to_le32(DescOwn | cp->rx_buf_sz);
 		rx_tail = NEXT_RX(rx_tail);
-
-		if (rx >= budget)
-			break;
 	}
 
 	cp->rx_tail = rx_tail;
-- 
1.7.5.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* [PATCH net-next] bcm63xx_enet: Stop pretending to support netpoll
  2014-03-11 20:09                   ` David Miller
  2014-03-11 21:13                     ` [PATCH next-next 0/10] Using dev_kfree_skb_any for functions called in multiple contexts Eric W. Biederman
  2014-03-11 21:30                     ` [PATCH net-next 0/2] Don't receive packets when the napi budget == 0 Eric W. Biederman
@ 2014-03-11 21:33                     ` Eric W. Biederman
  2014-03-13 19:26                       ` David Miller
  2 siblings, 1 reply; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-11 21:33 UTC (permalink / raw)
  To: David Miller; +Cc: eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma


bcm_enet_netpoll does not exist, and causing
bcm63xx_net to fail to build when NET_POLL_CONTROLLER
is defined.

Remove the bogus .ndo_poll_controller = bcm_enet_netpoll

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 drivers/net/ethernet/broadcom/bcm63xx_enet.c |    3 ---
 1 files changed, 0 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bcm63xx_enet.c b/drivers/net/ethernet/broadcom/bcm63xx_enet.c
index b9a5fb6400d3..a7d11f5565d6 100644
--- a/drivers/net/ethernet/broadcom/bcm63xx_enet.c
+++ b/drivers/net/ethernet/broadcom/bcm63xx_enet.c
@@ -1722,9 +1722,6 @@ static const struct net_device_ops bcm_enet_ops = {
 	.ndo_set_rx_mode	= bcm_enet_set_multicast_list,
 	.ndo_do_ioctl		= bcm_enet_ioctl,
 	.ndo_change_mtu		= bcm_enet_change_mtu,
-#ifdef CONFIG_NET_POLL_CONTROLLER
-	.ndo_poll_controller = bcm_enet_netpoll,
-#endif
 };
 
 /*
-- 
1.7.5.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* Re: [PATCH net-next 03/10] r8169: Call dev_kfree_skby_any instead of dev_kfree_skb.
  2014-03-11 21:16                       ` [PATCH net-next 03/10] r8169: Call dev_kfree_skby_any instead of dev_kfree_skb Eric W. Biederman
@ 2014-03-12  2:02                         ` Eric Dumazet
  0 siblings, 0 replies; 288+ messages in thread
From: Eric Dumazet @ 2014-03-12  2:02 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: David Miller, netdev, xiyou.wangcong, mpm, satyam.sharma

On Tue, 2014-03-11 at 14:16 -0700, Eric W. Biederman wrote:
> Replace dev_kfree_skb with dev_kfree_skb_any in functions that can
> be called in hard irq and other contexts.
> 
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
> ---
>  drivers/net/ethernet/realtek/r8169.c |    6 +++---
>  1 files changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/net/ethernet/realtek/r8169.c b/drivers/net/ethernet/realtek/r8169.c
> index e9779653cd4c..cf947337e0d6 100644
> --- a/drivers/net/ethernet/realtek/r8169.c
> +++ b/drivers/net/ethernet/realtek/r8169.c
> @@ -5834,7 +5834,7 @@ static void rtl8169_tx_clear_range(struct rtl8169_private *tp, u32 start,
>  					     tp->TxDescArray + entry);
>  			if (skb) {
>  				tp->dev->stats.tx_dropped++;
> -				dev_kfree_skb(skb);
> +				dev_kfree_skb_any(skb);
>  				tx_skb->skb = NULL;
>  			}
>  		}
> @@ -6059,7 +6059,7 @@ static netdev_tx_t rtl8169_start_xmit(struct sk_buff *skb,
>  err_dma_1:
>  	rtl8169_unmap_tx_skb(d, tp->tx_skb + entry, txd);
>  err_dma_0:
> -	dev_kfree_skb(skb);
> +	dev_kfree_skb_any(skb);
>  err_update_stats:
>  	dev->stats.tx_dropped++;
>  	return NETDEV_TX_OK;
> @@ -6142,7 +6142,7 @@ static void rtl_tx(struct net_device *dev, struct rtl8169_private *tp)
>  			tp->tx_stats.packets++;
>  			tp->tx_stats.bytes += tx_skb->skb->len;
>  			u64_stats_update_end(&tp->tx_stats.syncp);
> -			dev_kfree_skb(tx_skb->skb);
> +			dev_kfree_skb_any(tx_skb->skb);
>  			tx_skb->skb = NULL;
>  		}
>  		dirty_tx++;

If this code can either run from softirq or hardirq, then
rtl8169_get_stats64() should block hard irq, not only soft irq.

Ie not use u64_stats_fetch_{begin|retry}_bh()

^ permalink raw reply	[flat|nested] 288+ messages in thread

* Re: [PATCH net-next 02/10] 8139too: Call dev_kfree_skby_any instead of dev_kfree_skb.
  2014-03-11 21:15                       ` [PATCH net-next 02/10] 8139too: Call dev_kfree_skby_any instead of dev_kfree_skb Eric W. Biederman
@ 2014-03-12  2:06                         ` Eric Dumazet
  2014-03-12 21:24                           ` Francois Romieu
  0 siblings, 1 reply; 288+ messages in thread
From: Eric Dumazet @ 2014-03-12  2:06 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: David Miller, netdev, xiyou.wangcong, mpm, satyam.sharma

On Tue, 2014-03-11 at 14:15 -0700, Eric W. Biederman wrote:
> Replace dev_kfree_skb with dev_kfree_skb_any in functions that can
> be called in hard irq and other contexts.
> 
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
> ---
>  drivers/net/ethernet/realtek/8139too.c |    4 ++--
>  1 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/ethernet/realtek/8139too.c b/drivers/net/ethernet/realtek/8139too.c
> index da5972eefdd2..8cb2f357026e 100644
> --- a/drivers/net/ethernet/realtek/8139too.c
> +++ b/drivers/net/ethernet/realtek/8139too.c
> @@ -1717,9 +1717,9 @@ static netdev_tx_t rtl8139_start_xmit (struct sk_buff *skb,
>  		if (len < ETH_ZLEN)
>  			memset(tp->tx_buf[entry], 0, ETH_ZLEN);
>  		skb_copy_and_csum_dev(skb, tp->tx_buf[entry]);
> -		dev_kfree_skb(skb);
> +		dev_kfree_skb_any(skb);
>  	} else {
> -		dev_kfree_skb(skb);
> +		dev_kfree_skb_any(skb);
>  		dev->stats.tx_dropped++;
>  		return NETDEV_TX_OK;
>  	}

Same u64_stats_fetch_begin_bh() problem for this driver to fetch TX
stats.

^ permalink raw reply	[flat|nested] 288+ messages in thread

* Re: [PATCH next-next 0/10] Using dev_kfree_skb_any for functions called in multiple contexts
  2014-03-11 21:13                     ` [PATCH next-next 0/10] Using dev_kfree_skb_any for functions called in multiple contexts Eric W. Biederman
                                         ` (9 preceding siblings ...)
  2014-03-11 21:20                       ` [PATCH net-next 10/10] gianfar: Carefully free skbs in functions called by netpoll Eric W. Biederman
@ 2014-03-12  2:54                       ` Eric Dumazet
  2014-03-12 20:22                         ` David Miller
  2014-03-25  5:58                       ` [net-next 00/54][pull request] Using dev_kfree/consume_skb_any " Eric W. Biederman
  11 siblings, 1 reply; 288+ messages in thread
From: Eric Dumazet @ 2014-03-12  2:54 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: David Miller, netdev, xiyou.wangcong, mpm, satyam.sharma

On Tue, 2014-03-11 at 14:13 -0700, Eric W. Biederman wrote:
> This patchset should be an uncontroversial set of changes to change
> dev_kfree_skb to dev_kfree_skb_any for code paths that are called in
> hard irq contexts in addition to other contexts.  netpoll is the reason
> this code gets called in multiple contexts.
> 
> There is more coming but these changes are a good starting place, and
> stand on their own.
> 
> Since the last round changes to the rx path have been removed netpoll
> will changed to avoid that.
> 
> Eric W. Biederman (10):
>       8139cp: Call dev_kfree_skby_any instead of kfree_skb.
>       8139too: Call dev_kfree_skby_any instead of dev_kfree_skb.
>       r8169: Call dev_kfree_skby_any instead of dev_kfree_skb.
>       bonding: Call dev_kfree_skby_any instead of kfree_skb.
>       bnx2: Call dev_kfree_skby_any instead of dev_kfree_skb.
>       tg3: Call dev_kfree_skby_any instead of dev_kfree_skb.
>       ixgb: Call dev_kfree_skby_any instead of dev_kfree_skb.
>       mlx4: Call dev_kfree_skby_any instead of dev_kfree_skb.
>       benet: Call dev_kfree_skby_any instead of kfree_skb.
>       gianfar: Carefully free skbs in functions called by netpoll.

Acked-by: Eric Dumazet <edumazet@google.com>

We'll have some follow up, but these patches seem fine.
 

^ permalink raw reply	[flat|nested] 288+ messages in thread

* Re: [PATCH net-next 1/2] bnx2: Don't receive packets when the napi budget == 0
  2014-03-11 21:31                       ` [PATCH net-next 1/2] bnx2: " Eric W. Biederman
@ 2014-03-12  5:07                         ` Eric Dumazet
  0 siblings, 0 replies; 288+ messages in thread
From: Eric Dumazet @ 2014-03-12  5:07 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: David Miller, netdev, xiyou.wangcong, mpm, satyam.sharma

On Tue, 2014-03-11 at 14:31 -0700, Eric W. Biederman wrote:
> Processing any incoming packets with a with a napi budget of 0
> is incorrect driver behavior.
> 
> This matters as netpoll will shortly call drivers with a budget of 0
> to avoid receive packet processing happening in hard irq context.
> 
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
> ---
>  drivers/net/ethernet/broadcom/bnx2.c |    3 +++
>  1 files changed, 3 insertions(+), 0 deletions(-)

Acked-by: Eric Dumazet <edumazet@google.com>

^ permalink raw reply	[flat|nested] 288+ messages in thread

* Re: [PATCH net-next 2/2] 8139cp: Don't receive packets when the napi budget == 0
  2014-03-11 21:31                       ` [PATCH net-next 2/2] 8139cp: " Eric W. Biederman
@ 2014-03-12  5:08                         ` Eric Dumazet
  0 siblings, 0 replies; 288+ messages in thread
From: Eric Dumazet @ 2014-03-12  5:08 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: David Miller, netdev, xiyou.wangcong, mpm, satyam.sharma

On Tue, 2014-03-11 at 14:31 -0700, Eric W. Biederman wrote:
> Processing any incoming packets with a with a napi budget of 0
> is incorrect driver behavior.
> 
> This matters as netpoll will shortly call drivers with a budget of 0
> to avoid receive packet processing happening in hard irq context.
> 
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com
> ---
>  drivers/net/ethernet/realtek/8139cp.c |    5 +----
>  1 files changed, 1 insertions(+), 4 deletions(-)

Acked-by: Eric Dumazet <edumazet@google.com>

^ permalink raw reply	[flat|nested] 288+ messages in thread

* Re: [RFC PATCH 1/2] netpoll: Remove dead netpoll_rx code
  2014-03-11 20:48                     ` Eric W. Biederman
@ 2014-03-12 18:31                       ` Cong Wang
  2014-03-13 19:23                       ` David Miller
  1 sibling, 0 replies; 288+ messages in thread
From: Cong Wang @ 2014-03-12 18:31 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Stephen Hemminger, Eric Dumazet, David Miller,
	Linux Kernel Network Developers, Matt Mackall, Satyam Sharma

On Tue, Mar 11, 2014 at 1:48 PM, Eric W. Biederman
<ebiederm@xmission.com> wrote:
>
> To play devil's advocate to my own patch.  Does anyone know where kgdb
> over network (kgdboe) code lives today?
>
> What little I could find in a quick google search strongly suggests that
> kgdboe was abandoned in 2010 or so.
>
> I am trying to figure out if there are any active out of tree projects
> that need by directional netpoll.
>

We simply don't care about out-of-tree modules, they just need to be
merged into upstream for us to care.

^ permalink raw reply	[flat|nested] 288+ messages in thread

* Re: [RFC PATCH 2/2] netpoll: Don't poll for received packets
  2014-03-11  8:45               ` [RFC PATCH 2/2] netpoll: Don't poll for received packets Eric W. Biederman
  2014-03-11 12:44                 ` Eric Dumazet
@ 2014-03-12 18:39                 ` Cong Wang
  2014-03-13 20:48                   ` Eric W. Biederman
  1 sibling, 1 reply; 288+ messages in thread
From: Cong Wang @ 2014-03-12 18:39 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Eric Dumazet, David Miller, Linux Kernel Network Developers,
	Matt Mackall, Satyam Sharma

On Tue, Mar 11, 2014 at 1:45 AM, Eric W. Biederman
<ebiederm@xmission.com> wrote:
> -       work = napi->poll(napi, budget);
> -       trace_napi_poll(napi);
> +       /* Use a budget of 0 to request the drivers not process
> +        * their receive queue.  Warn when they do anyway.
> +        */
> +       work = napi->poll(napi, 0);
> +       WARN_ON_ONCE(work != 0);
>

Adding more printk's in netpoll call path would only bring more troubles.

^ permalink raw reply	[flat|nested] 288+ messages in thread

* Re: [PATCH next-next 0/10] Using dev_kfree_skb_any for functions called in multiple contexts
  2014-03-12  2:54                       ` [PATCH next-next 0/10] Using dev_kfree_skb_any for functions called in multiple contexts Eric Dumazet
@ 2014-03-12 20:22                         ` David Miller
  0 siblings, 0 replies; 288+ messages in thread
From: David Miller @ 2014-03-12 20:22 UTC (permalink / raw)
  To: eric.dumazet; +Cc: ebiederm, netdev, xiyou.wangcong, mpm, satyam.sharma

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Tue, 11 Mar 2014 19:54:05 -0700

> On Tue, 2014-03-11 at 14:13 -0700, Eric W. Biederman wrote:
>> This patchset should be an uncontroversial set of changes to change
>> dev_kfree_skb to dev_kfree_skb_any for code paths that are called in
>> hard irq contexts in addition to other contexts.  netpoll is the reason
>> this code gets called in multiple contexts.
>> 
>> There is more coming but these changes are a good starting place, and
>> stand on their own.
>> 
>> Since the last round changes to the rx path have been removed netpoll
>> will changed to avoid that.
>> 
>> Eric W. Biederman (10):
>>       8139cp: Call dev_kfree_skby_any instead of kfree_skb.
>>       8139too: Call dev_kfree_skby_any instead of dev_kfree_skb.
>>       r8169: Call dev_kfree_skby_any instead of dev_kfree_skb.
>>       bonding: Call dev_kfree_skby_any instead of kfree_skb.
>>       bnx2: Call dev_kfree_skby_any instead of dev_kfree_skb.
>>       tg3: Call dev_kfree_skby_any instead of dev_kfree_skb.
>>       ixgb: Call dev_kfree_skby_any instead of dev_kfree_skb.
>>       mlx4: Call dev_kfree_skby_any instead of dev_kfree_skb.
>>       benet: Call dev_kfree_skby_any instead of kfree_skb.
>>       gianfar: Carefully free skbs in functions called by netpoll.
> 
> Acked-by: Eric Dumazet <edumazet@google.com>
> 
> We'll have some follow up, but these patches seem fine.

Series applied, thanks everyone.

^ permalink raw reply	[flat|nested] 288+ messages in thread

* Re: [PATCH net-next 02/10] 8139too: Call dev_kfree_skby_any instead of dev_kfree_skb.
  2014-03-12  2:06                         ` Eric Dumazet
@ 2014-03-12 21:24                           ` Francois Romieu
  2014-03-12 22:01                             ` Eric Dumazet
  0 siblings, 1 reply; 288+ messages in thread
From: Francois Romieu @ 2014-03-12 21:24 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Eric W. Biederman, David Miller, netdev, xiyou.wangcong, mpm,
	satyam.sharma

Eric Dumazet <eric.dumazet@gmail.com> :
[...]
> Same u64_stats_fetch_begin_bh() problem for this driver to fetch TX
> stats.

Same problem for any NAPI context Tx completing driver that claims
netpoll support. For instance:
drivers/net/ethernet/emulex/benet
drivers/net/ethernet/intel/i40e
drivers/net/ethernet/intel/igb
drivers/net/ethernet/intel/ixgbe
drivers/net/ethernet/marvell/sky2.c
drivers/net/ethernet/neterion/vxge ?

Similar problem for the drivers below. They update Tx stats in start_xmit
and use u64_.*_bh:
drivers/net/ethernet/tile/tilepro.c
drivers/net/team/team.c
drivers/net/virtio_net.c

-- 
Ueimor

^ permalink raw reply	[flat|nested] 288+ messages in thread

* Re: [PATCH net-next 02/10] 8139too: Call dev_kfree_skby_any instead of dev_kfree_skb.
  2014-03-12 21:24                           ` Francois Romieu
@ 2014-03-12 22:01                             ` Eric Dumazet
  2014-03-13 21:08                               ` Eric W. Biederman
  2014-03-14  4:26                               ` [PATCH net-next] net: Replace u64_stats_fetch_begin_bh to u64_stats_fetch_begin_irq Eric W. Biederman
  0 siblings, 2 replies; 288+ messages in thread
From: Eric Dumazet @ 2014-03-12 22:01 UTC (permalink / raw)
  To: Francois Romieu
  Cc: Eric W. Biederman, David Miller, netdev, xiyou.wangcong, mpm,
	satyam.sharma

On Wed, 2014-03-12 at 22:24 +0100, Francois Romieu wrote:
> Eric Dumazet <eric.dumazet@gmail.com> :
> [...]
> > Same u64_stats_fetch_begin_bh() problem for this driver to fetch TX
> > stats.
> 
> Same problem for any NAPI context Tx completing driver that claims
> netpoll support. For instance:
> drivers/net/ethernet/emulex/benet
> drivers/net/ethernet/intel/i40e
> drivers/net/ethernet/intel/igb
> drivers/net/ethernet/intel/ixgbe
> drivers/net/ethernet/marvell/sky2.c
> drivers/net/ethernet/neterion/vxge ?
> 
> Similar problem for the drivers below. They update Tx stats in start_xmit
> and use u64_.*_bh:
> drivers/net/ethernet/tile/tilepro.c
> drivers/net/team/team.c
> drivers/net/virtio_net.c
> 

Yep, note that this issue is not caused by Eric patches, we need to take
care of this by providing u64_stats_fetch_{begin|retry}_irq() regardless
of how skb are freed.

^ permalink raw reply	[flat|nested] 288+ messages in thread

* Re: [PATCH net-next 0/2] Don't receive packets when the napi budget == 0
  2014-03-11 21:30                     ` [PATCH net-next 0/2] Don't receive packets when the napi budget == 0 Eric W. Biederman
  2014-03-11 21:31                       ` [PATCH net-next 1/2] bnx2: " Eric W. Biederman
  2014-03-11 21:31                       ` [PATCH net-next 2/2] 8139cp: " Eric W. Biederman
@ 2014-03-13 19:19                       ` David Miller
  2014-03-15  0:56                       ` [PATCH net-next 0/16] " Eric W. Biederman
  3 siblings, 0 replies; 288+ messages in thread
From: David Miller @ 2014-03-13 19:19 UTC (permalink / raw)
  To: ebiederm; +Cc: eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma

From: ebiederm@xmission.com (Eric W. Biederman)
Date: Tue, 11 Mar 2014 14:30:11 -0700

> To the best of understanding processing any received packets when the
> napi budget == 0 is broken driver behavior.  At the same time I don't
> think we have ever cared before so there are a handful of drivers that
> need fixes.
> 
> I care now as I will shortly be using htis in netpoll to get the
> tx queue processing without the rx queue processing.
> 
> Drivers that need fixes are few and far between, and so far I have only
> found two of them.  More similar patches later if I find more drivers
> that need fixes.

Series applied, thanks Eric.

^ permalink raw reply	[flat|nested] 288+ messages in thread

* Re: [RFC PATCH 1/2] netpoll: Remove dead netpoll_rx code
  2014-03-11 20:48                     ` Eric W. Biederman
  2014-03-12 18:31                       ` Cong Wang
@ 2014-03-13 19:23                       ` David Miller
  2014-03-13 20:46                         ` Eric W. Biederman
  2014-03-15  1:30                         ` [PATCH 0/9] netpoll: Cleanup received packet processing Eric W. Biederman
  1 sibling, 2 replies; 288+ messages in thread
From: David Miller @ 2014-03-13 19:23 UTC (permalink / raw)
  To: ebiederm
  Cc: stephen, eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma

From: ebiederm@xmission.com (Eric W. Biederman)
Date: Tue, 11 Mar 2014 13:48:01 -0700

> To play devil's advocate to my own patch.  Does anyone know where kgdb
> over network (kgdboe) code lives today?
> 
> What little I could find in a quick google search strongly suggests that
> kgdboe was abandoned in 2010 or so.
> 
> I am trying to figure out if there are any active out of tree projects
> that need by directional netpoll.

Good questions.

I, perhaps mistakenly, kept the functionality around because there
were claims that we'd use it in-tree.

That of course never materialized.

The fact that people have a lot of trouble even finding the kgdboe
sources is quite telling, indeed.

Let's kill it, we can pull it back in (perhaps with a better design)
if something is proposed in-tree that will need it.

But I'm skeptical we ever will need it, and even if such a
reinstatement is proposed f.e. for the kgdboe use case it has holes.

Consider the case where kgdboe takes a breakpoint in a hardware
interrupt handler.  What happens?  We cannot allow it to perform a
full back-and-forth conversation with the remote gdb from such a
context.

^ permalink raw reply	[flat|nested] 288+ messages in thread

* Re: [PATCH net-next] bcm63xx_enet: Stop pretending to support netpoll
  2014-03-11 21:33                     ` [PATCH net-next] bcm63xx_enet: Stop pretending to support netpoll Eric W. Biederman
@ 2014-03-13 19:26                       ` David Miller
  2014-03-13 19:42                         ` Florian Fainelli
  0 siblings, 1 reply; 288+ messages in thread
From: David Miller @ 2014-03-13 19:26 UTC (permalink / raw)
  To: ebiederm; +Cc: eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma

From: ebiederm@xmission.com (Eric W. Biederman)
Date: Tue, 11 Mar 2014 14:33:35 -0700

> bcm_enet_netpoll does not exist, and causing
> bcm63xx_net to fail to build when NET_POLL_CONTROLLER
> is defined.
> 
> Remove the bogus .ndo_poll_controller = bcm_enet_netpoll
> 
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>

Applied.

This driver doesn't get a lot of build testing, and that's due to the
platform BCM63XX kconfig option it depends upon.  Probably it should
be exposed more widely to perhaps a more broad dependency such as
CONFIG_OF.

^ permalink raw reply	[flat|nested] 288+ messages in thread

* Re: [PATCH net-next] bcm63xx_enet: Stop pretending to support netpoll
  2014-03-13 19:26                       ` David Miller
@ 2014-03-13 19:42                         ` Florian Fainelli
  2014-03-13 19:58                           ` David Miller
  0 siblings, 1 reply; 288+ messages in thread
From: Florian Fainelli @ 2014-03-13 19:42 UTC (permalink / raw)
  To: David Miller
  Cc: ebiederm, Eric Dumazet, netdev, Cong Wang, mpm, satyam.sharma

2014-03-13 12:26 GMT-07:00 David Miller <davem@davemloft.net>:
> From: ebiederm@xmission.com (Eric W. Biederman)
> Date: Tue, 11 Mar 2014 14:33:35 -0700
>
>> bcm_enet_netpoll does not exist, and causing
>> bcm63xx_net to fail to build when NET_POLL_CONTROLLER
>> is defined.
>>
>> Remove the bogus .ndo_poll_controller = bcm_enet_netpoll
>>
>> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
>
> Applied.
>
> This driver doesn't get a lot of build testing, and that's due to the
> platform BCM63XX kconfig option it depends upon.  Probably it should
> be exposed more widely to perhaps a more broad dependency such as
> CONFIG_OF.

There is no OF-aware platform using that driver. Part of the reason
why it does not get much build coverage is the large header
dependencies provided by arch/mips/include/asm/mach-bcm63xx. I will
work on improving that.
-- 
Florian

^ permalink raw reply	[flat|nested] 288+ messages in thread

* Re: [PATCH net-next] bcm63xx_enet: Stop pretending to support netpoll
  2014-03-13 19:42                         ` Florian Fainelli
@ 2014-03-13 19:58                           ` David Miller
  0 siblings, 0 replies; 288+ messages in thread
From: David Miller @ 2014-03-13 19:58 UTC (permalink / raw)
  To: f.fainelli
  Cc: ebiederm, eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma

From: Florian Fainelli <f.fainelli@gmail.com>
Date: Thu, 13 Mar 2014 12:42:22 -0700

> There is no OF-aware platform using that driver. Part of the reason
> why it does not get much build coverage is the large header
> dependencies provided by arch/mips/include/asm/mach-bcm63xx. I will
> work on improving that.

Thanks in advance.

^ permalink raw reply	[flat|nested] 288+ messages in thread

* Re: [RFC PATCH 1/2] netpoll: Remove dead netpoll_rx code
  2014-03-13 19:23                       ` David Miller
@ 2014-03-13 20:46                         ` Eric W. Biederman
  2014-03-15  1:30                         ` [PATCH 0/9] netpoll: Cleanup received packet processing Eric W. Biederman
  1 sibling, 0 replies; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-13 20:46 UTC (permalink / raw)
  To: David Miller
  Cc: stephen, eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma

David Miller <davem@davemloft.net> writes:

> From: ebiederm@xmission.com (Eric W. Biederman)
> Date: Tue, 11 Mar 2014 13:48:01 -0700
>
>> To play devil's advocate to my own patch.  Does anyone know where kgdb
>> over network (kgdboe) code lives today?
>> 
>> What little I could find in a quick google search strongly suggests that
>> kgdboe was abandoned in 2010 or so.
>> 
>> I am trying to figure out if there are any active out of tree projects
>> that need by directional netpoll.
>
> Good questions.
>
> I, perhaps mistakenly, kept the functionality around because there
> were claims that we'd use it in-tree.
>
> That of course never materialized.
>
> The fact that people have a lot of trouble even finding the kgdboe
> sources is quite telling, indeed.

Also telling is that we actually broken kgdboe support in 2011 when
netpoll_poll was removed.

> Let's kill it, we can pull it back in (perhaps with a better design)
> if something is proposed in-tree that will need it.

Sounds good.  Patches to follow shortly.

> But I'm skeptical we ever will need it, and even if such a
> reinstatement is proposed f.e. for the kgdboe use case it has holes.
>
> Consider the case where kgdboe takes a breakpoint in a hardware
> interrupt handler.  What happens?  We cannot allow it to perform a
> full back-and-forth conversation with the remote gdb from such a
> context.

In interests of full disclosure I missed a subtle detail, and what
the code does today is that when netpoll_poll_dev is running is that it
intercepts and drops all in-coming packets.

So I think the current packet receive design might be salvaged, if
someone gets interested again.  The bitrot is pretty significant
currently.

Still I should point out that the code that drops received skbs is
technically wrong today.  It uses kfree_skb, instead of the needed
dev_kfree_skb_any that is needed to work in any context netpoll_rx
might be called in.

Eric

^ permalink raw reply	[flat|nested] 288+ messages in thread

* Re: [RFC PATCH 2/2] netpoll: Don't poll for received packets
  2014-03-12 18:39                 ` Cong Wang
@ 2014-03-13 20:48                   ` Eric W. Biederman
  0 siblings, 0 replies; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-13 20:48 UTC (permalink / raw)
  To: Cong Wang
  Cc: Eric Dumazet, David Miller, Linux Kernel Network Developers,
	Matt Mackall, Satyam Sharma

Cong Wang <xiyou.wangcong@gmail.com> writes:

> On Tue, Mar 11, 2014 at 1:45 AM, Eric W. Biederman
> <ebiederm@xmission.com> wrote:
>> -       work = napi->poll(napi, budget);
>> -       trace_napi_poll(napi);
>> +       /* Use a budget of 0 to request the drivers not process
>> +        * their receive queue.  Warn when they do anyway.
>> +        */
>> +       work = napi->poll(napi, 0);
>> +       WARN_ON_ONCE(work != 0);
>>
>
> Adding more printk's in netpoll call path would only bring more
> troubles.

That is why I used WARN_ON_ONCE.  So we are alerted to problems but
because it only prints once service won't be denied if nothing except
that warning cares.

Eric

^ permalink raw reply	[flat|nested] 288+ messages in thread

* Re: [PATCH net-next 02/10] 8139too: Call dev_kfree_skby_any instead of dev_kfree_skb.
  2014-03-12 22:01                             ` Eric Dumazet
@ 2014-03-13 21:08                               ` Eric W. Biederman
  2014-03-14  4:26                               ` [PATCH net-next] net: Replace u64_stats_fetch_begin_bh to u64_stats_fetch_begin_irq Eric W. Biederman
  1 sibling, 0 replies; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-13 21:08 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Francois Romieu, David Miller, netdev, xiyou.wangcong, mpm,
	satyam.sharma

Eric Dumazet <eric.dumazet@gmail.com> writes:

> On Wed, 2014-03-12 at 22:24 +0100, Francois Romieu wrote:
>> Eric Dumazet <eric.dumazet@gmail.com> :
>> [...]
>> > Same u64_stats_fetch_begin_bh() problem for this driver to fetch TX
>> > stats.
>> 
>> Same problem for any NAPI context Tx completing driver that claims
>> netpoll support. For instance:
>> drivers/net/ethernet/emulex/benet
>> drivers/net/ethernet/intel/i40e
>> drivers/net/ethernet/intel/igb
>> drivers/net/ethernet/intel/ixgbe
>> drivers/net/ethernet/marvell/sky2.c
>> drivers/net/ethernet/neterion/vxge ?
>> 
>> Similar problem for the drivers below. They update Tx stats in start_xmit
>> and use u64_.*_bh:
>> drivers/net/ethernet/tile/tilepro.c
>> drivers/net/team/team.c
>> drivers/net/virtio_net.c
>> 
>
> Yep, note that this issue is not caused by Eric patches, we need to take
> care of this by providing u64_stats_fetch_{begin|retry}_irq() regardless
> of how skb are freed.

By my read of the code this is actually only a problem on 32bit
uniprocessor, and at worst it scrambles the reported nuumbers if we
happen to have a printk in irq context while we are fetching the stats.

Given the rest of the problems that I am fixing can corrupt things, and
can happen on any platform I am going to ignore this problem for now.

Eric

^ permalink raw reply	[flat|nested] 288+ messages in thread

* [PATCH net-next] net: Replace u64_stats_fetch_begin_bh to u64_stats_fetch_begin_irq
  2014-03-12 22:01                             ` Eric Dumazet
  2014-03-13 21:08                               ` Eric W. Biederman
@ 2014-03-14  4:26                               ` Eric W. Biederman
  2014-03-15  2:41                                 ` David Miller
  1 sibling, 1 reply; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-14  4:26 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Francois Romieu, David Miller, netdev, xiyou.wangcong, mpm,
	satyam.sharma


Replace the bh safe variant with the hard irq safe variant.

We need a hard irq safe variant to deal with netpoll transmitting
packets from hard irq context, and we need it in most if not all of
the places using the bh safe variant.  

Except on 32bit uni-processor the code is exactly the same so don't
bother with a bh variant, just have a hard irq safe variant that
everyone can use.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---

It turned out this was easy so.  So here is this change to get it off of
my plate.

 block/blk-cgroup.h                                |    8 ++++----
 drivers/net/dummy.c                               |    4 ++--
 drivers/net/ethernet/broadcom/b44.c               |    8 ++++----
 drivers/net/ethernet/emulex/benet/be_ethtool.c    |   12 ++++++------
 drivers/net/ethernet/emulex/benet/be_main.c       |   16 ++++++++--------
 drivers/net/ethernet/intel/i40e/i40e_ethtool.c    |    8 ++++----
 drivers/net/ethernet/intel/i40e/i40e_main.c       |   16 ++++++++--------
 drivers/net/ethernet/intel/igb/igb_ethtool.c      |   12 ++++++------
 drivers/net/ethernet/intel/igb/igb_main.c         |    8 ++++----
 drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c  |    8 ++++----
 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c     |    8 ++++----
 drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c |    8 ++++----
 drivers/net/ethernet/marvell/mvneta.c             |    4 ++--
 drivers/net/ethernet/marvell/sky2.c               |    8 ++++----
 drivers/net/ethernet/neterion/vxge/vxge-main.c    |    8 ++++----
 drivers/net/ethernet/nvidia/forcedeth.c           |    8 ++++----
 drivers/net/ethernet/realtek/8139too.c            |    8 ++++----
 drivers/net/ethernet/realtek/r8169.c              |    8 ++++----
 drivers/net/ethernet/tile/tilepro.c               |    4 ++--
 drivers/net/ethernet/via/via-rhine.c              |    8 ++++----
 drivers/net/ifb.c                                 |    8 ++++----
 drivers/net/loopback.c                            |    4 ++--
 drivers/net/macvlan.c                             |    4 ++--
 drivers/net/nlmon.c                               |    4 ++--
 drivers/net/team/team.c                           |    4 ++--
 drivers/net/team/team_mode_loadbalance.c          |    4 ++--
 drivers/net/veth.c                                |    4 ++--
 drivers/net/virtio_net.c                          |    8 ++++----
 drivers/net/xen-netfront.c                        |    4 ++--
 include/linux/u64_stats_sync.h                    |   16 ++++++++--------
 net/8021q/vlan_dev.c                              |    4 ++--
 net/bridge/br_device.c                            |    4 ++--
 net/ipv4/af_inet.c                                |    4 ++--
 net/ipv4/ip_tunnel_core.c                         |    4 ++--
 net/ipv6/ip6_tunnel.c                             |    4 ++--
 net/netfilter/ipvs/ip_vs_ctl.c                    |    4 ++--
 net/openvswitch/datapath.c                        |    4 ++--
 net/openvswitch/vport.c                           |    4 ++--
 38 files changed, 132 insertions(+), 132 deletions(-)

diff --git a/block/blk-cgroup.h b/block/blk-cgroup.h
index 86154eab9523..604f6d99ab92 100644
--- a/block/blk-cgroup.h
+++ b/block/blk-cgroup.h
@@ -435,9 +435,9 @@ static inline uint64_t blkg_stat_read(struct blkg_stat *stat)
 	uint64_t v;
 
 	do {
-		start = u64_stats_fetch_begin_bh(&stat->syncp);
+		start = u64_stats_fetch_begin_irq(&stat->syncp);
 		v = stat->cnt;
-	} while (u64_stats_fetch_retry_bh(&stat->syncp, start));
+	} while (u64_stats_fetch_retry_irq(&stat->syncp, start));
 
 	return v;
 }
@@ -508,9 +508,9 @@ static inline struct blkg_rwstat blkg_rwstat_read(struct blkg_rwstat *rwstat)
 	struct blkg_rwstat tmp;
 
 	do {
-		start = u64_stats_fetch_begin_bh(&rwstat->syncp);
+		start = u64_stats_fetch_begin_irq(&rwstat->syncp);
 		tmp = *rwstat;
-	} while (u64_stats_fetch_retry_bh(&rwstat->syncp, start));
+	} while (u64_stats_fetch_retry_irq(&rwstat->syncp, start));
 
 	return tmp;
 }
diff --git a/drivers/net/dummy.c b/drivers/net/dummy.c
index 1656317c96f8..0932ffbf381b 100644
--- a/drivers/net/dummy.c
+++ b/drivers/net/dummy.c
@@ -63,10 +63,10 @@ static struct rtnl_link_stats64 *dummy_get_stats64(struct net_device *dev,
 
 		dstats = per_cpu_ptr(dev->dstats, i);
 		do {
-			start = u64_stats_fetch_begin_bh(&dstats->syncp);
+			start = u64_stats_fetch_begin_irq(&dstats->syncp);
 			tbytes = dstats->tx_bytes;
 			tpackets = dstats->tx_packets;
-		} while (u64_stats_fetch_retry_bh(&dstats->syncp, start));
+		} while (u64_stats_fetch_retry_irq(&dstats->syncp, start));
 		stats->tx_bytes += tbytes;
 		stats->tx_packets += tpackets;
 	}
diff --git a/drivers/net/ethernet/broadcom/b44.c b/drivers/net/ethernet/broadcom/b44.c
index 8a7bf7dad898..05ba62589017 100644
--- a/drivers/net/ethernet/broadcom/b44.c
+++ b/drivers/net/ethernet/broadcom/b44.c
@@ -1685,7 +1685,7 @@ static struct rtnl_link_stats64 *b44_get_stats64(struct net_device *dev,
 	unsigned int start;
 
 	do {
-		start = u64_stats_fetch_begin_bh(&hwstat->syncp);
+		start = u64_stats_fetch_begin_irq(&hwstat->syncp);
 
 		/* Convert HW stats into rtnl_link_stats64 stats. */
 		nstat->rx_packets = hwstat->rx_pkts;
@@ -1719,7 +1719,7 @@ static struct rtnl_link_stats64 *b44_get_stats64(struct net_device *dev,
 		/* Carrier lost counter seems to be broken for some devices */
 		nstat->tx_carrier_errors = hwstat->tx_carrier_lost;
 #endif
-	} while (u64_stats_fetch_retry_bh(&hwstat->syncp, start));
+	} while (u64_stats_fetch_retry_irq(&hwstat->syncp, start));
 
 	return nstat;
 }
@@ -2073,12 +2073,12 @@ static void b44_get_ethtool_stats(struct net_device *dev,
 	do {
 		data_src = &hwstat->tx_good_octets;
 		data_dst = data;
-		start = u64_stats_fetch_begin_bh(&hwstat->syncp);
+		start = u64_stats_fetch_begin_irq(&hwstat->syncp);
 
 		for (i = 0; i < ARRAY_SIZE(b44_gstrings); i++)
 			*data_dst++ = *data_src++;
 
-	} while (u64_stats_fetch_retry_bh(&hwstat->syncp, start));
+	} while (u64_stats_fetch_retry_irq(&hwstat->syncp, start));
 }
 
 static void b44_get_wol(struct net_device *dev, struct ethtool_wolinfo *wol)
diff --git a/drivers/net/ethernet/emulex/benet/be_ethtool.c b/drivers/net/ethernet/emulex/benet/be_ethtool.c
index cf09d8faca84..252fb59f65b3 100644
--- a/drivers/net/ethernet/emulex/benet/be_ethtool.c
+++ b/drivers/net/ethernet/emulex/benet/be_ethtool.c
@@ -357,10 +357,10 @@ be_get_ethtool_stats(struct net_device *netdev,
 		struct be_rx_stats *stats = rx_stats(rxo);
 
 		do {
-			start = u64_stats_fetch_begin_bh(&stats->sync);
+			start = u64_stats_fetch_begin_irq(&stats->sync);
 			data[base] = stats->rx_bytes;
 			data[base + 1] = stats->rx_pkts;
-		} while (u64_stats_fetch_retry_bh(&stats->sync, start));
+		} while (u64_stats_fetch_retry_irq(&stats->sync, start));
 
 		for (i = 2; i < ETHTOOL_RXSTATS_NUM; i++) {
 			p = (u8 *)stats + et_rx_stats[i].offset;
@@ -373,19 +373,19 @@ be_get_ethtool_stats(struct net_device *netdev,
 		struct be_tx_stats *stats = tx_stats(txo);
 
 		do {
-			start = u64_stats_fetch_begin_bh(&stats->sync_compl);
+			start = u64_stats_fetch_begin_irq(&stats->sync_compl);
 			data[base] = stats->tx_compl;
-		} while (u64_stats_fetch_retry_bh(&stats->sync_compl, start));
+		} while (u64_stats_fetch_retry_irq(&stats->sync_compl, start));
 
 		do {
-			start = u64_stats_fetch_begin_bh(&stats->sync);
+			start = u64_stats_fetch_begin_irq(&stats->sync);
 			for (i = 1; i < ETHTOOL_TXSTATS_NUM; i++) {
 				p = (u8 *)stats + et_tx_stats[i].offset;
 				data[base + i] =
 					(et_tx_stats[i].size == sizeof(u64)) ?
 						*(u64 *)p : *(u32 *)p;
 			}
-		} while (u64_stats_fetch_retry_bh(&stats->sync, start));
+		} while (u64_stats_fetch_retry_irq(&stats->sync, start));
 		base += ETHTOOL_TXSTATS_NUM;
 	}
 }
diff --git a/drivers/net/ethernet/emulex/benet/be_main.c b/drivers/net/ethernet/emulex/benet/be_main.c
index 2eee0b2577f8..768912dc6e2c 100644
--- a/drivers/net/ethernet/emulex/benet/be_main.c
+++ b/drivers/net/ethernet/emulex/benet/be_main.c
@@ -591,10 +591,10 @@ static struct rtnl_link_stats64 *be_get_stats64(struct net_device *netdev,
 	for_all_rx_queues(adapter, rxo, i) {
 		const struct be_rx_stats *rx_stats = rx_stats(rxo);
 		do {
-			start = u64_stats_fetch_begin_bh(&rx_stats->sync);
+			start = u64_stats_fetch_begin_irq(&rx_stats->sync);
 			pkts = rx_stats(rxo)->rx_pkts;
 			bytes = rx_stats(rxo)->rx_bytes;
-		} while (u64_stats_fetch_retry_bh(&rx_stats->sync, start));
+		} while (u64_stats_fetch_retry_irq(&rx_stats->sync, start));
 		stats->rx_packets += pkts;
 		stats->rx_bytes += bytes;
 		stats->multicast += rx_stats(rxo)->rx_mcast_pkts;
@@ -605,10 +605,10 @@ static struct rtnl_link_stats64 *be_get_stats64(struct net_device *netdev,
 	for_all_tx_queues(adapter, txo, i) {
 		const struct be_tx_stats *tx_stats = tx_stats(txo);
 		do {
-			start = u64_stats_fetch_begin_bh(&tx_stats->sync);
+			start = u64_stats_fetch_begin_irq(&tx_stats->sync);
 			pkts = tx_stats(txo)->tx_pkts;
 			bytes = tx_stats(txo)->tx_bytes;
-		} while (u64_stats_fetch_retry_bh(&tx_stats->sync, start));
+		} while (u64_stats_fetch_retry_irq(&tx_stats->sync, start));
 		stats->tx_packets += pkts;
 		stats->tx_bytes += bytes;
 	}
@@ -1386,15 +1386,15 @@ static void be_eqd_update(struct be_adapter *adapter)
 
 		rxo = &adapter->rx_obj[eqo->idx];
 		do {
-			start = u64_stats_fetch_begin_bh(&rxo->stats.sync);
+			start = u64_stats_fetch_begin_irq(&rxo->stats.sync);
 			rx_pkts = rxo->stats.rx_pkts;
-		} while (u64_stats_fetch_retry_bh(&rxo->stats.sync, start));
+		} while (u64_stats_fetch_retry_irq(&rxo->stats.sync, start));
 
 		txo = &adapter->tx_obj[eqo->idx];
 		do {
-			start = u64_stats_fetch_begin_bh(&txo->stats.sync);
+			start = u64_stats_fetch_begin_irq(&txo->stats.sync);
 			tx_pkts = txo->stats.tx_reqs;
-		} while (u64_stats_fetch_retry_bh(&txo->stats.sync, start));
+		} while (u64_stats_fetch_retry_irq(&txo->stats.sync, start));
 
 
 		/* Skip, if wrapped around or first calculation */
diff --git a/drivers/net/ethernet/intel/i40e/i40e_ethtool.c b/drivers/net/ethernet/intel/i40e/i40e_ethtool.c
index b1d7d8c5cb9b..6e61691c3abe 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_ethtool.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_ethtool.c
@@ -649,18 +649,18 @@ static void i40e_get_ethtool_stats(struct net_device *netdev,
 
 		/* process Tx ring statistics */
 		do {
-			start = u64_stats_fetch_begin_bh(&tx_ring->syncp);
+			start = u64_stats_fetch_begin_irq(&tx_ring->syncp);
 			data[i] = tx_ring->stats.packets;
 			data[i + 1] = tx_ring->stats.bytes;
-		} while (u64_stats_fetch_retry_bh(&tx_ring->syncp, start));
+		} while (u64_stats_fetch_retry_irq(&tx_ring->syncp, start));
 
 		/* Rx ring is the 2nd half of the queue pair */
 		rx_ring = &tx_ring[1];
 		do {
-			start = u64_stats_fetch_begin_bh(&rx_ring->syncp);
+			start = u64_stats_fetch_begin_irq(&rx_ring->syncp);
 			data[i + 2] = rx_ring->stats.packets;
 			data[i + 3] = rx_ring->stats.bytes;
-		} while (u64_stats_fetch_retry_bh(&rx_ring->syncp, start));
+		} while (u64_stats_fetch_retry_irq(&rx_ring->syncp, start));
 	}
 	rcu_read_unlock();
 	if (vsi == pf->vsi[pf->lan_vsi]) {
diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c
index 53f3ed2df796..9cf8d5df1362 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -376,20 +376,20 @@ static struct rtnl_link_stats64 *i40e_get_netdev_stats_struct(
 			continue;
 
 		do {
-			start = u64_stats_fetch_begin_bh(&tx_ring->syncp);
+			start = u64_stats_fetch_begin_irq(&tx_ring->syncp);
 			packets = tx_ring->stats.packets;
 			bytes   = tx_ring->stats.bytes;
-		} while (u64_stats_fetch_retry_bh(&tx_ring->syncp, start));
+		} while (u64_stats_fetch_retry_irq(&tx_ring->syncp, start));
 
 		stats->tx_packets += packets;
 		stats->tx_bytes   += bytes;
 		rx_ring = &tx_ring[1];
 
 		do {
-			start = u64_stats_fetch_begin_bh(&rx_ring->syncp);
+			start = u64_stats_fetch_begin_irq(&rx_ring->syncp);
 			packets = rx_ring->stats.packets;
 			bytes   = rx_ring->stats.bytes;
-		} while (u64_stats_fetch_retry_bh(&rx_ring->syncp, start));
+		} while (u64_stats_fetch_retry_irq(&rx_ring->syncp, start));
 
 		stats->rx_packets += packets;
 		stats->rx_bytes   += bytes;
@@ -770,10 +770,10 @@ void i40e_update_stats(struct i40e_vsi *vsi)
 		p = ACCESS_ONCE(vsi->tx_rings[q]);
 
 		do {
-			start = u64_stats_fetch_begin_bh(&p->syncp);
+			start = u64_stats_fetch_begin_irq(&p->syncp);
 			packets = p->stats.packets;
 			bytes = p->stats.bytes;
-		} while (u64_stats_fetch_retry_bh(&p->syncp, start));
+		} while (u64_stats_fetch_retry_irq(&p->syncp, start));
 		tx_b += bytes;
 		tx_p += packets;
 		tx_restart += p->tx_stats.restart_queue;
@@ -782,10 +782,10 @@ void i40e_update_stats(struct i40e_vsi *vsi)
 		/* Rx queue is part of the same block as Tx queue */
 		p = &p[1];
 		do {
-			start = u64_stats_fetch_begin_bh(&p->syncp);
+			start = u64_stats_fetch_begin_irq(&p->syncp);
 			packets = p->stats.packets;
 			bytes = p->stats.bytes;
-		} while (u64_stats_fetch_retry_bh(&p->syncp, start));
+		} while (u64_stats_fetch_retry_irq(&p->syncp, start));
 		rx_b += bytes;
 		rx_p += packets;
 		rx_buf += p->rx_stats.alloc_buff_failed;
diff --git a/drivers/net/ethernet/intel/igb/igb_ethtool.c b/drivers/net/ethernet/intel/igb/igb_ethtool.c
index c7f574165298..ffcc423a7353 100644
--- a/drivers/net/ethernet/intel/igb/igb_ethtool.c
+++ b/drivers/net/ethernet/intel/igb/igb_ethtool.c
@@ -2273,15 +2273,15 @@ static void igb_get_ethtool_stats(struct net_device *netdev,
 
 		ring = adapter->tx_ring[j];
 		do {
-			start = u64_stats_fetch_begin_bh(&ring->tx_syncp);
+			start = u64_stats_fetch_begin_irq(&ring->tx_syncp);
 			data[i]   = ring->tx_stats.packets;
 			data[i+1] = ring->tx_stats.bytes;
 			data[i+2] = ring->tx_stats.restart_queue;
-		} while (u64_stats_fetch_retry_bh(&ring->tx_syncp, start));
+		} while (u64_stats_fetch_retry_irq(&ring->tx_syncp, start));
 		do {
-			start = u64_stats_fetch_begin_bh(&ring->tx_syncp2);
+			start = u64_stats_fetch_begin_irq(&ring->tx_syncp2);
 			restart2  = ring->tx_stats.restart_queue2;
-		} while (u64_stats_fetch_retry_bh(&ring->tx_syncp2, start));
+		} while (u64_stats_fetch_retry_irq(&ring->tx_syncp2, start));
 		data[i+2] += restart2;
 
 		i += IGB_TX_QUEUE_STATS_LEN;
@@ -2289,13 +2289,13 @@ static void igb_get_ethtool_stats(struct net_device *netdev,
 	for (j = 0; j < adapter->num_rx_queues; j++) {
 		ring = adapter->rx_ring[j];
 		do {
-			start = u64_stats_fetch_begin_bh(&ring->rx_syncp);
+			start = u64_stats_fetch_begin_irq(&ring->rx_syncp);
 			data[i]   = ring->rx_stats.packets;
 			data[i+1] = ring->rx_stats.bytes;
 			data[i+2] = ring->rx_stats.drops;
 			data[i+3] = ring->rx_stats.csum_err;
 			data[i+4] = ring->rx_stats.alloc_failed;
-		} while (u64_stats_fetch_retry_bh(&ring->rx_syncp, start));
+		} while (u64_stats_fetch_retry_irq(&ring->rx_syncp, start));
 		i += IGB_RX_QUEUE_STATS_LEN;
 	}
 	spin_unlock(&adapter->stats64_lock);
diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c
index 42cf29d94d62..8e14c1ba0a15 100644
--- a/drivers/net/ethernet/intel/igb/igb_main.c
+++ b/drivers/net/ethernet/intel/igb/igb_main.c
@@ -5127,10 +5127,10 @@ void igb_update_stats(struct igb_adapter *adapter,
 		}
 
 		do {
-			start = u64_stats_fetch_begin_bh(&ring->rx_syncp);
+			start = u64_stats_fetch_begin_irq(&ring->rx_syncp);
 			_bytes = ring->rx_stats.bytes;
 			_packets = ring->rx_stats.packets;
-		} while (u64_stats_fetch_retry_bh(&ring->rx_syncp, start));
+		} while (u64_stats_fetch_retry_irq(&ring->rx_syncp, start));
 		bytes += _bytes;
 		packets += _packets;
 	}
@@ -5143,10 +5143,10 @@ void igb_update_stats(struct igb_adapter *adapter,
 	for (i = 0; i < adapter->num_tx_queues; i++) {
 		struct igb_ring *ring = adapter->tx_ring[i];
 		do {
-			start = u64_stats_fetch_begin_bh(&ring->tx_syncp);
+			start = u64_stats_fetch_begin_irq(&ring->tx_syncp);
 			_bytes = ring->tx_stats.bytes;
 			_packets = ring->tx_stats.packets;
-		} while (u64_stats_fetch_retry_bh(&ring->tx_syncp, start));
+		} while (u64_stats_fetch_retry_irq(&ring->tx_syncp, start));
 		bytes += _bytes;
 		packets += _packets;
 	}
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c
index f2d35c04159c..beada78dba63 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c
@@ -1127,10 +1127,10 @@ static void ixgbe_get_ethtool_stats(struct net_device *netdev,
 		}
 
 		do {
-			start = u64_stats_fetch_begin_bh(&ring->syncp);
+			start = u64_stats_fetch_begin_irq(&ring->syncp);
 			data[i]   = ring->stats.packets;
 			data[i+1] = ring->stats.bytes;
-		} while (u64_stats_fetch_retry_bh(&ring->syncp, start));
+		} while (u64_stats_fetch_retry_irq(&ring->syncp, start));
 		i += 2;
 #ifdef BP_EXTENDED_STATS
 		data[i] = ring->stats.yields;
@@ -1155,10 +1155,10 @@ static void ixgbe_get_ethtool_stats(struct net_device *netdev,
 		}
 
 		do {
-			start = u64_stats_fetch_begin_bh(&ring->syncp);
+			start = u64_stats_fetch_begin_irq(&ring->syncp);
 			data[i]   = ring->stats.packets;
 			data[i+1] = ring->stats.bytes;
-		} while (u64_stats_fetch_retry_bh(&ring->syncp, start));
+		} while (u64_stats_fetch_retry_irq(&ring->syncp, start));
 		i += 2;
 #ifdef BP_EXTENDED_STATS
 		data[i] = ring->stats.yields;
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
index d6451c0e8b8d..7ad4466039da 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
@@ -7293,10 +7293,10 @@ static struct rtnl_link_stats64 *ixgbe_get_stats64(struct net_device *netdev,
 
 		if (ring) {
 			do {
-				start = u64_stats_fetch_begin_bh(&ring->syncp);
+				start = u64_stats_fetch_begin_irq(&ring->syncp);
 				packets = ring->stats.packets;
 				bytes   = ring->stats.bytes;
-			} while (u64_stats_fetch_retry_bh(&ring->syncp, start));
+			} while (u64_stats_fetch_retry_irq(&ring->syncp, start));
 			stats->rx_packets += packets;
 			stats->rx_bytes   += bytes;
 		}
@@ -7309,10 +7309,10 @@ static struct rtnl_link_stats64 *ixgbe_get_stats64(struct net_device *netdev,
 
 		if (ring) {
 			do {
-				start = u64_stats_fetch_begin_bh(&ring->syncp);
+				start = u64_stats_fetch_begin_irq(&ring->syncp);
 				packets = ring->stats.packets;
 				bytes   = ring->stats.bytes;
-			} while (u64_stats_fetch_retry_bh(&ring->syncp, start));
+			} while (u64_stats_fetch_retry_irq(&ring->syncp, start));
 			stats->tx_packets += packets;
 			stats->tx_bytes   += bytes;
 		}
diff --git a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
index 57e0cd89b3dc..a4c7c280c612 100644
--- a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
+++ b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
@@ -3337,10 +3337,10 @@ static struct rtnl_link_stats64 *ixgbevf_get_stats(struct net_device *netdev,
 	for (i = 0; i < adapter->num_rx_queues; i++) {
 		ring = adapter->rx_ring[i];
 		do {
-			start = u64_stats_fetch_begin_bh(&ring->syncp);
+			start = u64_stats_fetch_begin_irq(&ring->syncp);
 			bytes = ring->stats.bytes;
 			packets = ring->stats.packets;
-		} while (u64_stats_fetch_retry_bh(&ring->syncp, start));
+		} while (u64_stats_fetch_retry_irq(&ring->syncp, start));
 		stats->rx_bytes += bytes;
 		stats->rx_packets += packets;
 	}
@@ -3348,10 +3348,10 @@ static struct rtnl_link_stats64 *ixgbevf_get_stats(struct net_device *netdev,
 	for (i = 0; i < adapter->num_tx_queues; i++) {
 		ring = adapter->tx_ring[i];
 		do {
-			start = u64_stats_fetch_begin_bh(&ring->syncp);
+			start = u64_stats_fetch_begin_irq(&ring->syncp);
 			bytes = ring->stats.bytes;
 			packets = ring->stats.packets;
-		} while (u64_stats_fetch_retry_bh(&ring->syncp, start));
+		} while (u64_stats_fetch_retry_irq(&ring->syncp, start));
 		stats->tx_bytes += bytes;
 		stats->tx_packets += packets;
 	}
diff --git a/drivers/net/ethernet/marvell/mvneta.c b/drivers/net/ethernet/marvell/mvneta.c
index 12c6a66e54d1..f3afcbdbb725 100644
--- a/drivers/net/ethernet/marvell/mvneta.c
+++ b/drivers/net/ethernet/marvell/mvneta.c
@@ -508,12 +508,12 @@ struct rtnl_link_stats64 *mvneta_get_stats64(struct net_device *dev,
 
 		cpu_stats = per_cpu_ptr(pp->stats, cpu);
 		do {
-			start = u64_stats_fetch_begin_bh(&cpu_stats->syncp);
+			start = u64_stats_fetch_begin_irq(&cpu_stats->syncp);
 			rx_packets = cpu_stats->rx_packets;
 			rx_bytes   = cpu_stats->rx_bytes;
 			tx_packets = cpu_stats->tx_packets;
 			tx_bytes   = cpu_stats->tx_bytes;
-		} while (u64_stats_fetch_retry_bh(&cpu_stats->syncp, start));
+		} while (u64_stats_fetch_retry_irq(&cpu_stats->syncp, start));
 
 		stats->rx_packets += rx_packets;
 		stats->rx_bytes   += rx_bytes;
diff --git a/drivers/net/ethernet/marvell/sky2.c b/drivers/net/ethernet/marvell/sky2.c
index 55a37ae11440..e8daafde5f23 100644
--- a/drivers/net/ethernet/marvell/sky2.c
+++ b/drivers/net/ethernet/marvell/sky2.c
@@ -3906,19 +3906,19 @@ static struct rtnl_link_stats64 *sky2_get_stats(struct net_device *dev,
 	u64 _bytes, _packets;
 
 	do {
-		start = u64_stats_fetch_begin_bh(&sky2->rx_stats.syncp);
+		start = u64_stats_fetch_begin_irq(&sky2->rx_stats.syncp);
 		_bytes = sky2->rx_stats.bytes;
 		_packets = sky2->rx_stats.packets;
-	} while (u64_stats_fetch_retry_bh(&sky2->rx_stats.syncp, start));
+	} while (u64_stats_fetch_retry_irq(&sky2->rx_stats.syncp, start));
 
 	stats->rx_packets = _packets;
 	stats->rx_bytes = _bytes;
 
 	do {
-		start = u64_stats_fetch_begin_bh(&sky2->tx_stats.syncp);
+		start = u64_stats_fetch_begin_irq(&sky2->tx_stats.syncp);
 		_bytes = sky2->tx_stats.bytes;
 		_packets = sky2->tx_stats.packets;
-	} while (u64_stats_fetch_retry_bh(&sky2->tx_stats.syncp, start));
+	} while (u64_stats_fetch_retry_irq(&sky2->tx_stats.syncp, start));
 
 	stats->tx_packets = _packets;
 	stats->tx_bytes = _bytes;
diff --git a/drivers/net/ethernet/neterion/vxge/vxge-main.c b/drivers/net/ethernet/neterion/vxge/vxge-main.c
index c83cedd26dec..c5bb1ace4a74 100644
--- a/drivers/net/ethernet/neterion/vxge/vxge-main.c
+++ b/drivers/net/ethernet/neterion/vxge/vxge-main.c
@@ -3134,12 +3134,12 @@ vxge_get_stats64(struct net_device *dev, struct rtnl_link_stats64 *net_stats)
 		u64 packets, bytes, multicast;
 
 		do {
-			start = u64_stats_fetch_begin_bh(&rxstats->syncp);
+			start = u64_stats_fetch_begin_irq(&rxstats->syncp);
 
 			packets   = rxstats->rx_frms;
 			multicast = rxstats->rx_mcast;
 			bytes     = rxstats->rx_bytes;
-		} while (u64_stats_fetch_retry_bh(&rxstats->syncp, start));
+		} while (u64_stats_fetch_retry_irq(&rxstats->syncp, start));
 
 		net_stats->rx_packets += packets;
 		net_stats->rx_bytes += bytes;
@@ -3149,11 +3149,11 @@ vxge_get_stats64(struct net_device *dev, struct rtnl_link_stats64 *net_stats)
 		net_stats->rx_dropped += rxstats->rx_dropped;
 
 		do {
-			start = u64_stats_fetch_begin_bh(&txstats->syncp);
+			start = u64_stats_fetch_begin_irq(&txstats->syncp);
 
 			packets = txstats->tx_frms;
 			bytes   = txstats->tx_bytes;
-		} while (u64_stats_fetch_retry_bh(&txstats->syncp, start));
+		} while (u64_stats_fetch_retry_irq(&txstats->syncp, start));
 
 		net_stats->tx_packets += packets;
 		net_stats->tx_bytes += bytes;
diff --git a/drivers/net/ethernet/nvidia/forcedeth.c b/drivers/net/ethernet/nvidia/forcedeth.c
index bad3c057ee8a..811be0bccd14 100644
--- a/drivers/net/ethernet/nvidia/forcedeth.c
+++ b/drivers/net/ethernet/nvidia/forcedeth.c
@@ -1753,19 +1753,19 @@ nv_get_stats64(struct net_device *dev, struct rtnl_link_stats64 *storage)
 
 	/* software stats */
 	do {
-		syncp_start = u64_stats_fetch_begin_bh(&np->swstats_rx_syncp);
+		syncp_start = u64_stats_fetch_begin_irq(&np->swstats_rx_syncp);
 		storage->rx_packets       = np->stat_rx_packets;
 		storage->rx_bytes         = np->stat_rx_bytes;
 		storage->rx_dropped       = np->stat_rx_dropped;
 		storage->rx_missed_errors = np->stat_rx_missed_errors;
-	} while (u64_stats_fetch_retry_bh(&np->swstats_rx_syncp, syncp_start));
+	} while (u64_stats_fetch_retry_irq(&np->swstats_rx_syncp, syncp_start));
 
 	do {
-		syncp_start = u64_stats_fetch_begin_bh(&np->swstats_tx_syncp);
+		syncp_start = u64_stats_fetch_begin_irq(&np->swstats_tx_syncp);
 		storage->tx_packets = np->stat_tx_packets;
 		storage->tx_bytes   = np->stat_tx_bytes;
 		storage->tx_dropped = np->stat_tx_dropped;
-	} while (u64_stats_fetch_retry_bh(&np->swstats_tx_syncp, syncp_start));
+	} while (u64_stats_fetch_retry_irq(&np->swstats_tx_syncp, syncp_start));
 
 	/* If the nic supports hw counters then retrieve latest values */
 	if (np->driver_data & DEV_HAS_STATISTICS_V123) {
diff --git a/drivers/net/ethernet/realtek/8139too.c b/drivers/net/ethernet/realtek/8139too.c
index 8cb2f357026e..2e5df148af4c 100644
--- a/drivers/net/ethernet/realtek/8139too.c
+++ b/drivers/net/ethernet/realtek/8139too.c
@@ -2522,16 +2522,16 @@ rtl8139_get_stats64(struct net_device *dev, struct rtnl_link_stats64 *stats)
 	netdev_stats_to_stats64(stats, &dev->stats);
 
 	do {
-		start = u64_stats_fetch_begin_bh(&tp->rx_stats.syncp);
+		start = u64_stats_fetch_begin_irq(&tp->rx_stats.syncp);
 		stats->rx_packets = tp->rx_stats.packets;
 		stats->rx_bytes = tp->rx_stats.bytes;
-	} while (u64_stats_fetch_retry_bh(&tp->rx_stats.syncp, start));
+	} while (u64_stats_fetch_retry_irq(&tp->rx_stats.syncp, start));
 
 	do {
-		start = u64_stats_fetch_begin_bh(&tp->tx_stats.syncp);
+		start = u64_stats_fetch_begin_irq(&tp->tx_stats.syncp);
 		stats->tx_packets = tp->tx_stats.packets;
 		stats->tx_bytes = tp->tx_stats.bytes;
-	} while (u64_stats_fetch_retry_bh(&tp->tx_stats.syncp, start));
+	} while (u64_stats_fetch_retry_irq(&tp->tx_stats.syncp, start));
 
 	return stats;
 }
diff --git a/drivers/net/ethernet/realtek/r8169.c b/drivers/net/ethernet/realtek/r8169.c
index cf947337e0d6..51b20a731a70 100644
--- a/drivers/net/ethernet/realtek/r8169.c
+++ b/drivers/net/ethernet/realtek/r8169.c
@@ -6590,17 +6590,17 @@ rtl8169_get_stats64(struct net_device *dev, struct rtnl_link_stats64 *stats)
 		rtl8169_rx_missed(dev, ioaddr);
 
 	do {
-		start = u64_stats_fetch_begin_bh(&tp->rx_stats.syncp);
+		start = u64_stats_fetch_begin_irq(&tp->rx_stats.syncp);
 		stats->rx_packets = tp->rx_stats.packets;
 		stats->rx_bytes	= tp->rx_stats.bytes;
-	} while (u64_stats_fetch_retry_bh(&tp->rx_stats.syncp, start));
+	} while (u64_stats_fetch_retry_irq(&tp->rx_stats.syncp, start));
 
 
 	do {
-		start = u64_stats_fetch_begin_bh(&tp->tx_stats.syncp);
+		start = u64_stats_fetch_begin_irq(&tp->tx_stats.syncp);
 		stats->tx_packets = tp->tx_stats.packets;
 		stats->tx_bytes	= tp->tx_stats.bytes;
-	} while (u64_stats_fetch_retry_bh(&tp->tx_stats.syncp, start));
+	} while (u64_stats_fetch_retry_irq(&tp->tx_stats.syncp, start));
 
 	stats->rx_dropped	= dev->stats.rx_dropped;
 	stats->tx_dropped	= dev->stats.tx_dropped;
diff --git a/drivers/net/ethernet/tile/tilepro.c b/drivers/net/ethernet/tile/tilepro.c
index edb2e12a0fe2..7e33973487ee 100644
--- a/drivers/net/ethernet/tile/tilepro.c
+++ b/drivers/net/ethernet/tile/tilepro.c
@@ -2068,14 +2068,14 @@ static struct rtnl_link_stats64 *tile_net_get_stats64(struct net_device *dev,
 		cpu_stats = &priv->cpu[i]->stats;
 
 		do {
-			start = u64_stats_fetch_begin_bh(&cpu_stats->syncp);
+			start = u64_stats_fetch_begin_irq(&cpu_stats->syncp);
 			trx_packets = cpu_stats->rx_packets;
 			ttx_packets = cpu_stats->tx_packets;
 			trx_bytes   = cpu_stats->rx_bytes;
 			ttx_bytes   = cpu_stats->tx_bytes;
 			trx_errors  = cpu_stats->rx_errors;
 			trx_dropped = cpu_stats->rx_dropped;
-		} while (u64_stats_fetch_retry_bh(&cpu_stats->syncp, start));
+		} while (u64_stats_fetch_retry_irq(&cpu_stats->syncp, start));
 
 		rx_packets += trx_packets;
 		tx_packets += ttx_packets;
diff --git a/drivers/net/ethernet/via/via-rhine.c b/drivers/net/ethernet/via/via-rhine.c
index ef312bc6b865..5bc1a2d02dc1 100644
--- a/drivers/net/ethernet/via/via-rhine.c
+++ b/drivers/net/ethernet/via/via-rhine.c
@@ -2070,16 +2070,16 @@ rhine_get_stats64(struct net_device *dev, struct rtnl_link_stats64 *stats)
 	netdev_stats_to_stats64(stats, &dev->stats);
 
 	do {
-		start = u64_stats_fetch_begin_bh(&rp->rx_stats.syncp);
+		start = u64_stats_fetch_begin_irq(&rp->rx_stats.syncp);
 		stats->rx_packets = rp->rx_stats.packets;
 		stats->rx_bytes = rp->rx_stats.bytes;
-	} while (u64_stats_fetch_retry_bh(&rp->rx_stats.syncp, start));
+	} while (u64_stats_fetch_retry_irq(&rp->rx_stats.syncp, start));
 
 	do {
-		start = u64_stats_fetch_begin_bh(&rp->tx_stats.syncp);
+		start = u64_stats_fetch_begin_irq(&rp->tx_stats.syncp);
 		stats->tx_packets = rp->tx_stats.packets;
 		stats->tx_bytes = rp->tx_stats.bytes;
-	} while (u64_stats_fetch_retry_bh(&rp->tx_stats.syncp, start));
+	} while (u64_stats_fetch_retry_irq(&rp->tx_stats.syncp, start));
 
 	return stats;
 }
diff --git a/drivers/net/ifb.c b/drivers/net/ifb.c
index c14d39bf32d0..1da36764b1a4 100644
--- a/drivers/net/ifb.c
+++ b/drivers/net/ifb.c
@@ -136,18 +136,18 @@ static struct rtnl_link_stats64 *ifb_stats64(struct net_device *dev,
 	unsigned int start;
 
 	do {
-		start = u64_stats_fetch_begin_bh(&dp->rsync);
+		start = u64_stats_fetch_begin_irq(&dp->rsync);
 		stats->rx_packets = dp->rx_packets;
 		stats->rx_bytes = dp->rx_bytes;
-	} while (u64_stats_fetch_retry_bh(&dp->rsync, start));
+	} while (u64_stats_fetch_retry_irq(&dp->rsync, start));
 
 	do {
-		start = u64_stats_fetch_begin_bh(&dp->tsync);
+		start = u64_stats_fetch_begin_irq(&dp->tsync);
 
 		stats->tx_packets = dp->tx_packets;
 		stats->tx_bytes = dp->tx_bytes;
 
-	} while (u64_stats_fetch_retry_bh(&dp->tsync, start));
+	} while (u64_stats_fetch_retry_irq(&dp->tsync, start));
 
 	stats->rx_dropped = dev->stats.rx_dropped;
 	stats->tx_dropped = dev->stats.tx_dropped;
diff --git a/drivers/net/loopback.c b/drivers/net/loopback.c
index 282effee7e1c..bb96409f8c05 100644
--- a/drivers/net/loopback.c
+++ b/drivers/net/loopback.c
@@ -111,10 +111,10 @@ static struct rtnl_link_stats64 *loopback_get_stats64(struct net_device *dev,
 
 		lb_stats = per_cpu_ptr(dev->lstats, i);
 		do {
-			start = u64_stats_fetch_begin_bh(&lb_stats->syncp);
+			start = u64_stats_fetch_begin_irq(&lb_stats->syncp);
 			tbytes = lb_stats->bytes;
 			tpackets = lb_stats->packets;
-		} while (u64_stats_fetch_retry_bh(&lb_stats->syncp, start));
+		} while (u64_stats_fetch_retry_irq(&lb_stats->syncp, start));
 		bytes   += tbytes;
 		packets += tpackets;
 	}
diff --git a/drivers/net/macvlan.c b/drivers/net/macvlan.c
index c683ac2c8c94..753a8c23d15d 100644
--- a/drivers/net/macvlan.c
+++ b/drivers/net/macvlan.c
@@ -582,13 +582,13 @@ static struct rtnl_link_stats64 *macvlan_dev_get_stats64(struct net_device *dev,
 		for_each_possible_cpu(i) {
 			p = per_cpu_ptr(vlan->pcpu_stats, i);
 			do {
-				start = u64_stats_fetch_begin_bh(&p->syncp);
+				start = u64_stats_fetch_begin_irq(&p->syncp);
 				rx_packets	= p->rx_packets;
 				rx_bytes	= p->rx_bytes;
 				rx_multicast	= p->rx_multicast;
 				tx_packets	= p->tx_packets;
 				tx_bytes	= p->tx_bytes;
-			} while (u64_stats_fetch_retry_bh(&p->syncp, start));
+			} while (u64_stats_fetch_retry_irq(&p->syncp, start));
 
 			stats->rx_packets	+= rx_packets;
 			stats->rx_bytes		+= rx_bytes;
diff --git a/drivers/net/nlmon.c b/drivers/net/nlmon.c
index 14ce7de6a933..6929b03ec638 100644
--- a/drivers/net/nlmon.c
+++ b/drivers/net/nlmon.c
@@ -90,10 +90,10 @@ nlmon_get_stats64(struct net_device *dev, struct rtnl_link_stats64 *stats)
 		nl_stats = per_cpu_ptr(dev->lstats, i);
 
 		do {
-			start = u64_stats_fetch_begin_bh(&nl_stats->syncp);
+			start = u64_stats_fetch_begin_irq(&nl_stats->syncp);
 			tbytes = nl_stats->bytes;
 			tpackets = nl_stats->packets;
-		} while (u64_stats_fetch_retry_bh(&nl_stats->syncp, start));
+		} while (u64_stats_fetch_retry_irq(&nl_stats->syncp, start));
 
 		packets += tpackets;
 		bytes += tbytes;
diff --git a/drivers/net/team/team.c b/drivers/net/team/team.c
index dbc06ab3793d..33008c1d1d67 100644
--- a/drivers/net/team/team.c
+++ b/drivers/net/team/team.c
@@ -1759,13 +1759,13 @@ team_get_stats64(struct net_device *dev, struct rtnl_link_stats64 *stats)
 	for_each_possible_cpu(i) {
 		p = per_cpu_ptr(team->pcpu_stats, i);
 		do {
-			start = u64_stats_fetch_begin_bh(&p->syncp);
+			start = u64_stats_fetch_begin_irq(&p->syncp);
 			rx_packets	= p->rx_packets;
 			rx_bytes	= p->rx_bytes;
 			rx_multicast	= p->rx_multicast;
 			tx_packets	= p->tx_packets;
 			tx_bytes	= p->tx_bytes;
-		} while (u64_stats_fetch_retry_bh(&p->syncp, start));
+		} while (u64_stats_fetch_retry_irq(&p->syncp, start));
 
 		stats->rx_packets	+= rx_packets;
 		stats->rx_bytes		+= rx_bytes;
diff --git a/drivers/net/team/team_mode_loadbalance.c b/drivers/net/team/team_mode_loadbalance.c
index d671fc3ac5ac..dbde3412ee5e 100644
--- a/drivers/net/team/team_mode_loadbalance.c
+++ b/drivers/net/team/team_mode_loadbalance.c
@@ -432,9 +432,9 @@ static void __lb_one_cpu_stats_add(struct lb_stats *acc_stats,
 	struct lb_stats tmp;
 
 	do {
-		start = u64_stats_fetch_begin_bh(syncp);
+		start = u64_stats_fetch_begin_irq(syncp);
 		tmp.tx_bytes = cpu_stats->tx_bytes;
-	} while (u64_stats_fetch_retry_bh(syncp, start));
+	} while (u64_stats_fetch_retry_irq(syncp, start));
 	acc_stats->tx_bytes += tmp.tx_bytes;
 }
 
diff --git a/drivers/net/veth.c b/drivers/net/veth.c
index 3aca92e80e1e..e1c77d4b80e4 100644
--- a/drivers/net/veth.c
+++ b/drivers/net/veth.c
@@ -156,10 +156,10 @@ static u64 veth_stats_one(struct pcpu_vstats *result, struct net_device *dev)
 		unsigned int start;
 
 		do {
-			start = u64_stats_fetch_begin_bh(&stats->syncp);
+			start = u64_stats_fetch_begin_irq(&stats->syncp);
 			packets = stats->packets;
 			bytes = stats->bytes;
-		} while (u64_stats_fetch_retry_bh(&stats->syncp, start));
+		} while (u64_stats_fetch_retry_irq(&stats->syncp, start));
 		result->packets += packets;
 		result->bytes += bytes;
 	}
diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 5632a99cbbd2..80d84c446962 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -1000,16 +1000,16 @@ static struct rtnl_link_stats64 *virtnet_stats(struct net_device *dev,
 		u64 tpackets, tbytes, rpackets, rbytes;
 
 		do {
-			start = u64_stats_fetch_begin_bh(&stats->tx_syncp);
+			start = u64_stats_fetch_begin_irq(&stats->tx_syncp);
 			tpackets = stats->tx_packets;
 			tbytes   = stats->tx_bytes;
-		} while (u64_stats_fetch_retry_bh(&stats->tx_syncp, start));
+		} while (u64_stats_fetch_retry_irq(&stats->tx_syncp, start));
 
 		do {
-			start = u64_stats_fetch_begin_bh(&stats->rx_syncp);
+			start = u64_stats_fetch_begin_irq(&stats->rx_syncp);
 			rpackets = stats->rx_packets;
 			rbytes   = stats->rx_bytes;
-		} while (u64_stats_fetch_retry_bh(&stats->rx_syncp, start));
+		} while (u64_stats_fetch_retry_irq(&stats->rx_syncp, start));
 
 		tot->rx_packets += rpackets;
 		tot->tx_packets += tpackets;
diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c
index a38f03ded5a4..49f3b3dbbed8 100644
--- a/drivers/net/xen-netfront.c
+++ b/drivers/net/xen-netfront.c
@@ -1060,13 +1060,13 @@ static struct rtnl_link_stats64 *xennet_get_stats64(struct net_device *dev,
 		unsigned int start;
 
 		do {
-			start = u64_stats_fetch_begin_bh(&stats->syncp);
+			start = u64_stats_fetch_begin_irq(&stats->syncp);
 
 			rx_packets = stats->rx_packets;
 			tx_packets = stats->tx_packets;
 			rx_bytes = stats->rx_bytes;
 			tx_bytes = stats->tx_bytes;
-		} while (u64_stats_fetch_retry_bh(&stats->syncp, start));
+		} while (u64_stats_fetch_retry_irq(&stats->syncp, start));
 
 		tot->rx_packets += rx_packets;
 		tot->tx_packets += tx_packets;
diff --git a/include/linux/u64_stats_sync.h b/include/linux/u64_stats_sync.h
index 7bfabd20204c..4b4439e75f45 100644
--- a/include/linux/u64_stats_sync.h
+++ b/include/linux/u64_stats_sync.h
@@ -27,8 +27,8 @@
  *    (On UP, there is no seqcount_t protection, a reader allowing interrupts could
  *     read partial values)
  *
- * 7) For softirq uses, readers can use u64_stats_fetch_begin_bh() and
- *    u64_stats_fetch_retry_bh() helpers
+ * 7) For irq and softirq uses, readers can use u64_stats_fetch_begin_irq() and
+ *    u64_stats_fetch_retry_irq() helpers
  *
  * Usage :
  *
@@ -114,31 +114,31 @@ static inline bool u64_stats_fetch_retry(const struct u64_stats_sync *syncp,
 }
 
 /*
- * In case softirq handlers can update u64 counters, readers can use following helpers
+ * In case irq handlers can update u64 counters, readers can use following helpers
  * - SMP 32bit arches use seqcount protection, irq safe.
- * - UP 32bit must disable BH.
+ * - UP 32bit must disable irqs.
  * - 64bit have no problem atomically reading u64 values, irq safe.
  */
-static inline unsigned int u64_stats_fetch_begin_bh(const struct u64_stats_sync *syncp)
+static inline unsigned int u64_stats_fetch_begin_irq(const struct u64_stats_sync *syncp)
 {
 #if BITS_PER_LONG==32 && defined(CONFIG_SMP)
 	return read_seqcount_begin(&syncp->seq);
 #else
 #if BITS_PER_LONG==32
-	local_bh_disable();
+	local_irq_disable();
 #endif
 	return 0;
 #endif
 }
 
-static inline bool u64_stats_fetch_retry_bh(const struct u64_stats_sync *syncp,
+static inline bool u64_stats_fetch_retry_irq(const struct u64_stats_sync *syncp,
 					 unsigned int start)
 {
 #if BITS_PER_LONG==32 && defined(CONFIG_SMP)
 	return read_seqcount_retry(&syncp->seq, start);
 #else
 #if BITS_PER_LONG==32
-	local_bh_enable();
+	local_irq_enable();
 #endif
 	return false;
 #endif
diff --git a/net/8021q/vlan_dev.c b/net/8021q/vlan_dev.c
index b382b8e301fb..1cd88546b8ab 100644
--- a/net/8021q/vlan_dev.c
+++ b/net/8021q/vlan_dev.c
@@ -675,13 +675,13 @@ static struct rtnl_link_stats64 *vlan_dev_get_stats64(struct net_device *dev, st
 
 			p = per_cpu_ptr(vlan_dev_priv(dev)->vlan_pcpu_stats, i);
 			do {
-				start = u64_stats_fetch_begin_bh(&p->syncp);
+				start = u64_stats_fetch_begin_irq(&p->syncp);
 				rxpackets	= p->rx_packets;
 				rxbytes		= p->rx_bytes;
 				rxmulticast	= p->rx_multicast;
 				txpackets	= p->tx_packets;
 				txbytes		= p->tx_bytes;
-			} while (u64_stats_fetch_retry_bh(&p->syncp, start));
+			} while (u64_stats_fetch_retry_irq(&p->syncp, start));
 
 			stats->rx_packets	+= rxpackets;
 			stats->rx_bytes		+= rxbytes;
diff --git a/net/bridge/br_device.c b/net/bridge/br_device.c
index e529ae6bd79e..0dd01a05bd59 100644
--- a/net/bridge/br_device.c
+++ b/net/bridge/br_device.c
@@ -136,9 +136,9 @@ static struct rtnl_link_stats64 *br_get_stats64(struct net_device *dev,
 		const struct pcpu_sw_netstats *bstats
 			= per_cpu_ptr(br->stats, cpu);
 		do {
-			start = u64_stats_fetch_begin_bh(&bstats->syncp);
+			start = u64_stats_fetch_begin_irq(&bstats->syncp);
 			memcpy(&tmp, bstats, sizeof(tmp));
-		} while (u64_stats_fetch_retry_bh(&bstats->syncp, start));
+		} while (u64_stats_fetch_retry_irq(&bstats->syncp, start));
 		sum.tx_bytes   += tmp.tx_bytes;
 		sum.tx_packets += tmp.tx_packets;
 		sum.rx_bytes   += tmp.rx_bytes;
diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
index 19ab78aca547..8c54870db792 100644
--- a/net/ipv4/af_inet.c
+++ b/net/ipv4/af_inet.c
@@ -1505,9 +1505,9 @@ u64 snmp_fold_field64(void __percpu *mib[], int offt, size_t syncp_offset)
 		bhptr = per_cpu_ptr(mib[0], cpu);
 		syncp = (struct u64_stats_sync *)(bhptr + syncp_offset);
 		do {
-			start = u64_stats_fetch_begin_bh(syncp);
+			start = u64_stats_fetch_begin_irq(syncp);
 			v = *(((u64 *) bhptr) + offt);
-		} while (u64_stats_fetch_retry_bh(syncp, start));
+		} while (u64_stats_fetch_retry_irq(syncp, start));
 
 		res += v;
 	}
diff --git a/net/ipv4/ip_tunnel_core.c b/net/ipv4/ip_tunnel_core.c
index 6f847dd56dbc..b86f0a37fa7c 100644
--- a/net/ipv4/ip_tunnel_core.c
+++ b/net/ipv4/ip_tunnel_core.c
@@ -161,12 +161,12 @@ struct rtnl_link_stats64 *ip_tunnel_get_stats64(struct net_device *dev,
 		unsigned int start;
 
 		do {
-			start = u64_stats_fetch_begin_bh(&tstats->syncp);
+			start = u64_stats_fetch_begin_irq(&tstats->syncp);
 			rx_packets = tstats->rx_packets;
 			tx_packets = tstats->tx_packets;
 			rx_bytes = tstats->rx_bytes;
 			tx_bytes = tstats->tx_bytes;
-		} while (u64_stats_fetch_retry_bh(&tstats->syncp, start));
+		} while (u64_stats_fetch_retry_irq(&tstats->syncp, start));
 
 		tot->rx_packets += rx_packets;
 		tot->tx_packets += tx_packets;
diff --git a/net/ipv6/ip6_tunnel.c b/net/ipv6/ip6_tunnel.c
index 8ad59f4811df..e1df691d78be 100644
--- a/net/ipv6/ip6_tunnel.c
+++ b/net/ipv6/ip6_tunnel.c
@@ -108,12 +108,12 @@ static struct net_device_stats *ip6_get_stats(struct net_device *dev)
 						   per_cpu_ptr(dev->tstats, i);
 
 		do {
-			start = u64_stats_fetch_begin_bh(&tstats->syncp);
+			start = u64_stats_fetch_begin_irq(&tstats->syncp);
 			tmp.rx_packets = tstats->rx_packets;
 			tmp.rx_bytes = tstats->rx_bytes;
 			tmp.tx_packets = tstats->tx_packets;
 			tmp.tx_bytes =  tstats->tx_bytes;
-		} while (u64_stats_fetch_retry_bh(&tstats->syncp, start));
+		} while (u64_stats_fetch_retry_irq(&tstats->syncp, start));
 
 		sum.rx_packets += tmp.rx_packets;
 		sum.rx_bytes   += tmp.rx_bytes;
diff --git a/net/netfilter/ipvs/ip_vs_ctl.c b/net/netfilter/ipvs/ip_vs_ctl.c
index 35be035ee0ce..d6d75841352a 100644
--- a/net/netfilter/ipvs/ip_vs_ctl.c
+++ b/net/netfilter/ipvs/ip_vs_ctl.c
@@ -2177,10 +2177,10 @@ static int ip_vs_stats_percpu_show(struct seq_file *seq, void *v)
 		__u64 inbytes, outbytes;
 
 		do {
-			start = u64_stats_fetch_begin_bh(&u->syncp);
+			start = u64_stats_fetch_begin_irq(&u->syncp);
 			inbytes = u->ustats.inbytes;
 			outbytes = u->ustats.outbytes;
-		} while (u64_stats_fetch_retry_bh(&u->syncp, start));
+		} while (u64_stats_fetch_retry_irq(&u->syncp, start));
 
 		seq_printf(seq, "%3X %8X %8X %8X %16LX %16LX\n",
 			   i, u->ustats.conns, u->ustats.inpkts,
diff --git a/net/openvswitch/datapath.c b/net/openvswitch/datapath.c
index 36f8872cb072..c53fe0c9697c 100644
--- a/net/openvswitch/datapath.c
+++ b/net/openvswitch/datapath.c
@@ -606,9 +606,9 @@ static void get_dp_stats(struct datapath *dp, struct ovs_dp_stats *stats,
 		percpu_stats = per_cpu_ptr(dp->stats_percpu, i);
 
 		do {
-			start = u64_stats_fetch_begin_bh(&percpu_stats->syncp);
+			start = u64_stats_fetch_begin_irq(&percpu_stats->syncp);
 			local_stats = *percpu_stats;
-		} while (u64_stats_fetch_retry_bh(&percpu_stats->syncp, start));
+		} while (u64_stats_fetch_retry_irq(&percpu_stats->syncp, start));
 
 		stats->n_hit += local_stats.n_hit;
 		stats->n_missed += local_stats.n_missed;
diff --git a/net/openvswitch/vport.c b/net/openvswitch/vport.c
index 3b4db3220456..42c0f4a0b78c 100644
--- a/net/openvswitch/vport.c
+++ b/net/openvswitch/vport.c
@@ -277,9 +277,9 @@ void ovs_vport_get_stats(struct vport *vport, struct ovs_vport_stats *stats)
 		percpu_stats = per_cpu_ptr(vport->percpu_stats, i);
 
 		do {
-			start = u64_stats_fetch_begin_bh(&percpu_stats->syncp);
+			start = u64_stats_fetch_begin_irq(&percpu_stats->syncp);
 			local_stats = *percpu_stats;
-		} while (u64_stats_fetch_retry_bh(&percpu_stats->syncp, start));
+		} while (u64_stats_fetch_retry_irq(&percpu_stats->syncp, start));
 
 		stats->rx_bytes		+= local_stats.rx_bytes;
 		stats->rx_packets	+= local_stats.rx_packets;
-- 
1.7.5.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* [PATCH net-next 0/16] Don't receive packets when the napi budget == 0
  2014-03-11 21:30                     ` [PATCH net-next 0/2] Don't receive packets when the napi budget == 0 Eric W. Biederman
                                         ` (2 preceding siblings ...)
  2014-03-13 19:19                       ` [PATCH net-next 0/2] " David Miller
@ 2014-03-15  0:56                       ` Eric W. Biederman
  2014-03-15  0:57                         ` [PATCH net-next 01/16] bnx2x: " Eric W. Biederman
                                           ` (16 more replies)
  3 siblings, 17 replies; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-15  0:56 UTC (permalink / raw)
  To: David Miller; +Cc: eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma


After reading through all 120 drivers supporting netpoll I have found 16
more that process at least received packet when the napi budget == 0.

Processing more packets than your budget has always been a bug but
we haven't cared before so it looks like these drivers slipped through,
and need fixes.

As netpoll will shortly be using a budget of 0 to get the tx queue
processing with the rx queue processing we now care.

Eric W. Biederman (16):
      bnx2x: Don't receive packets when the napi budget == 0
      i40e: Don't receive packets when the napi budget == 0
      igb: Don't receive packets when the napi budget == 0
      ixgbe: Don't receive packets when the napi budget == 0
      amd8111e: Don't receive packets when the napi budget == 0
      enic: Don't receive packets when the napi budget == 0
      fs_enet: Don't receive packets when the napi budget == 0
      ibmveth: Don't receive packets when the napi budget == 0
      sky2: Don't receive packets when the napi budget == 0
      mlx4: Don't receive packets when the napi budget == 0
      s2io: Don't receive packets when the napi budget == 0
      tilegx: Don't receive packets when the napi budget == 0
      tilepro: Don't receive packets when the napi budget == 0
      tc35815: Don't receive packets when the napi budget == 0
      vxge: Don't receive packets when the napi budget == 0
      sfc: Don't receive packets when the napi budget == 0

 drivers/net/ethernet/amd/amd8111e.c                |    3 +++
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c    |    2 ++
 drivers/net/ethernet/cisco/enic/enic_main.c        |   14 ++++++++------
 .../net/ethernet/freescale/fs_enet/fs_enet-main.c  |    3 +++
 drivers/net/ethernet/ibm/ibmveth.c                 |    4 ++--
 drivers/net/ethernet/intel/i40e/i40e_txrx.c        |    3 +++
 drivers/net/ethernet/intel/igb/igb_main.c          |    4 ++--
 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c      |    4 ++--
 drivers/net/ethernet/marvell/sky2.c                |    3 +++
 drivers/net/ethernet/mellanox/mlx4/en_rx.c         |    3 +++
 drivers/net/ethernet/neterion/s2io.c               |    3 +++
 drivers/net/ethernet/neterion/vxge/vxge-main.c     |    4 ++++
 drivers/net/ethernet/sfc/ef10.c                    |    3 +++
 drivers/net/ethernet/sfc/farch.c                   |    3 +++
 drivers/net/ethernet/tile/tilegx.c                 |    3 +++
 drivers/net/ethernet/tile/tilepro.c                |    3 +++
 drivers/net/ethernet/toshiba/tc35815.c             |    3 +++
 17 files changed, 53 insertions(+), 12 deletions(-)

Eric

^ permalink raw reply	[flat|nested] 288+ messages in thread

* [PATCH net-next 01/16] bnx2x: Don't receive packets when the napi budget == 0
  2014-03-15  0:56                       ` [PATCH net-next 0/16] " Eric W. Biederman
@ 2014-03-15  0:57                         ` Eric W. Biederman
  2014-03-15  0:59                         ` [PATCH net-next 02/16] i40e: " Eric W. Biederman
                                           ` (15 subsequent siblings)
  16 siblings, 0 replies; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-15  0:57 UTC (permalink / raw)
  To: David Miller; +Cc: eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma


Processing any incoming packets with a with a napi budget of 0
is incorrect driver behavior.

This matters as netpoll will shortly call drivers with a budget of 0
to avoid receive packet processing happening in hard irq context.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c |    2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c
index 117b5c7f8ac9..acd494647f25 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c
@@ -872,6 +872,8 @@ static int bnx2x_rx_int(struct bnx2x_fastpath *fp, int budget)
 	if (unlikely(bp->panic))
 		return 0;
 #endif
+	if (budget <= 0)
+		return rx_pkt;
 
 	bd_cons = fp->rx_bd_cons;
 	bd_prod = fp->rx_bd_prod;
-- 
1.7.5.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* [PATCH net-next 02/16] i40e: Don't receive packets when the napi budget == 0
  2014-03-15  0:56                       ` [PATCH net-next 0/16] " Eric W. Biederman
  2014-03-15  0:57                         ` [PATCH net-next 01/16] bnx2x: " Eric W. Biederman
@ 2014-03-15  0:59                         ` Eric W. Biederman
  2014-03-15  1:00                         ` [PATCH net-next 03/16] igb: " Eric W. Biederman
                                           ` (14 subsequent siblings)
  16 siblings, 0 replies; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-15  0:59 UTC (permalink / raw)
  To: David Miller; +Cc: eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma


Processing any incoming packets with a with a napi budget of 0
is incorrect driver behavior.

This matters as netpoll will shortly call drivers with a budget of 0
to avoid receive packet processing happening in hard irq context.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 drivers/net/ethernet/intel/i40e/i40e_txrx.c |    3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_txrx.c b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
index 2081bdb214e5..2c21dd5ddcb6 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_txrx.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
@@ -1264,6 +1264,9 @@ static int i40e_clean_rx_irq(struct i40e_ring *rx_ring, int budget)
 	u8 rx_ptype;
 	u64 qword;
 
+	if (budget <= 0)
+		return 0;
+
 	rx_desc = I40E_RX_DESC(rx_ring, i);
 	qword = le64_to_cpu(rx_desc->wb.qword1.status_error_len);
 	rx_status = (qword & I40E_RXD_QW1_STATUS_MASK) >>
-- 
1.7.5.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* [PATCH net-next 03/16] igb: Don't receive packets when the napi budget == 0
  2014-03-15  0:56                       ` [PATCH net-next 0/16] " Eric W. Biederman
  2014-03-15  0:57                         ` [PATCH net-next 01/16] bnx2x: " Eric W. Biederman
  2014-03-15  0:59                         ` [PATCH net-next 02/16] i40e: " Eric W. Biederman
@ 2014-03-15  1:00                         ` Eric W. Biederman
  2014-03-15  1:00                         ` [PATCH net-next 04/16] ixgbe: " Eric W. Biederman
                                           ` (13 subsequent siblings)
  16 siblings, 0 replies; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-15  1:00 UTC (permalink / raw)
  To: David Miller; +Cc: eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma


Processing any incoming packets with a with a napi budget of 0
is incorrect driver behavior.

This matters as netpoll will shortly call drivers with a budget of 0
to avoid receive packet processing happening in hard irq context.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 drivers/net/ethernet/intel/igb/igb_main.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c
index 340a3449e1e9..973224e43b11 100644
--- a/drivers/net/ethernet/intel/igb/igb_main.c
+++ b/drivers/net/ethernet/intel/igb/igb_main.c
@@ -6946,7 +6946,7 @@ static bool igb_clean_rx_irq(struct igb_q_vector *q_vector, const int budget)
 	unsigned int total_bytes = 0, total_packets = 0;
 	u16 cleaned_count = igb_desc_unused(rx_ring);
 
-	do {
+	while (likely(total_packets < budget)) {
 		union e1000_adv_rx_desc *rx_desc;
 
 		/* return some buffers to hardware, one at a time is too slow */
@@ -6998,7 +6998,7 @@ static bool igb_clean_rx_irq(struct igb_q_vector *q_vector, const int budget)
 
 		/* update budget accounting */
 		total_packets++;
-	} while (likely(total_packets < budget));
+	}
 
 	/* place incomplete frames back on ring for completion */
 	rx_ring->skb = skb;
-- 
1.7.5.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* [PATCH net-next 04/16] ixgbe: Don't receive packets when the napi budget == 0
  2014-03-15  0:56                       ` [PATCH net-next 0/16] " Eric W. Biederman
                                           ` (2 preceding siblings ...)
  2014-03-15  1:00                         ` [PATCH net-next 03/16] igb: " Eric W. Biederman
@ 2014-03-15  1:00                         ` Eric W. Biederman
  2014-03-15  1:01                         ` [PATCH net-next 05/16] amd8111e: " Eric W. Biederman
                                           ` (12 subsequent siblings)
  16 siblings, 0 replies; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-15  1:00 UTC (permalink / raw)
  To: David Miller; +Cc: eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma


Processing any incoming packets with a with a napi budget of 0
is incorrect driver behavior.

This matters as netpoll will shortly call drivers with a budget of 0
to avoid receive packet processing happening in hard irq context.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
index 851c41377b47..04b8f0a5b0ce 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
@@ -2076,7 +2076,7 @@ static int ixgbe_clean_rx_irq(struct ixgbe_q_vector *q_vector,
 #endif /* IXGBE_FCOE */
 	u16 cleaned_count = ixgbe_desc_unused(rx_ring);
 
-	do {
+	while (likely(total_rx_packets < budget)) {
 		union ixgbe_adv_rx_desc *rx_desc;
 		struct sk_buff *skb;
 
@@ -2151,7 +2151,7 @@ static int ixgbe_clean_rx_irq(struct ixgbe_q_vector *q_vector,
 
 		/* update budget accounting */
 		total_rx_packets++;
-	} while (likely(total_rx_packets < budget));
+	}
 
 	u64_stats_update_begin(&rx_ring->syncp);
 	rx_ring->stats.packets += total_rx_packets;
-- 
1.7.5.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* [PATCH net-next 05/16] amd8111e: Don't receive packets when the napi budget == 0
  2014-03-15  0:56                       ` [PATCH net-next 0/16] " Eric W. Biederman
                                           ` (3 preceding siblings ...)
  2014-03-15  1:00                         ` [PATCH net-next 04/16] ixgbe: " Eric W. Biederman
@ 2014-03-15  1:01                         ` Eric W. Biederman
  2014-03-15  1:02                         ` [PATCH net-next 06/16] enic: " Eric W. Biederman
                                           ` (11 subsequent siblings)
  16 siblings, 0 replies; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-15  1:01 UTC (permalink / raw)
  To: David Miller; +Cc: eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma


Processing any incoming packets with a with a napi budget of 0
is incorrect driver behavior.

This matters as netpoll will shortly call drivers with a budget of 0
to avoid receive packet processing happening in hard irq context.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 drivers/net/ethernet/amd/amd8111e.c |    3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/drivers/net/ethernet/amd/amd8111e.c b/drivers/net/ethernet/amd/amd8111e.c
index 2061b471fd16..26efaaa5e73f 100644
--- a/drivers/net/ethernet/amd/amd8111e.c
+++ b/drivers/net/ethernet/amd/amd8111e.c
@@ -720,6 +720,9 @@ static int amd8111e_rx_poll(struct napi_struct *napi, int budget)
 	int rx_pkt_limit = budget;
 	unsigned long flags;
 
+	if (rx_pkt_limit <= 0)
+		goto rx_not_empty;
+
 	do{
 		/* process receive packets until we use the quota*/
 		/* If we own the next entry, it's a new packet. Send it up. */
-- 
1.7.5.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* [PATCH net-next 06/16] enic: Don't receive packets when the napi budget == 0
  2014-03-15  0:56                       ` [PATCH net-next 0/16] " Eric W. Biederman
                                           ` (4 preceding siblings ...)
  2014-03-15  1:01                         ` [PATCH net-next 05/16] amd8111e: " Eric W. Biederman
@ 2014-03-15  1:02                         ` Eric W. Biederman
  2014-03-15  1:03                         ` [PATCH net-next 07/16] fs_enet: " Eric W. Biederman
                                           ` (10 subsequent siblings)
  16 siblings, 0 replies; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-15  1:02 UTC (permalink / raw)
  To: David Miller; +Cc: eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma


Processing any incoming packets with a with a napi budget of 0
is incorrect driver behavior.

This matters as netpoll will shortly call drivers with a budget of 0
to avoid receive packet processing happening in hard irq context.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 drivers/net/ethernet/cisco/enic/enic_main.c |   14 ++++++++------
 1 files changed, 8 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/cisco/enic/enic_main.c b/drivers/net/ethernet/cisco/enic/enic_main.c
index dcd58f23834a..4c35fc8fad99 100644
--- a/drivers/net/ethernet/cisco/enic/enic_main.c
+++ b/drivers/net/ethernet/cisco/enic/enic_main.c
@@ -1086,14 +1086,15 @@ static int enic_poll(struct napi_struct *napi, int budget)
 	unsigned int intr = enic_legacy_io_intr();
 	unsigned int rq_work_to_do = budget;
 	unsigned int wq_work_to_do = -1; /* no limit */
-	unsigned int  work_done, rq_work_done, wq_work_done;
+	unsigned int  work_done, rq_work_done = 0, wq_work_done;
 	int err;
 
 	/* Service RQ (first) and WQ
 	 */
 
-	rq_work_done = vnic_cq_service(&enic->cq[cq_rq],
-		rq_work_to_do, enic_rq_service, NULL);
+	if (budget > 0)
+		rq_work_done = vnic_cq_service(&enic->cq[cq_rq],
+			rq_work_to_do, enic_rq_service, NULL);
 
 	wq_work_done = vnic_cq_service(&enic->cq[cq_wq],
 		wq_work_to_do, enic_wq_service, NULL);
@@ -1141,14 +1142,15 @@ static int enic_poll_msix(struct napi_struct *napi, int budget)
 	unsigned int cq = enic_cq_rq(enic, rq);
 	unsigned int intr = enic_msix_rq_intr(enic, rq);
 	unsigned int work_to_do = budget;
-	unsigned int work_done;
+	unsigned int work_done = 0;
 	int err;
 
 	/* Service RQ
 	 */
 
-	work_done = vnic_cq_service(&enic->cq[cq],
-		work_to_do, enic_rq_service, NULL);
+	if (budget > 0)
+		work_done = vnic_cq_service(&enic->cq[cq],
+			work_to_do, enic_rq_service, NULL);
 
 	/* Return intr event credits for this polling
 	 * cycle.  An intr event is the completion of a
-- 
1.7.5.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* [PATCH net-next 07/16] fs_enet: Don't receive packets when the napi budget == 0
  2014-03-15  0:56                       ` [PATCH net-next 0/16] " Eric W. Biederman
                                           ` (5 preceding siblings ...)
  2014-03-15  1:02                         ` [PATCH net-next 06/16] enic: " Eric W. Biederman
@ 2014-03-15  1:03                         ` Eric W. Biederman
  2014-03-15  1:03                         ` [PATCH net-next 08/16] ibmveth: " Eric W. Biederman
                                           ` (9 subsequent siblings)
  16 siblings, 0 replies; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-15  1:03 UTC (permalink / raw)
  To: David Miller; +Cc: eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma


Processing any incoming packets with a with a napi budget of 0
is incorrect driver behavior.

This matters as netpoll will shortly call drivers with a budget of 0
to avoid receive packet processing happening in hard irq context.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 .../net/ethernet/freescale/fs_enet/fs_enet-main.c  |    3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/drivers/net/ethernet/freescale/fs_enet/fs_enet-main.c b/drivers/net/ethernet/freescale/fs_enet/fs_enet-main.c
index 62f042d4aaa9..dc80db41d6b3 100644
--- a/drivers/net/ethernet/freescale/fs_enet/fs_enet-main.c
+++ b/drivers/net/ethernet/freescale/fs_enet/fs_enet-main.c
@@ -91,6 +91,9 @@ static int fs_enet_rx_napi(struct napi_struct *napi, int budget)
 	u16 pkt_len, sc;
 	int curidx;
 
+	if (budget <= 0)
+		return received;
+
 	/*
 	 * First, grab all of the stats for the incoming packet.
 	 * These get messed up if we get called due to a busy condition.
-- 
1.7.5.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* [PATCH net-next 08/16] ibmveth: Don't receive packets when the napi budget == 0
  2014-03-15  0:56                       ` [PATCH net-next 0/16] " Eric W. Biederman
                                           ` (6 preceding siblings ...)
  2014-03-15  1:03                         ` [PATCH net-next 07/16] fs_enet: " Eric W. Biederman
@ 2014-03-15  1:03                         ` Eric W. Biederman
  2014-03-15  1:05                         ` [PATCH net-next 09/16] sky2: " Eric W. Biederman
                                           ` (8 subsequent siblings)
  16 siblings, 0 replies; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-15  1:03 UTC (permalink / raw)
  To: David Miller; +Cc: eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma


Processing any incoming packets with a with a napi budget of 0
is incorrect driver behavior.

This matters as netpoll will shortly call drivers with a budget of 0
to avoid receive packet processing happening in hard irq context.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 drivers/net/ethernet/ibm/ibmveth.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/ibm/ibmveth.c b/drivers/net/ethernet/ibm/ibmveth.c
index 4be971590461..abb95b21c4a7 100644
--- a/drivers/net/ethernet/ibm/ibmveth.c
+++ b/drivers/net/ethernet/ibm/ibmveth.c
@@ -1062,7 +1062,7 @@ static int ibmveth_poll(struct napi_struct *napi, int budget)
 	unsigned long lpar_rc;
 
 restart_poll:
-	do {
+	while (frames_processed < budget) {
 		if (!ibmveth_rxq_pending_buffer(adapter))
 			break;
 
@@ -1111,7 +1111,7 @@ restart_poll:
 			netdev->stats.rx_bytes += length;
 			frames_processed++;
 		}
-	} while (frames_processed < budget);
+	}
 
 	ibmveth_replenish_task(adapter);
 
-- 
1.7.5.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* [PATCH net-next 09/16] sky2: Don't receive packets when the napi budget == 0
  2014-03-15  0:56                       ` [PATCH net-next 0/16] " Eric W. Biederman
                                           ` (7 preceding siblings ...)
  2014-03-15  1:03                         ` [PATCH net-next 08/16] ibmveth: " Eric W. Biederman
@ 2014-03-15  1:05                         ` Eric W. Biederman
  2014-03-15  1:34                           ` Stephen Hemminger
  2014-03-15  1:05                         ` [PATCH net-next 10/16] mlx4: " Eric W. Biederman
                                           ` (7 subsequent siblings)
  16 siblings, 1 reply; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-15  1:05 UTC (permalink / raw)
  To: David Miller; +Cc: eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma


Processing any incoming packets with a with a napi budget of 0
is incorrect driver behavior.

This matters as netpoll will shortly call drivers with a budget of 0
to avoid receive packet processing happening in hard irq context.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 drivers/net/ethernet/marvell/sky2.c |    3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/drivers/net/ethernet/marvell/sky2.c b/drivers/net/ethernet/marvell/sky2.c
index 2434611d1b4e..0ddfc43069ba 100644
--- a/drivers/net/ethernet/marvell/sky2.c
+++ b/drivers/net/ethernet/marvell/sky2.c
@@ -2735,6 +2735,9 @@ static int sky2_status_intr(struct sky2_hw *hw, int to_do, u16 idx)
 	unsigned int total_bytes[2] = { 0 };
 	unsigned int total_packets[2] = { 0 };
 
+	if (to_do <= 0)
+		return work_done;
+
 	rmb();
 	do {
 		struct sky2_port *sky2;
-- 
1.7.5.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* [PATCH net-next 10/16] mlx4: Don't receive packets when the napi budget == 0
  2014-03-15  0:56                       ` [PATCH net-next 0/16] " Eric W. Biederman
                                           ` (8 preceding siblings ...)
  2014-03-15  1:05                         ` [PATCH net-next 09/16] sky2: " Eric W. Biederman
@ 2014-03-15  1:05                         ` Eric W. Biederman
  2014-03-15  1:06                         ` [PATCH net-next 11/16] s2io: " Eric W. Biederman
                                           ` (6 subsequent siblings)
  16 siblings, 0 replies; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-15  1:05 UTC (permalink / raw)
  To: David Miller; +Cc: eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma


Processing any incoming packets with a with a napi budget of 0
is incorrect driver behavior.

This matters as netpoll will shortly call drivers with a budget of 0
to avoid receive packet processing happening in hard irq context.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 drivers/net/ethernet/mellanox/mlx4/en_rx.c |    3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx4/en_rx.c b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
index 8afb72ec957d..ba049ae88749 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_rx.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
@@ -661,6 +661,9 @@ int mlx4_en_process_rx_cq(struct net_device *dev, struct mlx4_en_cq *cq, int bud
 	if (!priv->port_up)
 		return 0;
 
+	if (budget <= 0)
+		return polled;
+
 	/* We assume a 1:1 mapping between CQEs and Rx descriptors, so Rx
 	 * descriptor offset can be deduced from the CQE index instead of
 	 * reading 'cqe->index' */
-- 
1.7.5.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* [PATCH net-next 11/16] s2io: Don't receive packets when the napi budget == 0
  2014-03-15  0:56                       ` [PATCH net-next 0/16] " Eric W. Biederman
                                           ` (9 preceding siblings ...)
  2014-03-15  1:05                         ` [PATCH net-next 10/16] mlx4: " Eric W. Biederman
@ 2014-03-15  1:06                         ` Eric W. Biederman
  2014-03-15  1:08                         ` [PATCH net-next 12/16] tilegx: " Eric W. Biederman
                                           ` (5 subsequent siblings)
  16 siblings, 0 replies; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-15  1:06 UTC (permalink / raw)
  To: David Miller; +Cc: eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma


Processing any incoming packets with a with a napi budget of 0
is incorrect driver behavior.

This matters as netpoll will shortly call drivers with a budget of 0
to avoid receive packet processing happening in hard irq context.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 drivers/net/ethernet/neterion/s2io.c |    3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/drivers/net/ethernet/neterion/s2io.c b/drivers/net/ethernet/neterion/s2io.c
index 56e3a9d42bb2..d44fdb91808e 100644
--- a/drivers/net/ethernet/neterion/s2io.c
+++ b/drivers/net/ethernet/neterion/s2io.c
@@ -2914,6 +2914,9 @@ static int rx_intr_handler(struct ring_info *ring_data, int budget)
 	struct RxD1 *rxdp1;
 	struct RxD3 *rxdp3;
 
+	if (budget <= 0)
+		return napi_pkts;
+
 	get_info = ring_data->rx_curr_get_info;
 	get_block = get_info.block_index;
 	memcpy(&put_info, &ring_data->rx_curr_put_info, sizeof(put_info));
-- 
1.7.5.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* [PATCH net-next 12/16] tilegx: Don't receive packets when the napi budget == 0
  2014-03-15  0:56                       ` [PATCH net-next 0/16] " Eric W. Biederman
                                           ` (10 preceding siblings ...)
  2014-03-15  1:06                         ` [PATCH net-next 11/16] s2io: " Eric W. Biederman
@ 2014-03-15  1:08                         ` Eric W. Biederman
  2014-03-15  1:09                         ` [PATCH net-next 13/16] tilepro: " Eric W. Biederman
                                           ` (4 subsequent siblings)
  16 siblings, 0 replies; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-15  1:08 UTC (permalink / raw)
  To: David Miller; +Cc: eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma


Processing any incoming packets with a with a napi budget of 0
is incorrect driver behavior.

This matters as netpoll will shortly call drivers with a budget of 0
to avoid receive packet processing happening in hard irq context.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 drivers/net/ethernet/tile/tilegx.c |    3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/drivers/net/ethernet/tile/tilegx.c b/drivers/net/ethernet/tile/tilegx.c
index 17503da9f7a5..b43f1b3b9632 100644
--- a/drivers/net/ethernet/tile/tilegx.c
+++ b/drivers/net/ethernet/tile/tilegx.c
@@ -659,6 +659,9 @@ static int tile_net_poll(struct napi_struct *napi, int budget)
 	struct info_mpipe *info_mpipe =
 		container_of(napi, struct info_mpipe, napi);
 
+	if (budget <= 0)
+		goto done;
+
 	instance = info_mpipe->instance;
 	while ((n = gxio_mpipe_iqueue_try_peek(
 			&info_mpipe->iqueue,
-- 
1.7.5.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* [PATCH net-next 13/16] tilepro: Don't receive packets when the napi budget == 0
  2014-03-15  0:56                       ` [PATCH net-next 0/16] " Eric W. Biederman
                                           ` (11 preceding siblings ...)
  2014-03-15  1:08                         ` [PATCH net-next 12/16] tilegx: " Eric W. Biederman
@ 2014-03-15  1:09                         ` Eric W. Biederman
  2014-03-15  1:10                         ` [PATCH net-next-test 14/16] tc35815: " Eric W. Biederman
                                           ` (3 subsequent siblings)
  16 siblings, 0 replies; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-15  1:09 UTC (permalink / raw)
  To: David Miller; +Cc: eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma


Processing any incoming packets with a with a napi budget of 0
is incorrect driver behavior.

This matters as netpoll will shortly call drivers with a budget of 0
to avoid receive packet processing happening in hard irq context.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 drivers/net/ethernet/tile/tilepro.c |    3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/drivers/net/ethernet/tile/tilepro.c b/drivers/net/ethernet/tile/tilepro.c
index edb2e12a0fe2..44b186aa932c 100644
--- a/drivers/net/ethernet/tile/tilepro.c
+++ b/drivers/net/ethernet/tile/tilepro.c
@@ -831,6 +831,9 @@ static int tile_net_poll(struct napi_struct *napi, int budget)
 
 	unsigned int work = 0;
 
+	if (budget <= 0)
+		goto done;
+
 	while (priv->active) {
 		int index = qup->__packet_receive_read;
 		if (index == qsp->__packet_receive_queue.__packet_write)
-- 
1.7.5.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* [PATCH net-next-test 14/16] tc35815: Don't receive packets when the napi budget == 0
  2014-03-15  0:56                       ` [PATCH net-next 0/16] " Eric W. Biederman
                                           ` (12 preceding siblings ...)
  2014-03-15  1:09                         ` [PATCH net-next 13/16] tilepro: " Eric W. Biederman
@ 2014-03-15  1:10                         ` Eric W. Biederman
  2014-03-15  1:10                         ` [PATCH net-next 15/16] vxge: " Eric W. Biederman
                                           ` (2 subsequent siblings)
  16 siblings, 0 replies; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-15  1:10 UTC (permalink / raw)
  To: David Miller; +Cc: eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma


Processing any incoming packets with a with a napi budget of 0
is incorrect driver behavior.

This matters as netpoll will shortly call drivers with a budget of 0
to avoid receive packet processing happening in hard irq context.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 drivers/net/ethernet/toshiba/tc35815.c |    3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/drivers/net/ethernet/toshiba/tc35815.c b/drivers/net/ethernet/toshiba/tc35815.c
index 88e9c73cebc0..fef5573dbfca 100644
--- a/drivers/net/ethernet/toshiba/tc35815.c
+++ b/drivers/net/ethernet/toshiba/tc35815.c
@@ -1645,6 +1645,9 @@ static int tc35815_poll(struct napi_struct *napi, int budget)
 	int received = 0, handled;
 	u32 status;
 
+	if (budget <= 0)
+		return received;
+
 	spin_lock(&lp->rx_lock);
 	status = tc_readl(&tr->Int_Src);
 	do {
-- 
1.7.5.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* [PATCH net-next 15/16] vxge: Don't receive packets when the napi budget == 0
  2014-03-15  0:56                       ` [PATCH net-next 0/16] " Eric W. Biederman
                                           ` (13 preceding siblings ...)
  2014-03-15  1:10                         ` [PATCH net-next-test 14/16] tc35815: " Eric W. Biederman
@ 2014-03-15  1:10                         ` Eric W. Biederman
  2014-03-15  1:11                         ` [PATCH net-next 16/16] sfc: " Eric W. Biederman
  2014-03-15  2:54                         ` [PATCH net-next 0/16] Don't receive packets when the napi budget == 0 David Miller
  16 siblings, 0 replies; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-15  1:10 UTC (permalink / raw)
  To: David Miller; +Cc: eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma


Processing any incoming packets with a with a napi budget of 0
is incorrect driver behavior.

This matters as netpoll will shortly call drivers with a budget of 0
to avoid receive packet processing happening in hard irq context.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 drivers/net/ethernet/neterion/vxge/vxge-main.c |    4 ++++
 1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/drivers/net/ethernet/neterion/vxge/vxge-main.c b/drivers/net/ethernet/neterion/vxge/vxge-main.c
index c83cedd26dec..9eb49edf4d5a 100644
--- a/drivers/net/ethernet/neterion/vxge/vxge-main.c
+++ b/drivers/net/ethernet/neterion/vxge/vxge-main.c
@@ -368,6 +368,9 @@ vxge_rx_1b_compl(struct __vxge_hw_ring *ringh, void *dtr,
 	vxge_debug_entryexit(VXGE_TRACE, "%s: %s:%d",
 		ring->ndev->name, __func__, __LINE__);
 
+	if (ring->budget <= 0)
+		goto out;
+
 	do {
 		prefetch((char *)dtr + L1_CACHE_BYTES);
 		rx_priv = vxge_hw_ring_rxd_private_get(dtr);
@@ -525,6 +528,7 @@ vxge_rx_1b_compl(struct __vxge_hw_ring *ringh, void *dtr,
 	if (first_dtr)
 		vxge_hw_ring_rxd_post_post_wmb(ringh, first_dtr);
 
+out:
 	vxge_debug_entryexit(VXGE_TRACE,
 				"%s:%d  Exiting...",
 				__func__, __LINE__);
-- 
1.7.5.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* [PATCH net-next 16/16] sfc: Don't receive packets when the napi budget == 0
  2014-03-15  0:56                       ` [PATCH net-next 0/16] " Eric W. Biederman
                                           ` (14 preceding siblings ...)
  2014-03-15  1:10                         ` [PATCH net-next 15/16] vxge: " Eric W. Biederman
@ 2014-03-15  1:11                         ` Eric W. Biederman
  2014-03-15 15:23                           ` Ben Hutchings
  2014-03-15  2:54                         ` [PATCH net-next 0/16] Don't receive packets when the napi budget == 0 David Miller
  16 siblings, 1 reply; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-15  1:11 UTC (permalink / raw)
  To: David Miller; +Cc: eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma


Processing any incoming packets with a with a napi budget of 0
is incorrect driver behavior.

This matters as netpoll will shortly call drivers with a budget of 0
to avoid receive packet processing happening in hard irq context.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 drivers/net/ethernet/sfc/ef10.c  |    3 +++
 drivers/net/ethernet/sfc/farch.c |    3 +++
 2 files changed, 6 insertions(+), 0 deletions(-)

diff --git a/drivers/net/ethernet/sfc/ef10.c b/drivers/net/ethernet/sfc/ef10.c
index eb75675f6e32..651626e133f9 100644
--- a/drivers/net/ethernet/sfc/ef10.c
+++ b/drivers/net/ethernet/sfc/ef10.c
@@ -1955,6 +1955,9 @@ static int efx_ef10_ev_process(struct efx_channel *channel, int quota)
 	int tx_descs = 0;
 	int spent = 0;
 
+	if (quota <= 0)
+		return spent;
+
 	read_ptr = channel->eventq_read_ptr;
 
 	for (;;) {
diff --git a/drivers/net/ethernet/sfc/farch.c b/drivers/net/ethernet/sfc/farch.c
index aa1b169f45ec..a08761360cdf 100644
--- a/drivers/net/ethernet/sfc/farch.c
+++ b/drivers/net/ethernet/sfc/farch.c
@@ -1248,6 +1248,9 @@ int efx_farch_ev_process(struct efx_channel *channel, int budget)
 	int tx_packets = 0;
 	int spent = 0;
 
+	if (budget <= 0)
+		return spent;
+
 	read_ptr = channel->eventq_read_ptr;
 
 	for (;;) {
-- 
1.7.5.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* [PATCH 0/9] netpoll: Cleanup received packet processing
  2014-03-13 19:23                       ` David Miller
  2014-03-13 20:46                         ` Eric W. Biederman
@ 2014-03-15  1:30                         ` Eric W. Biederman
  2014-03-15  1:31                           ` [PATCH 1/9] netpoll: Pass budget into poll_napi Eric W. Biederman
                                             ` (9 more replies)
  1 sibling, 10 replies; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-15  1:30 UTC (permalink / raw)
  To: David Miller
  Cc: stephen, eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma


This is the long-winded, careful, and polite version of removing the netpoll
receive packet processing.

First I untangle the code in small steps.  Then I modify the code to not
force reception and dropping of packets when we are transmiting a packet
with netpoll.  Finally I move all of the packet reception under
CONFIG_NETPOLL_TRAP and delete CONFIG_NETPOLL_TRAP.

If someone wants to do a stable backport it would take backporting
the first 18 patches that handle the budget == 0 in the networking
drivers, and the first 5 of these patches.

If anyone wants to resurrect netpoll packet reception someday it should
just be a matter of reverting the last patch.

Eric W. Biederman (9):
      netpoll: Pass budget into poll_napi
      netpoll: Visit all napi handlers in poll_napi
      netpoll: Warn if more packets are processed than are budgeted
      netpoll: Add netpoll_rx_processing
      netpoll: Don't drop all received packets.
      netpoll: Move netpoll_trap under CONFIG_NETPOLL_TRAP
      netpoll: Consolidate neigh_tx processing in service_neigh_queue
      netpoll: Move all receive processing under CONFIG_NETPOLL_TRAP
      netpoll: Remove dead packet receive code (CONFIG_NETPOLL_TRAP)

 drivers/net/Kconfig       |    5 -
 include/linux/netdevice.h |   17 --
 include/linux/netpoll.h   |   61 ------
 net/core/dev.c            |   11 +-
 net/core/netpoll.c        |  488 +--------------------------------------------
 5 files changed, 5 insertions(+), 577 deletions(-)

Eric

^ permalink raw reply	[flat|nested] 288+ messages in thread

* [PATCH 1/9] netpoll: Pass budget into poll_napi
  2014-03-15  1:30                         ` [PATCH 0/9] netpoll: Cleanup received packet processing Eric W. Biederman
@ 2014-03-15  1:31                           ` Eric W. Biederman
  2014-03-15  1:32                           ` [PATCH 2/9] netpoll: Visit all napi handlers in poll_napi Eric W. Biederman
                                             ` (8 subsequent siblings)
  9 siblings, 0 replies; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-15  1:31 UTC (permalink / raw)
  To: David Miller
  Cc: stephen, eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma


This moves the control logic to the top level in netpoll_poll_dev
instead of having it dispersed throughout netpoll_poll_dev,
poll_napi and poll_one_napi.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 net/core/netpoll.c |    6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/net/core/netpoll.c b/net/core/netpoll.c
index ef4f45df539f..147c75855c9b 100644
--- a/net/core/netpoll.c
+++ b/net/core/netpoll.c
@@ -165,10 +165,9 @@ static int poll_one_napi(struct napi_struct *napi, int budget)
 	return budget - work;
 }
 
-static void poll_napi(struct net_device *dev)
+static void poll_napi(struct net_device *dev, int budget)
 {
 	struct napi_struct *napi;
-	int budget = 16;
 
 	list_for_each_entry(napi, &dev->napi_list, dev_list) {
 		if (napi->poll_owner != smp_processor_id() &&
@@ -196,6 +195,7 @@ static void netpoll_poll_dev(struct net_device *dev)
 {
 	const struct net_device_ops *ops;
 	struct netpoll_info *ni = rcu_dereference_bh(dev->npinfo);
+	int budget = 16;
 
 	/* Don't do any rx activity if the dev_lock mutex is held
 	 * the dev_open/close paths use this to block netpoll activity
@@ -221,7 +221,7 @@ static void netpoll_poll_dev(struct net_device *dev)
 	/* Process pending work on NIC */
 	ops->ndo_poll_controller(dev);
 
-	poll_napi(dev);
+	poll_napi(dev, budget);
 
 	atomic_dec(&trapped);
 	ni->rx_flags &= ~NETPOLL_RX_DROP;
-- 
1.7.5.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* [PATCH 2/9] netpoll: Visit all napi handlers in poll_napi
  2014-03-15  1:30                         ` [PATCH 0/9] netpoll: Cleanup received packet processing Eric W. Biederman
  2014-03-15  1:31                           ` [PATCH 1/9] netpoll: Pass budget into poll_napi Eric W. Biederman
@ 2014-03-15  1:32                           ` Eric W. Biederman
  2014-03-15  1:33                           ` [PATCH 3/9] netpoll: Warn if more packets are processed than are budgeted Eric W. Biederman
                                             ` (7 subsequent siblings)
  9 siblings, 0 replies; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-15  1:32 UTC (permalink / raw)
  To: David Miller
  Cc: stephen, eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma


In poll_napi loop through all of the napi handlers even when the
budget falls to 0 to ensure that we process all of the tx_queues, and
so that we continue to call into drivers when our initial budget is 0.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 net/core/netpoll.c |    3 ---
 1 files changed, 0 insertions(+), 3 deletions(-)

diff --git a/net/core/netpoll.c b/net/core/netpoll.c
index 147c75855c9b..d9e3d74ec9ac 100644
--- a/net/core/netpoll.c
+++ b/net/core/netpoll.c
@@ -174,9 +174,6 @@ static void poll_napi(struct net_device *dev, int budget)
 		    spin_trylock(&napi->poll_lock)) {
 			budget = poll_one_napi(napi, budget);
 			spin_unlock(&napi->poll_lock);
-
-			if (!budget)
-				break;
 		}
 	}
 }
-- 
1.7.5.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* [PATCH 3/9] netpoll: Warn if more packets are processed than are budgeted
  2014-03-15  1:30                         ` [PATCH 0/9] netpoll: Cleanup received packet processing Eric W. Biederman
  2014-03-15  1:31                           ` [PATCH 1/9] netpoll: Pass budget into poll_napi Eric W. Biederman
  2014-03-15  1:32                           ` [PATCH 2/9] netpoll: Visit all napi handlers in poll_napi Eric W. Biederman
@ 2014-03-15  1:33                           ` Eric W. Biederman
  2014-03-15  1:33                           ` [PATCH 4/9] netpoll: Add netpoll_rx_processing Eric W. Biederman
                                             ` (6 subsequent siblings)
  9 siblings, 0 replies; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-15  1:33 UTC (permalink / raw)
  To: David Miller
  Cc: stephen, eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma


There is already a warning for this case in the normal netpoll path,
but put a copy here in case how netpoll calls the poll functions
causes a differenet result.

netpoll will shortly call the napi poll routine with a budget 0 to
avoid any rx packets being processed.  As nothing does that today
we may encounter drivers that have problems so a netpoll specific
warning seems desirable.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 net/core/netpoll.c |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/net/core/netpoll.c b/net/core/netpoll.c
index d9e3d74ec9ac..2ad330e02967 100644
--- a/net/core/netpoll.c
+++ b/net/core/netpoll.c
@@ -158,6 +158,7 @@ static int poll_one_napi(struct napi_struct *napi, int budget)
 	set_bit(NAPI_STATE_NPSVC, &napi->state);
 
 	work = napi->poll(napi, budget);
+	WARN_ONCE(work > budget, "%pF exceeded budget in poll\n", napi->poll);
 	trace_napi_poll(napi);
 
 	clear_bit(NAPI_STATE_NPSVC, &napi->state);
-- 
1.7.5.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* [PATCH 4/9] netpoll: Add netpoll_rx_processing
  2014-03-15  1:30                         ` [PATCH 0/9] netpoll: Cleanup received packet processing Eric W. Biederman
                                             ` (2 preceding siblings ...)
  2014-03-15  1:33                           ` [PATCH 3/9] netpoll: Warn if more packets are processed than are budgeted Eric W. Biederman
@ 2014-03-15  1:33                           ` Eric W. Biederman
  2014-03-15  1:34                           ` [PATCH 5/9] netpoll: Don't drop all received packets Eric W. Biederman
                                             ` (5 subsequent siblings)
  9 siblings, 0 replies; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-15  1:33 UTC (permalink / raw)
  To: David Miller
  Cc: stephen, eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma


Add a helper netpoll_rx_processing that reports when netpoll has
receive side processing to perform.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 include/linux/netpoll.h |   18 ++++++++++++++----
 net/core/netpoll.c      |    4 ++--
 2 files changed, 16 insertions(+), 6 deletions(-)

diff --git a/include/linux/netpoll.h b/include/linux/netpoll.h
index fbfdb9d8d3a7..479d15c97770 100644
--- a/include/linux/netpoll.h
+++ b/include/linux/netpoll.h
@@ -82,14 +82,24 @@ static inline void netpoll_send_skb(struct netpoll *np, struct sk_buff *skb)
 	local_irq_restore(flags);
 }
 
-
+#ifdef CONFIG_NETPOLL_TRAP
+static inline bool netpoll_rx_processing(struct netpoll_info *npinfo)
+{
+	return !list_empty(&npinfo->rx_np);
+}
+#else
+static inline bool netpoll_rx_processing(struct netpoll_info *npinfo)
+{
+	return false;
+}
+#endif
 
 #ifdef CONFIG_NETPOLL
 static inline bool netpoll_rx_on(struct sk_buff *skb)
 {
 	struct netpoll_info *npinfo = rcu_dereference_bh(skb->dev->npinfo);
 
-	return npinfo && (!list_empty(&npinfo->rx_np) || npinfo->rx_flags);
+	return npinfo && (netpoll_rx_processing(npinfo) || npinfo->rx_flags);
 }
 
 static inline bool netpoll_rx(struct sk_buff *skb)
@@ -105,8 +115,8 @@ static inline bool netpoll_rx(struct sk_buff *skb)
 
 	npinfo = rcu_dereference_bh(skb->dev->npinfo);
 	spin_lock(&npinfo->rx_lock);
-	/* check rx_flags again with the lock held */
-	if (npinfo->rx_flags && __netpoll_rx(skb, npinfo))
+	/* check rx_processing again with the lock held */
+	if (netpoll_rx_processing(npinfo) && __netpoll_rx(skb, npinfo))
 		ret = true;
 	spin_unlock(&npinfo->rx_lock);
 
diff --git a/net/core/netpoll.c b/net/core/netpoll.c
index 2ad330e02967..ef83a2530e98 100644
--- a/net/core/netpoll.c
+++ b/net/core/netpoll.c
@@ -538,7 +538,7 @@ static void netpoll_neigh_reply(struct sk_buff *skb, struct netpoll_info *npinfo
 	int hlen, tlen;
 	int hits = 0, proto;
 
-	if (list_empty(&npinfo->rx_np))
+	if (!netpoll_rx_processing(npinfo))
 		return;
 
 	/* Before checking the packet, we do some early
@@ -770,7 +770,7 @@ int __netpoll_rx(struct sk_buff *skb, struct netpoll_info *npinfo)
 	struct netpoll *np, *tmp;
 	uint16_t source;
 
-	if (list_empty(&npinfo->rx_np))
+	if (!netpoll_rx_processing(npinfo))
 		goto out;
 
 	if (skb->dev->type != ARPHRD_ETHER)
-- 
1.7.5.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* Re: [PATCH net-next 09/16] sky2: Don't receive packets when the napi budget == 0
  2014-03-15  1:05                         ` [PATCH net-next 09/16] sky2: " Eric W. Biederman
@ 2014-03-15  1:34                           ` Stephen Hemminger
  0 siblings, 0 replies; 288+ messages in thread
From: Stephen Hemminger @ 2014-03-15  1:34 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: David Miller, eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma

On Fri, 14 Mar 2014 18:05:26 -0700
ebiederm@xmission.com (Eric W. Biederman) wrote:

> 
> Processing any incoming packets with a with a napi budget of 0
> is incorrect driver behavior.
> 
> This matters as netpoll will shortly call drivers with a budget of 0
> to avoid receive packet processing happening in hard irq context.
> 
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
> ---
>  drivers/net/ethernet/marvell/sky2.c |    3 +++
>  1 files changed, 3 insertions(+), 0 deletions(-)
> 
> diff --git a/drivers/net/ethernet/marvell/sky2.c b/drivers/net/ethernet/marvell/sky2.c
> index 2434611d1b4e..0ddfc43069ba 100644
> --- a/drivers/net/ethernet/marvell/sky2.c
> +++ b/drivers/net/ethernet/marvell/sky2.c
> @@ -2735,6 +2735,9 @@ static int sky2_status_intr(struct sky2_hw *hw, int to_do, u16 idx)
>  	unsigned int total_bytes[2] = { 0 };
>  	unsigned int total_packets[2] = { 0 };
>  
> +	if (to_do <= 0)
> +		return work_done;
> +
>  	rmb();
>  	do {
>  		struct sky2_port *sky2;


I am fine with this.

Really should change to_do to an unsigned but that is another
battle.

^ permalink raw reply	[flat|nested] 288+ messages in thread

* [PATCH 5/9] netpoll: Don't drop all received packets.
  2014-03-15  1:30                         ` [PATCH 0/9] netpoll: Cleanup received packet processing Eric W. Biederman
                                             ` (3 preceding siblings ...)
  2014-03-15  1:33                           ` [PATCH 4/9] netpoll: Add netpoll_rx_processing Eric W. Biederman
@ 2014-03-15  1:34                           ` Eric W. Biederman
  2014-03-15  1:35                           ` [PATCH 6/9] netpoll: Move netpoll_trap under CONFIG_NETPOLL_TRAP Eric W. Biederman
                                             ` (4 subsequent siblings)
  9 siblings, 0 replies; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-15  1:34 UTC (permalink / raw)
  To: David Miller
  Cc: stephen, eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma


Change the strategy of netpoll from dropping all packets received
during netpoll_poll_dev to calling napi poll with a budget of 0
(to avoid processing drivers rx queue), and to ignore packets received
with netif_rx (those will safely be placed on the backlog queue).

All of the netpoll supporting drivers have been reviewed to ensure
either thay use netif_rx or that a budget of 0 is supported by their
napi poll routine and that a budget of 0 will not process the drivers
rx queues.

Not dropping packets makes NETPOLL_RX_DROP unnecesary so it is removed.

npinfo->rx_flags is removed  as rx_flags with just the NETPOLL_RX_ENABLED
flag becomes just a redundant mirror of list_empty(&npinfo->rx_np).

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 include/linux/netpoll.h |    3 +--
 net/core/netpoll.c      |   17 ++++++-----------
 2 files changed, 7 insertions(+), 13 deletions(-)

diff --git a/include/linux/netpoll.h b/include/linux/netpoll.h
index 479d15c97770..154f9776056c 100644
--- a/include/linux/netpoll.h
+++ b/include/linux/netpoll.h
@@ -39,7 +39,6 @@ struct netpoll {
 struct netpoll_info {
 	atomic_t refcnt;
 
-	unsigned long rx_flags;
 	spinlock_t rx_lock;
 	struct semaphore dev_lock;
 	struct list_head rx_np; /* netpolls that registered an rx_skb_hook */
@@ -99,7 +98,7 @@ static inline bool netpoll_rx_on(struct sk_buff *skb)
 {
 	struct netpoll_info *npinfo = rcu_dereference_bh(skb->dev->npinfo);
 
-	return npinfo && (netpoll_rx_processing(npinfo) || npinfo->rx_flags);
+	return npinfo && netpoll_rx_processing(npinfo);
 }
 
 static inline bool netpoll_rx(struct sk_buff *skb)
diff --git a/net/core/netpoll.c b/net/core/netpoll.c
index ef83a2530e98..793dc04d2f19 100644
--- a/net/core/netpoll.c
+++ b/net/core/netpoll.c
@@ -51,8 +51,6 @@ static atomic_t trapped;
 DEFINE_STATIC_SRCU(netpoll_srcu);
 
 #define USEC_PER_POLL	50
-#define NETPOLL_RX_ENABLED  1
-#define NETPOLL_RX_DROP     2
 
 #define MAX_SKB_SIZE							\
 	(sizeof(struct ethhdr) +					\
@@ -193,7 +191,8 @@ static void netpoll_poll_dev(struct net_device *dev)
 {
 	const struct net_device_ops *ops;
 	struct netpoll_info *ni = rcu_dereference_bh(dev->npinfo);
-	int budget = 16;
+	bool rx_processing = netpoll_rx_processing(ni);
+	int budget = rx_processing? 16 : 0;
 
 	/* Don't do any rx activity if the dev_lock mutex is held
 	 * the dev_open/close paths use this to block netpoll activity
@@ -207,8 +206,8 @@ static void netpoll_poll_dev(struct net_device *dev)
 		return;
 	}
 
-	ni->rx_flags |= NETPOLL_RX_DROP;
-	atomic_inc(&trapped);
+	if (rx_processing)
+		atomic_inc(&trapped);
 
 	ops = dev->netdev_ops;
 	if (!ops->ndo_poll_controller) {
@@ -221,8 +220,8 @@ static void netpoll_poll_dev(struct net_device *dev)
 
 	poll_napi(dev, budget);
 
-	atomic_dec(&trapped);
-	ni->rx_flags &= ~NETPOLL_RX_DROP;
+	if (rx_processing)
+		atomic_dec(&trapped);
 
 	up(&ni->dev_lock);
 
@@ -1050,7 +1049,6 @@ int __netpoll_setup(struct netpoll *np, struct net_device *ndev, gfp_t gfp)
 			goto out;
 		}
 
-		npinfo->rx_flags = 0;
 		INIT_LIST_HEAD(&npinfo->rx_np);
 
 		spin_lock_init(&npinfo->rx_lock);
@@ -1076,7 +1074,6 @@ int __netpoll_setup(struct netpoll *np, struct net_device *ndev, gfp_t gfp)
 
 	if (np->rx_skb_hook) {
 		spin_lock_irqsave(&npinfo->rx_lock, flags);
-		npinfo->rx_flags |= NETPOLL_RX_ENABLED;
 		list_add_tail(&np->rx, &npinfo->rx_np);
 		spin_unlock_irqrestore(&npinfo->rx_lock, flags);
 	}
@@ -1258,8 +1255,6 @@ void __netpoll_cleanup(struct netpoll *np)
 	if (!list_empty(&npinfo->rx_np)) {
 		spin_lock_irqsave(&npinfo->rx_lock, flags);
 		list_del(&np->rx);
-		if (list_empty(&npinfo->rx_np))
-			npinfo->rx_flags &= ~NETPOLL_RX_ENABLED;
 		spin_unlock_irqrestore(&npinfo->rx_lock, flags);
 	}
 
-- 
1.7.5.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* [PATCH 6/9] netpoll: Move netpoll_trap under CONFIG_NETPOLL_TRAP
  2014-03-15  1:30                         ` [PATCH 0/9] netpoll: Cleanup received packet processing Eric W. Biederman
                                             ` (4 preceding siblings ...)
  2014-03-15  1:34                           ` [PATCH 5/9] netpoll: Don't drop all received packets Eric W. Biederman
@ 2014-03-15  1:35                           ` Eric W. Biederman
  2014-03-15  1:36                           ` [PATCH 7/9] netpoll: Consolidate neigh_tx processing in service_neigh_queue Eric W. Biederman
                                             ` (3 subsequent siblings)
  9 siblings, 0 replies; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-15  1:35 UTC (permalink / raw)
  To: David Miller
  Cc: stephen, eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma


Now that we no longer need to receive packets to safely drain the
network drivers receive queue move netpoll_trap and netpoll_set_trap
under CONFIG_NETPOLL_TRAP

Making netpoll_trap and netpoll_set_trap noop inline functions
when CONFIG_NETPOLL_TRAP is not set.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 include/linux/netpoll.h |   11 +++++++++--
 net/core/netpoll.c      |   14 +++++++++-----
 2 files changed, 18 insertions(+), 7 deletions(-)

diff --git a/include/linux/netpoll.h b/include/linux/netpoll.h
index 154f9776056c..ab9aaaff8d04 100644
--- a/include/linux/netpoll.h
+++ b/include/linux/netpoll.h
@@ -65,8 +65,6 @@ void netpoll_print_options(struct netpoll *np);
 int netpoll_parse_options(struct netpoll *np, char *opt);
 int __netpoll_setup(struct netpoll *np, struct net_device *ndev, gfp_t gfp);
 int netpoll_setup(struct netpoll *np);
-int netpoll_trap(void);
-void netpoll_set_trap(int trap);
 void __netpoll_cleanup(struct netpoll *np);
 void __netpoll_free_async(struct netpoll *np);
 void netpoll_cleanup(struct netpoll *np);
@@ -82,11 +80,20 @@ static inline void netpoll_send_skb(struct netpoll *np, struct sk_buff *skb)
 }
 
 #ifdef CONFIG_NETPOLL_TRAP
+int netpoll_trap(void);
+void netpoll_set_trap(int trap);
 static inline bool netpoll_rx_processing(struct netpoll_info *npinfo)
 {
 	return !list_empty(&npinfo->rx_np);
 }
 #else
+static inline int netpoll_trap(void)
+{
+	return 0;
+}
+static inline void netpoll_set_trap(int trap)
+{
+}
 static inline bool netpoll_rx_processing(struct netpoll_info *npinfo)
 {
 	return false;
diff --git a/net/core/netpoll.c b/net/core/netpoll.c
index 793dc04d2f19..0e45835f1737 100644
--- a/net/core/netpoll.c
+++ b/net/core/netpoll.c
@@ -46,7 +46,9 @@
 
 static struct sk_buff_head skb_pool;
 
+#ifdef CONFIG_NETPOLL_TRAP
 static atomic_t trapped;
+#endif
 
 DEFINE_STATIC_SRCU(netpoll_srcu);
 
@@ -207,7 +209,7 @@ static void netpoll_poll_dev(struct net_device *dev)
 	}
 
 	if (rx_processing)
-		atomic_inc(&trapped);
+		netpoll_set_trap(1);
 
 	ops = dev->netdev_ops;
 	if (!ops->ndo_poll_controller) {
@@ -221,7 +223,7 @@ static void netpoll_poll_dev(struct net_device *dev)
 	poll_napi(dev, budget);
 
 	if (rx_processing)
-		atomic_dec(&trapped);
+		netpoll_set_trap(0);
 
 	up(&ni->dev_lock);
 
@@ -776,10 +778,10 @@ int __netpoll_rx(struct sk_buff *skb, struct netpoll_info *npinfo)
 		goto out;
 
 	/* check if netpoll clients need ARP */
-	if (skb->protocol == htons(ETH_P_ARP) && atomic_read(&trapped)) {
+	if (skb->protocol == htons(ETH_P_ARP) && netpoll_trap()) {
 		skb_queue_tail(&npinfo->neigh_tx, skb);
 		return 1;
-	} else if (pkt_is_ns(skb) && atomic_read(&trapped)) {
+	} else if (pkt_is_ns(skb) && netpoll_trap()) {
 		skb_queue_tail(&npinfo->neigh_tx, skb);
 		return 1;
 	}
@@ -896,7 +898,7 @@ int __netpoll_rx(struct sk_buff *skb, struct netpoll_info *npinfo)
 	return 1;
 
 out:
-	if (atomic_read(&trapped)) {
+	if (netpoll_trap()) {
 		kfree_skb(skb);
 		return 1;
 	}
@@ -1302,6 +1304,7 @@ out:
 }
 EXPORT_SYMBOL(netpoll_cleanup);
 
+#ifdef CONFIG_NETPOLL_TRAP
 int netpoll_trap(void)
 {
 	return atomic_read(&trapped);
@@ -1316,3 +1319,4 @@ void netpoll_set_trap(int trap)
 		atomic_dec(&trapped);
 }
 EXPORT_SYMBOL(netpoll_set_trap);
+#endif /* CONFIG_NETPOLL_TRAP */
-- 
1.7.5.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* [PATCH 7/9] netpoll: Consolidate neigh_tx processing in service_neigh_queue
  2014-03-15  1:30                         ` [PATCH 0/9] netpoll: Cleanup received packet processing Eric W. Biederman
                                             ` (5 preceding siblings ...)
  2014-03-15  1:35                           ` [PATCH 6/9] netpoll: Move netpoll_trap under CONFIG_NETPOLL_TRAP Eric W. Biederman
@ 2014-03-15  1:36                           ` Eric W. Biederman
  2014-03-15  1:37                           ` [PATCH 8/9] netpoll: Move all receive processing under CONFIG_NETPOLL_TRAP Eric W. Biederman
                                             ` (2 subsequent siblings)
  9 siblings, 0 replies; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-15  1:36 UTC (permalink / raw)
  To: David Miller
  Cc: stephen, eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma


Move the bond slave device neigh_tx handling into service_neigh_queue.

In connection with neigh_tx processing remove unnecessary tests of
a NULL netpoll_info.  As the netpoll_poll_dev has already used
and thus verified the existince of the netpoll_info.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 net/core/netpoll.c |   38 ++++++++++++++++----------------------
 1 files changed, 16 insertions(+), 22 deletions(-)

diff --git a/net/core/netpoll.c b/net/core/netpoll.c
index 0e45835f1737..b69bb3f1ba3f 100644
--- a/net/core/netpoll.c
+++ b/net/core/netpoll.c
@@ -179,14 +179,23 @@ static void poll_napi(struct net_device *dev, int budget)
 	}
 }
 
-static void service_neigh_queue(struct netpoll_info *npi)
+static void service_neigh_queue(struct net_device *dev,
+				struct netpoll_info *npi)
 {
-	if (npi) {
-		struct sk_buff *skb;
-
-		while ((skb = skb_dequeue(&npi->neigh_tx)))
-			netpoll_neigh_reply(skb, npi);
+	struct sk_buff *skb;
+	if (dev->flags & IFF_SLAVE) {
+		struct net_device *bond_dev;
+		struct netpoll_info *bond_ni;
+
+		bond_dev = netdev_master_upper_dev_get_rcu(dev);
+		bond_ni = rcu_dereference_bh(bond_dev->npinfo);
+		while ((skb = skb_dequeue(&npi->neigh_tx))) {
+			skb->dev = bond_dev;
+			skb_queue_tail(&bond_ni->neigh_tx, skb);
+		}
 	}
+	while ((skb = skb_dequeue(&npi->neigh_tx)))
+		netpoll_neigh_reply(skb, npi);
 }
 
 static void netpoll_poll_dev(struct net_device *dev)
@@ -227,22 +236,7 @@ static void netpoll_poll_dev(struct net_device *dev)
 
 	up(&ni->dev_lock);
 
-	if (dev->flags & IFF_SLAVE) {
-		if (ni) {
-			struct net_device *bond_dev;
-			struct sk_buff *skb;
-			struct netpoll_info *bond_ni;
-
-			bond_dev = netdev_master_upper_dev_get_rcu(dev);
-			bond_ni = rcu_dereference_bh(bond_dev->npinfo);
-			while ((skb = skb_dequeue(&ni->neigh_tx))) {
-				skb->dev = bond_dev;
-				skb_queue_tail(&bond_ni->neigh_tx, skb);
-			}
-		}
-	}
-
-	service_neigh_queue(ni);
+	service_neigh_queue(dev, ni);
 
 	zap_completion_queue();
 }
-- 
1.7.5.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* [PATCH 8/9] netpoll: Move all receive processing under CONFIG_NETPOLL_TRAP
  2014-03-15  1:30                         ` [PATCH 0/9] netpoll: Cleanup received packet processing Eric W. Biederman
                                             ` (6 preceding siblings ...)
  2014-03-15  1:36                           ` [PATCH 7/9] netpoll: Consolidate neigh_tx processing in service_neigh_queue Eric W. Biederman
@ 2014-03-15  1:37                           ` Eric W. Biederman
  2014-03-15  1:39                           ` [PATCH 9/9] netpoll: Remove dead packet receive code (CONFIG_NETPOLL_TRAP) Eric W. Biederman
  2014-03-15  2:59                           ` [PATCH 0/9] netpoll: Cleanup received packet processing David Miller
  9 siblings, 0 replies; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-15  1:37 UTC (permalink / raw)
  To: David Miller
  Cc: stephen, eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma


Make rx_skb_hook, and rx in struct netpoll depend on
CONFIG_NETPOLL_TRAP Make rx_lock, rx_np, and neigh_tx in struct
netpoll_info depend on CONFIG_NETPOLL_TRAP

Make the functions netpoll_rx_on, netpoll_rx, and netpoll_receive_skb
no-ops when CONFIG_NETPOLL_TRAP is not set.

Only build netpoll_neigh_reply, checksum_udp service_neigh_queue,
pkt_is_ns, and __netpoll_rx when CONFIG_NETPOLL_TRAP is defined.

Add helper functions netpoll_trap_setup, netpoll_trap_setup_info,
netpoll_trap_cleanup, and netpoll_trap_cleanup_info that initialize
and cleanup the struct netpoll and struct netpoll_info receive
specific fields when CONFIG_NETPOLL_TRAP is enabled and do nothing
otherwise.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 include/linux/netpoll.h |   73 +++++++++++++++++++++++-------------------
 net/core/netpoll.c      |   81 +++++++++++++++++++++++++++++++++++++----------
 2 files changed, 104 insertions(+), 50 deletions(-)

diff --git a/include/linux/netpoll.h b/include/linux/netpoll.h
index ab9aaaff8d04..a0632af88d8b 100644
--- a/include/linux/netpoll.h
+++ b/include/linux/netpoll.h
@@ -24,32 +24,38 @@ struct netpoll {
 	struct net_device *dev;
 	char dev_name[IFNAMSIZ];
 	const char *name;
-	void (*rx_skb_hook)(struct netpoll *np, int source, struct sk_buff *skb,
-			    int offset, int len);
 
 	union inet_addr local_ip, remote_ip;
 	bool ipv6;
 	u16 local_port, remote_port;
 	u8 remote_mac[ETH_ALEN];
 
-	struct list_head rx; /* rx_np list element */
 	struct work_struct cleanup_work;
+
+#ifdef CONFIG_NETPOLL_TRAP
+	void (*rx_skb_hook)(struct netpoll *np, int source, struct sk_buff *skb,
+			    int offset, int len);
+	struct list_head rx; /* rx_np list element */
+#endif
 };
 
 struct netpoll_info {
 	atomic_t refcnt;
 
-	spinlock_t rx_lock;
 	struct semaphore dev_lock;
-	struct list_head rx_np; /* netpolls that registered an rx_skb_hook */
 
-	struct sk_buff_head neigh_tx; /* list of neigh requests to reply to */
 	struct sk_buff_head txq;
 
 	struct delayed_work tx_work;
 
 	struct netpoll *netpoll;
 	struct rcu_head rcu;
+
+#ifdef CONFIG_NETPOLL_TRAP
+	spinlock_t rx_lock;
+	struct list_head rx_np; /* netpolls that registered an rx_skb_hook */
+	struct sk_buff_head neigh_tx; /* list of neigh requests to reply to */
+#endif
 };
 
 #ifdef CONFIG_NETPOLL
@@ -68,7 +74,6 @@ int netpoll_setup(struct netpoll *np);
 void __netpoll_cleanup(struct netpoll *np);
 void __netpoll_free_async(struct netpoll *np);
 void netpoll_cleanup(struct netpoll *np);
-int __netpoll_rx(struct sk_buff *skb, struct netpoll_info *npinfo);
 void netpoll_send_skb_on_dev(struct netpoll *np, struct sk_buff *skb,
 			     struct net_device *dev);
 static inline void netpoll_send_skb(struct netpoll *np, struct sk_buff *skb)
@@ -82,25 +87,12 @@ static inline void netpoll_send_skb(struct netpoll *np, struct sk_buff *skb)
 #ifdef CONFIG_NETPOLL_TRAP
 int netpoll_trap(void);
 void netpoll_set_trap(int trap);
+int __netpoll_rx(struct sk_buff *skb, struct netpoll_info *npinfo);
 static inline bool netpoll_rx_processing(struct netpoll_info *npinfo)
 {
 	return !list_empty(&npinfo->rx_np);
 }
-#else
-static inline int netpoll_trap(void)
-{
-	return 0;
-}
-static inline void netpoll_set_trap(int trap)
-{
-}
-static inline bool netpoll_rx_processing(struct netpoll_info *npinfo)
-{
-	return false;
-}
-#endif
 
-#ifdef CONFIG_NETPOLL
 static inline bool netpoll_rx_on(struct sk_buff *skb)
 {
 	struct netpoll_info *npinfo = rcu_dereference_bh(skb->dev->npinfo);
@@ -138,6 +130,33 @@ static inline int netpoll_receive_skb(struct sk_buff *skb)
 	return 0;
 }
 
+#else
+static inline int netpoll_trap(void)
+{
+	return 0;
+}
+static inline void netpoll_set_trap(int trap)
+{
+}
+static inline bool netpoll_rx_processing(struct netpoll_info *npinfo)
+{
+	return false;
+}
+static inline bool netpoll_rx(struct sk_buff *skb)
+{
+	return false;
+}
+static inline bool netpoll_rx_on(struct sk_buff *skb)
+{
+	return false;
+}
+static inline int netpoll_receive_skb(struct sk_buff *skb)
+{
+	return 0;
+}
+#endif
+
+#ifdef CONFIG_NETPOLL
 static inline void *netpoll_poll_lock(struct napi_struct *napi)
 {
 	struct net_device *dev = napi->dev;
@@ -166,18 +185,6 @@ static inline bool netpoll_tx_running(struct net_device *dev)
 }
 
 #else
-static inline bool netpoll_rx(struct sk_buff *skb)
-{
-	return false;
-}
-static inline bool netpoll_rx_on(struct sk_buff *skb)
-{
-	return false;
-}
-static inline int netpoll_receive_skb(struct sk_buff *skb)
-{
-	return 0;
-}
 static inline void *netpoll_poll_lock(struct napi_struct *napi)
 {
 	return NULL;
diff --git a/net/core/netpoll.c b/net/core/netpoll.c
index b69bb3f1ba3f..adb5768be5a5 100644
--- a/net/core/netpoll.c
+++ b/net/core/netpoll.c
@@ -48,6 +48,7 @@ static struct sk_buff_head skb_pool;
 
 #ifdef CONFIG_NETPOLL_TRAP
 static atomic_t trapped;
+static void netpoll_neigh_reply(struct sk_buff *skb, struct netpoll_info *npinfo);
 #endif
 
 DEFINE_STATIC_SRCU(netpoll_srcu);
@@ -61,7 +62,6 @@ DEFINE_STATIC_SRCU(netpoll_srcu);
 	 MAX_UDP_CHUNK)
 
 static void zap_completion_queue(void);
-static void netpoll_neigh_reply(struct sk_buff *skb, struct netpoll_info *npinfo);
 static void netpoll_async_cleanup(struct work_struct *work);
 
 static unsigned int carrier_timeout = 4;
@@ -109,6 +109,7 @@ static void queue_process(struct work_struct *work)
 	}
 }
 
+#ifdef CONFIG_NETPOLL_TRAP
 static __sum16 checksum_udp(struct sk_buff *skb, struct udphdr *uh,
 			    unsigned short ulen, __be32 saddr, __be32 daddr)
 {
@@ -127,6 +128,7 @@ static __sum16 checksum_udp(struct sk_buff *skb, struct udphdr *uh,
 
 	return __skb_checksum_complete(skb);
 }
+#endif /* CONFIG_NETPOLL_TRAP */
 
 /*
  * Check whether delayed processing was scheduled for our NIC. If so,
@@ -179,6 +181,7 @@ static void poll_napi(struct net_device *dev, int budget)
 	}
 }
 
+#ifdef CONFIG_NETPOLL_TRAP
 static void service_neigh_queue(struct net_device *dev,
 				struct netpoll_info *npi)
 {
@@ -197,6 +200,12 @@ static void service_neigh_queue(struct net_device *dev,
 	while ((skb = skb_dequeue(&npi->neigh_tx)))
 		netpoll_neigh_reply(skb, npi);
 }
+#else /* !CONFIG_NETPOLL_TRAP */
+static inline void service_neigh_queue(struct net_device *dev,
+				struct netpoll_info *npi)
+{
+}
+#endif /* CONFIG_NETPOLL_TRAP */
 
 static void netpoll_poll_dev(struct net_device *dev)
 {
@@ -522,6 +531,7 @@ void netpoll_send_udp(struct netpoll *np, const char *msg, int len)
 }
 EXPORT_SYMBOL(netpoll_send_udp);
 
+#ifdef CONFIG_NETPOLL_TRAP
 static void netpoll_neigh_reply(struct sk_buff *skb, struct netpoll_info *npinfo)
 {
 	int size, type = ARPOP_REPLY;
@@ -900,6 +910,55 @@ out:
 	return 0;
 }
 
+static void netpoll_trap_setup_info(struct netpoll_info *npinfo)
+{
+	INIT_LIST_HEAD(&npinfo->rx_np);
+	spin_lock_init(&npinfo->rx_lock);
+	skb_queue_head_init(&npinfo->neigh_tx);
+}
+
+static void netpoll_trap_cleanup_info(struct netpoll_info *npinfo)
+{
+	skb_queue_purge(&npinfo->neigh_tx);
+}
+
+static void netpoll_trap_setup(struct netpoll *np, struct netpoll_info *npinfo)
+{
+	unsigned long flags;
+	if (np->rx_skb_hook) {
+		spin_lock_irqsave(&npinfo->rx_lock, flags);
+		list_add_tail(&np->rx, &npinfo->rx_np);
+		spin_unlock_irqrestore(&npinfo->rx_lock, flags);
+	}
+}
+
+static void netpoll_trap_cleanup(struct netpoll *np, struct netpoll_info *npinfo)
+{
+	unsigned long flags;
+	if (!list_empty(&npinfo->rx_np)) {
+		spin_lock_irqsave(&npinfo->rx_lock, flags);
+		list_del(&np->rx);
+		spin_unlock_irqrestore(&npinfo->rx_lock, flags);
+	}
+}
+
+#else /* !CONFIG_NETPOLL_TRAP */
+static inline void netpoll_trap_setup_info(struct netpoll_info *npinfo)
+{
+}
+static inline void netpoll_trap_cleanup_info(struct netpoll_info *npinfo)
+{
+}
+static inline 
+void netpoll_trap_setup(struct netpoll *np, struct netpoll_info *npinfo)
+{
+}
+static inline
+void netpoll_trap_cleanup(struct netpoll *np, struct netpoll_info *npinfo)
+{
+}
+#endif /* CONFIG_NETPOLL_TRAP */
+
 void netpoll_print_options(struct netpoll *np)
 {
 	np_info(np, "local port %d\n", np->local_port);
@@ -1023,7 +1082,6 @@ int __netpoll_setup(struct netpoll *np, struct net_device *ndev, gfp_t gfp)
 {
 	struct netpoll_info *npinfo;
 	const struct net_device_ops *ops;
-	unsigned long flags;
 	int err;
 
 	np->dev = ndev;
@@ -1045,11 +1103,9 @@ int __netpoll_setup(struct netpoll *np, struct net_device *ndev, gfp_t gfp)
 			goto out;
 		}
 
-		INIT_LIST_HEAD(&npinfo->rx_np);
+		netpoll_trap_setup_info(npinfo);
 
-		spin_lock_init(&npinfo->rx_lock);
 		sema_init(&npinfo->dev_lock, 1);
-		skb_queue_head_init(&npinfo->neigh_tx);
 		skb_queue_head_init(&npinfo->txq);
 		INIT_DELAYED_WORK(&npinfo->tx_work, queue_process);
 
@@ -1068,11 +1124,7 @@ int __netpoll_setup(struct netpoll *np, struct net_device *ndev, gfp_t gfp)
 
 	npinfo->netpoll = np;
 
-	if (np->rx_skb_hook) {
-		spin_lock_irqsave(&npinfo->rx_lock, flags);
-		list_add_tail(&np->rx, &npinfo->rx_np);
-		spin_unlock_irqrestore(&npinfo->rx_lock, flags);
-	}
+	netpoll_trap_setup(np, npinfo);
 
 	/* last thing to do is link it to the net device structure */
 	rcu_assign_pointer(ndev->npinfo, npinfo);
@@ -1222,7 +1274,7 @@ static void rcu_cleanup_netpoll_info(struct rcu_head *rcu_head)
 	struct netpoll_info *npinfo =
 			container_of(rcu_head, struct netpoll_info, rcu);
 
-	skb_queue_purge(&npinfo->neigh_tx);
+	netpoll_trap_cleanup_info(npinfo);
 	skb_queue_purge(&npinfo->txq);
 
 	/* we can't call cancel_delayed_work_sync here, as we are in softirq */
@@ -1238,7 +1290,6 @@ static void rcu_cleanup_netpoll_info(struct rcu_head *rcu_head)
 void __netpoll_cleanup(struct netpoll *np)
 {
 	struct netpoll_info *npinfo;
-	unsigned long flags;
 
 	/* rtnl_dereference would be preferable here but
 	 * rcu_cleanup_netpoll path can put us in here safely without
@@ -1248,11 +1299,7 @@ void __netpoll_cleanup(struct netpoll *np)
 	if (!npinfo)
 		return;
 
-	if (!list_empty(&npinfo->rx_np)) {
-		spin_lock_irqsave(&npinfo->rx_lock, flags);
-		list_del(&np->rx);
-		spin_unlock_irqrestore(&npinfo->rx_lock, flags);
-	}
+	netpoll_trap_cleanup(np, npinfo);
 
 	synchronize_srcu(&netpoll_srcu);
 
-- 
1.7.5.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* [PATCH 9/9] netpoll: Remove dead packet receive code (CONFIG_NETPOLL_TRAP)
  2014-03-15  1:30                         ` [PATCH 0/9] netpoll: Cleanup received packet processing Eric W. Biederman
                                             ` (7 preceding siblings ...)
  2014-03-15  1:37                           ` [PATCH 8/9] netpoll: Move all receive processing under CONFIG_NETPOLL_TRAP Eric W. Biederman
@ 2014-03-15  1:39                           ` Eric W. Biederman
  2014-03-15  2:59                           ` [PATCH 0/9] netpoll: Cleanup received packet processing David Miller
  9 siblings, 0 replies; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-15  1:39 UTC (permalink / raw)
  To: David Miller
  Cc: stephen, eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma


The netpoll packet receive code only becomes active if the netpoll
rx_skb_hook is implemented, and there is not a single implementation
of the netpoll rx_skb_hook in the kernel.

All of the out of tree implementations I have found all call
netpoll_poll which was removed from the kernel in 2011, so this
change should not add any additional breakage.

There are problems with the netpoll packet receive code.  __netpoll_rx
does not call dev_kfree_skb_irq or dev_kfree_skb_any in hard irq
context.  netpoll_neigh_reply leaks every skb it receives.  Reception
of packets does not work successfully on stacked devices (aka bonding,
team, bridge, and vlans).

Given that the netpoll packet receive code is buggy, there are no
out of tree users that will be merged soon, and the code has
not been used for in tree for a decade let's just remove it.

Reverting this commit can server as a starting point for anyone
who wants to resurrect netpoll packet reception support.

Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 drivers/net/Kconfig       |    5 -
 include/linux/netdevice.h |   17 --
 include/linux/netpoll.h   |   84 --------
 net/core/dev.c            |   11 +-
 net/core/netpoll.c        |  520 +--------------------------------------------
 5 files changed, 2 insertions(+), 635 deletions(-)

diff --git a/drivers/net/Kconfig b/drivers/net/Kconfig
index 494b888a6568..89402c3b64f8 100644
--- a/drivers/net/Kconfig
+++ b/drivers/net/Kconfig
@@ -177,11 +177,6 @@ config NETCONSOLE_DYNAMIC
 config NETPOLL
 	def_bool NETCONSOLE
 
-config NETPOLL_TRAP
-	bool "Netpoll traffic trapping"
-	default n
-	depends on NETPOLL
-
 config NET_POLL_CONTROLLER
 	def_bool NETPOLL
 
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index b8d8c805fd75..4b6d12c7b803 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -1979,9 +1979,6 @@ struct net_device *__dev_get_by_index(struct net *net, int ifindex);
 struct net_device *dev_get_by_index_rcu(struct net *net, int ifindex);
 int netdev_get_name(struct net *net, char *name, int ifindex);
 int dev_restart(struct net_device *dev);
-#ifdef CONFIG_NETPOLL_TRAP
-int netpoll_trap(void);
-#endif
 int skb_gro_receive(struct sk_buff **head, struct sk_buff *skb);
 
 static inline unsigned int skb_gro_offset(const struct sk_buff *skb)
@@ -2186,12 +2183,6 @@ static inline void netif_tx_start_all_queues(struct net_device *dev)
 
 static inline void netif_tx_wake_queue(struct netdev_queue *dev_queue)
 {
-#ifdef CONFIG_NETPOLL_TRAP
-	if (netpoll_trap()) {
-		netif_tx_start_queue(dev_queue);
-		return;
-	}
-#endif
 	if (test_and_clear_bit(__QUEUE_STATE_DRV_XOFF, &dev_queue->state))
 		__netif_schedule(dev_queue->qdisc);
 }
@@ -2435,10 +2426,6 @@ static inline void netif_start_subqueue(struct net_device *dev, u16 queue_index)
 static inline void netif_stop_subqueue(struct net_device *dev, u16 queue_index)
 {
 	struct netdev_queue *txq = netdev_get_tx_queue(dev, queue_index);
-#ifdef CONFIG_NETPOLL_TRAP
-	if (netpoll_trap())
-		return;
-#endif
 	netif_tx_stop_queue(txq);
 }
 
@@ -2473,10 +2460,6 @@ static inline bool netif_subqueue_stopped(const struct net_device *dev,
 static inline void netif_wake_subqueue(struct net_device *dev, u16 queue_index)
 {
 	struct netdev_queue *txq = netdev_get_tx_queue(dev, queue_index);
-#ifdef CONFIG_NETPOLL_TRAP
-	if (netpoll_trap())
-		return;
-#endif
 	if (test_and_clear_bit(__QUEUE_STATE_DRV_XOFF, &txq->state))
 		__netif_schedule(txq->qdisc);
 }
diff --git a/include/linux/netpoll.h b/include/linux/netpoll.h
index a0632af88d8b..1b475a5a7239 100644
--- a/include/linux/netpoll.h
+++ b/include/linux/netpoll.h
@@ -31,12 +31,6 @@ struct netpoll {
 	u8 remote_mac[ETH_ALEN];
 
 	struct work_struct cleanup_work;
-
-#ifdef CONFIG_NETPOLL_TRAP
-	void (*rx_skb_hook)(struct netpoll *np, int source, struct sk_buff *skb,
-			    int offset, int len);
-	struct list_head rx; /* rx_np list element */
-#endif
 };
 
 struct netpoll_info {
@@ -50,12 +44,6 @@ struct netpoll_info {
 
 	struct netpoll *netpoll;
 	struct rcu_head rcu;
-
-#ifdef CONFIG_NETPOLL_TRAP
-	spinlock_t rx_lock;
-	struct list_head rx_np; /* netpolls that registered an rx_skb_hook */
-	struct sk_buff_head neigh_tx; /* list of neigh requests to reply to */
-#endif
 };
 
 #ifdef CONFIG_NETPOLL
@@ -84,78 +72,6 @@ static inline void netpoll_send_skb(struct netpoll *np, struct sk_buff *skb)
 	local_irq_restore(flags);
 }
 
-#ifdef CONFIG_NETPOLL_TRAP
-int netpoll_trap(void);
-void netpoll_set_trap(int trap);
-int __netpoll_rx(struct sk_buff *skb, struct netpoll_info *npinfo);
-static inline bool netpoll_rx_processing(struct netpoll_info *npinfo)
-{
-	return !list_empty(&npinfo->rx_np);
-}
-
-static inline bool netpoll_rx_on(struct sk_buff *skb)
-{
-	struct netpoll_info *npinfo = rcu_dereference_bh(skb->dev->npinfo);
-
-	return npinfo && netpoll_rx_processing(npinfo);
-}
-
-static inline bool netpoll_rx(struct sk_buff *skb)
-{
-	struct netpoll_info *npinfo;
-	unsigned long flags;
-	bool ret = false;
-
-	local_irq_save(flags);
-
-	if (!netpoll_rx_on(skb))
-		goto out;
-
-	npinfo = rcu_dereference_bh(skb->dev->npinfo);
-	spin_lock(&npinfo->rx_lock);
-	/* check rx_processing again with the lock held */
-	if (netpoll_rx_processing(npinfo) && __netpoll_rx(skb, npinfo))
-		ret = true;
-	spin_unlock(&npinfo->rx_lock);
-
-out:
-	local_irq_restore(flags);
-	return ret;
-}
-
-static inline int netpoll_receive_skb(struct sk_buff *skb)
-{
-	if (!list_empty(&skb->dev->napi_list))
-		return netpoll_rx(skb);
-	return 0;
-}
-
-#else
-static inline int netpoll_trap(void)
-{
-	return 0;
-}
-static inline void netpoll_set_trap(int trap)
-{
-}
-static inline bool netpoll_rx_processing(struct netpoll_info *npinfo)
-{
-	return false;
-}
-static inline bool netpoll_rx(struct sk_buff *skb)
-{
-	return false;
-}
-static inline bool netpoll_rx_on(struct sk_buff *skb)
-{
-	return false;
-}
-static inline int netpoll_receive_skb(struct sk_buff *skb)
-{
-	return 0;
-}
-#endif
-
 #ifdef CONFIG_NETPOLL
 static inline void *netpoll_poll_lock(struct napi_struct *napi)
 {
diff --git a/net/core/dev.c b/net/core/dev.c
index 587f9fb85d73..55f8e64c03a2 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -3231,10 +3231,6 @@ static int netif_rx_internal(struct sk_buff *skb)
 {
 	int ret;
 
-	/* if netpoll wants it, pretend we never saw it */
-	if (netpoll_rx(skb))
-		return NET_RX_DROP;
-
 	net_timestamp_check(netdev_tstamp_prequeue, skb);
 
 	trace_netif_rx(skb);
@@ -3520,10 +3516,6 @@ static int __netif_receive_skb_core(struct sk_buff *skb, bool pfmemalloc)
 
 	trace_netif_receive_skb(skb);
 
-	/* if we've gotten here through NAPI, check netpoll */
-	if (netpoll_receive_skb(skb))
-		goto out;
-
 	orig_dev = skb->dev;
 
 	skb_reset_network_header(skb);
@@ -3650,7 +3642,6 @@ drop:
 
 unlock:
 	rcu_read_unlock();
-out:
 	return ret;
 }
 
@@ -3875,7 +3866,7 @@ static enum gro_result dev_gro_receive(struct napi_struct *napi, struct sk_buff
 	int same_flow;
 	enum gro_result ret;
 
-	if (!(skb->dev->features & NETIF_F_GRO) || netpoll_rx_on(skb))
+	if (!(skb->dev->features & NETIF_F_GRO))
 		goto normal;
 
 	if (skb_is_gso(skb) || skb_has_frag_list(skb))
diff --git a/net/core/netpoll.c b/net/core/netpoll.c
index adb5768be5a5..7291dde93469 100644
--- a/net/core/netpoll.c
+++ b/net/core/netpoll.c
@@ -46,11 +46,6 @@
 
 static struct sk_buff_head skb_pool;
 
-#ifdef CONFIG_NETPOLL_TRAP
-static atomic_t trapped;
-static void netpoll_neigh_reply(struct sk_buff *skb, struct netpoll_info *npinfo);
-#endif
-
 DEFINE_STATIC_SRCU(netpoll_srcu);
 
 #define USEC_PER_POLL	50
@@ -109,27 +104,6 @@ static void queue_process(struct work_struct *work)
 	}
 }
 
-#ifdef CONFIG_NETPOLL_TRAP
-static __sum16 checksum_udp(struct sk_buff *skb, struct udphdr *uh,
-			    unsigned short ulen, __be32 saddr, __be32 daddr)
-{
-	__wsum psum;
-
-	if (uh->check == 0 || skb_csum_unnecessary(skb))
-		return 0;
-
-	psum = csum_tcpudp_nofold(saddr, daddr, ulen, IPPROTO_UDP, 0);
-
-	if (skb->ip_summed == CHECKSUM_COMPLETE &&
-	    !csum_fold(csum_add(psum, skb->csum)))
-		return 0;
-
-	skb->csum = psum;
-
-	return __skb_checksum_complete(skb);
-}
-#endif /* CONFIG_NETPOLL_TRAP */
-
 /*
  * Check whether delayed processing was scheduled for our NIC. If so,
  * we attempt to grab the poll lock and use ->poll() to pump the card.
@@ -140,11 +114,6 @@ static __sum16 checksum_udp(struct sk_buff *skb, struct udphdr *uh,
  * trylock here and interrupts are already disabled in the softirq
  * case. Further, we test the poll_owner to avoid recursion on UP
  * systems where the lock doesn't exist.
- *
- * In cases where there is bi-directional communications, reading only
- * one message at a time can lead to packets being dropped by the
- * network adapter, forcing superfluous retries and possibly timeouts.
- * Thus, we set our budget to greater than 1.
  */
 static int poll_one_napi(struct napi_struct *napi, int budget)
 {
@@ -181,38 +150,11 @@ static void poll_napi(struct net_device *dev, int budget)
 	}
 }
 
-#ifdef CONFIG_NETPOLL_TRAP
-static void service_neigh_queue(struct net_device *dev,
-				struct netpoll_info *npi)
-{
-	struct sk_buff *skb;
-	if (dev->flags & IFF_SLAVE) {
-		struct net_device *bond_dev;
-		struct netpoll_info *bond_ni;
-
-		bond_dev = netdev_master_upper_dev_get_rcu(dev);
-		bond_ni = rcu_dereference_bh(bond_dev->npinfo);
-		while ((skb = skb_dequeue(&npi->neigh_tx))) {
-			skb->dev = bond_dev;
-			skb_queue_tail(&bond_ni->neigh_tx, skb);
-		}
-	}
-	while ((skb = skb_dequeue(&npi->neigh_tx)))
-		netpoll_neigh_reply(skb, npi);
-}
-#else /* !CONFIG_NETPOLL_TRAP */
-static inline void service_neigh_queue(struct net_device *dev,
-				struct netpoll_info *npi)
-{
-}
-#endif /* CONFIG_NETPOLL_TRAP */
-
 static void netpoll_poll_dev(struct net_device *dev)
 {
 	const struct net_device_ops *ops;
 	struct netpoll_info *ni = rcu_dereference_bh(dev->npinfo);
-	bool rx_processing = netpoll_rx_processing(ni);
-	int budget = rx_processing? 16 : 0;
+	int budget = 0;
 
 	/* Don't do any rx activity if the dev_lock mutex is held
 	 * the dev_open/close paths use this to block netpoll activity
@@ -226,9 +168,6 @@ static void netpoll_poll_dev(struct net_device *dev)
 		return;
 	}
 
-	if (rx_processing)
-		netpoll_set_trap(1);
-
 	ops = dev->netdev_ops;
 	if (!ops->ndo_poll_controller) {
 		up(&ni->dev_lock);
@@ -240,13 +179,8 @@ static void netpoll_poll_dev(struct net_device *dev)
 
 	poll_napi(dev, budget);
 
-	if (rx_processing)
-		netpoll_set_trap(0);
-
 	up(&ni->dev_lock);
 
-	service_neigh_queue(dev, ni);
-
 	zap_completion_queue();
 }
 
@@ -531,434 +465,6 @@ void netpoll_send_udp(struct netpoll *np, const char *msg, int len)
 }
 EXPORT_SYMBOL(netpoll_send_udp);
 
-#ifdef CONFIG_NETPOLL_TRAP
-static void netpoll_neigh_reply(struct sk_buff *skb, struct netpoll_info *npinfo)
-{
-	int size, type = ARPOP_REPLY;
-	__be32 sip, tip;
-	unsigned char *sha;
-	struct sk_buff *send_skb;
-	struct netpoll *np, *tmp;
-	unsigned long flags;
-	int hlen, tlen;
-	int hits = 0, proto;
-
-	if (!netpoll_rx_processing(npinfo))
-		return;
-
-	/* Before checking the packet, we do some early
-	   inspection whether this is interesting at all */
-	spin_lock_irqsave(&npinfo->rx_lock, flags);
-	list_for_each_entry_safe(np, tmp, &npinfo->rx_np, rx) {
-		if (np->dev == skb->dev)
-			hits++;
-	}
-	spin_unlock_irqrestore(&npinfo->rx_lock, flags);
-
-	/* No netpoll struct is using this dev */
-	if (!hits)
-		return;
-
-	proto = ntohs(eth_hdr(skb)->h_proto);
-	if (proto == ETH_P_ARP) {
-		struct arphdr *arp;
-		unsigned char *arp_ptr;
-		/* No arp on this interface */
-		if (skb->dev->flags & IFF_NOARP)
-			return;
-
-		if (!pskb_may_pull(skb, arp_hdr_len(skb->dev)))
-			return;
-
-		skb_reset_network_header(skb);
-		skb_reset_transport_header(skb);
-		arp = arp_hdr(skb);
-
-		if ((arp->ar_hrd != htons(ARPHRD_ETHER) &&
-		     arp->ar_hrd != htons(ARPHRD_IEEE802)) ||
-		    arp->ar_pro != htons(ETH_P_IP) ||
-		    arp->ar_op != htons(ARPOP_REQUEST))
-			return;
-
-		arp_ptr = (unsigned char *)(arp+1);
-		/* save the location of the src hw addr */
-		sha = arp_ptr;
-		arp_ptr += skb->dev->addr_len;
-		memcpy(&sip, arp_ptr, 4);
-		arp_ptr += 4;
-		/* If we actually cared about dst hw addr,
-		   it would get copied here */
-		arp_ptr += skb->dev->addr_len;
-		memcpy(&tip, arp_ptr, 4);
-
-		/* Should we ignore arp? */
-		if (ipv4_is_loopback(tip) || ipv4_is_multicast(tip))
-			return;
-
-		size = arp_hdr_len(skb->dev);
-
-		spin_lock_irqsave(&npinfo->rx_lock, flags);
-		list_for_each_entry_safe(np, tmp, &npinfo->rx_np, rx) {
-			if (tip != np->local_ip.ip)
-				continue;
-
-			hlen = LL_RESERVED_SPACE(np->dev);
-			tlen = np->dev->needed_tailroom;
-			send_skb = find_skb(np, size + hlen + tlen, hlen);
-			if (!send_skb)
-				continue;
-
-			skb_reset_network_header(send_skb);
-			arp = (struct arphdr *) skb_put(send_skb, size);
-			send_skb->dev = skb->dev;
-			send_skb->protocol = htons(ETH_P_ARP);
-
-			/* Fill the device header for the ARP frame */
-			if (dev_hard_header(send_skb, skb->dev, ETH_P_ARP,
-					    sha, np->dev->dev_addr,
-					    send_skb->len) < 0) {
-				kfree_skb(send_skb);
-				continue;
-			}
-
-			/*
-			 * Fill out the arp protocol part.
-			 *
-			 * we only support ethernet device type,
-			 * which (according to RFC 1390) should
-			 * always equal 1 (Ethernet).
-			 */
-
-			arp->ar_hrd = htons(np->dev->type);
-			arp->ar_pro = htons(ETH_P_IP);
-			arp->ar_hln = np->dev->addr_len;
-			arp->ar_pln = 4;
-			arp->ar_op = htons(type);
-
-			arp_ptr = (unsigned char *)(arp + 1);
-			memcpy(arp_ptr, np->dev->dev_addr, np->dev->addr_len);
-			arp_ptr += np->dev->addr_len;
-			memcpy(arp_ptr, &tip, 4);
-			arp_ptr += 4;
-			memcpy(arp_ptr, sha, np->dev->addr_len);
-			arp_ptr += np->dev->addr_len;
-			memcpy(arp_ptr, &sip, 4);
-
-			netpoll_send_skb(np, send_skb);
-
-			/* If there are several rx_skb_hooks for the same
-			 * address we're fine by sending a single reply
-			 */
-			break;
-		}
-		spin_unlock_irqrestore(&npinfo->rx_lock, flags);
-	} else if( proto == ETH_P_IPV6) {
-#if IS_ENABLED(CONFIG_IPV6)
-		struct nd_msg *msg;
-		u8 *lladdr = NULL;
-		struct ipv6hdr *hdr;
-		struct icmp6hdr *icmp6h;
-		const struct in6_addr *saddr;
-		const struct in6_addr *daddr;
-		struct inet6_dev *in6_dev = NULL;
-		struct in6_addr *target;
-
-		in6_dev = in6_dev_get(skb->dev);
-		if (!in6_dev || !in6_dev->cnf.accept_ra)
-			return;
-
-		if (!pskb_may_pull(skb, skb->len))
-			return;
-
-		msg = (struct nd_msg *)skb_transport_header(skb);
-
-		__skb_push(skb, skb->data - skb_transport_header(skb));
-
-		if (ipv6_hdr(skb)->hop_limit != 255)
-			return;
-		if (msg->icmph.icmp6_code != 0)
-			return;
-		if (msg->icmph.icmp6_type != NDISC_NEIGHBOUR_SOLICITATION)
-			return;
-
-		saddr = &ipv6_hdr(skb)->saddr;
-		daddr = &ipv6_hdr(skb)->daddr;
-
-		size = sizeof(struct icmp6hdr) + sizeof(struct in6_addr);
-
-		spin_lock_irqsave(&npinfo->rx_lock, flags);
-		list_for_each_entry_safe(np, tmp, &npinfo->rx_np, rx) {
-			if (!ipv6_addr_equal(daddr, &np->local_ip.in6))
-				continue;
-
-			hlen = LL_RESERVED_SPACE(np->dev);
-			tlen = np->dev->needed_tailroom;
-			send_skb = find_skb(np, size + hlen + tlen, hlen);
-			if (!send_skb)
-				continue;
-
-			send_skb->protocol = htons(ETH_P_IPV6);
-			send_skb->dev = skb->dev;
-
-			skb_reset_network_header(send_skb);
-			hdr = (struct ipv6hdr *) skb_put(send_skb, sizeof(struct ipv6hdr));
-			*(__be32*)hdr = htonl(0x60000000);
-			hdr->payload_len = htons(size);
-			hdr->nexthdr = IPPROTO_ICMPV6;
-			hdr->hop_limit = 255;
-			hdr->saddr = *saddr;
-			hdr->daddr = *daddr;
-
-			icmp6h = (struct icmp6hdr *) skb_put(send_skb, sizeof(struct icmp6hdr));
-			icmp6h->icmp6_type = NDISC_NEIGHBOUR_ADVERTISEMENT;
-			icmp6h->icmp6_router = 0;
-			icmp6h->icmp6_solicited = 1;
-
-			target = (struct in6_addr *) skb_put(send_skb, sizeof(struct in6_addr));
-			*target = msg->target;
-			icmp6h->icmp6_cksum = csum_ipv6_magic(saddr, daddr, size,
-							      IPPROTO_ICMPV6,
-							      csum_partial(icmp6h,
-									   size, 0));
-
-			if (dev_hard_header(send_skb, skb->dev, ETH_P_IPV6,
-					    lladdr, np->dev->dev_addr,
-					    send_skb->len) < 0) {
-				kfree_skb(send_skb);
-				continue;
-			}
-
-			netpoll_send_skb(np, send_skb);
-
-			/* If there are several rx_skb_hooks for the same
-			 * address, we're fine by sending a single reply
-			 */
-			break;
-		}
-		spin_unlock_irqrestore(&npinfo->rx_lock, flags);
-#endif
-	}
-}
-
-static bool pkt_is_ns(struct sk_buff *skb)
-{
-	struct nd_msg *msg;
-	struct ipv6hdr *hdr;
-
-	if (skb->protocol != htons(ETH_P_ARP))
-		return false;
-	if (!pskb_may_pull(skb, sizeof(struct ipv6hdr) + sizeof(struct nd_msg)))
-		return false;
-
-	msg = (struct nd_msg *)skb_transport_header(skb);
-	__skb_push(skb, skb->data - skb_transport_header(skb));
-	hdr = ipv6_hdr(skb);
-
-	if (hdr->nexthdr != IPPROTO_ICMPV6)
-		return false;
-	if (hdr->hop_limit != 255)
-		return false;
-	if (msg->icmph.icmp6_code != 0)
-		return false;
-	if (msg->icmph.icmp6_type != NDISC_NEIGHBOUR_SOLICITATION)
-		return false;
-
-	return true;
-}
-
-int __netpoll_rx(struct sk_buff *skb, struct netpoll_info *npinfo)
-{
-	int proto, len, ulen, data_len;
-	int hits = 0, offset;
-	const struct iphdr *iph;
-	struct udphdr *uh;
-	struct netpoll *np, *tmp;
-	uint16_t source;
-
-	if (!netpoll_rx_processing(npinfo))
-		goto out;
-
-	if (skb->dev->type != ARPHRD_ETHER)
-		goto out;
-
-	/* check if netpoll clients need ARP */
-	if (skb->protocol == htons(ETH_P_ARP) && netpoll_trap()) {
-		skb_queue_tail(&npinfo->neigh_tx, skb);
-		return 1;
-	} else if (pkt_is_ns(skb) && netpoll_trap()) {
-		skb_queue_tail(&npinfo->neigh_tx, skb);
-		return 1;
-	}
-
-	if (skb->protocol == cpu_to_be16(ETH_P_8021Q)) {
-		skb = vlan_untag(skb);
-		if (unlikely(!skb))
-			goto out;
-	}
-
-	proto = ntohs(eth_hdr(skb)->h_proto);
-	if (proto != ETH_P_IP && proto != ETH_P_IPV6)
-		goto out;
-	if (skb->pkt_type == PACKET_OTHERHOST)
-		goto out;
-	if (skb_shared(skb))
-		goto out;
-
-	if (proto == ETH_P_IP) {
-		if (!pskb_may_pull(skb, sizeof(struct iphdr)))
-			goto out;
-		iph = (struct iphdr *)skb->data;
-		if (iph->ihl < 5 || iph->version != 4)
-			goto out;
-		if (!pskb_may_pull(skb, iph->ihl*4))
-			goto out;
-		iph = (struct iphdr *)skb->data;
-		if (ip_fast_csum((u8 *)iph, iph->ihl) != 0)
-			goto out;
-
-		len = ntohs(iph->tot_len);
-		if (skb->len < len || len < iph->ihl*4)
-			goto out;
-
-		/*
-		 * Our transport medium may have padded the buffer out.
-		 * Now We trim to the true length of the frame.
-		 */
-		if (pskb_trim_rcsum(skb, len))
-			goto out;
-
-		iph = (struct iphdr *)skb->data;
-		if (iph->protocol != IPPROTO_UDP)
-			goto out;
-
-		len -= iph->ihl*4;
-		uh = (struct udphdr *)(((char *)iph) + iph->ihl*4);
-		offset = (unsigned char *)(uh + 1) - skb->data;
-		ulen = ntohs(uh->len);
-		data_len = skb->len - offset;
-		source = ntohs(uh->source);
-
-		if (ulen != len)
-			goto out;
-		if (checksum_udp(skb, uh, ulen, iph->saddr, iph->daddr))
-			goto out;
-		list_for_each_entry_safe(np, tmp, &npinfo->rx_np, rx) {
-			if (np->local_ip.ip && np->local_ip.ip != iph->daddr)
-				continue;
-			if (np->remote_ip.ip && np->remote_ip.ip != iph->saddr)
-				continue;
-			if (np->local_port && np->local_port != ntohs(uh->dest))
-				continue;
-
-			np->rx_skb_hook(np, source, skb, offset, data_len);
-			hits++;
-		}
-	} else {
-#if IS_ENABLED(CONFIG_IPV6)
-		const struct ipv6hdr *ip6h;
-
-		if (!pskb_may_pull(skb, sizeof(struct ipv6hdr)))
-			goto out;
-		ip6h = (struct ipv6hdr *)skb->data;
-		if (ip6h->version != 6)
-			goto out;
-		len = ntohs(ip6h->payload_len);
-		if (!len)
-			goto out;
-		if (len + sizeof(struct ipv6hdr) > skb->len)
-			goto out;
-		if (pskb_trim_rcsum(skb, len + sizeof(struct ipv6hdr)))
-			goto out;
-		ip6h = ipv6_hdr(skb);
-		if (!pskb_may_pull(skb, sizeof(struct udphdr)))
-			goto out;
-		uh = udp_hdr(skb);
-		offset = (unsigned char *)(uh + 1) - skb->data;
-		ulen = ntohs(uh->len);
-		data_len = skb->len - offset;
-		source = ntohs(uh->source);
-		if (ulen != skb->len)
-			goto out;
-		if (udp6_csum_init(skb, uh, IPPROTO_UDP))
-			goto out;
-		list_for_each_entry_safe(np, tmp, &npinfo->rx_np, rx) {
-			if (!ipv6_addr_equal(&np->local_ip.in6, &ip6h->daddr))
-				continue;
-			if (!ipv6_addr_equal(&np->remote_ip.in6, &ip6h->saddr))
-				continue;
-			if (np->local_port && np->local_port != ntohs(uh->dest))
-				continue;
-
-			np->rx_skb_hook(np, source, skb, offset, data_len);
-			hits++;
-		}
-#endif
-	}
-
-	if (!hits)
-		goto out;
-
-	kfree_skb(skb);
-	return 1;
-
-out:
-	if (netpoll_trap()) {
-		kfree_skb(skb);
-		return 1;
-	}
-
-	return 0;
-}
-
-static void netpoll_trap_setup_info(struct netpoll_info *npinfo)
-{
-	INIT_LIST_HEAD(&npinfo->rx_np);
-	spin_lock_init(&npinfo->rx_lock);
-	skb_queue_head_init(&npinfo->neigh_tx);
-}
-
-static void netpoll_trap_cleanup_info(struct netpoll_info *npinfo)
-{
-	skb_queue_purge(&npinfo->neigh_tx);
-}
-
-static void netpoll_trap_setup(struct netpoll *np, struct netpoll_info *npinfo)
-{
-	unsigned long flags;
-	if (np->rx_skb_hook) {
-		spin_lock_irqsave(&npinfo->rx_lock, flags);
-		list_add_tail(&np->rx, &npinfo->rx_np);
-		spin_unlock_irqrestore(&npinfo->rx_lock, flags);
-	}
-}
-
-static void netpoll_trap_cleanup(struct netpoll *np, struct netpoll_info *npinfo)
-{
-	unsigned long flags;
-	if (!list_empty(&npinfo->rx_np)) {
-		spin_lock_irqsave(&npinfo->rx_lock, flags);
-		list_del(&np->rx);
-		spin_unlock_irqrestore(&npinfo->rx_lock, flags);
-	}
-}
-
-#else /* !CONFIG_NETPOLL_TRAP */
-static inline void netpoll_trap_setup_info(struct netpoll_info *npinfo)
-{
-}
-static inline void netpoll_trap_cleanup_info(struct netpoll_info *npinfo)
-{
-}
-static inline 
-void netpoll_trap_setup(struct netpoll *np, struct netpoll_info *npinfo)
-{
-}
-static inline
-void netpoll_trap_cleanup(struct netpoll *np, struct netpoll_info *npinfo)
-{
-}
-#endif /* CONFIG_NETPOLL_TRAP */
-
 void netpoll_print_options(struct netpoll *np)
 {
 	np_info(np, "local port %d\n", np->local_port);
@@ -1103,8 +609,6 @@ int __netpoll_setup(struct netpoll *np, struct net_device *ndev, gfp_t gfp)
 			goto out;
 		}
 
-		netpoll_trap_setup_info(npinfo);
-
 		sema_init(&npinfo->dev_lock, 1);
 		skb_queue_head_init(&npinfo->txq);
 		INIT_DELAYED_WORK(&npinfo->tx_work, queue_process);
@@ -1124,8 +628,6 @@ int __netpoll_setup(struct netpoll *np, struct net_device *ndev, gfp_t gfp)
 
 	npinfo->netpoll = np;
 
-	netpoll_trap_setup(np, npinfo);
-
 	/* last thing to do is link it to the net device structure */
 	rcu_assign_pointer(ndev->npinfo, npinfo);
 
@@ -1274,7 +776,6 @@ static void rcu_cleanup_netpoll_info(struct rcu_head *rcu_head)
 	struct netpoll_info *npinfo =
 			container_of(rcu_head, struct netpoll_info, rcu);
 
-	netpoll_trap_cleanup_info(npinfo);
 	skb_queue_purge(&npinfo->txq);
 
 	/* we can't call cancel_delayed_work_sync here, as we are in softirq */
@@ -1299,8 +800,6 @@ void __netpoll_cleanup(struct netpoll *np)
 	if (!npinfo)
 		return;
 
-	netpoll_trap_cleanup(np, npinfo);
-
 	synchronize_srcu(&netpoll_srcu);
 
 	if (atomic_dec_and_test(&npinfo->refcnt)) {
@@ -1344,20 +843,3 @@ out:
 	rtnl_unlock();
 }
 EXPORT_SYMBOL(netpoll_cleanup);
-
-#ifdef CONFIG_NETPOLL_TRAP
-int netpoll_trap(void)
-{
-	return atomic_read(&trapped);
-}
-EXPORT_SYMBOL(netpoll_trap);
-
-void netpoll_set_trap(int trap)
-{
-	if (trap)
-		atomic_inc(&trapped);
-	else
-		atomic_dec(&trapped);
-}
-EXPORT_SYMBOL(netpoll_set_trap);
-#endif /* CONFIG_NETPOLL_TRAP */
-- 
1.7.5.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* Re: [PATCH net-next] net: Replace u64_stats_fetch_begin_bh to u64_stats_fetch_begin_irq
  2014-03-14  4:26                               ` [PATCH net-next] net: Replace u64_stats_fetch_begin_bh to u64_stats_fetch_begin_irq Eric W. Biederman
@ 2014-03-15  2:41                                 ` David Miller
  0 siblings, 0 replies; 288+ messages in thread
From: David Miller @ 2014-03-15  2:41 UTC (permalink / raw)
  To: ebiederm; +Cc: eric.dumazet, romieu, netdev, xiyou.wangcong, mpm, satyam.sharma

From: ebiederm@xmission.com (Eric W. Biederman)
Date: Thu, 13 Mar 2014 21:26:42 -0700

> Replace the bh safe variant with the hard irq safe variant.
> 
> We need a hard irq safe variant to deal with netpoll transmitting
> packets from hard irq context, and we need it in most if not all of
> the places using the bh safe variant.  
> 
> Except on 32bit uni-processor the code is exactly the same so don't
> bother with a bh variant, just have a hard irq safe variant that
> everyone can use.
> 
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>

This looks great.

I'm going to apply this to the net-next tree, I hope the block folks
don't mind.

^ permalink raw reply	[flat|nested] 288+ messages in thread

* Re: [PATCH net-next 0/16] Don't receive packets when the napi budget == 0
  2014-03-15  0:56                       ` [PATCH net-next 0/16] " Eric W. Biederman
                                           ` (15 preceding siblings ...)
  2014-03-15  1:11                         ` [PATCH net-next 16/16] sfc: " Eric W. Biederman
@ 2014-03-15  2:54                         ` David Miller
  16 siblings, 0 replies; 288+ messages in thread
From: David Miller @ 2014-03-15  2:54 UTC (permalink / raw)
  To: ebiederm; +Cc: eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma

From: ebiederm@xmission.com (Eric W. Biederman)
Date: Fri, 14 Mar 2014 17:56:47 -0700

> After reading through all 120 drivers supporting netpoll I have found 16
> more that process at least received packet when the napi budget == 0.
> 
> Processing more packets than your budget has always been a bug but
> we haven't cared before so it looks like these drivers slipped through,
> and need fixes.
> 
> As netpoll will shortly be using a budget of 0 to get the tx queue
> processing with the rx queue processing we now care.

Series applied, thanks Eric.

^ permalink raw reply	[flat|nested] 288+ messages in thread

* Re: [PATCH 0/9] netpoll: Cleanup received packet processing
  2014-03-15  1:30                         ` [PATCH 0/9] netpoll: Cleanup received packet processing Eric W. Biederman
                                             ` (8 preceding siblings ...)
  2014-03-15  1:39                           ` [PATCH 9/9] netpoll: Remove dead packet receive code (CONFIG_NETPOLL_TRAP) Eric W. Biederman
@ 2014-03-15  2:59                           ` David Miller
  2014-03-15  3:39                             ` Eric W. Biederman
  9 siblings, 1 reply; 288+ messages in thread
From: David Miller @ 2014-03-15  2:59 UTC (permalink / raw)
  To: ebiederm
  Cc: stephen, eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma

From: ebiederm@xmission.com (Eric W. Biederman)
Date: Fri, 14 Mar 2014 18:30:14 -0700

> This is the long-winded, careful, and polite version of removing the netpoll
> receive packet processing.
> 
> First I untangle the code in small steps.  Then I modify the code to not
> force reception and dropping of packets when we are transmiting a packet
> with netpoll.  Finally I move all of the packet reception under
> CONFIG_NETPOLL_TRAP and delete CONFIG_NETPOLL_TRAP.
> 
> If someone wants to do a stable backport it would take backporting
> the first 18 patches that handle the budget == 0 in the networking
> drivers, and the first 5 of these patches.
> 
> If anyone wants to resurrect netpoll packet reception someday it should
> just be a matter of reverting the last patch.

This looks great, but it doesn't apply cleanly to net-next, please
respin.

Thanks!

^ permalink raw reply	[flat|nested] 288+ messages in thread

* Re: [PATCH 0/9] netpoll: Cleanup received packet processing
  2014-03-15  2:59                           ` [PATCH 0/9] netpoll: Cleanup received packet processing David Miller
@ 2014-03-15  3:39                             ` Eric W. Biederman
  2014-03-15  3:43                               ` [PATCH 00/10] " Eric W. Biederman
  0 siblings, 1 reply; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-15  3:39 UTC (permalink / raw)
  To: David Miller
  Cc: stephen, eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma

David Miller <davem@davemloft.net> writes:

> From: ebiederm@xmission.com (Eric W. Biederman)
> Date: Fri, 14 Mar 2014 18:30:14 -0700
>
>> This is the long-winded, careful, and polite version of removing the netpoll
>> receive packet processing.
>> 
>> First I untangle the code in small steps.  Then I modify the code to not
>> force reception and dropping of packets when we are transmiting a packet
>> with netpoll.  Finally I move all of the packet reception under
>> CONFIG_NETPOLL_TRAP and delete CONFIG_NETPOLL_TRAP.
>> 
>> If someone wants to do a stable backport it would take backporting
>> the first 18 patches that handle the budget == 0 in the networking
>> drivers, and the first 5 of these patches.
>> 
>> If anyone wants to resurrect netpoll packet reception someday it should
>> just be a matter of reverting the last patch.
>
> This looks great, but it doesn't apply cleanly to net-next, please
> respin.

Doh!  It looks like I dropped the first patch by accident.
Resend coming up.


Eric

^ permalink raw reply	[flat|nested] 288+ messages in thread

* Re: [PATCH 00/10] netpoll: Cleanup received packet processing
  2014-03-15  3:39                             ` Eric W. Biederman
@ 2014-03-15  3:43                               ` Eric W. Biederman
  2014-03-15  3:44                                 ` [PATCH 01/10] netpoll: move setting of NETPOLL_RX_DROP into netpoll_poll_dev Eric W. Biederman
                                                   ` (10 more replies)
  0 siblings, 11 replies; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-15  3:43 UTC (permalink / raw)
  To: David Miller
  Cc: stephen, eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma


This is the long-winded, careful, and polite version of removing the netpoll
receive packet processing.

First I untangle the code in small steps.  Then I modify the code to not
force reception and dropping of packets when we are transmiting a packet
with netpoll.  Finally I move all of the packet reception under
CONFIG_NETPOLL_TRAP and delete CONFIG_NETPOLL_TRAP.

If someone wants to do a stable backport of these patches, it would
require backporting the first 18 patches that handle the budget == 0 in
the networking drivers, and the first 6 of these patches.

If anyone wants to resurrect netpoll packet reception someday it should
just be a matter of reverting the last patch.

Eric W. Biederman (10):
      netpoll: move setting of NETPOLL_RX_DROP into netpoll_poll_dev
      netpoll: Pass budget into poll_napi
      netpoll: Visit all napi handlers in poll_napi
      netpoll: Warn if more packets are processed than are budgeted
      netpoll: Add netpoll_rx_processing
      netpoll: Don't drop all received packets.
      netpoll: Move netpoll_trap under CONFIG_NETPOLL_TRAP
      netpoll: Consolidate neigh_tx processing in service_neigh_queue
      netpoll: Move all receive processing under CONFIG_NETPOLL_TRAP
      netpoll: Remove dead packet receive code (CONFIG_NETPOLL_TRAP)

 drivers/net/Kconfig       |    5 -
 include/linux/netdevice.h |   17 --
 include/linux/netpoll.h   |   61 ------
 net/core/dev.c            |   11 +-
 net/core/netpoll.c        |  492 +--------------------------------------------
 5 files changed, 7 insertions(+), 579 deletions(-)

Eric

^ permalink raw reply	[flat|nested] 288+ messages in thread

* [PATCH 01/10] netpoll: move setting of NETPOLL_RX_DROP into netpoll_poll_dev
  2014-03-15  3:43                               ` [PATCH 00/10] " Eric W. Biederman
@ 2014-03-15  3:44                                 ` Eric W. Biederman
  2014-03-15  3:45                                 ` [PATCH 02/10] netpoll: Pass budget into poll_napi Eric W. Biederman
                                                   ` (9 subsequent siblings)
  10 siblings, 0 replies; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-15  3:44 UTC (permalink / raw)
  To: David Miller
  Cc: stephen, eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma


Today netpoll depends on setting NETPOLL_RX_DROP before networking
drivers receive packets in interrupt context so that the packets can
be dropped.  Move this setting into netpoll_poll_dev from
poll_one_napi so that if ndo_poll_controller happens to receive
packets we will drop the packets on the floor instead of letting the
packets bounce through the networking stack and potentially cause problems.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 net/core/netpoll.c |   16 ++++++++--------
 1 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/net/core/netpoll.c b/net/core/netpoll.c
index a664f7829a6d..ef4f45df539f 100644
--- a/net/core/netpoll.c
+++ b/net/core/netpoll.c
@@ -144,8 +144,7 @@ static __sum16 checksum_udp(struct sk_buff *skb, struct udphdr *uh,
  * network adapter, forcing superfluous retries and possibly timeouts.
  * Thus, we set our budget to greater than 1.
  */
-static int poll_one_napi(struct netpoll_info *npinfo,
-			 struct napi_struct *napi, int budget)
+static int poll_one_napi(struct napi_struct *napi, int budget)
 {
 	int work;
 
@@ -156,16 +155,12 @@ static int poll_one_napi(struct netpoll_info *npinfo,
 	if (!test_bit(NAPI_STATE_SCHED, &napi->state))
 		return budget;
 
-	npinfo->rx_flags |= NETPOLL_RX_DROP;
-	atomic_inc(&trapped);
 	set_bit(NAPI_STATE_NPSVC, &napi->state);
 
 	work = napi->poll(napi, budget);
 	trace_napi_poll(napi);
 
 	clear_bit(NAPI_STATE_NPSVC, &napi->state);
-	atomic_dec(&trapped);
-	npinfo->rx_flags &= ~NETPOLL_RX_DROP;
 
 	return budget - work;
 }
@@ -178,8 +173,7 @@ static void poll_napi(struct net_device *dev)
 	list_for_each_entry(napi, &dev->napi_list, dev_list) {
 		if (napi->poll_owner != smp_processor_id() &&
 		    spin_trylock(&napi->poll_lock)) {
-			budget = poll_one_napi(rcu_dereference_bh(dev->npinfo),
-					       napi, budget);
+			budget = poll_one_napi(napi, budget);
 			spin_unlock(&napi->poll_lock);
 
 			if (!budget)
@@ -215,6 +209,9 @@ static void netpoll_poll_dev(struct net_device *dev)
 		return;
 	}
 
+	ni->rx_flags |= NETPOLL_RX_DROP;
+	atomic_inc(&trapped);
+
 	ops = dev->netdev_ops;
 	if (!ops->ndo_poll_controller) {
 		up(&ni->dev_lock);
@@ -226,6 +223,9 @@ static void netpoll_poll_dev(struct net_device *dev)
 
 	poll_napi(dev);
 
+	atomic_dec(&trapped);
+	ni->rx_flags &= ~NETPOLL_RX_DROP;
+
 	up(&ni->dev_lock);
 
 	if (dev->flags & IFF_SLAVE) {
-- 
1.7.5.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* [PATCH 02/10] netpoll: Pass budget into poll_napi
  2014-03-15  3:43                               ` [PATCH 00/10] " Eric W. Biederman
  2014-03-15  3:44                                 ` [PATCH 01/10] netpoll: move setting of NETPOLL_RX_DROP into netpoll_poll_dev Eric W. Biederman
@ 2014-03-15  3:45                                 ` Eric W. Biederman
  2014-03-15  3:45                                 ` [PATCH 03/10] netpoll: Visit all napi handlers in poll_napi Eric W. Biederman
                                                   ` (8 subsequent siblings)
  10 siblings, 0 replies; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-15  3:45 UTC (permalink / raw)
  To: David Miller
  Cc: stephen, eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma


This moves the control logic to the top level in netpoll_poll_dev
instead of having it dispersed throughout netpoll_poll_dev,
poll_napi and poll_one_napi.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 net/core/netpoll.c |    6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/net/core/netpoll.c b/net/core/netpoll.c
index ef4f45df539f..147c75855c9b 100644
--- a/net/core/netpoll.c
+++ b/net/core/netpoll.c
@@ -165,10 +165,9 @@ static int poll_one_napi(struct napi_struct *napi, int budget)
 	return budget - work;
 }
 
-static void poll_napi(struct net_device *dev)
+static void poll_napi(struct net_device *dev, int budget)
 {
 	struct napi_struct *napi;
-	int budget = 16;
 
 	list_for_each_entry(napi, &dev->napi_list, dev_list) {
 		if (napi->poll_owner != smp_processor_id() &&
@@ -196,6 +195,7 @@ static void netpoll_poll_dev(struct net_device *dev)
 {
 	const struct net_device_ops *ops;
 	struct netpoll_info *ni = rcu_dereference_bh(dev->npinfo);
+	int budget = 16;
 
 	/* Don't do any rx activity if the dev_lock mutex is held
 	 * the dev_open/close paths use this to block netpoll activity
@@ -221,7 +221,7 @@ static void netpoll_poll_dev(struct net_device *dev)
 	/* Process pending work on NIC */
 	ops->ndo_poll_controller(dev);
 
-	poll_napi(dev);
+	poll_napi(dev, budget);
 
 	atomic_dec(&trapped);
 	ni->rx_flags &= ~NETPOLL_RX_DROP;
-- 
1.7.5.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* [PATCH 03/10] netpoll: Visit all napi handlers in poll_napi
  2014-03-15  3:43                               ` [PATCH 00/10] " Eric W. Biederman
  2014-03-15  3:44                                 ` [PATCH 01/10] netpoll: move setting of NETPOLL_RX_DROP into netpoll_poll_dev Eric W. Biederman
  2014-03-15  3:45                                 ` [PATCH 02/10] netpoll: Pass budget into poll_napi Eric W. Biederman
@ 2014-03-15  3:45                                 ` Eric W. Biederman
  2014-03-15  3:47                                 ` [PATCH 04/10] netpoll: Warn if more packets are processed than are budgeted Eric W. Biederman
                                                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-15  3:45 UTC (permalink / raw)
  To: David Miller
  Cc: stephen, eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma


In poll_napi loop through all of the napi handlers even when the
budget falls to 0 to ensure that we process all of the tx_queues, and
so that we continue to call into drivers when our initial budget is 0.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 net/core/netpoll.c |    3 ---
 1 files changed, 0 insertions(+), 3 deletions(-)

diff --git a/net/core/netpoll.c b/net/core/netpoll.c
index 147c75855c9b..d9e3d74ec9ac 100644
--- a/net/core/netpoll.c
+++ b/net/core/netpoll.c
@@ -174,9 +174,6 @@ static void poll_napi(struct net_device *dev, int budget)
 		    spin_trylock(&napi->poll_lock)) {
 			budget = poll_one_napi(napi, budget);
 			spin_unlock(&napi->poll_lock);
-
-			if (!budget)
-				break;
 		}
 	}
 }
-- 
1.7.5.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* [PATCH 04/10] netpoll: Warn if more packets are processed than are budgeted
  2014-03-15  3:43                               ` [PATCH 00/10] " Eric W. Biederman
                                                   ` (2 preceding siblings ...)
  2014-03-15  3:45                                 ` [PATCH 03/10] netpoll: Visit all napi handlers in poll_napi Eric W. Biederman
@ 2014-03-15  3:47                                 ` Eric W. Biederman
  2014-03-15  3:47                                 ` [PATCH 05/10] netpoll: Add netpoll_rx_processing Eric W. Biederman
                                                   ` (6 subsequent siblings)
  10 siblings, 0 replies; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-15  3:47 UTC (permalink / raw)
  To: David Miller
  Cc: stephen, eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma


There is already a warning for this case in the normal netpoll path,
but put a copy here in case how netpoll calls the poll functions
causes a differenet result.

netpoll will shortly call the napi poll routine with a budget 0 to
avoid any rx packets being processed.  As nothing does that today
we may encounter drivers that have problems so a netpoll specific
warning seems desirable.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 net/core/netpoll.c |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/net/core/netpoll.c b/net/core/netpoll.c
index d9e3d74ec9ac..2ad330e02967 100644
--- a/net/core/netpoll.c
+++ b/net/core/netpoll.c
@@ -158,6 +158,7 @@ static int poll_one_napi(struct napi_struct *napi, int budget)
 	set_bit(NAPI_STATE_NPSVC, &napi->state);
 
 	work = napi->poll(napi, budget);
+	WARN_ONCE(work > budget, "%pF exceeded budget in poll\n", napi->poll);
 	trace_napi_poll(napi);
 
 	clear_bit(NAPI_STATE_NPSVC, &napi->state);
-- 
1.7.5.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* [PATCH 05/10] netpoll: Add netpoll_rx_processing
  2014-03-15  3:43                               ` [PATCH 00/10] " Eric W. Biederman
                                                   ` (3 preceding siblings ...)
  2014-03-15  3:47                                 ` [PATCH 04/10] netpoll: Warn if more packets are processed than are budgeted Eric W. Biederman
@ 2014-03-15  3:47                                 ` Eric W. Biederman
  2014-03-15  3:48                                 ` [PATCH 06/10] netpoll: Don't drop all received packets Eric W. Biederman
                                                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-15  3:47 UTC (permalink / raw)
  To: David Miller
  Cc: stephen, eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma


Add a helper netpoll_rx_processing that reports when netpoll has
receive side processing to perform.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 include/linux/netpoll.h |   18 ++++++++++++++----
 net/core/netpoll.c      |    4 ++--
 2 files changed, 16 insertions(+), 6 deletions(-)

diff --git a/include/linux/netpoll.h b/include/linux/netpoll.h
index fbfdb9d8d3a7..479d15c97770 100644
--- a/include/linux/netpoll.h
+++ b/include/linux/netpoll.h
@@ -82,14 +82,24 @@ static inline void netpoll_send_skb(struct netpoll *np, struct sk_buff *skb)
 	local_irq_restore(flags);
 }
 
-
+#ifdef CONFIG_NETPOLL_TRAP
+static inline bool netpoll_rx_processing(struct netpoll_info *npinfo)
+{
+	return !list_empty(&npinfo->rx_np);
+}
+#else
+static inline bool netpoll_rx_processing(struct netpoll_info *npinfo)
+{
+	return false;
+}
+#endif
 
 #ifdef CONFIG_NETPOLL
 static inline bool netpoll_rx_on(struct sk_buff *skb)
 {
 	struct netpoll_info *npinfo = rcu_dereference_bh(skb->dev->npinfo);
 
-	return npinfo && (!list_empty(&npinfo->rx_np) || npinfo->rx_flags);
+	return npinfo && (netpoll_rx_processing(npinfo) || npinfo->rx_flags);
 }
 
 static inline bool netpoll_rx(struct sk_buff *skb)
@@ -105,8 +115,8 @@ static inline bool netpoll_rx(struct sk_buff *skb)
 
 	npinfo = rcu_dereference_bh(skb->dev->npinfo);
 	spin_lock(&npinfo->rx_lock);
-	/* check rx_flags again with the lock held */
-	if (npinfo->rx_flags && __netpoll_rx(skb, npinfo))
+	/* check rx_processing again with the lock held */
+	if (netpoll_rx_processing(npinfo) && __netpoll_rx(skb, npinfo))
 		ret = true;
 	spin_unlock(&npinfo->rx_lock);
 
diff --git a/net/core/netpoll.c b/net/core/netpoll.c
index 2ad330e02967..ef83a2530e98 100644
--- a/net/core/netpoll.c
+++ b/net/core/netpoll.c
@@ -538,7 +538,7 @@ static void netpoll_neigh_reply(struct sk_buff *skb, struct netpoll_info *npinfo
 	int hlen, tlen;
 	int hits = 0, proto;
 
-	if (list_empty(&npinfo->rx_np))
+	if (!netpoll_rx_processing(npinfo))
 		return;
 
 	/* Before checking the packet, we do some early
@@ -770,7 +770,7 @@ int __netpoll_rx(struct sk_buff *skb, struct netpoll_info *npinfo)
 	struct netpoll *np, *tmp;
 	uint16_t source;
 
-	if (list_empty(&npinfo->rx_np))
+	if (!netpoll_rx_processing(npinfo))
 		goto out;
 
 	if (skb->dev->type != ARPHRD_ETHER)
-- 
1.7.5.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* [PATCH 06/10] netpoll: Don't drop all received packets.
  2014-03-15  3:43                               ` [PATCH 00/10] " Eric W. Biederman
                                                   ` (4 preceding siblings ...)
  2014-03-15  3:47                                 ` [PATCH 05/10] netpoll: Add netpoll_rx_processing Eric W. Biederman
@ 2014-03-15  3:48                                 ` Eric W. Biederman
  2014-03-15  3:49                                 ` [PATCH 07/10] netpoll: Move netpoll_trap under CONFIG_NETPOLL_TRAP Eric W. Biederman
                                                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-15  3:48 UTC (permalink / raw)
  To: David Miller
  Cc: stephen, eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma


Change the strategy of netpoll from dropping all packets received
during netpoll_poll_dev to calling napi poll with a budget of 0
(to avoid processing drivers rx queue), and to ignore packets received
with netif_rx (those will safely be placed on the backlog queue).

All of the netpoll supporting drivers have been reviewed to ensure
either thay use netif_rx or that a budget of 0 is supported by their
napi poll routine and that a budget of 0 will not process the drivers
rx queues.

Not dropping packets makes NETPOLL_RX_DROP unnecesary so it is removed.

npinfo->rx_flags is removed  as rx_flags with just the NETPOLL_RX_ENABLED
flag becomes just a redundant mirror of list_empty(&npinfo->rx_np).

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 include/linux/netpoll.h |    3 +--
 net/core/netpoll.c      |   17 ++++++-----------
 2 files changed, 7 insertions(+), 13 deletions(-)

diff --git a/include/linux/netpoll.h b/include/linux/netpoll.h
index 479d15c97770..154f9776056c 100644
--- a/include/linux/netpoll.h
+++ b/include/linux/netpoll.h
@@ -39,7 +39,6 @@ struct netpoll {
 struct netpoll_info {
 	atomic_t refcnt;
 
-	unsigned long rx_flags;
 	spinlock_t rx_lock;
 	struct semaphore dev_lock;
 	struct list_head rx_np; /* netpolls that registered an rx_skb_hook */
@@ -99,7 +98,7 @@ static inline bool netpoll_rx_on(struct sk_buff *skb)
 {
 	struct netpoll_info *npinfo = rcu_dereference_bh(skb->dev->npinfo);
 
-	return npinfo && (netpoll_rx_processing(npinfo) || npinfo->rx_flags);
+	return npinfo && netpoll_rx_processing(npinfo);
 }
 
 static inline bool netpoll_rx(struct sk_buff *skb)
diff --git a/net/core/netpoll.c b/net/core/netpoll.c
index ef83a2530e98..793dc04d2f19 100644
--- a/net/core/netpoll.c
+++ b/net/core/netpoll.c
@@ -51,8 +51,6 @@ static atomic_t trapped;
 DEFINE_STATIC_SRCU(netpoll_srcu);
 
 #define USEC_PER_POLL	50
-#define NETPOLL_RX_ENABLED  1
-#define NETPOLL_RX_DROP     2
 
 #define MAX_SKB_SIZE							\
 	(sizeof(struct ethhdr) +					\
@@ -193,7 +191,8 @@ static void netpoll_poll_dev(struct net_device *dev)
 {
 	const struct net_device_ops *ops;
 	struct netpoll_info *ni = rcu_dereference_bh(dev->npinfo);
-	int budget = 16;
+	bool rx_processing = netpoll_rx_processing(ni);
+	int budget = rx_processing? 16 : 0;
 
 	/* Don't do any rx activity if the dev_lock mutex is held
 	 * the dev_open/close paths use this to block netpoll activity
@@ -207,8 +206,8 @@ static void netpoll_poll_dev(struct net_device *dev)
 		return;
 	}
 
-	ni->rx_flags |= NETPOLL_RX_DROP;
-	atomic_inc(&trapped);
+	if (rx_processing)
+		atomic_inc(&trapped);
 
 	ops = dev->netdev_ops;
 	if (!ops->ndo_poll_controller) {
@@ -221,8 +220,8 @@ static void netpoll_poll_dev(struct net_device *dev)
 
 	poll_napi(dev, budget);
 
-	atomic_dec(&trapped);
-	ni->rx_flags &= ~NETPOLL_RX_DROP;
+	if (rx_processing)
+		atomic_dec(&trapped);
 
 	up(&ni->dev_lock);
 
@@ -1050,7 +1049,6 @@ int __netpoll_setup(struct netpoll *np, struct net_device *ndev, gfp_t gfp)
 			goto out;
 		}
 
-		npinfo->rx_flags = 0;
 		INIT_LIST_HEAD(&npinfo->rx_np);
 
 		spin_lock_init(&npinfo->rx_lock);
@@ -1076,7 +1074,6 @@ int __netpoll_setup(struct netpoll *np, struct net_device *ndev, gfp_t gfp)
 
 	if (np->rx_skb_hook) {
 		spin_lock_irqsave(&npinfo->rx_lock, flags);
-		npinfo->rx_flags |= NETPOLL_RX_ENABLED;
 		list_add_tail(&np->rx, &npinfo->rx_np);
 		spin_unlock_irqrestore(&npinfo->rx_lock, flags);
 	}
@@ -1258,8 +1255,6 @@ void __netpoll_cleanup(struct netpoll *np)
 	if (!list_empty(&npinfo->rx_np)) {
 		spin_lock_irqsave(&npinfo->rx_lock, flags);
 		list_del(&np->rx);
-		if (list_empty(&npinfo->rx_np))
-			npinfo->rx_flags &= ~NETPOLL_RX_ENABLED;
 		spin_unlock_irqrestore(&npinfo->rx_lock, flags);
 	}
 
-- 
1.7.5.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* [PATCH 07/10] netpoll: Move netpoll_trap under CONFIG_NETPOLL_TRAP
  2014-03-15  3:43                               ` [PATCH 00/10] " Eric W. Biederman
                                                   ` (5 preceding siblings ...)
  2014-03-15  3:48                                 ` [PATCH 06/10] netpoll: Don't drop all received packets Eric W. Biederman
@ 2014-03-15  3:49                                 ` Eric W. Biederman
  2014-03-15  3:50                                 ` [PATCH 08/10] netpoll: Consolidate neigh_tx processing in service_neigh_queue Eric W. Biederman
                                                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-15  3:49 UTC (permalink / raw)
  To: David Miller
  Cc: stephen, eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma


Now that we no longer need to receive packets to safely drain the
network drivers receive queue move netpoll_trap and netpoll_set_trap
under CONFIG_NETPOLL_TRAP

Making netpoll_trap and netpoll_set_trap noop inline functions
when CONFIG_NETPOLL_TRAP is not set.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 include/linux/netpoll.h |   11 +++++++++--
 net/core/netpoll.c      |   14 +++++++++-----
 2 files changed, 18 insertions(+), 7 deletions(-)

diff --git a/include/linux/netpoll.h b/include/linux/netpoll.h
index 154f9776056c..ab9aaaff8d04 100644
--- a/include/linux/netpoll.h
+++ b/include/linux/netpoll.h
@@ -65,8 +65,6 @@ void netpoll_print_options(struct netpoll *np);
 int netpoll_parse_options(struct netpoll *np, char *opt);
 int __netpoll_setup(struct netpoll *np, struct net_device *ndev, gfp_t gfp);
 int netpoll_setup(struct netpoll *np);
-int netpoll_trap(void);
-void netpoll_set_trap(int trap);
 void __netpoll_cleanup(struct netpoll *np);
 void __netpoll_free_async(struct netpoll *np);
 void netpoll_cleanup(struct netpoll *np);
@@ -82,11 +80,20 @@ static inline void netpoll_send_skb(struct netpoll *np, struct sk_buff *skb)
 }
 
 #ifdef CONFIG_NETPOLL_TRAP
+int netpoll_trap(void);
+void netpoll_set_trap(int trap);
 static inline bool netpoll_rx_processing(struct netpoll_info *npinfo)
 {
 	return !list_empty(&npinfo->rx_np);
 }
 #else
+static inline int netpoll_trap(void)
+{
+	return 0;
+}
+static inline void netpoll_set_trap(int trap)
+{
+}
 static inline bool netpoll_rx_processing(struct netpoll_info *npinfo)
 {
 	return false;
diff --git a/net/core/netpoll.c b/net/core/netpoll.c
index 793dc04d2f19..0e45835f1737 100644
--- a/net/core/netpoll.c
+++ b/net/core/netpoll.c
@@ -46,7 +46,9 @@
 
 static struct sk_buff_head skb_pool;
 
+#ifdef CONFIG_NETPOLL_TRAP
 static atomic_t trapped;
+#endif
 
 DEFINE_STATIC_SRCU(netpoll_srcu);
 
@@ -207,7 +209,7 @@ static void netpoll_poll_dev(struct net_device *dev)
 	}
 
 	if (rx_processing)
-		atomic_inc(&trapped);
+		netpoll_set_trap(1);
 
 	ops = dev->netdev_ops;
 	if (!ops->ndo_poll_controller) {
@@ -221,7 +223,7 @@ static void netpoll_poll_dev(struct net_device *dev)
 	poll_napi(dev, budget);
 
 	if (rx_processing)
-		atomic_dec(&trapped);
+		netpoll_set_trap(0);
 
 	up(&ni->dev_lock);
 
@@ -776,10 +778,10 @@ int __netpoll_rx(struct sk_buff *skb, struct netpoll_info *npinfo)
 		goto out;
 
 	/* check if netpoll clients need ARP */
-	if (skb->protocol == htons(ETH_P_ARP) && atomic_read(&trapped)) {
+	if (skb->protocol == htons(ETH_P_ARP) && netpoll_trap()) {
 		skb_queue_tail(&npinfo->neigh_tx, skb);
 		return 1;
-	} else if (pkt_is_ns(skb) && atomic_read(&trapped)) {
+	} else if (pkt_is_ns(skb) && netpoll_trap()) {
 		skb_queue_tail(&npinfo->neigh_tx, skb);
 		return 1;
 	}
@@ -896,7 +898,7 @@ int __netpoll_rx(struct sk_buff *skb, struct netpoll_info *npinfo)
 	return 1;
 
 out:
-	if (atomic_read(&trapped)) {
+	if (netpoll_trap()) {
 		kfree_skb(skb);
 		return 1;
 	}
@@ -1302,6 +1304,7 @@ out:
 }
 EXPORT_SYMBOL(netpoll_cleanup);
 
+#ifdef CONFIG_NETPOLL_TRAP
 int netpoll_trap(void)
 {
 	return atomic_read(&trapped);
@@ -1316,3 +1319,4 @@ void netpoll_set_trap(int trap)
 		atomic_dec(&trapped);
 }
 EXPORT_SYMBOL(netpoll_set_trap);
+#endif /* CONFIG_NETPOLL_TRAP */
-- 
1.7.5.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* [PATCH 08/10] netpoll: Consolidate neigh_tx processing in service_neigh_queue
  2014-03-15  3:43                               ` [PATCH 00/10] " Eric W. Biederman
                                                   ` (6 preceding siblings ...)
  2014-03-15  3:49                                 ` [PATCH 07/10] netpoll: Move netpoll_trap under CONFIG_NETPOLL_TRAP Eric W. Biederman
@ 2014-03-15  3:50                                 ` Eric W. Biederman
  2014-03-15  3:50                                 ` [PATCH 09/10] netpoll: Move all receive processing under CONFIG_NETPOLL_TRAP Eric W. Biederman
                                                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-15  3:50 UTC (permalink / raw)
  To: David Miller
  Cc: stephen, eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma


Move the bond slave device neigh_tx handling into service_neigh_queue.

In connection with neigh_tx processing remove unnecessary tests of
a NULL netpoll_info.  As the netpoll_poll_dev has already used
and thus verified the existince of the netpoll_info.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 net/core/netpoll.c |   38 ++++++++++++++++----------------------
 1 files changed, 16 insertions(+), 22 deletions(-)

diff --git a/net/core/netpoll.c b/net/core/netpoll.c
index 0e45835f1737..b69bb3f1ba3f 100644
--- a/net/core/netpoll.c
+++ b/net/core/netpoll.c
@@ -179,14 +179,23 @@ static void poll_napi(struct net_device *dev, int budget)
 	}
 }
 
-static void service_neigh_queue(struct netpoll_info *npi)
+static void service_neigh_queue(struct net_device *dev,
+				struct netpoll_info *npi)
 {
-	if (npi) {
-		struct sk_buff *skb;
-
-		while ((skb = skb_dequeue(&npi->neigh_tx)))
-			netpoll_neigh_reply(skb, npi);
+	struct sk_buff *skb;
+	if (dev->flags & IFF_SLAVE) {
+		struct net_device *bond_dev;
+		struct netpoll_info *bond_ni;
+
+		bond_dev = netdev_master_upper_dev_get_rcu(dev);
+		bond_ni = rcu_dereference_bh(bond_dev->npinfo);
+		while ((skb = skb_dequeue(&npi->neigh_tx))) {
+			skb->dev = bond_dev;
+			skb_queue_tail(&bond_ni->neigh_tx, skb);
+		}
 	}
+	while ((skb = skb_dequeue(&npi->neigh_tx)))
+		netpoll_neigh_reply(skb, npi);
 }
 
 static void netpoll_poll_dev(struct net_device *dev)
@@ -227,22 +236,7 @@ static void netpoll_poll_dev(struct net_device *dev)
 
 	up(&ni->dev_lock);
 
-	if (dev->flags & IFF_SLAVE) {
-		if (ni) {
-			struct net_device *bond_dev;
-			struct sk_buff *skb;
-			struct netpoll_info *bond_ni;
-
-			bond_dev = netdev_master_upper_dev_get_rcu(dev);
-			bond_ni = rcu_dereference_bh(bond_dev->npinfo);
-			while ((skb = skb_dequeue(&ni->neigh_tx))) {
-				skb->dev = bond_dev;
-				skb_queue_tail(&bond_ni->neigh_tx, skb);
-			}
-		}
-	}
-
-	service_neigh_queue(ni);
+	service_neigh_queue(dev, ni);
 
 	zap_completion_queue();
 }
-- 
1.7.5.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* [PATCH 09/10] netpoll: Move all receive processing under CONFIG_NETPOLL_TRAP
  2014-03-15  3:43                               ` [PATCH 00/10] " Eric W. Biederman
                                                   ` (7 preceding siblings ...)
  2014-03-15  3:50                                 ` [PATCH 08/10] netpoll: Consolidate neigh_tx processing in service_neigh_queue Eric W. Biederman
@ 2014-03-15  3:50                                 ` Eric W. Biederman
  2014-03-15  3:51                                 ` [PATCH 10/10] netpoll: Remove dead packet receive code (CONFIG_NETPOLL_TRAP) Eric W. Biederman
  2014-03-17 19:49                                 ` [PATCH 00/10] netpoll: Cleanup received packet processing David Miller
  10 siblings, 0 replies; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-15  3:50 UTC (permalink / raw)
  To: David Miller
  Cc: stephen, eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma



Make rx_skb_hook, and rx in struct netpoll depend on
CONFIG_NETPOLL_TRAP Make rx_lock, rx_np, and neigh_tx in struct
netpoll_info depend on CONFIG_NETPOLL_TRAP

Make the functions netpoll_rx_on, netpoll_rx, and netpoll_receive_skb
no-ops when CONFIG_NETPOLL_TRAP is not set.

Only build netpoll_neigh_reply, checksum_udp service_neigh_queue,
pkt_is_ns, and __netpoll_rx when CONFIG_NETPOLL_TRAP is defined.

Add helper functions netpoll_trap_setup, netpoll_trap_setup_info,
netpoll_trap_cleanup, and netpoll_trap_cleanup_info that initialize
and cleanup the struct netpoll and struct netpoll_info receive
specific fields when CONFIG_NETPOLL_TRAP is enabled and do nothing
otherwise.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 include/linux/netpoll.h |   73 +++++++++++++++++++++++-------------------
 net/core/netpoll.c      |   81 +++++++++++++++++++++++++++++++++++++----------
 2 files changed, 104 insertions(+), 50 deletions(-)

diff --git a/include/linux/netpoll.h b/include/linux/netpoll.h
index ab9aaaff8d04..a0632af88d8b 100644
--- a/include/linux/netpoll.h
+++ b/include/linux/netpoll.h
@@ -24,32 +24,38 @@ struct netpoll {
 	struct net_device *dev;
 	char dev_name[IFNAMSIZ];
 	const char *name;
-	void (*rx_skb_hook)(struct netpoll *np, int source, struct sk_buff *skb,
-			    int offset, int len);
 
 	union inet_addr local_ip, remote_ip;
 	bool ipv6;
 	u16 local_port, remote_port;
 	u8 remote_mac[ETH_ALEN];
 
-	struct list_head rx; /* rx_np list element */
 	struct work_struct cleanup_work;
+
+#ifdef CONFIG_NETPOLL_TRAP
+	void (*rx_skb_hook)(struct netpoll *np, int source, struct sk_buff *skb,
+			    int offset, int len);
+	struct list_head rx; /* rx_np list element */
+#endif
 };
 
 struct netpoll_info {
 	atomic_t refcnt;
 
-	spinlock_t rx_lock;
 	struct semaphore dev_lock;
-	struct list_head rx_np; /* netpolls that registered an rx_skb_hook */
 
-	struct sk_buff_head neigh_tx; /* list of neigh requests to reply to */
 	struct sk_buff_head txq;
 
 	struct delayed_work tx_work;
 
 	struct netpoll *netpoll;
 	struct rcu_head rcu;
+
+#ifdef CONFIG_NETPOLL_TRAP
+	spinlock_t rx_lock;
+	struct list_head rx_np; /* netpolls that registered an rx_skb_hook */
+	struct sk_buff_head neigh_tx; /* list of neigh requests to reply to */
+#endif
 };
 
 #ifdef CONFIG_NETPOLL
@@ -68,7 +74,6 @@ int netpoll_setup(struct netpoll *np);
 void __netpoll_cleanup(struct netpoll *np);
 void __netpoll_free_async(struct netpoll *np);
 void netpoll_cleanup(struct netpoll *np);
-int __netpoll_rx(struct sk_buff *skb, struct netpoll_info *npinfo);
 void netpoll_send_skb_on_dev(struct netpoll *np, struct sk_buff *skb,
 			     struct net_device *dev);
 static inline void netpoll_send_skb(struct netpoll *np, struct sk_buff *skb)
@@ -82,25 +87,12 @@ static inline void netpoll_send_skb(struct netpoll *np, struct sk_buff *skb)
 #ifdef CONFIG_NETPOLL_TRAP
 int netpoll_trap(void);
 void netpoll_set_trap(int trap);
+int __netpoll_rx(struct sk_buff *skb, struct netpoll_info *npinfo);
 static inline bool netpoll_rx_processing(struct netpoll_info *npinfo)
 {
 	return !list_empty(&npinfo->rx_np);
 }
-#else
-static inline int netpoll_trap(void)
-{
-	return 0;
-}
-static inline void netpoll_set_trap(int trap)
-{
-}
-static inline bool netpoll_rx_processing(struct netpoll_info *npinfo)
-{
-	return false;
-}
-#endif
 
-#ifdef CONFIG_NETPOLL
 static inline bool netpoll_rx_on(struct sk_buff *skb)
 {
 	struct netpoll_info *npinfo = rcu_dereference_bh(skb->dev->npinfo);
@@ -138,6 +130,33 @@ static inline int netpoll_receive_skb(struct sk_buff *skb)
 	return 0;
 }
 
+#else
+static inline int netpoll_trap(void)
+{
+	return 0;
+}
+static inline void netpoll_set_trap(int trap)
+{
+}
+static inline bool netpoll_rx_processing(struct netpoll_info *npinfo)
+{
+	return false;
+}
+static inline bool netpoll_rx(struct sk_buff *skb)
+{
+	return false;
+}
+static inline bool netpoll_rx_on(struct sk_buff *skb)
+{
+	return false;
+}
+static inline int netpoll_receive_skb(struct sk_buff *skb)
+{
+	return 0;
+}
+#endif
+
+#ifdef CONFIG_NETPOLL
 static inline void *netpoll_poll_lock(struct napi_struct *napi)
 {
 	struct net_device *dev = napi->dev;
@@ -166,18 +185,6 @@ static inline bool netpoll_tx_running(struct net_device *dev)
 }
 
 #else
-static inline bool netpoll_rx(struct sk_buff *skb)
-{
-	return false;
-}
-static inline bool netpoll_rx_on(struct sk_buff *skb)
-{
-	return false;
-}
-static inline int netpoll_receive_skb(struct sk_buff *skb)
-{
-	return 0;
-}
 static inline void *netpoll_poll_lock(struct napi_struct *napi)
 {
 	return NULL;
diff --git a/net/core/netpoll.c b/net/core/netpoll.c
index b69bb3f1ba3f..adb5768be5a5 100644
--- a/net/core/netpoll.c
+++ b/net/core/netpoll.c
@@ -48,6 +48,7 @@ static struct sk_buff_head skb_pool;
 
 #ifdef CONFIG_NETPOLL_TRAP
 static atomic_t trapped;
+static void netpoll_neigh_reply(struct sk_buff *skb, struct netpoll_info *npinfo);
 #endif
 
 DEFINE_STATIC_SRCU(netpoll_srcu);
@@ -61,7 +62,6 @@ DEFINE_STATIC_SRCU(netpoll_srcu);
 	 MAX_UDP_CHUNK)
 
 static void zap_completion_queue(void);
-static void netpoll_neigh_reply(struct sk_buff *skb, struct netpoll_info *npinfo);
 static void netpoll_async_cleanup(struct work_struct *work);
 
 static unsigned int carrier_timeout = 4;
@@ -109,6 +109,7 @@ static void queue_process(struct work_struct *work)
 	}
 }
 
+#ifdef CONFIG_NETPOLL_TRAP
 static __sum16 checksum_udp(struct sk_buff *skb, struct udphdr *uh,
 			    unsigned short ulen, __be32 saddr, __be32 daddr)
 {
@@ -127,6 +128,7 @@ static __sum16 checksum_udp(struct sk_buff *skb, struct udphdr *uh,
 
 	return __skb_checksum_complete(skb);
 }
+#endif /* CONFIG_NETPOLL_TRAP */
 
 /*
  * Check whether delayed processing was scheduled for our NIC. If so,
@@ -179,6 +181,7 @@ static void poll_napi(struct net_device *dev, int budget)
 	}
 }
 
+#ifdef CONFIG_NETPOLL_TRAP
 static void service_neigh_queue(struct net_device *dev,
 				struct netpoll_info *npi)
 {
@@ -197,6 +200,12 @@ static void service_neigh_queue(struct net_device *dev,
 	while ((skb = skb_dequeue(&npi->neigh_tx)))
 		netpoll_neigh_reply(skb, npi);
 }
+#else /* !CONFIG_NETPOLL_TRAP */
+static inline void service_neigh_queue(struct net_device *dev,
+				struct netpoll_info *npi)
+{
+}
+#endif /* CONFIG_NETPOLL_TRAP */
 
 static void netpoll_poll_dev(struct net_device *dev)
 {
@@ -522,6 +531,7 @@ void netpoll_send_udp(struct netpoll *np, const char *msg, int len)
 }
 EXPORT_SYMBOL(netpoll_send_udp);
 
+#ifdef CONFIG_NETPOLL_TRAP
 static void netpoll_neigh_reply(struct sk_buff *skb, struct netpoll_info *npinfo)
 {
 	int size, type = ARPOP_REPLY;
@@ -900,6 +910,55 @@ out:
 	return 0;
 }
 
+static void netpoll_trap_setup_info(struct netpoll_info *npinfo)
+{
+	INIT_LIST_HEAD(&npinfo->rx_np);
+	spin_lock_init(&npinfo->rx_lock);
+	skb_queue_head_init(&npinfo->neigh_tx);
+}
+
+static void netpoll_trap_cleanup_info(struct netpoll_info *npinfo)
+{
+	skb_queue_purge(&npinfo->neigh_tx);
+}
+
+static void netpoll_trap_setup(struct netpoll *np, struct netpoll_info *npinfo)
+{
+	unsigned long flags;
+	if (np->rx_skb_hook) {
+		spin_lock_irqsave(&npinfo->rx_lock, flags);
+		list_add_tail(&np->rx, &npinfo->rx_np);
+		spin_unlock_irqrestore(&npinfo->rx_lock, flags);
+	}
+}
+
+static void netpoll_trap_cleanup(struct netpoll *np, struct netpoll_info *npinfo)
+{
+	unsigned long flags;
+	if (!list_empty(&npinfo->rx_np)) {
+		spin_lock_irqsave(&npinfo->rx_lock, flags);
+		list_del(&np->rx);
+		spin_unlock_irqrestore(&npinfo->rx_lock, flags);
+	}
+}
+
+#else /* !CONFIG_NETPOLL_TRAP */
+static inline void netpoll_trap_setup_info(struct netpoll_info *npinfo)
+{
+}
+static inline void netpoll_trap_cleanup_info(struct netpoll_info *npinfo)
+{
+}
+static inline 
+void netpoll_trap_setup(struct netpoll *np, struct netpoll_info *npinfo)
+{
+}
+static inline
+void netpoll_trap_cleanup(struct netpoll *np, struct netpoll_info *npinfo)
+{
+}
+#endif /* CONFIG_NETPOLL_TRAP */
+
 void netpoll_print_options(struct netpoll *np)
 {
 	np_info(np, "local port %d\n", np->local_port);
@@ -1023,7 +1082,6 @@ int __netpoll_setup(struct netpoll *np, struct net_device *ndev, gfp_t gfp)
 {
 	struct netpoll_info *npinfo;
 	const struct net_device_ops *ops;
-	unsigned long flags;
 	int err;
 
 	np->dev = ndev;
@@ -1045,11 +1103,9 @@ int __netpoll_setup(struct netpoll *np, struct net_device *ndev, gfp_t gfp)
 			goto out;
 		}
 
-		INIT_LIST_HEAD(&npinfo->rx_np);
+		netpoll_trap_setup_info(npinfo);
 
-		spin_lock_init(&npinfo->rx_lock);
 		sema_init(&npinfo->dev_lock, 1);
-		skb_queue_head_init(&npinfo->neigh_tx);
 		skb_queue_head_init(&npinfo->txq);
 		INIT_DELAYED_WORK(&npinfo->tx_work, queue_process);
 
@@ -1068,11 +1124,7 @@ int __netpoll_setup(struct netpoll *np, struct net_device *ndev, gfp_t gfp)
 
 	npinfo->netpoll = np;
 
-	if (np->rx_skb_hook) {
-		spin_lock_irqsave(&npinfo->rx_lock, flags);
-		list_add_tail(&np->rx, &npinfo->rx_np);
-		spin_unlock_irqrestore(&npinfo->rx_lock, flags);
-	}
+	netpoll_trap_setup(np, npinfo);
 
 	/* last thing to do is link it to the net device structure */
 	rcu_assign_pointer(ndev->npinfo, npinfo);
@@ -1222,7 +1274,7 @@ static void rcu_cleanup_netpoll_info(struct rcu_head *rcu_head)
 	struct netpoll_info *npinfo =
 			container_of(rcu_head, struct netpoll_info, rcu);
 
-	skb_queue_purge(&npinfo->neigh_tx);
+	netpoll_trap_cleanup_info(npinfo);
 	skb_queue_purge(&npinfo->txq);
 
 	/* we can't call cancel_delayed_work_sync here, as we are in softirq */
@@ -1238,7 +1290,6 @@ static void rcu_cleanup_netpoll_info(struct rcu_head *rcu_head)
 void __netpoll_cleanup(struct netpoll *np)
 {
 	struct netpoll_info *npinfo;
-	unsigned long flags;
 
 	/* rtnl_dereference would be preferable here but
 	 * rcu_cleanup_netpoll path can put us in here safely without
@@ -1248,11 +1299,7 @@ void __netpoll_cleanup(struct netpoll *np)
 	if (!npinfo)
 		return;
 
-	if (!list_empty(&npinfo->rx_np)) {
-		spin_lock_irqsave(&npinfo->rx_lock, flags);
-		list_del(&np->rx);
-		spin_unlock_irqrestore(&npinfo->rx_lock, flags);
-	}
+	netpoll_trap_cleanup(np, npinfo);
 
 	synchronize_srcu(&netpoll_srcu);
 
-- 
1.7.5.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* [PATCH 10/10] netpoll: Remove dead packet receive code (CONFIG_NETPOLL_TRAP)
  2014-03-15  3:43                               ` [PATCH 00/10] " Eric W. Biederman
                                                   ` (8 preceding siblings ...)
  2014-03-15  3:50                                 ` [PATCH 09/10] netpoll: Move all receive processing under CONFIG_NETPOLL_TRAP Eric W. Biederman
@ 2014-03-15  3:51                                 ` Eric W. Biederman
  2014-03-17 19:49                                 ` [PATCH 00/10] netpoll: Cleanup received packet processing David Miller
  10 siblings, 0 replies; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-15  3:51 UTC (permalink / raw)
  To: David Miller
  Cc: stephen, eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma


The netpoll packet receive code only becomes active if the netpoll
rx_skb_hook is implemented, and there is not a single implementation
of the netpoll rx_skb_hook in the kernel.

All of the out of tree implementations I have found all call
netpoll_poll which was removed from the kernel in 2011, so this
change should not add any additional breakage.

There are problems with the netpoll packet receive code.  __netpoll_rx
does not call dev_kfree_skb_irq or dev_kfree_skb_any in hard irq
context.  netpoll_neigh_reply leaks every skb it receives.  Reception
of packets does not work successfully on stacked devices (aka bonding,
team, bridge, and vlans).

Given that the netpoll packet receive code is buggy, there are no
out of tree users that will be merged soon, and the code has
not been used for in tree for a decade let's just remove it.

Reverting this commit can server as a starting point for anyone
who wants to resurrect netpoll packet reception support.

Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 drivers/net/Kconfig       |    5 -
 include/linux/netdevice.h |   17 --
 include/linux/netpoll.h   |   84 --------
 net/core/dev.c            |   11 +-
 net/core/netpoll.c        |  520 +--------------------------------------------
 5 files changed, 2 insertions(+), 635 deletions(-)

diff --git a/drivers/net/Kconfig b/drivers/net/Kconfig
index 494b888a6568..89402c3b64f8 100644
--- a/drivers/net/Kconfig
+++ b/drivers/net/Kconfig
@@ -177,11 +177,6 @@ config NETCONSOLE_DYNAMIC
 config NETPOLL
 	def_bool NETCONSOLE
 
-config NETPOLL_TRAP
-	bool "Netpoll traffic trapping"
-	default n
-	depends on NETPOLL
-
 config NET_POLL_CONTROLLER
 	def_bool NETPOLL
 
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index b8d8c805fd75..4b6d12c7b803 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -1979,9 +1979,6 @@ struct net_device *__dev_get_by_index(struct net *net, int ifindex);
 struct net_device *dev_get_by_index_rcu(struct net *net, int ifindex);
 int netdev_get_name(struct net *net, char *name, int ifindex);
 int dev_restart(struct net_device *dev);
-#ifdef CONFIG_NETPOLL_TRAP
-int netpoll_trap(void);
-#endif
 int skb_gro_receive(struct sk_buff **head, struct sk_buff *skb);
 
 static inline unsigned int skb_gro_offset(const struct sk_buff *skb)
@@ -2186,12 +2183,6 @@ static inline void netif_tx_start_all_queues(struct net_device *dev)
 
 static inline void netif_tx_wake_queue(struct netdev_queue *dev_queue)
 {
-#ifdef CONFIG_NETPOLL_TRAP
-	if (netpoll_trap()) {
-		netif_tx_start_queue(dev_queue);
-		return;
-	}
-#endif
 	if (test_and_clear_bit(__QUEUE_STATE_DRV_XOFF, &dev_queue->state))
 		__netif_schedule(dev_queue->qdisc);
 }
@@ -2435,10 +2426,6 @@ static inline void netif_start_subqueue(struct net_device *dev, u16 queue_index)
 static inline void netif_stop_subqueue(struct net_device *dev, u16 queue_index)
 {
 	struct netdev_queue *txq = netdev_get_tx_queue(dev, queue_index);
-#ifdef CONFIG_NETPOLL_TRAP
-	if (netpoll_trap())
-		return;
-#endif
 	netif_tx_stop_queue(txq);
 }
 
@@ -2473,10 +2460,6 @@ static inline bool netif_subqueue_stopped(const struct net_device *dev,
 static inline void netif_wake_subqueue(struct net_device *dev, u16 queue_index)
 {
 	struct netdev_queue *txq = netdev_get_tx_queue(dev, queue_index);
-#ifdef CONFIG_NETPOLL_TRAP
-	if (netpoll_trap())
-		return;
-#endif
 	if (test_and_clear_bit(__QUEUE_STATE_DRV_XOFF, &txq->state))
 		__netif_schedule(txq->qdisc);
 }
diff --git a/include/linux/netpoll.h b/include/linux/netpoll.h
index a0632af88d8b..1b475a5a7239 100644
--- a/include/linux/netpoll.h
+++ b/include/linux/netpoll.h
@@ -31,12 +31,6 @@ struct netpoll {
 	u8 remote_mac[ETH_ALEN];
 
 	struct work_struct cleanup_work;
-
-#ifdef CONFIG_NETPOLL_TRAP
-	void (*rx_skb_hook)(struct netpoll *np, int source, struct sk_buff *skb,
-			    int offset, int len);
-	struct list_head rx; /* rx_np list element */
-#endif
 };
 
 struct netpoll_info {
@@ -50,12 +44,6 @@ struct netpoll_info {
 
 	struct netpoll *netpoll;
 	struct rcu_head rcu;
-
-#ifdef CONFIG_NETPOLL_TRAP
-	spinlock_t rx_lock;
-	struct list_head rx_np; /* netpolls that registered an rx_skb_hook */
-	struct sk_buff_head neigh_tx; /* list of neigh requests to reply to */
-#endif
 };
 
 #ifdef CONFIG_NETPOLL
@@ -84,78 +72,6 @@ static inline void netpoll_send_skb(struct netpoll *np, struct sk_buff *skb)
 	local_irq_restore(flags);
 }
 
-#ifdef CONFIG_NETPOLL_TRAP
-int netpoll_trap(void);
-void netpoll_set_trap(int trap);
-int __netpoll_rx(struct sk_buff *skb, struct netpoll_info *npinfo);
-static inline bool netpoll_rx_processing(struct netpoll_info *npinfo)
-{
-	return !list_empty(&npinfo->rx_np);
-}
-
-static inline bool netpoll_rx_on(struct sk_buff *skb)
-{
-	struct netpoll_info *npinfo = rcu_dereference_bh(skb->dev->npinfo);
-
-	return npinfo && netpoll_rx_processing(npinfo);
-}
-
-static inline bool netpoll_rx(struct sk_buff *skb)
-{
-	struct netpoll_info *npinfo;
-	unsigned long flags;
-	bool ret = false;
-
-	local_irq_save(flags);
-
-	if (!netpoll_rx_on(skb))
-		goto out;
-
-	npinfo = rcu_dereference_bh(skb->dev->npinfo);
-	spin_lock(&npinfo->rx_lock);
-	/* check rx_processing again with the lock held */
-	if (netpoll_rx_processing(npinfo) && __netpoll_rx(skb, npinfo))
-		ret = true;
-	spin_unlock(&npinfo->rx_lock);
-
-out:
-	local_irq_restore(flags);
-	return ret;
-}
-
-static inline int netpoll_receive_skb(struct sk_buff *skb)
-{
-	if (!list_empty(&skb->dev->napi_list))
-		return netpoll_rx(skb);
-	return 0;
-}
-
-#else
-static inline int netpoll_trap(void)
-{
-	return 0;
-}
-static inline void netpoll_set_trap(int trap)
-{
-}
-static inline bool netpoll_rx_processing(struct netpoll_info *npinfo)
-{
-	return false;
-}
-static inline bool netpoll_rx(struct sk_buff *skb)
-{
-	return false;
-}
-static inline bool netpoll_rx_on(struct sk_buff *skb)
-{
-	return false;
-}
-static inline int netpoll_receive_skb(struct sk_buff *skb)
-{
-	return 0;
-}
-#endif
-
 #ifdef CONFIG_NETPOLL
 static inline void *netpoll_poll_lock(struct napi_struct *napi)
 {
diff --git a/net/core/dev.c b/net/core/dev.c
index 587f9fb85d73..55f8e64c03a2 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -3231,10 +3231,6 @@ static int netif_rx_internal(struct sk_buff *skb)
 {
 	int ret;
 
-	/* if netpoll wants it, pretend we never saw it */
-	if (netpoll_rx(skb))
-		return NET_RX_DROP;
-
 	net_timestamp_check(netdev_tstamp_prequeue, skb);
 
 	trace_netif_rx(skb);
@@ -3520,10 +3516,6 @@ static int __netif_receive_skb_core(struct sk_buff *skb, bool pfmemalloc)
 
 	trace_netif_receive_skb(skb);
 
-	/* if we've gotten here through NAPI, check netpoll */
-	if (netpoll_receive_skb(skb))
-		goto out;
-
 	orig_dev = skb->dev;
 
 	skb_reset_network_header(skb);
@@ -3650,7 +3642,6 @@ drop:
 
 unlock:
 	rcu_read_unlock();
-out:
 	return ret;
 }
 
@@ -3875,7 +3866,7 @@ static enum gro_result dev_gro_receive(struct napi_struct *napi, struct sk_buff
 	int same_flow;
 	enum gro_result ret;
 
-	if (!(skb->dev->features & NETIF_F_GRO) || netpoll_rx_on(skb))
+	if (!(skb->dev->features & NETIF_F_GRO))
 		goto normal;
 
 	if (skb_is_gso(skb) || skb_has_frag_list(skb))
diff --git a/net/core/netpoll.c b/net/core/netpoll.c
index adb5768be5a5..7291dde93469 100644
--- a/net/core/netpoll.c
+++ b/net/core/netpoll.c
@@ -46,11 +46,6 @@
 
 static struct sk_buff_head skb_pool;
 
-#ifdef CONFIG_NETPOLL_TRAP
-static atomic_t trapped;
-static void netpoll_neigh_reply(struct sk_buff *skb, struct netpoll_info *npinfo);
-#endif
-
 DEFINE_STATIC_SRCU(netpoll_srcu);
 
 #define USEC_PER_POLL	50
@@ -109,27 +104,6 @@ static void queue_process(struct work_struct *work)
 	}
 }
 
-#ifdef CONFIG_NETPOLL_TRAP
-static __sum16 checksum_udp(struct sk_buff *skb, struct udphdr *uh,
-			    unsigned short ulen, __be32 saddr, __be32 daddr)
-{
-	__wsum psum;
-
-	if (uh->check == 0 || skb_csum_unnecessary(skb))
-		return 0;
-
-	psum = csum_tcpudp_nofold(saddr, daddr, ulen, IPPROTO_UDP, 0);
-
-	if (skb->ip_summed == CHECKSUM_COMPLETE &&
-	    !csum_fold(csum_add(psum, skb->csum)))
-		return 0;
-
-	skb->csum = psum;
-
-	return __skb_checksum_complete(skb);
-}
-#endif /* CONFIG_NETPOLL_TRAP */
-
 /*
  * Check whether delayed processing was scheduled for our NIC. If so,
  * we attempt to grab the poll lock and use ->poll() to pump the card.
@@ -140,11 +114,6 @@ static __sum16 checksum_udp(struct sk_buff *skb, struct udphdr *uh,
  * trylock here and interrupts are already disabled in the softirq
  * case. Further, we test the poll_owner to avoid recursion on UP
  * systems where the lock doesn't exist.
- *
- * In cases where there is bi-directional communications, reading only
- * one message at a time can lead to packets being dropped by the
- * network adapter, forcing superfluous retries and possibly timeouts.
- * Thus, we set our budget to greater than 1.
  */
 static int poll_one_napi(struct napi_struct *napi, int budget)
 {
@@ -181,38 +150,11 @@ static void poll_napi(struct net_device *dev, int budget)
 	}
 }
 
-#ifdef CONFIG_NETPOLL_TRAP
-static void service_neigh_queue(struct net_device *dev,
-				struct netpoll_info *npi)
-{
-	struct sk_buff *skb;
-	if (dev->flags & IFF_SLAVE) {
-		struct net_device *bond_dev;
-		struct netpoll_info *bond_ni;
-
-		bond_dev = netdev_master_upper_dev_get_rcu(dev);
-		bond_ni = rcu_dereference_bh(bond_dev->npinfo);
-		while ((skb = skb_dequeue(&npi->neigh_tx))) {
-			skb->dev = bond_dev;
-			skb_queue_tail(&bond_ni->neigh_tx, skb);
-		}
-	}
-	while ((skb = skb_dequeue(&npi->neigh_tx)))
-		netpoll_neigh_reply(skb, npi);
-}
-#else /* !CONFIG_NETPOLL_TRAP */
-static inline void service_neigh_queue(struct net_device *dev,
-				struct netpoll_info *npi)
-{
-}
-#endif /* CONFIG_NETPOLL_TRAP */
-
 static void netpoll_poll_dev(struct net_device *dev)
 {
 	const struct net_device_ops *ops;
 	struct netpoll_info *ni = rcu_dereference_bh(dev->npinfo);
-	bool rx_processing = netpoll_rx_processing(ni);
-	int budget = rx_processing? 16 : 0;
+	int budget = 0;
 
 	/* Don't do any rx activity if the dev_lock mutex is held
 	 * the dev_open/close paths use this to block netpoll activity
@@ -226,9 +168,6 @@ static void netpoll_poll_dev(struct net_device *dev)
 		return;
 	}
 
-	if (rx_processing)
-		netpoll_set_trap(1);
-
 	ops = dev->netdev_ops;
 	if (!ops->ndo_poll_controller) {
 		up(&ni->dev_lock);
@@ -240,13 +179,8 @@ static void netpoll_poll_dev(struct net_device *dev)
 
 	poll_napi(dev, budget);
 
-	if (rx_processing)
-		netpoll_set_trap(0);
-
 	up(&ni->dev_lock);
 
-	service_neigh_queue(dev, ni);
-
 	zap_completion_queue();
 }
 
@@ -531,434 +465,6 @@ void netpoll_send_udp(struct netpoll *np, const char *msg, int len)
 }
 EXPORT_SYMBOL(netpoll_send_udp);
 
-#ifdef CONFIG_NETPOLL_TRAP
-static void netpoll_neigh_reply(struct sk_buff *skb, struct netpoll_info *npinfo)
-{
-	int size, type = ARPOP_REPLY;
-	__be32 sip, tip;
-	unsigned char *sha;
-	struct sk_buff *send_skb;
-	struct netpoll *np, *tmp;
-	unsigned long flags;
-	int hlen, tlen;
-	int hits = 0, proto;
-
-	if (!netpoll_rx_processing(npinfo))
-		return;
-
-	/* Before checking the packet, we do some early
-	   inspection whether this is interesting at all */
-	spin_lock_irqsave(&npinfo->rx_lock, flags);
-	list_for_each_entry_safe(np, tmp, &npinfo->rx_np, rx) {
-		if (np->dev == skb->dev)
-			hits++;
-	}
-	spin_unlock_irqrestore(&npinfo->rx_lock, flags);
-
-	/* No netpoll struct is using this dev */
-	if (!hits)
-		return;
-
-	proto = ntohs(eth_hdr(skb)->h_proto);
-	if (proto == ETH_P_ARP) {
-		struct arphdr *arp;
-		unsigned char *arp_ptr;
-		/* No arp on this interface */
-		if (skb->dev->flags & IFF_NOARP)
-			return;
-
-		if (!pskb_may_pull(skb, arp_hdr_len(skb->dev)))
-			return;
-
-		skb_reset_network_header(skb);
-		skb_reset_transport_header(skb);
-		arp = arp_hdr(skb);
-
-		if ((arp->ar_hrd != htons(ARPHRD_ETHER) &&
-		     arp->ar_hrd != htons(ARPHRD_IEEE802)) ||
-		    arp->ar_pro != htons(ETH_P_IP) ||
-		    arp->ar_op != htons(ARPOP_REQUEST))
-			return;
-
-		arp_ptr = (unsigned char *)(arp+1);
-		/* save the location of the src hw addr */
-		sha = arp_ptr;
-		arp_ptr += skb->dev->addr_len;
-		memcpy(&sip, arp_ptr, 4);
-		arp_ptr += 4;
-		/* If we actually cared about dst hw addr,
-		   it would get copied here */
-		arp_ptr += skb->dev->addr_len;
-		memcpy(&tip, arp_ptr, 4);
-
-		/* Should we ignore arp? */
-		if (ipv4_is_loopback(tip) || ipv4_is_multicast(tip))
-			return;
-
-		size = arp_hdr_len(skb->dev);
-
-		spin_lock_irqsave(&npinfo->rx_lock, flags);
-		list_for_each_entry_safe(np, tmp, &npinfo->rx_np, rx) {
-			if (tip != np->local_ip.ip)
-				continue;
-
-			hlen = LL_RESERVED_SPACE(np->dev);
-			tlen = np->dev->needed_tailroom;
-			send_skb = find_skb(np, size + hlen + tlen, hlen);
-			if (!send_skb)
-				continue;
-
-			skb_reset_network_header(send_skb);
-			arp = (struct arphdr *) skb_put(send_skb, size);
-			send_skb->dev = skb->dev;
-			send_skb->protocol = htons(ETH_P_ARP);
-
-			/* Fill the device header for the ARP frame */
-			if (dev_hard_header(send_skb, skb->dev, ETH_P_ARP,
-					    sha, np->dev->dev_addr,
-					    send_skb->len) < 0) {
-				kfree_skb(send_skb);
-				continue;
-			}
-
-			/*
-			 * Fill out the arp protocol part.
-			 *
-			 * we only support ethernet device type,
-			 * which (according to RFC 1390) should
-			 * always equal 1 (Ethernet).
-			 */
-
-			arp->ar_hrd = htons(np->dev->type);
-			arp->ar_pro = htons(ETH_P_IP);
-			arp->ar_hln = np->dev->addr_len;
-			arp->ar_pln = 4;
-			arp->ar_op = htons(type);
-
-			arp_ptr = (unsigned char *)(arp + 1);
-			memcpy(arp_ptr, np->dev->dev_addr, np->dev->addr_len);
-			arp_ptr += np->dev->addr_len;
-			memcpy(arp_ptr, &tip, 4);
-			arp_ptr += 4;
-			memcpy(arp_ptr, sha, np->dev->addr_len);
-			arp_ptr += np->dev->addr_len;
-			memcpy(arp_ptr, &sip, 4);
-
-			netpoll_send_skb(np, send_skb);
-
-			/* If there are several rx_skb_hooks for the same
-			 * address we're fine by sending a single reply
-			 */
-			break;
-		}
-		spin_unlock_irqrestore(&npinfo->rx_lock, flags);
-	} else if( proto == ETH_P_IPV6) {
-#if IS_ENABLED(CONFIG_IPV6)
-		struct nd_msg *msg;
-		u8 *lladdr = NULL;
-		struct ipv6hdr *hdr;
-		struct icmp6hdr *icmp6h;
-		const struct in6_addr *saddr;
-		const struct in6_addr *daddr;
-		struct inet6_dev *in6_dev = NULL;
-		struct in6_addr *target;
-
-		in6_dev = in6_dev_get(skb->dev);
-		if (!in6_dev || !in6_dev->cnf.accept_ra)
-			return;
-
-		if (!pskb_may_pull(skb, skb->len))
-			return;
-
-		msg = (struct nd_msg *)skb_transport_header(skb);
-
-		__skb_push(skb, skb->data - skb_transport_header(skb));
-
-		if (ipv6_hdr(skb)->hop_limit != 255)
-			return;
-		if (msg->icmph.icmp6_code != 0)
-			return;
-		if (msg->icmph.icmp6_type != NDISC_NEIGHBOUR_SOLICITATION)
-			return;
-
-		saddr = &ipv6_hdr(skb)->saddr;
-		daddr = &ipv6_hdr(skb)->daddr;
-
-		size = sizeof(struct icmp6hdr) + sizeof(struct in6_addr);
-
-		spin_lock_irqsave(&npinfo->rx_lock, flags);
-		list_for_each_entry_safe(np, tmp, &npinfo->rx_np, rx) {
-			if (!ipv6_addr_equal(daddr, &np->local_ip.in6))
-				continue;
-
-			hlen = LL_RESERVED_SPACE(np->dev);
-			tlen = np->dev->needed_tailroom;
-			send_skb = find_skb(np, size + hlen + tlen, hlen);
-			if (!send_skb)
-				continue;
-
-			send_skb->protocol = htons(ETH_P_IPV6);
-			send_skb->dev = skb->dev;
-
-			skb_reset_network_header(send_skb);
-			hdr = (struct ipv6hdr *) skb_put(send_skb, sizeof(struct ipv6hdr));
-			*(__be32*)hdr = htonl(0x60000000);
-			hdr->payload_len = htons(size);
-			hdr->nexthdr = IPPROTO_ICMPV6;
-			hdr->hop_limit = 255;
-			hdr->saddr = *saddr;
-			hdr->daddr = *daddr;
-
-			icmp6h = (struct icmp6hdr *) skb_put(send_skb, sizeof(struct icmp6hdr));
-			icmp6h->icmp6_type = NDISC_NEIGHBOUR_ADVERTISEMENT;
-			icmp6h->icmp6_router = 0;
-			icmp6h->icmp6_solicited = 1;
-
-			target = (struct in6_addr *) skb_put(send_skb, sizeof(struct in6_addr));
-			*target = msg->target;
-			icmp6h->icmp6_cksum = csum_ipv6_magic(saddr, daddr, size,
-							      IPPROTO_ICMPV6,
-							      csum_partial(icmp6h,
-									   size, 0));
-
-			if (dev_hard_header(send_skb, skb->dev, ETH_P_IPV6,
-					    lladdr, np->dev->dev_addr,
-					    send_skb->len) < 0) {
-				kfree_skb(send_skb);
-				continue;
-			}
-
-			netpoll_send_skb(np, send_skb);
-
-			/* If there are several rx_skb_hooks for the same
-			 * address, we're fine by sending a single reply
-			 */
-			break;
-		}
-		spin_unlock_irqrestore(&npinfo->rx_lock, flags);
-#endif
-	}
-}
-
-static bool pkt_is_ns(struct sk_buff *skb)
-{
-	struct nd_msg *msg;
-	struct ipv6hdr *hdr;
-
-	if (skb->protocol != htons(ETH_P_ARP))
-		return false;
-	if (!pskb_may_pull(skb, sizeof(struct ipv6hdr) + sizeof(struct nd_msg)))
-		return false;
-
-	msg = (struct nd_msg *)skb_transport_header(skb);
-	__skb_push(skb, skb->data - skb_transport_header(skb));
-	hdr = ipv6_hdr(skb);
-
-	if (hdr->nexthdr != IPPROTO_ICMPV6)
-		return false;
-	if (hdr->hop_limit != 255)
-		return false;
-	if (msg->icmph.icmp6_code != 0)
-		return false;
-	if (msg->icmph.icmp6_type != NDISC_NEIGHBOUR_SOLICITATION)
-		return false;
-
-	return true;
-}
-
-int __netpoll_rx(struct sk_buff *skb, struct netpoll_info *npinfo)
-{
-	int proto, len, ulen, data_len;
-	int hits = 0, offset;
-	const struct iphdr *iph;
-	struct udphdr *uh;
-	struct netpoll *np, *tmp;
-	uint16_t source;
-
-	if (!netpoll_rx_processing(npinfo))
-		goto out;
-
-	if (skb->dev->type != ARPHRD_ETHER)
-		goto out;
-
-	/* check if netpoll clients need ARP */
-	if (skb->protocol == htons(ETH_P_ARP) && netpoll_trap()) {
-		skb_queue_tail(&npinfo->neigh_tx, skb);
-		return 1;
-	} else if (pkt_is_ns(skb) && netpoll_trap()) {
-		skb_queue_tail(&npinfo->neigh_tx, skb);
-		return 1;
-	}
-
-	if (skb->protocol == cpu_to_be16(ETH_P_8021Q)) {
-		skb = vlan_untag(skb);
-		if (unlikely(!skb))
-			goto out;
-	}
-
-	proto = ntohs(eth_hdr(skb)->h_proto);
-	if (proto != ETH_P_IP && proto != ETH_P_IPV6)
-		goto out;
-	if (skb->pkt_type == PACKET_OTHERHOST)
-		goto out;
-	if (skb_shared(skb))
-		goto out;
-
-	if (proto == ETH_P_IP) {
-		if (!pskb_may_pull(skb, sizeof(struct iphdr)))
-			goto out;
-		iph = (struct iphdr *)skb->data;
-		if (iph->ihl < 5 || iph->version != 4)
-			goto out;
-		if (!pskb_may_pull(skb, iph->ihl*4))
-			goto out;
-		iph = (struct iphdr *)skb->data;
-		if (ip_fast_csum((u8 *)iph, iph->ihl) != 0)
-			goto out;
-
-		len = ntohs(iph->tot_len);
-		if (skb->len < len || len < iph->ihl*4)
-			goto out;
-
-		/*
-		 * Our transport medium may have padded the buffer out.
-		 * Now We trim to the true length of the frame.
-		 */
-		if (pskb_trim_rcsum(skb, len))
-			goto out;
-
-		iph = (struct iphdr *)skb->data;
-		if (iph->protocol != IPPROTO_UDP)
-			goto out;
-
-		len -= iph->ihl*4;
-		uh = (struct udphdr *)(((char *)iph) + iph->ihl*4);
-		offset = (unsigned char *)(uh + 1) - skb->data;
-		ulen = ntohs(uh->len);
-		data_len = skb->len - offset;
-		source = ntohs(uh->source);
-
-		if (ulen != len)
-			goto out;
-		if (checksum_udp(skb, uh, ulen, iph->saddr, iph->daddr))
-			goto out;
-		list_for_each_entry_safe(np, tmp, &npinfo->rx_np, rx) {
-			if (np->local_ip.ip && np->local_ip.ip != iph->daddr)
-				continue;
-			if (np->remote_ip.ip && np->remote_ip.ip != iph->saddr)
-				continue;
-			if (np->local_port && np->local_port != ntohs(uh->dest))
-				continue;
-
-			np->rx_skb_hook(np, source, skb, offset, data_len);
-			hits++;
-		}
-	} else {
-#if IS_ENABLED(CONFIG_IPV6)
-		const struct ipv6hdr *ip6h;
-
-		if (!pskb_may_pull(skb, sizeof(struct ipv6hdr)))
-			goto out;
-		ip6h = (struct ipv6hdr *)skb->data;
-		if (ip6h->version != 6)
-			goto out;
-		len = ntohs(ip6h->payload_len);
-		if (!len)
-			goto out;
-		if (len + sizeof(struct ipv6hdr) > skb->len)
-			goto out;
-		if (pskb_trim_rcsum(skb, len + sizeof(struct ipv6hdr)))
-			goto out;
-		ip6h = ipv6_hdr(skb);
-		if (!pskb_may_pull(skb, sizeof(struct udphdr)))
-			goto out;
-		uh = udp_hdr(skb);
-		offset = (unsigned char *)(uh + 1) - skb->data;
-		ulen = ntohs(uh->len);
-		data_len = skb->len - offset;
-		source = ntohs(uh->source);
-		if (ulen != skb->len)
-			goto out;
-		if (udp6_csum_init(skb, uh, IPPROTO_UDP))
-			goto out;
-		list_for_each_entry_safe(np, tmp, &npinfo->rx_np, rx) {
-			if (!ipv6_addr_equal(&np->local_ip.in6, &ip6h->daddr))
-				continue;
-			if (!ipv6_addr_equal(&np->remote_ip.in6, &ip6h->saddr))
-				continue;
-			if (np->local_port && np->local_port != ntohs(uh->dest))
-				continue;
-
-			np->rx_skb_hook(np, source, skb, offset, data_len);
-			hits++;
-		}
-#endif
-	}
-
-	if (!hits)
-		goto out;
-
-	kfree_skb(skb);
-	return 1;
-
-out:
-	if (netpoll_trap()) {
-		kfree_skb(skb);
-		return 1;
-	}
-
-	return 0;
-}
-
-static void netpoll_trap_setup_info(struct netpoll_info *npinfo)
-{
-	INIT_LIST_HEAD(&npinfo->rx_np);
-	spin_lock_init(&npinfo->rx_lock);
-	skb_queue_head_init(&npinfo->neigh_tx);
-}
-
-static void netpoll_trap_cleanup_info(struct netpoll_info *npinfo)
-{
-	skb_queue_purge(&npinfo->neigh_tx);
-}
-
-static void netpoll_trap_setup(struct netpoll *np, struct netpoll_info *npinfo)
-{
-	unsigned long flags;
-	if (np->rx_skb_hook) {
-		spin_lock_irqsave(&npinfo->rx_lock, flags);
-		list_add_tail(&np->rx, &npinfo->rx_np);
-		spin_unlock_irqrestore(&npinfo->rx_lock, flags);
-	}
-}
-
-static void netpoll_trap_cleanup(struct netpoll *np, struct netpoll_info *npinfo)
-{
-	unsigned long flags;
-	if (!list_empty(&npinfo->rx_np)) {
-		spin_lock_irqsave(&npinfo->rx_lock, flags);
-		list_del(&np->rx);
-		spin_unlock_irqrestore(&npinfo->rx_lock, flags);
-	}
-}
-
-#else /* !CONFIG_NETPOLL_TRAP */
-static inline void netpoll_trap_setup_info(struct netpoll_info *npinfo)
-{
-}
-static inline void netpoll_trap_cleanup_info(struct netpoll_info *npinfo)
-{
-}
-static inline 
-void netpoll_trap_setup(struct netpoll *np, struct netpoll_info *npinfo)
-{
-}
-static inline
-void netpoll_trap_cleanup(struct netpoll *np, struct netpoll_info *npinfo)
-{
-}
-#endif /* CONFIG_NETPOLL_TRAP */
-
 void netpoll_print_options(struct netpoll *np)
 {
 	np_info(np, "local port %d\n", np->local_port);
@@ -1103,8 +609,6 @@ int __netpoll_setup(struct netpoll *np, struct net_device *ndev, gfp_t gfp)
 			goto out;
 		}
 
-		netpoll_trap_setup_info(npinfo);
-
 		sema_init(&npinfo->dev_lock, 1);
 		skb_queue_head_init(&npinfo->txq);
 		INIT_DELAYED_WORK(&npinfo->tx_work, queue_process);
@@ -1124,8 +628,6 @@ int __netpoll_setup(struct netpoll *np, struct net_device *ndev, gfp_t gfp)
 
 	npinfo->netpoll = np;
 
-	netpoll_trap_setup(np, npinfo);
-
 	/* last thing to do is link it to the net device structure */
 	rcu_assign_pointer(ndev->npinfo, npinfo);
 
@@ -1274,7 +776,6 @@ static void rcu_cleanup_netpoll_info(struct rcu_head *rcu_head)
 	struct netpoll_info *npinfo =
 			container_of(rcu_head, struct netpoll_info, rcu);
 
-	netpoll_trap_cleanup_info(npinfo);
 	skb_queue_purge(&npinfo->txq);
 
 	/* we can't call cancel_delayed_work_sync here, as we are in softirq */
@@ -1299,8 +800,6 @@ void __netpoll_cleanup(struct netpoll *np)
 	if (!npinfo)
 		return;
 
-	netpoll_trap_cleanup(np, npinfo);
-
 	synchronize_srcu(&netpoll_srcu);
 
 	if (atomic_dec_and_test(&npinfo->refcnt)) {
@@ -1344,20 +843,3 @@ out:
 	rtnl_unlock();
 }
 EXPORT_SYMBOL(netpoll_cleanup);
-
-#ifdef CONFIG_NETPOLL_TRAP
-int netpoll_trap(void)
-{
-	return atomic_read(&trapped);
-}
-EXPORT_SYMBOL(netpoll_trap);
-
-void netpoll_set_trap(int trap)
-{
-	if (trap)
-		atomic_inc(&trapped);
-	else
-		atomic_dec(&trapped);
-}
-EXPORT_SYMBOL(netpoll_set_trap);
-#endif /* CONFIG_NETPOLL_TRAP */
-- 
1.7.5.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* Re: [PATCH net-next 16/16] sfc: Don't receive packets when the napi budget == 0
  2014-03-15  1:11                         ` [PATCH net-next 16/16] sfc: " Eric W. Biederman
@ 2014-03-15 15:23                           ` Ben Hutchings
  2014-03-15 16:29                             ` David Miller
  0 siblings, 1 reply; 288+ messages in thread
From: Ben Hutchings @ 2014-03-15 15:23 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: David Miller, eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma

[-- Attachment #1: Type: text/plain, Size: 1977 bytes --]

On Fri, 2014-03-14 at 18:11 -0700, Eric W. Biederman wrote:
> Processing any incoming packets with a with a napi budget of 0
> is incorrect driver behavior.
> 
> This matters as netpoll will shortly call drivers with a budget of 0
> to avoid receive packet processing happening in hard irq context.

But this also prevents handling TX completions, at which point you may
as well change efx_netpoll() to a no-op.  And then, does it make sense
to implement ndo_poll_controller at all?

Note that sfc does have a module parameter to enable separate RX and TX
completions so they could be polled separately, but it is disabled by
default.

Ben.

> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
> ---
>  drivers/net/ethernet/sfc/ef10.c  |    3 +++
>  drivers/net/ethernet/sfc/farch.c |    3 +++
>  2 files changed, 6 insertions(+), 0 deletions(-)
> 
> diff --git a/drivers/net/ethernet/sfc/ef10.c b/drivers/net/ethernet/sfc/ef10.c
> index eb75675f6e32..651626e133f9 100644
> --- a/drivers/net/ethernet/sfc/ef10.c
> +++ b/drivers/net/ethernet/sfc/ef10.c
> @@ -1955,6 +1955,9 @@ static int efx_ef10_ev_process(struct efx_channel *channel, int quota)
>  	int tx_descs = 0;
>  	int spent = 0;
>  
> +	if (quota <= 0)
> +		return spent;
> +
>  	read_ptr = channel->eventq_read_ptr;
>  
>  	for (;;) {
> diff --git a/drivers/net/ethernet/sfc/farch.c b/drivers/net/ethernet/sfc/farch.c
> index aa1b169f45ec..a08761360cdf 100644
> --- a/drivers/net/ethernet/sfc/farch.c
> +++ b/drivers/net/ethernet/sfc/farch.c
> @@ -1248,6 +1248,9 @@ int efx_farch_ev_process(struct efx_channel *channel, int budget)
>  	int tx_packets = 0;
>  	int spent = 0;
>  
> +	if (budget <= 0)
> +		return spent;
> +
>  	read_ptr = channel->eventq_read_ptr;
>  
>  	for (;;) {

-- 
Ben Hutchings
When you say `I wrote a program that crashed Windows', people just stare ...
and say `Hey, I got those with the system, *for free*'. - Linus Torvalds

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 811 bytes --]

^ permalink raw reply	[flat|nested] 288+ messages in thread

* Re: [PATCH net-next 16/16] sfc: Don't receive packets when the napi budget == 0
  2014-03-15 15:23                           ` Ben Hutchings
@ 2014-03-15 16:29                             ` David Miller
  2014-03-15 17:23                               ` Ben Hutchings
  2014-03-15 20:01                               ` mlx4 netpoll and rx/tx weirdness Eric W. Biederman
  0 siblings, 2 replies; 288+ messages in thread
From: David Miller @ 2014-03-15 16:29 UTC (permalink / raw)
  To: ben; +Cc: ebiederm, eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma

From: Ben Hutchings <ben@decadent.org.uk>
Date: Sat, 15 Mar 2014 15:23:34 +0000

> On Fri, 2014-03-14 at 18:11 -0700, Eric W. Biederman wrote:
>> Processing any incoming packets with a with a napi budget of 0
>> is incorrect driver behavior.
>> 
>> This matters as netpoll will shortly call drivers with a budget of 0
>> to avoid receive packet processing happening in hard irq context.
> 
> But this also prevents handling TX completions, at which point you may
> as well change efx_netpoll() to a no-op.  And then, does it make sense
> to implement ndo_poll_controller at all?
> 
> Note that sfc does have a module parameter to enable separate RX and TX
> completions so they could be polled separately, but it is disabled by
> default.

TX completions should run unconditionally, irregardless of the given
budget.

This is how I have coded all of my drivers, and I how I tell others
to do so.

^ permalink raw reply	[flat|nested] 288+ messages in thread

* Re: [PATCH net-next 16/16] sfc: Don't receive packets when the napi budget == 0
  2014-03-15 16:29                             ` David Miller
@ 2014-03-15 17:23                               ` Ben Hutchings
  2014-03-15 18:54                                 ` Eric Dumazet
  2014-03-15 20:01                               ` mlx4 netpoll and rx/tx weirdness Eric W. Biederman
  1 sibling, 1 reply; 288+ messages in thread
From: Ben Hutchings @ 2014-03-15 17:23 UTC (permalink / raw)
  To: David Miller
  Cc: ebiederm, eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma

[-- Attachment #1: Type: text/plain, Size: 1694 bytes --]

On Sat, 2014-03-15 at 12:29 -0400, David Miller wrote:
> From: Ben Hutchings <ben@decadent.org.uk>
> Date: Sat, 15 Mar 2014 15:23:34 +0000
> 
> > On Fri, 2014-03-14 at 18:11 -0700, Eric W. Biederman wrote:
> >> Processing any incoming packets with a with a napi budget of 0
> >> is incorrect driver behavior.
> >> 
> >> This matters as netpoll will shortly call drivers with a budget of 0
> >> to avoid receive packet processing happening in hard irq context.
> > 
> > But this also prevents handling TX completions, at which point you may
> > as well change efx_netpoll() to a no-op.  And then, does it make sense
> > to implement ndo_poll_controller at all?
> > 
> > Note that sfc does have a module parameter to enable separate RX and TX
> > completions so they could be polled separately, but it is disabled by
> > default.
> 
> TX completions should run unconditionally, irregardless of the given
> budget.
> 
> This is how I have coded all of my drivers, and I how I tell others
> to do so.

The Solarflare hardware provides generic event queues for RX and TX
completions, link changes, errors, etc.  The driver can't process TX
completions without going through all the other events mixed in with
them.

It is possible to allocate separate event queues for TX completions but
that isn't done by default because it also requires extra IRQs.  The
driver could be restructured to allocate some event queues without IRQs
of their own, but it probably requires a lot of work.

Ben.

-- 
Ben Hutchings
When you say `I wrote a program that crashed Windows', people just stare ...
and say `Hey, I got those with the system, *for free*'. - Linus Torvalds

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 811 bytes --]

^ permalink raw reply	[flat|nested] 288+ messages in thread

* Re: [PATCH net-next 16/16] sfc: Don't receive packets when the napi budget == 0
  2014-03-15 17:23                               ` Ben Hutchings
@ 2014-03-15 18:54                                 ` Eric Dumazet
  2014-03-15 19:25                                   ` Eric W. Biederman
  0 siblings, 1 reply; 288+ messages in thread
From: Eric Dumazet @ 2014-03-15 18:54 UTC (permalink / raw)
  To: Ben Hutchings
  Cc: David Miller, ebiederm, netdev, xiyou.wangcong, mpm, satyam.sharma

On Sat, 2014-03-15 at 17:23 +0000, Ben Hutchings wrote:

> The Solarflare hardware provides generic event queues for RX and TX
> completions, link changes, errors, etc.  The driver can't process TX
> completions without going through all the other events mixed in with
> them.
> 

ndo_poll_controller() could process all events, and queue incoming
packets through netif_rx().

Nobody cares about performance in this path.

^ permalink raw reply	[flat|nested] 288+ messages in thread

* Re: [PATCH net-next 16/16] sfc: Don't receive packets when the napi budget == 0
  2014-03-15 18:54                                 ` Eric Dumazet
@ 2014-03-15 19:25                                   ` Eric W. Biederman
  0 siblings, 0 replies; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-15 19:25 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Ben Hutchings, David Miller, netdev, xiyou.wangcong, mpm, satyam.sharma

Eric Dumazet <eric.dumazet@gmail.com> writes:

> On Sat, 2014-03-15 at 17:23 +0000, Ben Hutchings wrote:
>
>> The Solarflare hardware provides generic event queues for RX and TX
>> completions, link changes, errors, etc.  The driver can't process TX
>> completions without going through all the other events mixed in with
>> them.
>> 
>
> ndo_poll_controller() could process all events, and queue incoming
> packets through netif_rx().
>
> Nobody cares about performance in this path.

I made this change so that we don't process rx packets in hard irq
context and cause things to break, and I made the most minimal most
obvious change I could.

queueing incoming packets with trough netif_rx sounds elegant.

The practical challenge is that there might be napi bottom halves
already scheduled to run before ndo_poll_controller is called so even
if ndo_poll_controller processes all of the events and calls netif_rx
to deal with the incoming packets the napi bottom halves will have to
still not process the queues when they are called.

Eric

^ permalink raw reply	[flat|nested] 288+ messages in thread

* mlx4 netpoll and rx/tx weirdness
  2014-03-15 16:29                             ` David Miller
  2014-03-15 17:23                               ` Ben Hutchings
@ 2014-03-15 20:01                               ` Eric W. Biederman
  2014-03-16 16:17                                 ` Eric Dumazet
  1 sibling, 1 reply; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-15 20:01 UTC (permalink / raw)
  To: David Miller
  Cc: ben, eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma,
	Amir Vadai, Or Gerlitz, Jack Morgenstein

David Miller <davem@davemloft.net> writes:

> From: Ben Hutchings <ben@decadent.org.uk>
> Date: Sat, 15 Mar 2014 15:23:34 +0000
>
>> On Fri, 2014-03-14 at 18:11 -0700, Eric W. Biederman wrote:
>>> Processing any incoming packets with a with a napi budget of 0
>>> is incorrect driver behavior.
>>> 
>>> This matters as netpoll will shortly call drivers with a budget of 0
>>> to avoid receive packet processing happening in hard irq context.
>> 
>> But this also prevents handling TX completions, at which point you may
>> as well change efx_netpoll() to a no-op.  And then, does it make sense
>> to implement ndo_poll_controller at all?
>> 
>> Note that sfc does have a module parameter to enable separate RX and TX
>> completions so they could be polled separately, but it is disabled by
>> default.
>
> TX completions should run unconditionally, irregardless of the given
> budget.
>
> This is how I have coded all of my drivers, and I how I tell others
> to do so.

Given this comment I have to ask:  How insane is the mellanox mlx4
driver that has separate rx and tx queues, separate rx and tx interrupts
and uses separate napi bottom halves to process each, and honors the
budget passed into it's rx napi handler.

Right now the mlx4 code disables interrupts and calls napi_synchronize
in it's netpoll routine instead of just scheduling the napi bottom
halves as all of the sane drivers do.  napi_synchronize trying to sleep
in hard irq context is pretty terrible, but I can see roughly how that
should be fixed.

diff --git a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
index fa5ee719e04b..2e6fded14e60 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
@@ -1302,16 +1302,10 @@ out:
 static void mlx4_en_netpoll(struct net_device *dev)
 {
        struct mlx4_en_priv *priv = netdev_priv(dev);
-       struct mlx4_en_cq *cq;
-       unsigned long flags;
        int i;
 
        for (i = 0; i < priv->rx_ring_num; i++) {
-               cq = priv->rx_cq[i];
-               spin_lock_irqsave(&cq->lock, flags);
-               napi_synchronize(&cq->napi);
-               mlx4_en_process_rx_cq(dev, cq, 0);
-               spin_unlock_irqrestore(&cq->lock, flags);
+               napi_schedule(&priv->tx_cq[i]->napi);
        }
 }

What I can't see is what is a clean thing to do with the mlx4 tx bottom
napi bottom half.  As it won't processes the tx cq when I pass in a
budget of 0.

What I can't see is how to prevent a netpoll stalling after enough
packets are transmitted from hard irq context say with sysrq-l on
a machine with enough cpus to fill the tx queue.

If possible I would very much like to get this to work.

Eric

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* Re: mlx4 netpoll and rx/tx weirdness
  2014-03-15 20:01                               ` mlx4 netpoll and rx/tx weirdness Eric W. Biederman
@ 2014-03-16 16:17                                 ` Eric Dumazet
  2014-03-17 21:22                                   ` David Miller
  0 siblings, 1 reply; 288+ messages in thread
From: Eric Dumazet @ 2014-03-16 16:17 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: David Miller, ben, netdev, xiyou.wangcong, mpm, satyam.sharma,
	Amir Vadai, Or Gerlitz, Jack Morgenstein

On Sat, 2014-03-15 at 13:01 -0700, Eric W. Biederman wrote:

> Given this comment I have to ask:  How insane is the mellanox mlx4
> driver that has separate rx and tx queues, separate rx and tx interrupts
> and uses separate napi bottom halves to process each, and honors the
> budget passed into it's rx napi handler.

I believe this driver had TX completion from hard irq, and NAPI polling
for RX.

This was changed quite recently to also do TX completion from softirq
handler.

> 
> Right now the mlx4 code disables interrupts and calls napi_synchronize
> in it's netpoll routine instead of just scheduling the napi bottom
> halves as all of the sane drivers do.  napi_synchronize trying to sleep
> in hard irq context is pretty terrible, but I can see roughly how that
> should be fixed.
> 
> diff --git a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
> index fa5ee719e04b..2e6fded14e60 100644
> --- a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
> +++ b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
> @@ -1302,16 +1302,10 @@ out:
>  static void mlx4_en_netpoll(struct net_device *dev)
>  {
>         struct mlx4_en_priv *priv = netdev_priv(dev);
> -       struct mlx4_en_cq *cq;
> -       unsigned long flags;
>         int i;
>  
>         for (i = 0; i < priv->rx_ring_num; i++) {
> -               cq = priv->rx_cq[i];
> -               spin_lock_irqsave(&cq->lock, flags);
> -               napi_synchronize(&cq->napi);
> -               mlx4_en_process_rx_cq(dev, cq, 0);
> -               spin_unlock_irqrestore(&cq->lock, flags);
> +               napi_schedule(&priv->tx_cq[i]->napi);
>         }
>  }

Wait a minute, I thought ndo_poll_controller() had to be synchronous,
not schedule a napi ?

> 
> What I can't see is what is a clean thing to do with the mlx4 tx bottom
> napi bottom half.  As it won't processes the tx cq when I pass in a
> budget of 0.

Just ignore the budget to drain tx queues, as David said.

> 
> What I can't see is how to prevent a netpoll stalling after enough
> packets are transmitted from hard irq context say with sysrq-l on
> a machine with enough cpus to fill the tx queue.

Is this specific to mlx4 ?

^ permalink raw reply	[flat|nested] 288+ messages in thread

* Re: [PATCH 00/10] netpoll: Cleanup received packet processing
  2014-03-15  3:43                               ` [PATCH 00/10] " Eric W. Biederman
                                                   ` (9 preceding siblings ...)
  2014-03-15  3:51                                 ` [PATCH 10/10] netpoll: Remove dead packet receive code (CONFIG_NETPOLL_TRAP) Eric W. Biederman
@ 2014-03-17 19:49                                 ` David Miller
  2014-03-18  6:22                                   ` [PATCH 0/6] netpoll: Cleanups and fixes Eric W. Biederman
  10 siblings, 1 reply; 288+ messages in thread
From: David Miller @ 2014-03-17 19:49 UTC (permalink / raw)
  To: ebiederm
  Cc: stephen, eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma

From: ebiederm@xmission.com (Eric W. Biederman)
Date: Fri, 14 Mar 2014 20:43:13 -0700

> This is the long-winded, careful, and polite version of removing the netpoll
> receive packet processing.
> 
> First I untangle the code in small steps.  Then I modify the code to not
> force reception and dropping of packets when we are transmiting a packet
> with netpoll.  Finally I move all of the packet reception under
> CONFIG_NETPOLL_TRAP and delete CONFIG_NETPOLL_TRAP.
> 
> If someone wants to do a stable backport of these patches, it would
> require backporting the first 18 patches that handle the budget == 0 in
> the networking drivers, and the first 6 of these patches.
> 
> If anyone wants to resurrect netpoll packet reception someday it should
> just be a matter of reverting the last patch.

Series applied, thanks Eric.

^ permalink raw reply	[flat|nested] 288+ messages in thread

* Re: mlx4 netpoll and rx/tx weirdness
  2014-03-16 16:17                                 ` Eric Dumazet
@ 2014-03-17 21:22                                   ` David Miller
  2014-03-17 21:40                                     ` Eric Dumazet
  0 siblings, 1 reply; 288+ messages in thread
From: David Miller @ 2014-03-17 21:22 UTC (permalink / raw)
  To: eric.dumazet
  Cc: ebiederm, ben, netdev, xiyou.wangcong, mpm, satyam.sharma, amirv,
	ogerlitz, jackm

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Sun, 16 Mar 2014 09:17:01 -0700

> Wait a minute, I thought ndo_poll_controller() had to be synchronous,
> not schedule a napi ?

ndo_poll_controller() basically runs the hardware interrupt handler,
which might schedule NAPI processing.  Then netpoll invokes
the NAPI poll in whatever state ndo_poll_controller() left it in.

^ permalink raw reply	[flat|nested] 288+ messages in thread

* Re: mlx4 netpoll and rx/tx weirdness
  2014-03-17 21:22                                   ` David Miller
@ 2014-03-17 21:40                                     ` Eric Dumazet
  0 siblings, 0 replies; 288+ messages in thread
From: Eric Dumazet @ 2014-03-17 21:40 UTC (permalink / raw)
  To: David Miller
  Cc: ebiederm, ben, netdev, xiyou.wangcong, mpm, satyam.sharma, amirv,
	ogerlitz, jackm

On Mon, 2014-03-17 at 17:22 -0400, David Miller wrote:
> From: Eric Dumazet <eric.dumazet@gmail.com>
> Date: Sun, 16 Mar 2014 09:17:01 -0700
> 
> > Wait a minute, I thought ndo_poll_controller() had to be synchronous,
> > not schedule a napi ?
> 
> ndo_poll_controller() basically runs the hardware interrupt handler,
> which might schedule NAPI processing.  Then netpoll invokes
> the NAPI poll in whatever state ndo_poll_controller() left it in.

Ah right, this makes sense, poll_napi() uses a spin_trylock()...

^ permalink raw reply	[flat|nested] 288+ messages in thread

* [PATCH 0/6] netpoll: Cleanups and fixes
  2014-03-17 19:49                                 ` [PATCH 00/10] netpoll: Cleanup received packet processing David Miller
@ 2014-03-18  6:22                                   ` Eric W. Biederman
  2014-03-18  6:24                                     ` [PATCH 1/6] netpoll: Remove gfp parameter from __netpoll_setup Eric W. Biederman
                                                       ` (6 more replies)
  0 siblings, 7 replies; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-18  6:22 UTC (permalink / raw)
  To: David Miller
  Cc: stephen, eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma


This is a set of small cleanups and fixes for netpoll that makes the
code a little more penetratable and reliable.

The most significant is patch 6 which adds the function
skb_irq_freeable.  Patch 6 makes use of skb_irq_freeable to simplify the
logic in the network stack by freeing skbs as soon as possible and
making the buggy function zap_completion_queue in netpoll completely
pointless and thus unnecessary.

Eric W. Biederman (6):
      netpoll: Remove gfp parameter from __netpoll_setup
      netpoll: Only call ndo_start_xmit from a single place
      netpoll: Don't allow on devices that perform their own xmit locking
      netpoll: Move rx enable/disable into __dev_close_many
      netpoll: Rename netpoll_rx_enable/disable to netpoll_poll_disable/enable
      net: Free skbs from irqs when possible.

 drivers/net/bonding/bond_main.c |    6 +-
 drivers/net/team/team.c         |   16 +++---
 include/linux/netdevice.h       |    3 +-
 include/linux/netpoll.h         |   10 ++--
 include/linux/skbuff.h          |   13 +++++
 net/8021q/vlan_dev.c            |    7 +--
 net/bridge/br_device.c          |   15 +++---
 net/bridge/br_if.c              |    2 +-
 net/bridge/br_private.h         |    4 +-
 net/core/dev.c                  |   31 +++++------
 net/core/netpoll.c              |  110 ++++++++++++++++-----------------------
 net/core/skbuff.c               |   13 ++++-
 12 files changed, 112 insertions(+), 118 deletions(-)

^ permalink raw reply	[flat|nested] 288+ messages in thread

* [PATCH 1/6] netpoll: Remove gfp parameter from __netpoll_setup
  2014-03-18  6:22                                   ` [PATCH 0/6] netpoll: Cleanups and fixes Eric W. Biederman
@ 2014-03-18  6:24                                     ` Eric W. Biederman
  2014-03-18  6:24                                     ` [PATCH 2/6] netpoll: Only call ndo_start_xmit from a single place Eric W. Biederman
                                                       ` (5 subsequent siblings)
  6 siblings, 0 replies; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-18  6:24 UTC (permalink / raw)
  To: David Miller
  Cc: stephen, eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma


The gfp parameter was added in:
commit 47be03a28cc6c80e3aa2b3e8ed6d960ff0c5c0af
Author: Amerigo Wang <amwang@redhat.com>
Date:   Fri Aug 10 01:24:37 2012 +0000

    netpoll: use GFP_ATOMIC in slave_enable_netpoll() and __netpoll_setup()

    slave_enable_netpoll() and __netpoll_setup() may be called
    with read_lock() held, so should use GFP_ATOMIC to allocate
    memory. Eric suggested to pass gfp flags to __netpoll_setup().

    Cc: Eric Dumazet <eric.dumazet@gmail.com>
    Cc: "David S. Miller" <davem@davemloft.net>
    Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Signed-off-by: Cong Wang <amwang@redhat.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

The reason for the gfp parameter was removed in:
commit c4cdef9b7183159c23c7302aaf270d64c549f557
Author: dingtianhong <dingtianhong@huawei.com>
Date:   Tue Jul 23 15:25:27 2013 +0800

    bonding: don't call slave_xxx_netpoll under spinlocks

    The slave_xxx_netpoll will call synchronize_rcu_bh(),
    so the function may schedule and sleep, it should't be
    called under spinlocks.

    bond_netpoll_setup() and bond_netpoll_cleanup() are always
    protected by rtnl lock, it is no need to take the read lock,
    as the slave list couldn't be changed outside rtnl lock.

    Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
    Cc: Jay Vosburgh <fubar@us.ibm.com>
    Cc: Andy Gospodarek <andy@greyhouse.net>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Nothing else that calls __netpoll_setup or ndo_netpoll_setup
requires a gfp paramter, so remove the gfp parameter from both
of these functions making the code clearer.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 drivers/net/bonding/bond_main.c |    6 +++---
 drivers/net/team/team.c         |   16 +++++++---------
 include/linux/netdevice.h       |    3 +--
 include/linux/netpoll.h         |    2 +-
 net/8021q/vlan_dev.c            |    7 +++----
 net/bridge/br_device.c          |   15 +++++++--------
 net/bridge/br_if.c              |    2 +-
 net/bridge/br_private.h         |    4 ++--
 net/core/netpoll.c              |    8 ++++----
 9 files changed, 29 insertions(+), 34 deletions(-)

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index e717db301d46..76581971cf5f 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -922,12 +922,12 @@ static inline int slave_enable_netpoll(struct slave *slave)
 	struct netpoll *np;
 	int err = 0;
 
-	np = kzalloc(sizeof(*np), GFP_ATOMIC);
+	np = kzalloc(sizeof(*np), GFP_KERNEL);
 	err = -ENOMEM;
 	if (!np)
 		goto out;
 
-	err = __netpoll_setup(np, slave->dev, GFP_ATOMIC);
+	err = __netpoll_setup(np, slave->dev);
 	if (err) {
 		kfree(np);
 		goto out;
@@ -962,7 +962,7 @@ static void bond_netpoll_cleanup(struct net_device *bond_dev)
 			slave_disable_netpoll(slave);
 }
 
-static int bond_netpoll_setup(struct net_device *dev, struct netpoll_info *ni, gfp_t gfp)
+static int bond_netpoll_setup(struct net_device *dev, struct netpoll_info *ni)
 {
 	struct bonding *bond = netdev_priv(dev);
 	struct list_head *iter;
diff --git a/drivers/net/team/team.c b/drivers/net/team/team.c
index 2b1a1d61072c..33008c1d1d67 100644
--- a/drivers/net/team/team.c
+++ b/drivers/net/team/team.c
@@ -1031,8 +1031,7 @@ static void team_port_leave(struct team *team, struct team_port *port)
 }
 
 #ifdef CONFIG_NET_POLL_CONTROLLER
-static int team_port_enable_netpoll(struct team *team, struct team_port *port,
-				    gfp_t gfp)
+static int team_port_enable_netpoll(struct team *team, struct team_port *port)
 {
 	struct netpoll *np;
 	int err;
@@ -1040,11 +1039,11 @@ static int team_port_enable_netpoll(struct team *team, struct team_port *port,
 	if (!team->dev->npinfo)
 		return 0;
 
-	np = kzalloc(sizeof(*np), gfp);
+	np = kzalloc(sizeof(*np), GFP_KERNEL);
 	if (!np)
 		return -ENOMEM;
 
-	err = __netpoll_setup(np, port->dev, gfp);
+	err = __netpoll_setup(np, port->dev);
 	if (err) {
 		kfree(np);
 		return err;
@@ -1067,8 +1066,7 @@ static void team_port_disable_netpoll(struct team_port *port)
 	kfree(np);
 }
 #else
-static int team_port_enable_netpoll(struct team *team, struct team_port *port,
-				    gfp_t gfp)
+static int team_port_enable_netpoll(struct team *team, struct team_port *port)
 {
 	return 0;
 }
@@ -1156,7 +1154,7 @@ static int team_port_add(struct team *team, struct net_device *port_dev)
 		goto err_vids_add;
 	}
 
-	err = team_port_enable_netpoll(team, port, GFP_KERNEL);
+	err = team_port_enable_netpoll(team, port);
 	if (err) {
 		netdev_err(dev, "Failed to enable netpoll on device %s\n",
 			   portname);
@@ -1850,7 +1848,7 @@ static void team_netpoll_cleanup(struct net_device *dev)
 }
 
 static int team_netpoll_setup(struct net_device *dev,
-			      struct netpoll_info *npifo, gfp_t gfp)
+			      struct netpoll_info *npifo)
 {
 	struct team *team = netdev_priv(dev);
 	struct team_port *port;
@@ -1858,7 +1856,7 @@ static int team_netpoll_setup(struct net_device *dev,
 
 	mutex_lock(&team->lock);
 	list_for_each_entry(port, &team->port_list, list) {
-		err = team_port_enable_netpoll(team, port, gfp);
+		err = team_port_enable_netpoll(team, port);
 		if (err) {
 			__team_netpoll_cleanup(team);
 			break;
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 4b6d12c7b803..77142a78c4d9 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -1037,8 +1037,7 @@ struct net_device_ops {
 #ifdef CONFIG_NET_POLL_CONTROLLER
 	void                    (*ndo_poll_controller)(struct net_device *dev);
 	int			(*ndo_netpoll_setup)(struct net_device *dev,
-						     struct netpoll_info *info,
-						     gfp_t gfp);
+						     struct netpoll_info *info);
 	void			(*ndo_netpoll_cleanup)(struct net_device *dev);
 #endif
 #ifdef CONFIG_NET_RX_BUSY_POLL
diff --git a/include/linux/netpoll.h b/include/linux/netpoll.h
index 1b475a5a7239..893b9e66060e 100644
--- a/include/linux/netpoll.h
+++ b/include/linux/netpoll.h
@@ -57,7 +57,7 @@ static inline void netpoll_rx_enable(struct net_device *dev) { return; }
 void netpoll_send_udp(struct netpoll *np, const char *msg, int len);
 void netpoll_print_options(struct netpoll *np);
 int netpoll_parse_options(struct netpoll *np, char *opt);
-int __netpoll_setup(struct netpoll *np, struct net_device *ndev, gfp_t gfp);
+int __netpoll_setup(struct netpoll *np, struct net_device *ndev);
 int netpoll_setup(struct netpoll *np);
 void __netpoll_cleanup(struct netpoll *np);
 void __netpoll_free_async(struct netpoll *np);
diff --git a/net/8021q/vlan_dev.c b/net/8021q/vlan_dev.c
index 4f3e9073cb49..a78bebeca4d9 100644
--- a/net/8021q/vlan_dev.c
+++ b/net/8021q/vlan_dev.c
@@ -707,20 +707,19 @@ static void vlan_dev_poll_controller(struct net_device *dev)
 	return;
 }
 
-static int vlan_dev_netpoll_setup(struct net_device *dev, struct netpoll_info *npinfo,
-				  gfp_t gfp)
+static int vlan_dev_netpoll_setup(struct net_device *dev, struct netpoll_info *npinfo)
 {
 	struct vlan_dev_priv *vlan = vlan_dev_priv(dev);
 	struct net_device *real_dev = vlan->real_dev;
 	struct netpoll *netpoll;
 	int err = 0;
 
-	netpoll = kzalloc(sizeof(*netpoll), gfp);
+	netpoll = kzalloc(sizeof(*netpoll), GFP_KERNEL);
 	err = -ENOMEM;
 	if (!netpoll)
 		goto out;
 
-	err = __netpoll_setup(netpoll, real_dev, gfp);
+	err = __netpoll_setup(netpoll, real_dev);
 	if (err) {
 		kfree(netpoll);
 		goto out;
diff --git a/net/bridge/br_device.c b/net/bridge/br_device.c
index f2a08477e0f5..0dd01a05bd59 100644
--- a/net/bridge/br_device.c
+++ b/net/bridge/br_device.c
@@ -218,16 +218,16 @@ static void br_netpoll_cleanup(struct net_device *dev)
 		br_netpoll_disable(p);
 }
 
-static int __br_netpoll_enable(struct net_bridge_port *p, gfp_t gfp)
+static int __br_netpoll_enable(struct net_bridge_port *p)
 {
 	struct netpoll *np;
 	int err;
 
-	np = kzalloc(sizeof(*p->np), gfp);
+	np = kzalloc(sizeof(*p->np), GFP_KERNEL);
 	if (!np)
 		return -ENOMEM;
 
-	err = __netpoll_setup(np, p->dev, gfp);
+	err = __netpoll_setup(np, p->dev);
 	if (err) {
 		kfree(np);
 		return err;
@@ -237,16 +237,15 @@ static int __br_netpoll_enable(struct net_bridge_port *p, gfp_t gfp)
 	return err;
 }
 
-int br_netpoll_enable(struct net_bridge_port *p, gfp_t gfp)
+int br_netpoll_enable(struct net_bridge_port *p)
 {
 	if (!p->br->dev->npinfo)
 		return 0;
 
-	return __br_netpoll_enable(p, gfp);
+	return __br_netpoll_enable(p);
 }
 
-static int br_netpoll_setup(struct net_device *dev, struct netpoll_info *ni,
-			    gfp_t gfp)
+static int br_netpoll_setup(struct net_device *dev, struct netpoll_info *ni)
 {
 	struct net_bridge *br = netdev_priv(dev);
 	struct net_bridge_port *p;
@@ -255,7 +254,7 @@ static int br_netpoll_setup(struct net_device *dev, struct netpoll_info *ni,
 	list_for_each_entry(p, &br->port_list, list) {
 		if (!p->dev)
 			continue;
-		err = __br_netpoll_enable(p, gfp);
+		err = __br_netpoll_enable(p);
 		if (err)
 			goto fail;
 	}
diff --git a/net/bridge/br_if.c b/net/bridge/br_if.c
index 54d207d3a31c..5262b8617eb9 100644
--- a/net/bridge/br_if.c
+++ b/net/bridge/br_if.c
@@ -366,7 +366,7 @@ int br_add_if(struct net_bridge *br, struct net_device *dev)
 	if (err)
 		goto err2;
 
-	err = br_netpoll_enable(p, GFP_KERNEL);
+	err = br_netpoll_enable(p);
 	if (err)
 		goto err3;
 
diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h
index e1ca1dc916a4..06811d79f89f 100644
--- a/net/bridge/br_private.h
+++ b/net/bridge/br_private.h
@@ -349,7 +349,7 @@ static inline void br_netpoll_send_skb(const struct net_bridge_port *p,
 		netpoll_send_skb(np, skb);
 }
 
-int br_netpoll_enable(struct net_bridge_port *p, gfp_t gfp);
+int br_netpoll_enable(struct net_bridge_port *p);
 void br_netpoll_disable(struct net_bridge_port *p);
 #else
 static inline void br_netpoll_send_skb(const struct net_bridge_port *p,
@@ -357,7 +357,7 @@ static inline void br_netpoll_send_skb(const struct net_bridge_port *p,
 {
 }
 
-static inline int br_netpoll_enable(struct net_bridge_port *p, gfp_t gfp)
+static inline int br_netpoll_enable(struct net_bridge_port *p)
 {
 	return 0;
 }
diff --git a/net/core/netpoll.c b/net/core/netpoll.c
index 7291dde93469..4bccc78c5b58 100644
--- a/net/core/netpoll.c
+++ b/net/core/netpoll.c
@@ -584,7 +584,7 @@ int netpoll_parse_options(struct netpoll *np, char *opt)
 }
 EXPORT_SYMBOL(netpoll_parse_options);
 
-int __netpoll_setup(struct netpoll *np, struct net_device *ndev, gfp_t gfp)
+int __netpoll_setup(struct netpoll *np, struct net_device *ndev)
 {
 	struct netpoll_info *npinfo;
 	const struct net_device_ops *ops;
@@ -603,7 +603,7 @@ int __netpoll_setup(struct netpoll *np, struct net_device *ndev, gfp_t gfp)
 	}
 
 	if (!ndev->npinfo) {
-		npinfo = kmalloc(sizeof(*npinfo), gfp);
+		npinfo = kmalloc(sizeof(*npinfo), GFP_KERNEL);
 		if (!npinfo) {
 			err = -ENOMEM;
 			goto out;
@@ -617,7 +617,7 @@ int __netpoll_setup(struct netpoll *np, struct net_device *ndev, gfp_t gfp)
 
 		ops = np->dev->netdev_ops;
 		if (ops->ndo_netpoll_setup) {
-			err = ops->ndo_netpoll_setup(ndev, npinfo, gfp);
+			err = ops->ndo_netpoll_setup(ndev, npinfo);
 			if (err)
 				goto free_npinfo;
 		}
@@ -749,7 +749,7 @@ int netpoll_setup(struct netpoll *np)
 	/* fill up the skb queue */
 	refill_skbs();
 
-	err = __netpoll_setup(np, ndev, GFP_KERNEL);
+	err = __netpoll_setup(np, ndev);
 	if (err)
 		goto put;
 
-- 
1.7.5.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* [PATCH 2/6] netpoll: Only call ndo_start_xmit from a single place
  2014-03-18  6:22                                   ` [PATCH 0/6] netpoll: Cleanups and fixes Eric W. Biederman
  2014-03-18  6:24                                     ` [PATCH 1/6] netpoll: Remove gfp parameter from __netpoll_setup Eric W. Biederman
@ 2014-03-18  6:24                                     ` Eric W. Biederman
  2014-03-18  6:25                                     ` [PATCH 3/6] netpoll: Don't allow on devices that perform their own xmit locking Eric W. Biederman
                                                       ` (4 subsequent siblings)
  6 siblings, 0 replies; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-18  6:24 UTC (permalink / raw)
  To: David Miller
  Cc: stephen, eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma


Factor out the code that needs to surround ndo_start_xmit
from netpoll_send_skb_on_dev into netpoll_start_xmit.

It is an unfortunate fact that as the netpoll code has been maintained
the primary call site ndo_start_xmit learned how to handle vlans
and timestamps but the second call of ndo_start_xmit in queue_process
did not.

With the introduction of netpoll_start_xmit this associated logic now
happens at both call sites of ndo_start_xmit and should make it easy
for that to continue into the future.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 net/core/netpoll.c |   61 ++++++++++++++++++++++++++++++---------------------
 1 files changed, 36 insertions(+), 25 deletions(-)

diff --git a/net/core/netpoll.c b/net/core/netpoll.c
index 4bccc78c5b58..825200fcb0ff 100644
--- a/net/core/netpoll.c
+++ b/net/core/netpoll.c
@@ -69,6 +69,37 @@ module_param(carrier_timeout, uint, 0644);
 #define np_notice(np, fmt, ...)				\
 	pr_notice("%s: " fmt, np->name, ##__VA_ARGS__)
 
+static int netpoll_start_xmit(struct sk_buff *skb, struct net_device *dev,
+			      struct netdev_queue *txq)
+{
+	const struct net_device_ops *ops = dev->netdev_ops;
+	int status = NETDEV_TX_OK;
+	netdev_features_t features;
+
+	features = netif_skb_features(skb);
+
+	if (vlan_tx_tag_present(skb) &&
+	    !vlan_hw_offload_capable(features, skb->vlan_proto)) {
+		skb = __vlan_put_tag(skb, skb->vlan_proto,
+				     vlan_tx_tag_get(skb));
+		if (unlikely(!skb)) {
+			/* This is actually a packet drop, but we
+			 * don't want the code that calls this
+			 * function to try and operate on a NULL skb.
+			 */
+			goto out;
+		}
+		skb->vlan_tci = 0;
+	}
+
+	status = ops->ndo_start_xmit(skb, dev);
+	if (status == NETDEV_TX_OK)
+		txq_trans_update(txq);
+
+out:
+	return status;
+}
+
 static void queue_process(struct work_struct *work)
 {
 	struct netpoll_info *npinfo =
@@ -78,7 +109,6 @@ static void queue_process(struct work_struct *work)
 
 	while ((skb = skb_dequeue(&npinfo->txq))) {
 		struct net_device *dev = skb->dev;
-		const struct net_device_ops *ops = dev->netdev_ops;
 		struct netdev_queue *txq;
 
 		if (!netif_device_present(dev) || !netif_running(dev)) {
@@ -91,7 +121,7 @@ static void queue_process(struct work_struct *work)
 		local_irq_save(flags);
 		__netif_tx_lock(txq, smp_processor_id());
 		if (netif_xmit_frozen_or_stopped(txq) ||
-		    ops->ndo_start_xmit(skb, dev) != NETDEV_TX_OK) {
+		    netpoll_start_xmit(skb, dev, txq) != NETDEV_TX_OK) {
 			skb_queue_head(&npinfo->txq, skb);
 			__netif_tx_unlock(txq);
 			local_irq_restore(flags);
@@ -295,7 +325,6 @@ void netpoll_send_skb_on_dev(struct netpoll *np, struct sk_buff *skb,
 {
 	int status = NETDEV_TX_BUSY;
 	unsigned long tries;
-	const struct net_device_ops *ops = dev->netdev_ops;
 	/* It is up to the caller to keep npinfo alive. */
 	struct netpoll_info *npinfo;
 
@@ -317,27 +346,9 @@ void netpoll_send_skb_on_dev(struct netpoll *np, struct sk_buff *skb,
 		for (tries = jiffies_to_usecs(1)/USEC_PER_POLL;
 		     tries > 0; --tries) {
 			if (__netif_tx_trylock(txq)) {
-				if (!netif_xmit_stopped(txq)) {
-					if (vlan_tx_tag_present(skb) &&
-					    !vlan_hw_offload_capable(netif_skb_features(skb),
-								     skb->vlan_proto)) {
-						skb = __vlan_put_tag(skb, skb->vlan_proto, vlan_tx_tag_get(skb));
-						if (unlikely(!skb)) {
-							/* This is actually a packet drop, but we
-							 * don't want the code at the end of this
-							 * function to try and re-queue a NULL skb.
-							 */
-							status = NETDEV_TX_OK;
-							goto unlock_txq;
-						}
-						skb->vlan_tci = 0;
-					}
-
-					status = ops->ndo_start_xmit(skb, dev);
-					if (status == NETDEV_TX_OK)
-						txq_trans_update(txq);
-				}
-			unlock_txq:
+				if (!netif_xmit_stopped(txq))
+					status = netpoll_start_xmit(skb, dev, txq);
+
 				__netif_tx_unlock(txq);
 
 				if (status == NETDEV_TX_OK)
@@ -353,7 +364,7 @@ void netpoll_send_skb_on_dev(struct netpoll *np, struct sk_buff *skb,
 
 		WARN_ONCE(!irqs_disabled(),
 			"netpoll_send_skb_on_dev(): %s enabled interrupts in poll (%pF)\n",
-			dev->name, ops->ndo_start_xmit);
+			dev->name, dev->netdev_ops->ndo_start_xmit);
 
 	}
 
-- 
1.7.5.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* [PATCH 3/6] netpoll: Don't allow on devices that perform their own xmit locking
  2014-03-18  6:22                                   ` [PATCH 0/6] netpoll: Cleanups and fixes Eric W. Biederman
  2014-03-18  6:24                                     ` [PATCH 1/6] netpoll: Remove gfp parameter from __netpoll_setup Eric W. Biederman
  2014-03-18  6:24                                     ` [PATCH 2/6] netpoll: Only call ndo_start_xmit from a single place Eric W. Biederman
@ 2014-03-18  6:25                                     ` Eric W. Biederman
  2014-03-18 18:26                                       ` Cong Wang
  2014-03-18  6:26                                     ` [PATCH 4/6] netpoll: Move rx enable/disable into __dev_close_many Eric W. Biederman
                                                       ` (3 subsequent siblings)
  6 siblings, 1 reply; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-18  6:25 UTC (permalink / raw)
  To: David Miller
  Cc: stephen, eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma


There are strong and reasonable assumptions in the netpoll code that the
transmit code for network devices will not perform their own locking,
that can easily lead to deadlock if the assumptions are violated.

Document those assumptions by verifying the network device on which
netpoll is enabled does not have NETIF_F_LLTX set in netdev->features.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 net/core/netpoll.c |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/net/core/netpoll.c b/net/core/netpoll.c
index 825200fcb0ff..a9abb195a2c3 100644
--- a/net/core/netpoll.c
+++ b/net/core/netpoll.c
@@ -606,6 +606,7 @@ int __netpoll_setup(struct netpoll *np, struct net_device *ndev)
 	INIT_WORK(&np->cleanup_work, netpoll_async_cleanup);
 
 	if ((ndev->priv_flags & IFF_DISABLE_NETPOLL) ||
+	    (ndev->features & NETIF_F_LLTX) ||
 	    !ndev->netdev_ops->ndo_poll_controller) {
 		np_err(np, "%s doesn't support polling, aborting\n",
 		       np->dev_name);
-- 
1.7.5.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* [PATCH 4/6] netpoll: Move rx enable/disable into __dev_close_many
  2014-03-18  6:22                                   ` [PATCH 0/6] netpoll: Cleanups and fixes Eric W. Biederman
                                                       ` (2 preceding siblings ...)
  2014-03-18  6:25                                     ` [PATCH 3/6] netpoll: Don't allow on devices that perform their own xmit locking Eric W. Biederman
@ 2014-03-18  6:26                                     ` Eric W. Biederman
  2014-03-18  6:27                                     ` [PATCH 5/6] netpoll: Rename netpoll_rx_enable/disable to netpoll_poll_disable/enable Eric W. Biederman
                                                       ` (2 subsequent siblings)
  6 siblings, 0 replies; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-18  6:26 UTC (permalink / raw)
  To: David Miller
  Cc: stephen, eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma


Today netpoll_rx_enable and netpoll_rx_disable are called
from dev_close and and __dev_close, and not from dev_close_many.

That is too many call sites for a simple operations.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 net/core/dev.c |   13 ++++---------
 1 files changed, 4 insertions(+), 9 deletions(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index 55f8e64c03a2..f660448d992d 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -1313,6 +1313,9 @@ static int __dev_close_many(struct list_head *head)
 	might_sleep();
 
 	list_for_each_entry(dev, head, close_list) {
+		/* Temporarily disable netpoll until the interface is down */
+		netpoll_rx_disable(dev);
+
 		call_netdevice_notifiers(NETDEV_GOING_DOWN, dev);
 
 		clear_bit(__LINK_STATE_START, &dev->state);
@@ -1343,6 +1346,7 @@ static int __dev_close_many(struct list_head *head)
 
 		dev->flags &= ~IFF_UP;
 		net_dmaengine_put();
+		netpoll_rx_enable(dev);
 	}
 
 	return 0;
@@ -1353,14 +1357,10 @@ static int __dev_close(struct net_device *dev)
 	int retval;
 	LIST_HEAD(single);
 
-	/* Temporarily disable netpoll until the interface is down */
-	netpoll_rx_disable(dev);
-
 	list_add(&dev->close_list, &single);
 	retval = __dev_close_many(&single);
 	list_del(&single);
 
-	netpoll_rx_enable(dev);
 	return retval;
 }
 
@@ -1398,14 +1398,9 @@ int dev_close(struct net_device *dev)
 	if (dev->flags & IFF_UP) {
 		LIST_HEAD(single);
 
-		/* Block netpoll rx while the interface is going down */
-		netpoll_rx_disable(dev);
-
 		list_add(&dev->close_list, &single);
 		dev_close_many(&single);
 		list_del(&single);
-
-		netpoll_rx_enable(dev);
 	}
 	return 0;
 }
-- 
1.7.5.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* [PATCH 5/6] netpoll: Rename netpoll_rx_enable/disable to netpoll_poll_disable/enable
  2014-03-18  6:22                                   ` [PATCH 0/6] netpoll: Cleanups and fixes Eric W. Biederman
                                                       ` (3 preceding siblings ...)
  2014-03-18  6:26                                     ` [PATCH 4/6] netpoll: Move rx enable/disable into __dev_close_many Eric W. Biederman
@ 2014-03-18  6:27                                     ` Eric W. Biederman
  2014-03-18  6:27                                     ` [PATCH 6/6] net: Free skbs from irqs when possible Eric W. Biederman
  2014-03-27 22:35                                     ` [PATCH v2 0/6] netpoll: Cleanups and fixes Eric W. Biederman
  6 siblings, 0 replies; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-18  6:27 UTC (permalink / raw)
  To: David Miller
  Cc: stephen, eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma


The netpoll_rx_enable and netpoll_rx_disable functions have always
controlled polling the network drivers transmit and receive queues.

Rename them to netpoll_poll_enable and netpoll_poll_disable to make
their functionality clear.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 include/linux/netpoll.h |    8 ++++----
 net/core/dev.c          |    8 ++++----
 net/core/netpoll.c      |    8 ++++----
 3 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/include/linux/netpoll.h b/include/linux/netpoll.h
index 893b9e66060e..b25ee9ffdbe6 100644
--- a/include/linux/netpoll.h
+++ b/include/linux/netpoll.h
@@ -47,11 +47,11 @@ struct netpoll_info {
 };
 
 #ifdef CONFIG_NETPOLL
-extern void netpoll_rx_disable(struct net_device *dev);
-extern void netpoll_rx_enable(struct net_device *dev);
+extern void netpoll_poll_disable(struct net_device *dev);
+extern void netpoll_poll_enable(struct net_device *dev);
 #else
-static inline void netpoll_rx_disable(struct net_device *dev) { return; }
-static inline void netpoll_rx_enable(struct net_device *dev) { return; }
+static inline void netpoll_poll_disable(struct net_device *dev) { return; }
+static inline void netpoll_poll_enable(struct net_device *dev) { return; }
 #endif
 
 void netpoll_send_udp(struct netpoll *np, const char *msg, int len);
diff --git a/net/core/dev.c b/net/core/dev.c
index f660448d992d..8b3ea4058a5e 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -1245,7 +1245,7 @@ static int __dev_open(struct net_device *dev)
 	 * If we don't do this there is a chance ndo_poll_controller
 	 * or ndo_poll may be running while we open the device
 	 */
-	netpoll_rx_disable(dev);
+	netpoll_poll_disable(dev);
 
 	ret = call_netdevice_notifiers(NETDEV_PRE_UP, dev);
 	ret = notifier_to_errno(ret);
@@ -1260,7 +1260,7 @@ static int __dev_open(struct net_device *dev)
 	if (!ret && ops->ndo_open)
 		ret = ops->ndo_open(dev);
 
-	netpoll_rx_enable(dev);
+	netpoll_poll_enable(dev);
 
 	if (ret)
 		clear_bit(__LINK_STATE_START, &dev->state);
@@ -1314,7 +1314,7 @@ static int __dev_close_many(struct list_head *head)
 
 	list_for_each_entry(dev, head, close_list) {
 		/* Temporarily disable netpoll until the interface is down */
-		netpoll_rx_disable(dev);
+		netpoll_poll_disable(dev);
 
 		call_netdevice_notifiers(NETDEV_GOING_DOWN, dev);
 
@@ -1346,7 +1346,7 @@ static int __dev_close_many(struct list_head *head)
 
 		dev->flags &= ~IFF_UP;
 		net_dmaengine_put();
-		netpoll_rx_enable(dev);
+		netpoll_poll_enable(dev);
 	}
 
 	return 0;
diff --git a/net/core/netpoll.c b/net/core/netpoll.c
index a9abb195a2c3..dec929b71348 100644
--- a/net/core/netpoll.c
+++ b/net/core/netpoll.c
@@ -214,7 +214,7 @@ static void netpoll_poll_dev(struct net_device *dev)
 	zap_completion_queue();
 }
 
-void netpoll_rx_disable(struct net_device *dev)
+void netpoll_poll_disable(struct net_device *dev)
 {
 	struct netpoll_info *ni;
 	int idx;
@@ -225,9 +225,9 @@ void netpoll_rx_disable(struct net_device *dev)
 		down(&ni->dev_lock);
 	srcu_read_unlock(&netpoll_srcu, idx);
 }
-EXPORT_SYMBOL(netpoll_rx_disable);
+EXPORT_SYMBOL(netpoll_poll_disable);
 
-void netpoll_rx_enable(struct net_device *dev)
+void netpoll_poll_enable(struct net_device *dev)
 {
 	struct netpoll_info *ni;
 	rcu_read_lock();
@@ -236,7 +236,7 @@ void netpoll_rx_enable(struct net_device *dev)
 		up(&ni->dev_lock);
 	rcu_read_unlock();
 }
-EXPORT_SYMBOL(netpoll_rx_enable);
+EXPORT_SYMBOL(netpoll_poll_enable);
 
 static void refill_skbs(void)
 {
-- 
1.7.5.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* [PATCH 6/6] net: Free skbs from irqs when possible.
  2014-03-18  6:22                                   ` [PATCH 0/6] netpoll: Cleanups and fixes Eric W. Biederman
                                                       ` (4 preceding siblings ...)
  2014-03-18  6:27                                     ` [PATCH 5/6] netpoll: Rename netpoll_rx_enable/disable to netpoll_poll_disable/enable Eric W. Biederman
@ 2014-03-18  6:27                                     ` Eric W. Biederman
  2014-03-18  9:32                                       ` David Laight
                                                         ` (3 more replies)
  2014-03-27 22:35                                     ` [PATCH v2 0/6] netpoll: Cleanups and fixes Eric W. Biederman
  6 siblings, 4 replies; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-18  6:27 UTC (permalink / raw)
  To: David Miller
  Cc: stephen, eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma


Add a test skb_irq_freeable to report when it is safe to free a skb
from irq context.

It is not safe to free an skb from irq context when:
- The skb has a destructor as some skb destructors call local_bh_disable
  or spin_lock_bh.
- There is xfrm state as __xfrm_state_destroy calls spin_lock_bh.
- There is netfilter conntrack state as destroy_conntrack calls
  spin_lock_bh.
- If there is a refcounted dst entry on the skb, as __dst_free
  calls spin_lock_bh.
- If there is a frag_list, which could be a list of any skbs.
Otherwise it appears safe to free a skb from interrupt context.

- Update the warning in skb_releae_head_state to warn about freeing
  skb's in the wrong context.

- Update __dev_kfree_skb_irq to free all skbs that it can immediately

- Kill zap_completion_queue because there is no point going through
  a queue of packets that are not safe to free and looking for packets
  that are safe to free.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 include/linux/skbuff.h |   13 +++++++++++++
 net/core/dev.c         |   14 +++++++++-----
 net/core/netpoll.c     |   32 --------------------------------
 net/core/skbuff.c      |   13 ++++++++++---
 4 files changed, 32 insertions(+), 40 deletions(-)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 03db95ab8a8c..53f72b53fd47 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -2833,6 +2833,19 @@ static inline void skb_init_secmark(struct sk_buff *skb)
 { }
 #endif
 
+static inline bool skb_irq_freeable(struct sk_buff *skb)
+{
+	return !skb->destructor &&
+#if IS_ENABLED(CONFIG_XFRM)
+		!skb->sp &&
+#endif
+#if IS_ENABLED(CONFIG_NF_CONNTRACK)
+		!skb->nfct &&
+#endif
+		(!skb->_skb_refdst || (skb->_skb_refdst & SKB_DST_NOREF)) &&
+		!skb_has_frag_list(skb);
+}
+
 static inline void skb_set_queue_mapping(struct sk_buff *skb, u16 queue_mapping)
 {
 	skb->queue_mapping = queue_mapping;
diff --git a/net/core/dev.c b/net/core/dev.c
index 8b3ea4058a5e..99fd079488aa 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -2164,11 +2164,15 @@ void __dev_kfree_skb_irq(struct sk_buff *skb, enum skb_free_reason reason)
 		return;
 	}
 	get_kfree_skb_cb(skb)->reason = reason;
-	local_irq_save(flags);
-	skb->next = __this_cpu_read(softnet_data.completion_queue);
-	__this_cpu_write(softnet_data.completion_queue, skb);
-	raise_softirq_irqoff(NET_TX_SOFTIRQ);
-	local_irq_restore(flags);
+	if (unlikely(skb_irq_freeable(skb))) {
+		__kfree_skb(skb);
+	} else {
+		local_irq_save(flags);
+		skb->next = __this_cpu_read(softnet_data.completion_queue);
+		__this_cpu_write(softnet_data.completion_queue, skb);
+		raise_softirq_irqoff(NET_TX_SOFTIRQ);
+		local_irq_restore(flags);
+	}
 }
 EXPORT_SYMBOL(__dev_kfree_skb_irq);
 
diff --git a/net/core/netpoll.c b/net/core/netpoll.c
index dec929b71348..0cd492508a88 100644
--- a/net/core/netpoll.c
+++ b/net/core/netpoll.c
@@ -56,7 +56,6 @@ DEFINE_STATIC_SRCU(netpoll_srcu);
 	 sizeof(struct udphdr) +					\
 	 MAX_UDP_CHUNK)
 
-static void zap_completion_queue(void);
 static void netpoll_async_cleanup(struct work_struct *work);
 
 static unsigned int carrier_timeout = 4;
@@ -210,8 +209,6 @@ static void netpoll_poll_dev(struct net_device *dev)
 	poll_napi(dev, budget);
 
 	up(&ni->dev_lock);
-
-	zap_completion_queue();
 }
 
 void netpoll_poll_disable(struct net_device *dev)
@@ -254,40 +251,11 @@ static void refill_skbs(void)
 	spin_unlock_irqrestore(&skb_pool.lock, flags);
 }
 
-static void zap_completion_queue(void)
-{
-	unsigned long flags;
-	struct softnet_data *sd = &get_cpu_var(softnet_data);
-
-	if (sd->completion_queue) {
-		struct sk_buff *clist;
-
-		local_irq_save(flags);
-		clist = sd->completion_queue;
-		sd->completion_queue = NULL;
-		local_irq_restore(flags);
-
-		while (clist != NULL) {
-			struct sk_buff *skb = clist;
-			clist = clist->next;
-			if (skb->destructor) {
-				atomic_inc(&skb->users);
-				dev_kfree_skb_any(skb); /* put this one back */
-			} else {
-				__kfree_skb(skb);
-			}
-		}
-	}
-
-	put_cpu_var(softnet_data);
-}
-
 static struct sk_buff *find_skb(struct netpoll *np, int len, int reserve)
 {
 	int count = 0;
 	struct sk_buff *skb;
 
-	zap_completion_queue();
 	refill_skbs();
 repeat:
 
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 3f14c638c2b1..5654e3eb4066 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -554,14 +554,21 @@ static void kfree_skbmem(struct sk_buff *skb)
 
 static void skb_release_head_state(struct sk_buff *skb)
 {
+	WARN_ONCE(in_irq() && !skb_irq_freeable(skb),
+		  "%s called from irq! sp %d nfct %d frag_list %d %pF dst %lx",
+		  __func__,
+		  IS_ENABLED(CONFIG_XFRM) ? !!skb->sp : 0,
+		  IS_ENABLED(CONFIG_NF_CONNTRACK) ? !!skb->nfct : 0,
+		  !!skb_has_frag_list(skb),
+		  skb->destructor,
+		  skb->_skb_refdst);
+
 	skb_dst_drop(skb);
 #ifdef CONFIG_XFRM
 	secpath_put(skb->sp);
 #endif
-	if (skb->destructor) {
-		WARN_ON(in_irq());
+	if (skb->destructor)
 		skb->destructor(skb);
-	}
 #if IS_ENABLED(CONFIG_NF_CONNTRACK)
 	nf_conntrack_put(skb->nfct);
 #endif
-- 
1.7.5.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* RE: [PATCH 6/6] net: Free skbs from irqs when possible.
  2014-03-18  6:27                                     ` [PATCH 6/6] net: Free skbs from irqs when possible Eric W. Biederman
@ 2014-03-18  9:32                                       ` David Laight
  2014-03-18 13:22                                       ` Eric Dumazet
                                                         ` (2 subsequent siblings)
  3 siblings, 0 replies; 288+ messages in thread
From: David Laight @ 2014-03-18  9:32 UTC (permalink / raw)
  To: 'Eric W. Biederman', David Miller
  Cc: stephen, eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma

From: Of Eric W.> Biederman
> Add a test skb_irq_freeable to report when it is safe to free a skb
> from irq context.
> 
> It is not safe to free an skb from irq context when:
> - The skb has a destructor as some skb destructors call local_bh_disable
>   or spin_lock_bh.
> - There is xfrm state as __xfrm_state_destroy calls spin_lock_bh.
> - There is netfilter conntrack state as destroy_conntrack calls
>   spin_lock_bh.
> - If there is a refcounted dst entry on the skb, as __dst_free
>   calls spin_lock_bh.
> - If there is a frag_list, which could be a list of any skbs.

That is a lot of conditions to check....

> Otherwise it appears safe to free a skb from interrupt context.
> 
> - Update the warning in skb_releae_head_state to warn about freeing
>   skb's in the wrong context.
> 
> - Update __dev_kfree_skb_irq to free all skbs that it can immediately
> 
> - Kill zap_completion_queue because there is no point going through
>   a queue of packets that are not safe to free and looking for packets
>   that are safe to free.
> 
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
> ---
>  include/linux/skbuff.h |   13 +++++++++++++
>  net/core/dev.c         |   14 +++++++++-----
>  net/core/netpoll.c     |   32 --------------------------------
>  net/core/skbuff.c      |   13 ++++++++++---
>  4 files changed, 32 insertions(+), 40 deletions(-)
> 
> diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
> index 03db95ab8a8c..53f72b53fd47 100644
> --- a/include/linux/skbuff.h
> +++ b/include/linux/skbuff.h
> @@ -2833,6 +2833,19 @@ static inline void skb_init_secmark(struct sk_buff *skb)
>  { }
>  #endif
> 
> +static inline bool skb_irq_freeable(struct sk_buff *skb)
> +{
> +	return !skb->destructor &&
> +#if IS_ENABLED(CONFIG_XFRM)
> +		!skb->sp &&
> +#endif
> +#if IS_ENABLED(CONFIG_NF_CONNTRACK)
> +		!skb->nfct &&
> +#endif
> +		(!skb->_skb_refdst || (skb->_skb_refdst & SKB_DST_NOREF)) &&
> +		!skb_has_frag_list(skb);
> +}
> +
>  static inline void skb_set_queue_mapping(struct sk_buff *skb, u16 queue_mapping)
>  {
>  	skb->queue_mapping = queue_mapping;
> diff --git a/net/core/dev.c b/net/core/dev.c
> index 8b3ea4058a5e..99fd079488aa 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -2164,11 +2164,15 @@ void __dev_kfree_skb_irq(struct sk_buff *skb, enum skb_free_reason reason)
>  		return;
>  	}
>  	get_kfree_skb_cb(skb)->reason = reason;
> -	local_irq_save(flags);
> -	skb->next = __this_cpu_read(softnet_data.completion_queue);
> -	__this_cpu_write(softnet_data.completion_queue, skb);
> -	raise_softirq_irqoff(NET_TX_SOFTIRQ);
> -	local_irq_restore(flags);
> +	if (unlikely(skb_irq_freeable(skb))) {
> +		__kfree_skb(skb);
> +	} else {
> +		local_irq_save(flags);
> +		skb->next = __this_cpu_read(softnet_data.completion_queue);
> +		__this_cpu_write(softnet_data.completion_queue, skb);
> +		raise_softirq_irqoff(NET_TX_SOFTIRQ);
> +		local_irq_restore(flags);
> +	}

You've even marked the condition with 'unlikely'.
So I wonder how much you gain from the direct free?

	David

^ permalink raw reply	[flat|nested] 288+ messages in thread

* Re: [PATCH 6/6] net: Free skbs from irqs when possible.
  2014-03-18  6:27                                     ` [PATCH 6/6] net: Free skbs from irqs when possible Eric W. Biederman
  2014-03-18  9:32                                       ` David Laight
@ 2014-03-18 13:22                                       ` Eric Dumazet
  2014-03-18 17:51                                         ` Eric W. Biederman
  2014-03-18 13:30                                       ` Ben Hutchings
  2014-03-18 15:23                                       ` Stephen Hemminger
  3 siblings, 1 reply; 288+ messages in thread
From: Eric Dumazet @ 2014-03-18 13:22 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: David Miller, stephen, netdev, xiyou.wangcong, mpm, satyam.sharma

On Mon, 2014-03-17 at 23:27 -0700, Eric W. Biederman wrote:
> Add a test skb_irq_freeable to report when it is safe to free a skb
> from irq context.
> 
> It is not safe to free an skb from irq context when:
> - The skb has a destructor as some skb destructors call local_bh_disable
>   or spin_lock_bh.
> - There is xfrm state as __xfrm_state_destroy calls spin_lock_bh.
> - There is netfilter conntrack state as destroy_conntrack calls
>   spin_lock_bh.
> - If there is a refcounted dst entry on the skb, as __dst_free
>   calls spin_lock_bh.
> - If there is a frag_list, which could be a list of any skbs.
> Otherwise it appears safe to free a skb from interrupt context.
> 
> - Update the warning in skb_releae_head_state to warn about freeing
>   skb's in the wrong context.
> 
> - Update __dev_kfree_skb_irq to free all skbs that it can immediately
> 
> - Kill zap_completion_queue because there is no point going through
>   a queue of packets that are not safe to free and looking for packets
>   that are safe to free.
> 
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
> ---
>  include/linux/skbuff.h |   13 +++++++++++++
>  net/core/dev.c         |   14 +++++++++-----
>  net/core/netpoll.c     |   32 --------------------------------
>  net/core/skbuff.c      |   13 ++++++++++---
>  4 files changed, 32 insertions(+), 40 deletions(-)
> 
> diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
> index 03db95ab8a8c..53f72b53fd47 100644
> --- a/include/linux/skbuff.h
> +++ b/include/linux/skbuff.h
> @@ -2833,6 +2833,19 @@ static inline void skb_init_secmark(struct sk_buff *skb)
>  { }
>  #endif
>  
> +static inline bool skb_irq_freeable(struct sk_buff *skb)
> +{
> +	return !skb->destructor &&
> +#if IS_ENABLED(CONFIG_XFRM)
> +		!skb->sp &&
> +#endif
> +#if IS_ENABLED(CONFIG_NF_CONNTRACK)
> +		!skb->nfct &&
> +#endif
> +		(!skb->_skb_refdst || (skb->_skb_refdst & SKB_DST_NOREF)) &&
> +		!skb_has_frag_list(skb);
> +}
> +

It would be a serious bug having (skb->_skb_refdst & SKB_DST_NOREF) at
this point. dst would be RCU protected, but this can not be true as the
packet was queued in TX ring buffer for a possibly long period. And even
before reaching the driver, skb might have been queued in qdisc layer
and escape rcu protection section anyway.

Thats why we use skb_dst_force() from __dev_xmit_skb()

^ permalink raw reply	[flat|nested] 288+ messages in thread

* Re: [PATCH 6/6] net: Free skbs from irqs when possible.
  2014-03-18  6:27                                     ` [PATCH 6/6] net: Free skbs from irqs when possible Eric W. Biederman
  2014-03-18  9:32                                       ` David Laight
  2014-03-18 13:22                                       ` Eric Dumazet
@ 2014-03-18 13:30                                       ` Ben Hutchings
  2014-03-18 14:24                                         ` Bjørn Mork
  2014-03-18 17:53                                         ` [PATCH 6/6] net: Free skbs from irqs when possible Eric W. Biederman
  2014-03-18 15:23                                       ` Stephen Hemminger
  3 siblings, 2 replies; 288+ messages in thread
From: Ben Hutchings @ 2014-03-18 13:30 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: David Miller, stephen, eric.dumazet, netdev, xiyou.wangcong, mpm,
	satyam.sharma

[-- Attachment #1: Type: text/plain, Size: 804 bytes --]

On Mon, 2014-03-17 at 23:27 -0700, Eric W. Biederman wrote:
[...]
> --- a/net/core/skbuff.c
> +++ b/net/core/skbuff.c
> @@ -554,14 +554,21 @@ static void kfree_skbmem(struct sk_buff *skb)
>  
>  static void skb_release_head_state(struct sk_buff *skb)
>  {
> +	WARN_ONCE(in_irq() && !skb_irq_freeable(skb),
> +		  "%s called from irq! sp %d nfct %d frag_list %d %pF dst %lx",
> +		  __func__,
> +		  IS_ENABLED(CONFIG_XFRM) ? !!skb->sp : 0,
> +		  IS_ENABLED(CONFIG_NF_CONNTRACK) ? !!skb->nfct : 0,
[...]

This is a syntax error if CONFIG_XFRM or CONFIG_NF_CONNTRACK is
disabled; you have to use #ifdef's.

Ben.

-- 
Ben Hutchings
Experience is directly proportional to the value of equipment destroyed.
                                                         - Carolyn Scheppner

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 811 bytes --]

^ permalink raw reply	[flat|nested] 288+ messages in thread

* Re: [PATCH 6/6] net: Free skbs from irqs when possible.
  2014-03-18 13:30                                       ` Ben Hutchings
@ 2014-03-18 14:24                                         ` Bjørn Mork
  2014-03-18 15:23                                           ` Eric Dumazet
  2014-03-18 17:53                                         ` [PATCH 6/6] net: Free skbs from irqs when possible Eric W. Biederman
  1 sibling, 1 reply; 288+ messages in thread
From: Bjørn Mork @ 2014-03-18 14:24 UTC (permalink / raw)
  To: Ben Hutchings
  Cc: Eric W. Biederman, David Miller, stephen, eric.dumazet, netdev,
	xiyou.wangcong, mpm, satyam.sharma

Ben Hutchings <ben@decadent.org.uk> writes:
> On Mon, 2014-03-17 at 23:27 -0700, Eric W. Biederman wrote:
> [...]
>> --- a/net/core/skbuff.c
>> +++ b/net/core/skbuff.c
>> @@ -554,14 +554,21 @@ static void kfree_skbmem(struct sk_buff *skb)
>>  
>>  static void skb_release_head_state(struct sk_buff *skb)
>>  {
>> +	WARN_ONCE(in_irq() && !skb_irq_freeable(skb),
>> +		  "%s called from irq! sp %d nfct %d frag_list %d %pF dst %lx",
>> +		  __func__,
>> +		  IS_ENABLED(CONFIG_XFRM) ? !!skb->sp : 0,
>> +		  IS_ENABLED(CONFIG_NF_CONNTRACK) ? !!skb->nfct : 0,
> [...]
>
> This is a syntax error if CONFIG_XFRM or CONFIG_NF_CONNTRACK is
> disabled; you have to use #ifdef's.

Are you sure?  I thought one of the ideas behind these macros was that
they would always evaluate to 0 or 1.  The docs says:

 * IS_ENABLED(CONFIG_FOO) evaluates to 1 if CONFIG_FOO is set to 'y' or 'm',
 * 0 otherwise.


See include/linux/kconfig.h for the macro magic making this
happen. Looks like fun figuring that out.


Bjørn

^ permalink raw reply	[flat|nested] 288+ messages in thread

* Re: [PATCH 6/6] net: Free skbs from irqs when possible.
  2014-03-18 14:24                                         ` Bjørn Mork
@ 2014-03-18 15:23                                           ` Eric Dumazet
  2014-03-18 15:41                                             ` Bjørn Mork
  0 siblings, 1 reply; 288+ messages in thread
From: Eric Dumazet @ 2014-03-18 15:23 UTC (permalink / raw)
  To: Bjørn Mork
  Cc: Ben Hutchings, Eric W. Biederman, David Miller, stephen, netdev,
	xiyou.wangcong, mpm, satyam.sharma

On Tue, 2014-03-18 at 15:24 +0100, Bjørn Mork wrote:
> Ben Hutchings <ben@decadent.org.uk> writes:
> > On Mon, 2014-03-17 at 23:27 -0700, Eric W. Biederman wrote:
> > [...]
> >> --- a/net/core/skbuff.c
> >> +++ b/net/core/skbuff.c
> >> @@ -554,14 +554,21 @@ static void kfree_skbmem(struct sk_buff *skb)
> >>  
> >>  static void skb_release_head_state(struct sk_buff *skb)
> >>  {
> >> +	WARN_ONCE(in_irq() && !skb_irq_freeable(skb),
> >> +		  "%s called from irq! sp %d nfct %d frag_list %d %pF dst %lx",
> >> +		  __func__,
> >> +		  IS_ENABLED(CONFIG_XFRM) ? !!skb->sp : 0,
> >> +		  IS_ENABLED(CONFIG_NF_CONNTRACK) ? !!skb->nfct : 0,
> > [...]
> >
> > This is a syntax error if CONFIG_XFRM or CONFIG_NF_CONNTRACK is
> > disabled; you have to use #ifdef's.
> 
> Are you sure?  I thought one of the ideas behind these macros was that
> they would always evaluate to 0 or 1.  The docs says:
> 
>  * IS_ENABLED(CONFIG_FOO) evaluates to 1 if CONFIG_FOO is set to 'y' or 'm',
>  * 0 otherwise.
> 
> 
> See include/linux/kconfig.h for the macro magic making this
> happen. Looks like fun figuring that out.

It has nothing to do with this.

Try following code, and you'll get a compilation error.

unsigned int can_this_fly(struct sk_buff *skb)
{
	return IS_ENABLED(CONFIG_NOWAY_SIR) ? skb->unknown_field : 0;
}

^ permalink raw reply	[flat|nested] 288+ messages in thread

* Re: [PATCH 6/6] net: Free skbs from irqs when possible.
  2014-03-18  6:27                                     ` [PATCH 6/6] net: Free skbs from irqs when possible Eric W. Biederman
                                                         ` (2 preceding siblings ...)
  2014-03-18 13:30                                       ` Ben Hutchings
@ 2014-03-18 15:23                                       ` Stephen Hemminger
  2014-03-18 17:47                                         ` Eric W. Biederman
  3 siblings, 1 reply; 288+ messages in thread
From: Stephen Hemminger @ 2014-03-18 15:23 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: David Miller, eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma

On Mon, 17 Mar 2014 23:27:52 -0700
ebiederm@xmission.com (Eric W. Biederman) wrote:

> Add a test skb_irq_freeable to report when it is safe to free a skb
> from irq context.
> 
> It is not safe to free an skb from irq context when:
> - The skb has a destructor as some skb destructors call local_bh_disable
>   or spin_lock_bh.
> - There is xfrm state as __xfrm_state_destroy calls spin_lock_bh.
> - There is netfilter conntrack state as destroy_conntrack calls
>   spin_lock_bh.
> - If there is a refcounted dst entry on the skb, as __dst_free
>   calls spin_lock_bh.
> - If there is a frag_list, which could be a list of any skbs.
> Otherwise it appears safe to free a skb from interrupt context.
> 
> - Update the warning in skb_releae_head_state to warn about freeing
>   skb's in the wrong context.
> 
> - Update __dev_kfree_skb_irq to free all skbs that it can immediately
> 
> - Kill zap_completion_queue because there is no point going through
>   a queue of packets that are not safe to free and looking for packets
>   that are safe to free.
> 
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>

Why introduce the additional complexity for so little gain?
It looks like you are only optimizing for the corner case where netpoll
is cleaning up on Tx.

-1

^ permalink raw reply	[flat|nested] 288+ messages in thread

* Re: [PATCH 6/6] net: Free skbs from irqs when possible.
  2014-03-18 15:23                                           ` Eric Dumazet
@ 2014-03-18 15:41                                             ` Bjørn Mork
  2014-03-18 15:52                                               ` David Laight
  0 siblings, 1 reply; 288+ messages in thread
From: Bjørn Mork @ 2014-03-18 15:41 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Ben Hutchings, Eric W. Biederman, David Miller, stephen, netdev,
	xiyou.wangcong, mpm, satyam.sharma



On 18 March 2014 16:23:49 CET, Eric Dumazet <eric.dumazet@gmail.com> wrote:
>On Tue, 2014-03-18 at 15:24 +0100, Bjørn Mork wrote:
>> Ben Hutchings <ben@decadent.org.uk> writes:
>> > On Mon, 2014-03-17 at 23:27 -0700, Eric W. Biederman wrote:
>> > [...]
>> >> --- a/net/core/skbuff.c
>> >> +++ b/net/core/skbuff.c
>> >> @@ -554,14 +554,21 @@ static void kfree_skbmem(struct sk_buff
>*skb)
>> >>  
>> >>  static void skb_release_head_state(struct sk_buff *skb)
>> >>  {
>> >> +	WARN_ONCE(in_irq() && !skb_irq_freeable(skb),
>> >> +		  "%s called from irq! sp %d nfct %d frag_list %d %pF dst %lx",
>> >> +		  __func__,
>> >> +		  IS_ENABLED(CONFIG_XFRM) ? !!skb->sp : 0,
>> >> +		  IS_ENABLED(CONFIG_NF_CONNTRACK) ? !!skb->nfct : 0,
>> > [...]
>> >
>> > This is a syntax error if CONFIG_XFRM or CONFIG_NF_CONNTRACK is
>> > disabled; you have to use #ifdef's.
>> 
>> Are you sure?  I thought one of the ideas behind these macros was
>that
>> they would always evaluate to 0 or 1.  The docs says:
>> 
>>  * IS_ENABLED(CONFIG_FOO) evaluates to 1 if CONFIG_FOO is set to 'y'
>or 'm',
>>  * 0 otherwise.
>> 
>> 
>> See include/linux/kconfig.h for the macro magic making this
>> happen. Looks like fun figuring that out.
>
>It has nothing to do with this.
>
>Try following code, and you'll get a compilation error.
>
>unsigned int can_this_fly(struct sk_buff *skb)
>{
>	return IS_ENABLED(CONFIG_NOWAY_SIR) ? skb->unknown_field : 0;
>}

Doh. Of course. Thanks for spoon feeding me that.



Bjørn

^ permalink raw reply	[flat|nested] 288+ messages in thread

* RE: [PATCH 6/6] net: Free skbs from irqs when possible.
  2014-03-18 15:41                                             ` Bjørn Mork
@ 2014-03-18 15:52                                               ` David Laight
  2014-03-28  1:14                                                 ` [PATCH 0/3] netpoll: Freeing skbs in hard irq context Eric W. Biederman
  0 siblings, 1 reply; 288+ messages in thread
From: David Laight @ 2014-03-18 15:52 UTC (permalink / raw)
  To: 'Bjørn Mork', Eric Dumazet
  Cc: Ben Hutchings, Eric W. Biederman, David Miller, stephen, netdev,
	xiyou.wangcong, mpm, satyam.sharma

From: Bjørn Mork
...
> >Try following code, and you'll get a compilation error.
> >
> >unsigned int can_this_fly(struct sk_buff *skb)
> >{
> >	return IS_ENABLED(CONFIG_NOWAY_SIR) ? skb->unknown_field : 0;
> >}
> 
> Doh. Of course. Thanks for spoon feeding me that.

Of course, config_enabled(foo) could be implemented as
config_enabled_(foo, 1, 0)
Allowing you to define and test:
	IS_ENABLED_Q(CONFIG_NOWAY_SIR, skb->unknown_field, 0)

Although you'd probably need to paste the expansions of both 'foo'
'foo_MODULE' onto __ARG_PLACEHOLDER_ to avoid the ||.

More useful might be IF_ENABLED(foo, x) which expands to either
x or nothing at all.
The you could conditionally generate the chunks of format statement
as well as the parameter list.

OTOH you might want readable code!

	David


^ permalink raw reply	[flat|nested] 288+ messages in thread

* Re: [PATCH 6/6] net: Free skbs from irqs when possible.
  2014-03-18 15:23                                       ` Stephen Hemminger
@ 2014-03-18 17:47                                         ` Eric W. Biederman
  2014-03-18 18:37                                           ` David Miller
  0 siblings, 1 reply; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-18 17:47 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: David Miller, eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma

Stephen Hemminger <stephen@networkplumber.org> writes:

> On Mon, 17 Mar 2014 23:27:52 -0700
> ebiederm@xmission.com (Eric W. Biederman) wrote:
>
>> Add a test skb_irq_freeable to report when it is safe to free a skb
>> from irq context.
>> 
>> It is not safe to free an skb from irq context when:
>> - The skb has a destructor as some skb destructors call local_bh_disable
>>   or spin_lock_bh.
>> - There is xfrm state as __xfrm_state_destroy calls spin_lock_bh.
>> - There is netfilter conntrack state as destroy_conntrack calls
>>   spin_lock_bh.
>> - If there is a refcounted dst entry on the skb, as __dst_free
>>   calls spin_lock_bh.
>> - If there is a frag_list, which could be a list of any skbs.
>> Otherwise it appears safe to free a skb from interrupt context.
>> 
>> - Update the warning in skb_releae_head_state to warn about freeing
>>   skb's in the wrong context.
>> 
>> - Update __dev_kfree_skb_irq to free all skbs that it can immediately
>> 
>> - Kill zap_completion_queue because there is no point going through
>>   a queue of packets that are not safe to free and looking for packets
>>   that are safe to free.
>> 
>> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
>
> Why introduce the additional complexity for so little gain?
> It looks like you are only optimizing for the corner case where netpoll
> is cleaning up on Tx.
>
> -1

The netpoll cleanup is interesting.  And certainly something needs to be
done to fix/cleanup zap_completion_queue in netpoll.

However the deep reason for introducing skb_irq_freeable() is to fix the
warnings in skb_release_head_state aka __kfree_skb.  Only warning when
we have a destructor in irq context strongly suggests that it is only
when we have destructors that we have problems.

Most of the destructors today are fine (which doubly makes the warning
confusing).  Assuming we don't have a dst reference when a packet is
transmitted it is the presence of iptables in the system that makes tx
packets not freeable from hard irq context.

Received packets seem to be always freeable and freeing packets on the
error path of packet reception could actually be helped by this.
Especially in the drivers like the e1000 that have misplaced calls to
dev_kfree_skb_irq.

With respect to netpoll the only skbs that are likely to be freeable in
the netpoll transmit path are packets netpoll has transmitted itself.
So if this does not fit in the generic dev_kfree_skb_irq it definitely
makes sense to perform this test in netpoll's zap_completion_queue.

At a practical level seeing a skb->destructor typically will be what
pushes the code into the delayed freeing scenario.  So I don't see this
slowing things down noticably.

In addition freeing skbs in hard irq context (outside of netpoll) is
something only older mostly pre NAPI drivers do.  So frankly it seems
reasonable to me to optimize for the common case of freeing skbs in
hard irq context (i.e. netpoll).

Eric

^ permalink raw reply	[flat|nested] 288+ messages in thread

* Re: [PATCH 6/6] net: Free skbs from irqs when possible.
  2014-03-18 13:22                                       ` Eric Dumazet
@ 2014-03-18 17:51                                         ` Eric W. Biederman
  0 siblings, 0 replies; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-18 17:51 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: David Miller, stephen, netdev, xiyou.wangcong, mpm, satyam.sharma

Eric Dumazet <eric.dumazet@gmail.com> writes:

> On Mon, 2014-03-17 at 23:27 -0700, Eric W. Biederman wrote:
>> Add a test skb_irq_freeable to report when it is safe to free a skb
>> from irq context.
>> 
>> It is not safe to free an skb from irq context when:
>> - The skb has a destructor as some skb destructors call local_bh_disable
>>   or spin_lock_bh.
>> - There is xfrm state as __xfrm_state_destroy calls spin_lock_bh.
>> - There is netfilter conntrack state as destroy_conntrack calls
>>   spin_lock_bh.
>> - If there is a refcounted dst entry on the skb, as __dst_free
>>   calls spin_lock_bh.
>> - If there is a frag_list, which could be a list of any skbs.
>> Otherwise it appears safe to free a skb from interrupt context.
>> 
>> - Update the warning in skb_releae_head_state to warn about freeing
>>   skb's in the wrong context.
>> 
>> - Update __dev_kfree_skb_irq to free all skbs that it can immediately
>> 
>> - Kill zap_completion_queue because there is no point going through
>>   a queue of packets that are not safe to free and looking for packets
>>   that are safe to free.
>> 
>> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
>> ---
>>  include/linux/skbuff.h |   13 +++++++++++++
>>  net/core/dev.c         |   14 +++++++++-----
>>  net/core/netpoll.c     |   32 --------------------------------
>>  net/core/skbuff.c      |   13 ++++++++++---
>>  4 files changed, 32 insertions(+), 40 deletions(-)
>> 
>> diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
>> index 03db95ab8a8c..53f72b53fd47 100644
>> --- a/include/linux/skbuff.h
>> +++ b/include/linux/skbuff.h
>> @@ -2833,6 +2833,19 @@ static inline void skb_init_secmark(struct sk_buff *skb)
>>  { }
>>  #endif
>>  
>> +static inline bool skb_irq_freeable(struct sk_buff *skb)
>> +{
>> +	return !skb->destructor &&
>> +#if IS_ENABLED(CONFIG_XFRM)
>> +		!skb->sp &&
>> +#endif
>> +#if IS_ENABLED(CONFIG_NF_CONNTRACK)
>> +		!skb->nfct &&
>> +#endif
>> +		(!skb->_skb_refdst || (skb->_skb_refdst & SKB_DST_NOREF)) &&
>> +		!skb_has_frag_list(skb);
>> +}
>> +
>
> It would be a serious bug having (skb->_skb_refdst & SKB_DST_NOREF) at
> this point. dst would be RCU protected, but this can not be true as the
> packet was queued in TX ring buffer for a possibly long period. And even
> before reaching the driver, skb might have been queued in qdisc layer
> and escape rcu protection section anyway.
>
> Thats why we use skb_dst_force() from __dev_xmit_skb()

Interesting point.  I hadn't dug far enough to see that.  I have no
problems simplifying this check.

Eric

^ permalink raw reply	[flat|nested] 288+ messages in thread

* Re: [PATCH 6/6] net: Free skbs from irqs when possible.
  2014-03-18 13:30                                       ` Ben Hutchings
  2014-03-18 14:24                                         ` Bjørn Mork
@ 2014-03-18 17:53                                         ` Eric W. Biederman
  1 sibling, 0 replies; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-18 17:53 UTC (permalink / raw)
  To: Ben Hutchings
  Cc: David Miller, stephen, eric.dumazet, netdev, xiyou.wangcong, mpm,
	satyam.sharma

Ben Hutchings <ben@decadent.org.uk> writes:

> On Mon, 2014-03-17 at 23:27 -0700, Eric W. Biederman wrote:
> [...]
>> --- a/net/core/skbuff.c
>> +++ b/net/core/skbuff.c
>> @@ -554,14 +554,21 @@ static void kfree_skbmem(struct sk_buff *skb)
>>  
>>  static void skb_release_head_state(struct sk_buff *skb)
>>  {
>> +	WARN_ONCE(in_irq() && !skb_irq_freeable(skb),
>> +		  "%s called from irq! sp %d nfct %d frag_list %d %pF dst %lx",
>> +		  __func__,
>> +		  IS_ENABLED(CONFIG_XFRM) ? !!skb->sp : 0,
>> +		  IS_ENABLED(CONFIG_NF_CONNTRACK) ? !!skb->nfct : 0,
> [...]
>
> This is a syntax error if CONFIG_XFRM or CONFIG_NF_CONNTRACK is
> disabled; you have to use #ifdef's.

Doh!

That if nothing else definitely calls for a respin of this patch.

Eric

^ permalink raw reply	[flat|nested] 288+ messages in thread

* Re: [PATCH 3/6] netpoll: Don't allow on devices that perform their own xmit locking
  2014-03-18  6:25                                     ` [PATCH 3/6] netpoll: Don't allow on devices that perform their own xmit locking Eric W. Biederman
@ 2014-03-18 18:26                                       ` Cong Wang
  2014-03-18 18:38                                         ` David Miller
  0 siblings, 1 reply; 288+ messages in thread
From: Cong Wang @ 2014-03-18 18:26 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: David Miller, Stephen Hemminger, Eric Dumazet, netdev, Cong Wang,
	mpm, satyam.sharma

On Mon, Mar 17, 2014 at 11:25 PM, Eric W. Biederman
<ebiederm@xmission.com> wrote:
>
> There are strong and reasonable assumptions in the netpoll code that the
> transmit code for network devices will not perform their own locking,
> that can easily lead to deadlock if the assumptions are violated.
>
> Document those assumptions by verifying the network device on which
> netpoll is enabled does not have NETIF_F_LLTX set in netdev->features.
>
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
> ---
>  net/core/netpoll.c |    1 +
>  1 files changed, 1 insertions(+), 0 deletions(-)
>
> diff --git a/net/core/netpoll.c b/net/core/netpoll.c
> index 825200fcb0ff..a9abb195a2c3 100644
> --- a/net/core/netpoll.c
> +++ b/net/core/netpoll.c
> @@ -606,6 +606,7 @@ int __netpoll_setup(struct netpoll *np, struct net_device *ndev)
>         INIT_WORK(&np->cleanup_work, netpoll_async_cleanup);
>
>         if ((ndev->priv_flags & IFF_DISABLE_NETPOLL) ||
> +           (ndev->features & NETIF_F_LLTX) ||
>             !ndev->netdev_ops->ndo_poll_controller) {
>                 np_err(np, "%s doesn't support polling, aborting\n",
>                        np->dev_name);

Hmm? This basically disables netpoll on a lots of devices, such as vlan.

^ permalink raw reply	[flat|nested] 288+ messages in thread

* Re: [PATCH 6/6] net: Free skbs from irqs when possible.
  2014-03-18 17:47                                         ` Eric W. Biederman
@ 2014-03-18 18:37                                           ` David Miller
  2014-03-27 23:02                                             ` Eric W. Biederman
  0 siblings, 1 reply; 288+ messages in thread
From: David Miller @ 2014-03-18 18:37 UTC (permalink / raw)
  To: ebiederm
  Cc: stephen, eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma

From: ebiederm@xmission.com (Eric W. Biederman)
Date: Tue, 18 Mar 2014 10:47:36 -0700

> Most of the destructors today are fine (which doubly makes the warning
> confusing).

Not true by my estimation.  We absolutely do not want socket state being
modified from hardware interrupts, and that's the most common destructor,
releasing socket memory.

^ permalink raw reply	[flat|nested] 288+ messages in thread

* Re: [PATCH 3/6] netpoll: Don't allow on devices that perform their own xmit locking
  2014-03-18 18:26                                       ` Cong Wang
@ 2014-03-18 18:38                                         ` David Miller
  0 siblings, 0 replies; 288+ messages in thread
From: David Miller @ 2014-03-18 18:38 UTC (permalink / raw)
  To: cwang
  Cc: ebiederm, stephen, eric.dumazet, netdev, xiyou.wangcong, mpm,
	satyam.sharma

From: Cong Wang <cwang@twopensource.com>
Date: Tue, 18 Mar 2014 11:26:54 -0700

> On Mon, Mar 17, 2014 at 11:25 PM, Eric W. Biederman
> <ebiederm@xmission.com> wrote:
>>
>> There are strong and reasonable assumptions in the netpoll code that the
>> transmit code for network devices will not perform their own locking,
>> that can easily lead to deadlock if the assumptions are violated.
>>
>> Document those assumptions by verifying the network device on which
>> netpoll is enabled does not have NETIF_F_LLTX set in netdev->features.
>>
>> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
>> ---
>>  net/core/netpoll.c |    1 +
>>  1 files changed, 1 insertions(+), 0 deletions(-)
>>
>> diff --git a/net/core/netpoll.c b/net/core/netpoll.c
>> index 825200fcb0ff..a9abb195a2c3 100644
>> --- a/net/core/netpoll.c
>> +++ b/net/core/netpoll.c
>> @@ -606,6 +606,7 @@ int __netpoll_setup(struct netpoll *np, struct net_device *ndev)
>>         INIT_WORK(&np->cleanup_work, netpoll_async_cleanup);
>>
>>         if ((ndev->priv_flags & IFF_DISABLE_NETPOLL) ||
>> +           (ndev->features & NETIF_F_LLTX) ||
>>             !ndev->netdev_ops->ndo_poll_controller) {
>>                 np_err(np, "%s doesn't support polling, aborting\n",
>>                        np->dev_name);
> 
> Hmm? This basically disables netpoll on a lots of devices, such as vlan.

Right, this is bogus.

^ permalink raw reply	[flat|nested] 288+ messages in thread

* [net-next 00/54][pull request] Using dev_kfree/consume_skb_any for functions called in multiple contexts
  2014-03-11 21:13                     ` [PATCH next-next 0/10] Using dev_kfree_skb_any for functions called in multiple contexts Eric W. Biederman
                                         ` (10 preceding siblings ...)
  2014-03-12  2:54                       ` [PATCH next-next 0/10] Using dev_kfree_skb_any for functions called in multiple contexts Eric Dumazet
@ 2014-03-25  5:58                       ` Eric W. Biederman
  2014-03-25  6:04                         ` [PATCH 01/54] uml/net_kern: Call dev_consume_skb_any instead of dev_kfree_skb Eric W. Biederman
  2014-03-25 20:49                         ` [net-next 00/54][pull request] Using dev_kfree/consume_skb_any for functions called in multiple contexts Eric Dumazet
  11 siblings, 2 replies; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-25  5:58 UTC (permalink / raw)
  To: David Miller; +Cc: eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma


These changes are a result of walking through the network drivers
supporting netpoll and verifying the code paths that netpoll can cause
to be called in hard irq context use an appropriate flavor of
kfree_skb.  Either dev_kfree_skb_any or dev_consume_skb_any.

Since my last pass at this I have become aware of the small differences
between dev_kfree_skb_any and dev_consume_skb_any.
net/core/drop_monitor.c reports the dev_kfree_skb_any as a drop and
while being quite about the second.  With the weird twist that
dev_kfree_skb is unintuitively consume_skb.

As netpoll now calls the napi poll function with budget == 0, pieces of
a drivers the napi poll function that don't run when budget == 0 have
been ignored.

The most interesting change is to the atl1c which tried unsuccesfully to
tell one of it's functions which context it is called in so that it
could call dev_kfree_skb_irq or dev_kfree_skb as appropriate.  I have
just removed the extra parameter and called dev_consume_skb_any.

At 54 separate changes I will post each change as a separate patch (so
they can be reviewed) but for general sanity sake I have gathered them
all into a git branch for easy acces.

David when you are satisified with these changes please pull:

    git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/net-next.git master

Hopefully this will allow me to forget this class of error when dealing
with netpoll.

Eric

Eric W. Biederman (54):
      uml/net_kern: Call dev_consume_skb_any instead of dev_kfree_skb.
      3c509: Call dev_consume_skb_any instead of dev_kfree_skb.
      3c59x: Call dev_consume_skb_any instead of dev_kfree_skb.
      8390: Call dev_consume_skb_any instead of dev_kfree_skb.
      bfin_mac: Call dev_consume_skb_any instead of dev_kfree_skb.
      sun4i-emac: Call dev_consume_skb_any instead of dev_kfree_skb.
      am79c961a: Call dev_consume_skb_any instead of dev_kfree_skb.
      lance: Call dev_consume_skb_any instead of dev_kfree_skb.
      pcnet32: Call dev_kfree_skb_any instead of dev_kfree_skb.
      alx: Call dev_kfree_skb_any instead of dev_kfree_skb.
      atl1c: Call dev_kfree/consume_skb_any instead of dev_kfree_skb.
      bnad: Call dev_kfree_skb_any instead of dev_kfree_skb.
      macb: Call dev_kfree_skb_any instead of kfree_skb.
      xgmac: Call dev_kfree/consume_skb_any instead of dev_kfree_skb.
      cxgb3: Call dev_kfree/consume_skb_any instead of [dev_]kfree_skb.
      cxgb4: Call dev_kfree/consume_skb_any instead of [dev_]kfree_skb.
      cxfb4vf: Call dev_kfree/consume_skb_any instead of [dev_]kfree_skb.
      cs89x0: Call dev_consume_skb_any instead of dev_kfree_skb.
      enic: Call dev_kfree_skb_any instead of dev_kfree_skb.
      dm9000: Call dev_consume_skb_any instead of dev_kfree_skb.
      dmfe: Call dev_kfree/consume_skb_any instead of dev_kfree_skb.
      uli526x: Call dev_kfree/consume_skb_any instead of dev_kfree_skb.
      sundance: Call dev_kfree_skb_any instead of dev_kfree_skb.
      fec: Call dev_kfree_skb_any instead of kfree_skb.
      ucc_geth: Call dev_consume_skb_any instead of dev_kfree_skb.
      i825xx: Call dev_kfree_skb_any instead of dev_kfree_skb.
      ehea: Call dev_consume_skb_any instead of dev_kfree_skb.
      ibmveth: Call dev_consume_skb_any instead of dev_kfree_skb.
      jme: Call dev_kfree_skb_any instead of dev_kfree_skb.
      mv643xx_eth: Call dev_kfree_skb_any instead of dev_kfree_skb.
      skge: Call dev_kfree/consume_skb_any instead of dev_kfree_skb.
      sky2: Call dev_kfree_skb_any instead of dev_kfree_skb.
      ksz884x: Call dev_consume_skb_any instead of dev_kfree_skb.
      s2io: Call dev_kfree_skb_any instead of dev_kfree_skb.
      vxge: Call dev_kfree_skb_any instead of dev_kfree_skb.
      forcedeth: Call dev_kfree_skb_any instead of kfree_skb.
      sc92031: Call dev_consume_skb_any instead of dev_kfree_skb.
      sis900: Call dev_kfree_skb_any instead of dev_kfree_skb.
      smc911x: Call dev_kfree_skb_any instead of dev_kfree_skb.
      smc91x: Call dev_kfree/consume_skb_any instead of dev_kfree_skb.
      smsc911x: Call dev_consume_skb_any instead of dev_kfree_skb.
      stmmac: Call dev_consume_skb_any instead of dev_kfree_skb.
      sungem: Call dev_consume_skb_any instead of dev_kfree_skb.
      tilepro: Call dev_consume_skb_any instead of kfree_skb.
      spider_net: Call dev_consume_skb_any instead of dev_kfree_skb.
      via-rhine: Call dev_kfree/consume_skb_any instead of dev_kfree_skb.
      via-velocity: Call dev_kfree_skb_any instead of kfree_skb.
      xilinx_emaclite: Call dev_consume_skb_any instead of dev_kfree_skb.
      vmxnet3: Call dev_kfree_skb_any instead of dev_kfree_skb.
      xen-netfront: Call dev_kfree_skb_any instead of dev_kfree_skb.
      wlags49_h2: Call dev_kfree/consume_skb_any instead of dev_kfree_skb.
      staging/octeon-ethernet: Call dev_kfree/consume_skb_any instead of dev_kfree_skb.
      virtio_net: Call dev_kfree_skb_any instead of dev_kfree_skb.
      if_vlan: Call dev_kfree_skb_any instead of kfree_skb.

 arch/um/drivers/net_kern.c                        |  2 +-
 drivers/net/ethernet/3com/3c509.c                 |  2 +-
 drivers/net/ethernet/3com/3c59x.c                 |  2 +-
 drivers/net/ethernet/8390/lib8390.c               |  2 +-
 drivers/net/ethernet/adi/bfin_mac.c               |  2 +-
 drivers/net/ethernet/allwinner/sun4i-emac.c       |  2 +-
 drivers/net/ethernet/amd/7990.c                   |  2 +-
 drivers/net/ethernet/amd/am79c961a.c              |  2 +-
 drivers/net/ethernet/amd/pcnet32.c                |  2 +-
 drivers/net/ethernet/atheros/alx/main.c           |  2 +-
 drivers/net/ethernet/atheros/atl1c/atl1c_main.c   | 20 ++++++++------------
 drivers/net/ethernet/brocade/bna/bnad.c           | 16 ++++++++--------
 drivers/net/ethernet/cadence/macb.c               |  2 +-
 drivers/net/ethernet/calxeda/xgmac.c              |  6 +++---
 drivers/net/ethernet/chelsio/cxgb3/sge.c          |  6 +++---
 drivers/net/ethernet/chelsio/cxgb4/sge.c          |  6 +++---
 drivers/net/ethernet/chelsio/cxgb4vf/sge.c        |  6 +++---
 drivers/net/ethernet/cirrus/cs89x0.c              |  2 +-
 drivers/net/ethernet/cisco/enic/enic_main.c       |  4 ++--
 drivers/net/ethernet/davicom/dm9000.c             |  2 +-
 drivers/net/ethernet/dec/tulip/dmfe.c             |  4 ++--
 drivers/net/ethernet/dec/tulip/uli526x.c          |  4 ++--
 drivers/net/ethernet/dlink/sundance.c             |  2 +-
 drivers/net/ethernet/freescale/fec_main.c         |  2 +-
 drivers/net/ethernet/freescale/ucc_geth.c         |  2 +-
 drivers/net/ethernet/i825xx/lib82596.c            |  2 +-
 drivers/net/ethernet/ibm/ehea/ehea_main.c         |  6 +++---
 drivers/net/ethernet/ibm/ibmveth.c                |  2 +-
 drivers/net/ethernet/jme.c                        |  2 +-
 drivers/net/ethernet/marvell/mv643xx_eth.c        |  4 ++--
 drivers/net/ethernet/marvell/skge.c               |  4 ++--
 drivers/net/ethernet/marvell/sky2.c               |  2 +-
 drivers/net/ethernet/micrel/ksz884x.c             |  2 +-
 drivers/net/ethernet/neterion/s2io.c              |  6 +++---
 drivers/net/ethernet/neterion/vxge/vxge-main.c    |  8 ++++----
 drivers/net/ethernet/nvidia/forcedeth.c           |  8 ++++----
 drivers/net/ethernet/silan/sc92031.c              |  2 +-
 drivers/net/ethernet/sis/sis900.c                 |  2 +-
 drivers/net/ethernet/smsc/smc911x.c               |  2 +-
 drivers/net/ethernet/smsc/smc91x.c                |  4 ++--
 drivers/net/ethernet/smsc/smsc911x.c              |  2 +-
 drivers/net/ethernet/stmicro/stmmac/stmmac_main.c |  2 +-
 drivers/net/ethernet/sun/sungem.c                 |  2 +-
 drivers/net/ethernet/tile/tilepro.c               |  4 ++--
 drivers/net/ethernet/toshiba/spider_net.c         |  2 +-
 drivers/net/ethernet/via/via-rhine.c              |  6 +++---
 drivers/net/ethernet/via/via-velocity.c           |  2 +-
 drivers/net/ethernet/xilinx/xilinx_emaclite.c     |  2 +-
 drivers/net/virtio_net.c                          |  2 +-
 drivers/net/vmxnet3/vmxnet3_drv.c                 |  2 +-
 drivers/net/xen-netfront.c                        |  2 +-
 drivers/staging/octeon/ethernet-tx.c              |  6 +++---
 drivers/staging/wlags49_h2/wl_netdev.c            |  6 +++---
 include/linux/if_vlan.h                           |  2 +-
 54 files changed, 99 insertions(+), 103 deletions(-)

^ permalink raw reply	[flat|nested] 288+ messages in thread

* [PATCH 01/54] uml/net_kern: Call dev_consume_skb_any instead of dev_kfree_skb.
  2014-03-25  5:58                       ` [net-next 00/54][pull request] Using dev_kfree/consume_skb_any " Eric W. Biederman
@ 2014-03-25  6:04                         ` Eric W. Biederman
  2014-03-25  6:04                           ` [PATCH 02/54] 3c509: " Eric W. Biederman
                                             ` (53 more replies)
  2014-03-25 20:49                         ` [net-next 00/54][pull request] Using dev_kfree/consume_skb_any for functions called in multiple contexts Eric Dumazet
  1 sibling, 54 replies; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-25  6:04 UTC (permalink / raw)
  To: David Miller
  Cc: eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma,
	Eric W. Biederman

From: "Eric W. Biederman" <ebiederm@xmission.com>

Replace dev_kfree_skb with dev_consume_skb_any in uml_net_start_xmit
as it can be called in hard irq and other contexts.

dev_consume_skb_any is used as uml_net_start_xmit typically
consumes (not drops) packets.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 arch/um/drivers/net_kern.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/um/drivers/net_kern.c b/arch/um/drivers/net_kern.c
index 39f186252e02..7d26d9c0b2fb 100644
--- a/arch/um/drivers/net_kern.c
+++ b/arch/um/drivers/net_kern.c
@@ -240,7 +240,7 @@ static int uml_net_start_xmit(struct sk_buff *skb, struct net_device *dev)
 
 	spin_unlock_irqrestore(&lp->lock, flags);
 
-	dev_kfree_skb(skb);
+	dev_consume_skb_any(skb);
 
 	return NETDEV_TX_OK;
 }
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* [PATCH 02/54] 3c509: Call dev_consume_skb_any instead of dev_kfree_skb.
  2014-03-25  6:04                         ` [PATCH 01/54] uml/net_kern: Call dev_consume_skb_any instead of dev_kfree_skb Eric W. Biederman
@ 2014-03-25  6:04                           ` Eric W. Biederman
  2014-03-25 13:03                             ` Eric Dumazet
  2014-03-25  6:04                           ` [PATCH 03/54] 3c59x: " Eric W. Biederman
                                             ` (52 subsequent siblings)
  53 siblings, 1 reply; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-25  6:04 UTC (permalink / raw)
  To: David Miller
  Cc: eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma,
	Eric W. Biederman

From: "Eric W. Biederman" <ebiederm@xmission.com>

Replace dev_kfree_skb with dev_consume_skb_any in el3_start_xmit
as it can be called in hard irq and other contexts.

dev_consume_skb_any is used as on this simple hardware the
skb is consumed directly by the start_xmit function.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 drivers/net/ethernet/3com/3c509.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/3com/3c509.c b/drivers/net/ethernet/3com/3c509.c
index c53384d41c96..35df0b9e6848 100644
--- a/drivers/net/ethernet/3com/3c509.c
+++ b/drivers/net/ethernet/3com/3c509.c
@@ -749,7 +749,7 @@ el3_start_xmit(struct sk_buff *skb, struct net_device *dev)
 
 	spin_unlock_irqrestore(&lp->lock, flags);
 
-	dev_kfree_skb (skb);
+	dev_consume_skb_any (skb);
 
 	/* Clear the Tx status stack. */
 	{
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* [PATCH 03/54] 3c59x: Call dev_consume_skb_any instead of dev_kfree_skb.
  2014-03-25  6:04                         ` [PATCH 01/54] uml/net_kern: Call dev_consume_skb_any instead of dev_kfree_skb Eric W. Biederman
  2014-03-25  6:04                           ` [PATCH 02/54] 3c509: " Eric W. Biederman
@ 2014-03-25  6:04                           ` Eric W. Biederman
  2014-03-25 13:04                             ` Eric Dumazet
  2014-03-25  6:04                           ` [PATCH 04/54] 8390: " Eric W. Biederman
                                             ` (51 subsequent siblings)
  53 siblings, 1 reply; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-25  6:04 UTC (permalink / raw)
  To: David Miller
  Cc: eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma,
	Eric W. Biederman

From: "Eric W. Biederman" <ebiederm@xmission.com>

Replace dev_kfree_skb with dev_consume_skb_any in vortex_start_xmit
as it can be called in hard irq and other contexts.

dev_consume_skb_any is used when vortext_start_xmit directly consumes
the packet instead of dmaing it to the device.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 drivers/net/ethernet/3com/3c59x.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/3com/3c59x.c b/drivers/net/ethernet/3com/3c59x.c
index 238ccea965c8..61477b8e8d24 100644
--- a/drivers/net/ethernet/3com/3c59x.c
+++ b/drivers/net/ethernet/3com/3c59x.c
@@ -2086,7 +2086,7 @@ vortex_start_xmit(struct sk_buff *skb, struct net_device *dev)
 		/* ... and the packet rounded to a doubleword. */
 		skb_tx_timestamp(skb);
 		iowrite32_rep(ioaddr + TX_FIFO, skb->data, (skb->len + 3) >> 2);
-		dev_kfree_skb (skb);
+		dev_consume_skb_any (skb);
 		if (ioread16(ioaddr + TxFree) > 1536) {
 			netif_start_queue (dev);	/* AKPM: redundant? */
 		} else {
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* [PATCH 04/54] 8390: Call dev_consume_skb_any instead of dev_kfree_skb.
  2014-03-25  6:04                         ` [PATCH 01/54] uml/net_kern: Call dev_consume_skb_any instead of dev_kfree_skb Eric W. Biederman
  2014-03-25  6:04                           ` [PATCH 02/54] 3c509: " Eric W. Biederman
  2014-03-25  6:04                           ` [PATCH 03/54] 3c59x: " Eric W. Biederman
@ 2014-03-25  6:04                           ` Eric W. Biederman
  2014-03-25 13:06                             ` Eric Dumazet
  2014-03-25  6:04                           ` [PATCH 05/54] bfin_mac: " Eric W. Biederman
                                             ` (50 subsequent siblings)
  53 siblings, 1 reply; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-25  6:04 UTC (permalink / raw)
  To: David Miller
  Cc: eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma,
	Eric W. Biederman

From: "Eric W. Biederman" <ebiederm@xmission.com>

Replace dev_kfree_skb with dev_consume_skb_any in __ei_start_xmit that
can be called in hard irq and other contexts.

dev_consume_skb is used as in this simple driver the skb is always
immediately consumed, there are no drops.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 drivers/net/ethernet/8390/lib8390.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/8390/lib8390.c b/drivers/net/ethernet/8390/lib8390.c
index d2cd80444ade..599311f0e05c 100644
--- a/drivers/net/ethernet/8390/lib8390.c
+++ b/drivers/net/ethernet/8390/lib8390.c
@@ -404,7 +404,7 @@ static netdev_tx_t __ei_start_xmit(struct sk_buff *skb,
 	spin_unlock(&ei_local->page_lock);
 	enable_irq_lockdep_irqrestore(dev->irq, &flags);
 	skb_tx_timestamp(skb);
-	dev_kfree_skb(skb);
+	dev_consume_skb_any(skb);
 	dev->stats.tx_bytes += send_length;
 
 	return NETDEV_TX_OK;
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* [PATCH 05/54] bfin_mac: Call dev_consume_skb_any instead of dev_kfree_skb.
  2014-03-25  6:04                         ` [PATCH 01/54] uml/net_kern: Call dev_consume_skb_any instead of dev_kfree_skb Eric W. Biederman
                                             ` (2 preceding siblings ...)
  2014-03-25  6:04                           ` [PATCH 04/54] 8390: " Eric W. Biederman
@ 2014-03-25  6:04                           ` Eric W. Biederman
  2014-03-25 13:10                             ` Eric Dumazet
  2014-03-25  6:04                           ` [PATCH 06/54] sun4i-emac: " Eric W. Biederman
                                             ` (49 subsequent siblings)
  53 siblings, 1 reply; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-25  6:04 UTC (permalink / raw)
  To: David Miller
  Cc: eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma,
	Eric W. Biederman

From: "Eric W. Biederman" <ebiederm@xmission.com>

Replace dev_kfree_skb with dev_consume_skb_any in _tx_reclaim_skb that
can be called in hard irq and other contexts.

dev_consume_skb is used as _tx_reclaim_skb is called after a packet
has been successfully transmitted.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 drivers/net/ethernet/adi/bfin_mac.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/adi/bfin_mac.c b/drivers/net/ethernet/adi/bfin_mac.c
index 83a8cdbcd936..95779b6b7394 100644
--- a/drivers/net/ethernet/adi/bfin_mac.c
+++ b/drivers/net/ethernet/adi/bfin_mac.c
@@ -1087,7 +1087,7 @@ static inline void _tx_reclaim_skb(void)
 		tx_list_head->desc_a.config &= ~DMAEN;
 		tx_list_head->status.status_word = 0;
 		if (tx_list_head->skb) {
-			dev_kfree_skb(tx_list_head->skb);
+			dev_consume_skb_any(tx_list_head->skb);
 			tx_list_head->skb = NULL;
 		}
 		tx_list_head = tx_list_head->next;
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* [PATCH 06/54] sun4i-emac: Call dev_consume_skb_any instead of dev_kfree_skb.
  2014-03-25  6:04                         ` [PATCH 01/54] uml/net_kern: Call dev_consume_skb_any instead of dev_kfree_skb Eric W. Biederman
                                             ` (3 preceding siblings ...)
  2014-03-25  6:04                           ` [PATCH 05/54] bfin_mac: " Eric W. Biederman
@ 2014-03-25  6:04                           ` Eric W. Biederman
  2014-03-25 13:11                             ` Eric Dumazet
  2014-03-25  6:04                           ` [PATCH 07/54] am79c961a: " Eric W. Biederman
                                             ` (48 subsequent siblings)
  53 siblings, 1 reply; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-25  6:04 UTC (permalink / raw)
  To: David Miller
  Cc: eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma,
	Eric W. Biederman

From: "Eric W. Biederman" <ebiederm@xmission.com>

Replace dev_kfree_skb with dev_consume_skb_any in emacs_start_xmit
which can be called in hard irq and other contexts.

emac_start_xmit always transmits the packet making dev_consume_skb
the appropriate function to call.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 drivers/net/ethernet/allwinner/sun4i-emac.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/allwinner/sun4i-emac.c b/drivers/net/ethernet/allwinner/sun4i-emac.c
index 511f6eecd58b..fcaeeb8a4929 100644
--- a/drivers/net/ethernet/allwinner/sun4i-emac.c
+++ b/drivers/net/ethernet/allwinner/sun4i-emac.c
@@ -476,7 +476,7 @@ static int emac_start_xmit(struct sk_buff *skb, struct net_device *dev)
 	spin_unlock_irqrestore(&db->lock, flags);
 
 	/* free this SKB */
-	dev_kfree_skb(skb);
+	dev_consume_skb_any(skb);
 
 	return NETDEV_TX_OK;
 }
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* [PATCH 07/54] am79c961a: Call dev_consume_skb_any instead of dev_kfree_skb.
  2014-03-25  6:04                         ` [PATCH 01/54] uml/net_kern: Call dev_consume_skb_any instead of dev_kfree_skb Eric W. Biederman
                                             ` (4 preceding siblings ...)
  2014-03-25  6:04                           ` [PATCH 06/54] sun4i-emac: " Eric W. Biederman
@ 2014-03-25  6:04                           ` Eric W. Biederman
  2014-03-25 13:13                             ` Eric Dumazet
  2014-03-25  6:04                           ` [PATCH 08/54] lance: " Eric W. Biederman
                                             ` (47 subsequent siblings)
  53 siblings, 1 reply; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-25  6:04 UTC (permalink / raw)
  To: David Miller
  Cc: eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma,
	Eric W. Biederman

From: "Eric W. Biederman" <ebiederm@xmission.com>

Replace dev_kfree_skb with dev_consume_skb_any in am79c961 that can
be called in hard irq and other contexts.

dev_consume_skb_any is used as am79c961_sendpacket always
immediately consumes the skb.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 drivers/net/ethernet/amd/am79c961a.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/amd/am79c961a.c b/drivers/net/ethernet/amd/am79c961a.c
index 9793767996a2..87e727b921dc 100644
--- a/drivers/net/ethernet/amd/am79c961a.c
+++ b/drivers/net/ethernet/amd/am79c961a.c
@@ -472,7 +472,7 @@ am79c961_sendpacket(struct sk_buff *skb, struct net_device *dev)
 	if (am_readword(dev, priv->txhdr + (priv->txhead << 3) + 2) & TMD_OWN)
 		netif_stop_queue(dev);
 
-	dev_kfree_skb(skb);
+	dev_consume_skb_any(skb);
 
 	return NETDEV_TX_OK;
 }
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* [PATCH 08/54] lance: Call dev_consume_skb_any instead of dev_kfree_skb.
  2014-03-25  6:04                         ` [PATCH 01/54] uml/net_kern: Call dev_consume_skb_any instead of dev_kfree_skb Eric W. Biederman
                                             ` (5 preceding siblings ...)
  2014-03-25  6:04                           ` [PATCH 07/54] am79c961a: " Eric W. Biederman
@ 2014-03-25  6:04                           ` Eric W. Biederman
  2014-03-25 13:14                             ` Eric Dumazet
  2014-03-25  6:04                           ` [PATCH 09/54] pcnet32: Call dev_kfree_skb_any " Eric W. Biederman
                                             ` (46 subsequent siblings)
  53 siblings, 1 reply; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-25  6:04 UTC (permalink / raw)
  To: David Miller
  Cc: eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma,
	Eric W. Biederman

From: "Eric W. Biederman" <ebiederm@xmission.com>

Replace dev_kfree_skb with dev_consume_skb_any in lance_start_xmit
that can be called in hard irq and other contexts.

dev_consume_skb_any is used as lance_start_xmit always immediately
consumes the skb.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 drivers/net/ethernet/amd/7990.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/amd/7990.c b/drivers/net/ethernet/amd/7990.c
index 18e542f7853d..98a10d555b79 100644
--- a/drivers/net/ethernet/amd/7990.c
+++ b/drivers/net/ethernet/amd/7990.c
@@ -578,7 +578,7 @@ int lance_start_xmit(struct sk_buff *skb, struct net_device *dev)
 	outs++;
 	/* Kick the lance: transmit now */
 	WRITERDP(lp, LE_C0_INEA | LE_C0_TDMD);
-	dev_kfree_skb(skb);
+	dev_consume_skb_any(skb);
 
 	spin_lock_irqsave(&lp->devlock, flags);
 	if (TX_BUFFS_AVAIL)
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* [PATCH 09/54] pcnet32: Call dev_kfree_skb_any instead of dev_kfree_skb.
  2014-03-25  6:04                         ` [PATCH 01/54] uml/net_kern: Call dev_consume_skb_any instead of dev_kfree_skb Eric W. Biederman
                                             ` (6 preceding siblings ...)
  2014-03-25  6:04                           ` [PATCH 08/54] lance: " Eric W. Biederman
@ 2014-03-25  6:04                           ` Eric W. Biederman
  2014-03-25 13:15                             ` Eric Dumazet
  2014-03-25  6:04                           ` [PATCH 10/54] alx: " Eric W. Biederman
                                             ` (45 subsequent siblings)
  53 siblings, 1 reply; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-25  6:04 UTC (permalink / raw)
  To: David Miller
  Cc: eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma,
	Eric W. Biederman

From: "Eric W. Biederman" <ebiederm@xmission.com>

Replace dev_kfree_skb with dev_kfree_skb_any in pcnet32_start_xmit
that can be called in hard irq and other contexts.

dev_kfree_skb_any is used as pcnet32_start_xmit only frees an
skb when it drops a packet during transmit.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 drivers/net/ethernet/amd/pcnet32.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/amd/pcnet32.c b/drivers/net/ethernet/amd/pcnet32.c
index 2ae00ed83afa..e7cc9174e364 100644
--- a/drivers/net/ethernet/amd/pcnet32.c
+++ b/drivers/net/ethernet/amd/pcnet32.c
@@ -2448,7 +2448,7 @@ static netdev_tx_t pcnet32_start_xmit(struct sk_buff *skb,
 	lp->tx_dma_addr[entry] =
 	    pci_map_single(lp->pci_dev, skb->data, skb->len, PCI_DMA_TODEVICE);
 	if (pci_dma_mapping_error(lp->pci_dev, lp->tx_dma_addr[entry])) {
-		dev_kfree_skb(skb);
+		dev_kfree_skb_any(skb);
 		dev->stats.tx_dropped++;
 		goto drop_packet;
 	}
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* [PATCH 10/54] alx: Call dev_kfree_skb_any instead of dev_kfree_skb.
  2014-03-25  6:04                         ` [PATCH 01/54] uml/net_kern: Call dev_consume_skb_any instead of dev_kfree_skb Eric W. Biederman
                                             ` (7 preceding siblings ...)
  2014-03-25  6:04                           ` [PATCH 09/54] pcnet32: Call dev_kfree_skb_any " Eric W. Biederman
@ 2014-03-25  6:04                           ` Eric W. Biederman
  2014-03-25 13:16                             ` Eric Dumazet
  2014-03-25  6:04                           ` [PATCH 11/54] atl1c: Call dev_kfree/consume_skb_any " Eric W. Biederman
                                             ` (44 subsequent siblings)
  53 siblings, 1 reply; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-25  6:04 UTC (permalink / raw)
  To: David Miller
  Cc: eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma,
	Eric W. Biederman

From: "Eric W. Biederman" <ebiederm@xmission.com>

Replace dev_kfree_skb with dev_kfree_skb_any in alx_start_xmit that
can be called in hard irq and other contexts.

dev_kfree_skb_any is used as alx_start_xmit only frees skbs
when dropping them.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 drivers/net/ethernet/atheros/alx/main.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/atheros/alx/main.c b/drivers/net/ethernet/atheros/alx/main.c
index 2e45f6ec1bf0..238356108e65 100644
--- a/drivers/net/ethernet/atheros/alx/main.c
+++ b/drivers/net/ethernet/atheros/alx/main.c
@@ -1097,7 +1097,7 @@ static netdev_tx_t alx_start_xmit(struct sk_buff *skb,
 	return NETDEV_TX_OK;
 
 drop:
-	dev_kfree_skb(skb);
+	dev_kfree_skb_any(skb);
 	return NETDEV_TX_OK;
 }
 
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* [PATCH 11/54] atl1c: Call dev_kfree/consume_skb_any instead of dev_kfree_skb.
  2014-03-25  6:04                         ` [PATCH 01/54] uml/net_kern: Call dev_consume_skb_any instead of dev_kfree_skb Eric W. Biederman
                                             ` (8 preceding siblings ...)
  2014-03-25  6:04                           ` [PATCH 10/54] alx: " Eric W. Biederman
@ 2014-03-25  6:04                           ` Eric W. Biederman
  2014-03-25 13:18                             ` Eric Dumazet
  2014-03-25  6:04                           ` [PATCH 12/54] bnad: Call dev_kfree_skb_any " Eric W. Biederman
                                             ` (43 subsequent siblings)
  53 siblings, 1 reply; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-25  6:04 UTC (permalink / raw)
  To: David Miller
  Cc: eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma,
	Eric W. Biederman

From: "Eric W. Biederman" <ebiederm@xmission.com>

The call path: atl1c_xmit_frame, atlc_tx_rollback, atl1c_clean_buffer
can not be tell at compile time if it will be invoked from hard irq
or other context, as atl1c_xmit_frame does not know.  So remove
the logic that  passes the compile time knowledge into al1c_clean_buffer
and figure out it out at runtime with dev_consume_skb_any.

Replace dev_kfree_skb with dev_kfree_skb_any in atl1c_xmit_frame that
can be called in hard irq and other contexts.

Replace dev_kfree_skb and dev_kfree_skb_irq with dev_consume_skb_any
in atl1c_clean_buffer that can be called in hard irq and other
contexts.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 drivers/net/ethernet/atheros/atl1c/atl1c_main.c |   20 ++++++++------------
 1 file changed, 8 insertions(+), 12 deletions(-)

diff --git a/drivers/net/ethernet/atheros/atl1c/atl1c_main.c b/drivers/net/ethernet/atheros/atl1c/atl1c_main.c
index 4d3258dd0a88..31f262302128 100644
--- a/drivers/net/ethernet/atheros/atl1c/atl1c_main.c
+++ b/drivers/net/ethernet/atheros/atl1c/atl1c_main.c
@@ -832,7 +832,7 @@ static int atl1c_sw_init(struct atl1c_adapter *adapter)
 }
 
 static inline void atl1c_clean_buffer(struct pci_dev *pdev,
-				struct atl1c_buffer *buffer_info, int in_irq)
+				struct atl1c_buffer *buffer_info)
 {
 	u16 pci_driection;
 	if (buffer_info->flags & ATL1C_BUFFER_FREE)
@@ -850,12 +850,8 @@ static inline void atl1c_clean_buffer(struct pci_dev *pdev,
 			pci_unmap_page(pdev, buffer_info->dma,
 					buffer_info->length, pci_driection);
 	}
-	if (buffer_info->skb) {
-		if (in_irq)
-			dev_kfree_skb_irq(buffer_info->skb);
-		else
-			dev_kfree_skb(buffer_info->skb);
-	}
+	if (buffer_info->skb)
+		dev_consume_skb_any(buffer_info->skb);
 	buffer_info->dma = 0;
 	buffer_info->skb = NULL;
 	ATL1C_SET_BUFFER_STATE(buffer_info, ATL1C_BUFFER_FREE);
@@ -875,7 +871,7 @@ static void atl1c_clean_tx_ring(struct atl1c_adapter *adapter,
 	ring_count = tpd_ring->count;
 	for (index = 0; index < ring_count; index++) {
 		buffer_info = &tpd_ring->buffer_info[index];
-		atl1c_clean_buffer(pdev, buffer_info, 0);
+		atl1c_clean_buffer(pdev, buffer_info);
 	}
 
 	/* Zero out Tx-buffers */
@@ -899,7 +895,7 @@ static void atl1c_clean_rx_ring(struct atl1c_adapter *adapter)
 
 	for (j = 0; j < rfd_ring->count; j++) {
 		buffer_info = &rfd_ring->buffer_info[j];
-		atl1c_clean_buffer(pdev, buffer_info, 0);
+		atl1c_clean_buffer(pdev, buffer_info);
 	}
 	/* zero out the descriptor ring */
 	memset(rfd_ring->desc, 0, rfd_ring->size);
@@ -1562,7 +1558,7 @@ static bool atl1c_clean_tx_irq(struct atl1c_adapter *adapter,
 
 	while (next_to_clean != hw_next_to_clean) {
 		buffer_info = &tpd_ring->buffer_info[next_to_clean];
-		atl1c_clean_buffer(pdev, buffer_info, 1);
+		atl1c_clean_buffer(pdev, buffer_info);
 		if (++next_to_clean == tpd_ring->count)
 			next_to_clean = 0;
 		atomic_set(&tpd_ring->next_to_clean, next_to_clean);
@@ -2085,7 +2081,7 @@ static void atl1c_tx_rollback(struct atl1c_adapter *adpt,
 	while (index != tpd_ring->next_to_use) {
 		tpd = ATL1C_TPD_DESC(tpd_ring, index);
 		buffer_info = &tpd_ring->buffer_info[index];
-		atl1c_clean_buffer(adpt->pdev, buffer_info, 0);
+		atl1c_clean_buffer(adpt->pdev, buffer_info);
 		memset(tpd, 0, sizeof(struct atl1c_tpd_desc));
 		if (++index == tpd_ring->count)
 			index = 0;
@@ -2258,7 +2254,7 @@ static netdev_tx_t atl1c_xmit_frame(struct sk_buff *skb,
 		/* roll back tpd/buffer */
 		atl1c_tx_rollback(adapter, tpd, type);
 		spin_unlock_irqrestore(&adapter->tx_lock, flags);
-		dev_kfree_skb(skb);
+		dev_kfree_skb_any(skb);
 	} else {
 		atl1c_tx_queue(adapter, skb, tpd, type);
 		spin_unlock_irqrestore(&adapter->tx_lock, flags);
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* [PATCH 12/54] bnad: Call dev_kfree_skb_any instead of dev_kfree_skb.
  2014-03-25  6:04                         ` [PATCH 01/54] uml/net_kern: Call dev_consume_skb_any instead of dev_kfree_skb Eric W. Biederman
                                             ` (9 preceding siblings ...)
  2014-03-25  6:04                           ` [PATCH 11/54] atl1c: Call dev_kfree/consume_skb_any " Eric W. Biederman
@ 2014-03-25  6:04                           ` Eric W. Biederman
  2014-03-25 13:19                             ` Eric Dumazet
  2014-03-25  6:04                           ` [PATCH 13/54] macb: Call dev_kfree_skb_any instead of kfree_skb Eric W. Biederman
                                             ` (42 subsequent siblings)
  53 siblings, 1 reply; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-25  6:04 UTC (permalink / raw)
  To: David Miller
  Cc: eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma,
	Eric W. Biederman

From: "Eric W. Biederman" <ebiederm@xmission.com>

Replace dev_kfree_skb with dev_kfree_skb_any in bnad_start_xmit that
can be called in hard irq and other contexts.

dev_kfree_skb_any is used as bnad_start_xmit only frees skbs when to
drop them, normally transmitted packets are handled elsewhere.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 drivers/net/ethernet/brocade/bna/bnad.c |   16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/drivers/net/ethernet/brocade/bna/bnad.c b/drivers/net/ethernet/brocade/bna/bnad.c
index cb7625366ec2..a881e982a084 100644
--- a/drivers/net/ethernet/brocade/bna/bnad.c
+++ b/drivers/net/ethernet/brocade/bna/bnad.c
@@ -2946,17 +2946,17 @@ bnad_start_xmit(struct sk_buff *skb, struct net_device *netdev)
 	/* Sanity checks for the skb */
 
 	if (unlikely(skb->len <= ETH_HLEN)) {
-		dev_kfree_skb(skb);
+		dev_kfree_skb_any(skb);
 		BNAD_UPDATE_CTR(bnad, tx_skb_too_short);
 		return NETDEV_TX_OK;
 	}
 	if (unlikely(len > BFI_TX_MAX_DATA_PER_VECTOR)) {
-		dev_kfree_skb(skb);
+		dev_kfree_skb_any(skb);
 		BNAD_UPDATE_CTR(bnad, tx_skb_headlen_zero);
 		return NETDEV_TX_OK;
 	}
 	if (unlikely(len == 0)) {
-		dev_kfree_skb(skb);
+		dev_kfree_skb_any(skb);
 		BNAD_UPDATE_CTR(bnad, tx_skb_headlen_zero);
 		return NETDEV_TX_OK;
 	}
@@ -2968,7 +2968,7 @@ bnad_start_xmit(struct sk_buff *skb, struct net_device *netdev)
 	 * and the netif_tx_stop_all_queues() call.
 	 */
 	if (unlikely(!tcb || !test_bit(BNAD_TXQ_TX_STARTED, &tcb->flags))) {
-		dev_kfree_skb(skb);
+		dev_kfree_skb_any(skb);
 		BNAD_UPDATE_CTR(bnad, tx_skb_stopping);
 		return NETDEV_TX_OK;
 	}
@@ -2981,7 +2981,7 @@ bnad_start_xmit(struct sk_buff *skb, struct net_device *netdev)
 	wis = BNA_TXQ_WI_NEEDED(vectors);	/* 4 vectors per work item */
 
 	if (unlikely(vectors > BFI_TX_MAX_VECTORS_PER_PKT)) {
-		dev_kfree_skb(skb);
+		dev_kfree_skb_any(skb);
 		BNAD_UPDATE_CTR(bnad, tx_skb_max_vectors);
 		return NETDEV_TX_OK;
 	}
@@ -3021,7 +3021,7 @@ bnad_start_xmit(struct sk_buff *skb, struct net_device *netdev)
 
 	/* Program the opcode, flags, frame_len, num_vectors in WI */
 	if (bnad_txq_wi_prepare(bnad, tcb, skb, txqent)) {
-		dev_kfree_skb(skb);
+		dev_kfree_skb_any(skb);
 		return NETDEV_TX_OK;
 	}
 	txqent->hdr.wi.reserved = 0;
@@ -3047,7 +3047,7 @@ bnad_start_xmit(struct sk_buff *skb, struct net_device *netdev)
 			/* Undo the changes starting at tcb->producer_index */
 			bnad_tx_buff_unmap(bnad, unmap_q, q_depth,
 				tcb->producer_index);
-			dev_kfree_skb(skb);
+			dev_kfree_skb_any(skb);
 			BNAD_UPDATE_CTR(bnad, tx_skb_frag_zero);
 			return NETDEV_TX_OK;
 		}
@@ -3076,7 +3076,7 @@ bnad_start_xmit(struct sk_buff *skb, struct net_device *netdev)
 	if (unlikely(len != skb->len)) {
 		/* Undo the changes starting at tcb->producer_index */
 		bnad_tx_buff_unmap(bnad, unmap_q, q_depth, tcb->producer_index);
-		dev_kfree_skb(skb);
+		dev_kfree_skb_any(skb);
 		BNAD_UPDATE_CTR(bnad, tx_skb_len_mismatch);
 		return NETDEV_TX_OK;
 	}
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* [PATCH 13/54] macb: Call dev_kfree_skb_any instead of kfree_skb.
  2014-03-25  6:04                         ` [PATCH 01/54] uml/net_kern: Call dev_consume_skb_any instead of dev_kfree_skb Eric W. Biederman
                                             ` (10 preceding siblings ...)
  2014-03-25  6:04                           ` [PATCH 12/54] bnad: Call dev_kfree_skb_any " Eric W. Biederman
@ 2014-03-25  6:04                           ` Eric W. Biederman
  2014-03-25 13:21                             ` Eric Dumazet
  2014-03-25  6:05                           ` [PATCH 14/54] xgmac: Call dev_kfree/consume_skb_any instead of dev_kfree_skb Eric W. Biederman
                                             ` (41 subsequent siblings)
  53 siblings, 1 reply; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-25  6:04 UTC (permalink / raw)
  To: David Miller
  Cc: eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma,
	Eric W. Biederman

From: "Eric W. Biederman" <ebiederm@xmission.com>

Replace kfree_skb with dev_kfree_skb_any in macb_start_xmit that can
be called in hard irq and other contexts.

macb_start_xmit only frees skbs when dropping them so
dev_kfree_skb_any is used.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 drivers/net/ethernet/cadence/macb.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/cadence/macb.c b/drivers/net/ethernet/cadence/macb.c
index d0c38e01e99f..6116887d2880 100644
--- a/drivers/net/ethernet/cadence/macb.c
+++ b/drivers/net/ethernet/cadence/macb.c
@@ -1045,7 +1045,7 @@ static int macb_start_xmit(struct sk_buff *skb, struct net_device *dev)
 	mapping = dma_map_single(&bp->pdev->dev, skb->data,
 				 len, DMA_TO_DEVICE);
 	if (dma_mapping_error(&bp->pdev->dev, mapping)) {
-		kfree_skb(skb);
+		dev_kfree_skb_any(skb);
 		goto unlock;
 	}
 
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* [PATCH 14/54] xgmac: Call dev_kfree/consume_skb_any instead of dev_kfree_skb.
  2014-03-25  6:04                         ` [PATCH 01/54] uml/net_kern: Call dev_consume_skb_any instead of dev_kfree_skb Eric W. Biederman
                                             ` (11 preceding siblings ...)
  2014-03-25  6:04                           ` [PATCH 13/54] macb: Call dev_kfree_skb_any instead of kfree_skb Eric W. Biederman
@ 2014-03-25  6:05                           ` Eric W. Biederman
  2014-03-25 15:16                             ` Eric Dumazet
  2014-03-25  6:05                           ` [PATCH 15/54] cxgb3: Call dev_kfree/consume_skb_any instead of [dev_]kfree_skb Eric W. Biederman
                                             ` (40 subsequent siblings)
  53 siblings, 1 reply; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-25  6:05 UTC (permalink / raw)
  To: David Miller
  Cc: eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma,
	Eric W. Biederman

From: "Eric W. Biederman" <ebiederm@xmission.com>

Replace dev_kfree_skb with dev_consume_skb_any in xgmac_tx_complete
that can be called in hard irq and other contexts.

Replace dev_kfree_skb with dev_kfree_skb_any in xgmac_xmit that can
be called in hard irq and other contexts.

dev_consume_skb_any is used in xgamc_tx_complete as skbs that reach
there have been successfully transmitted, dev_kfree_skby_any is used
in xgmac_xmit as skbs that are freed there are being dropped.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 drivers/net/ethernet/calxeda/xgmac.c |    6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/calxeda/xgmac.c b/drivers/net/ethernet/calxeda/xgmac.c
index d2a183c3a6ce..521dfea44b83 100644
--- a/drivers/net/ethernet/calxeda/xgmac.c
+++ b/drivers/net/ethernet/calxeda/xgmac.c
@@ -897,7 +897,7 @@ static void xgmac_tx_complete(struct xgmac_priv *priv)
 		/* Check tx error on the last segment */
 		if (desc_get_tx_ls(p)) {
 			desc_get_tx_status(priv, p);
-			dev_kfree_skb(skb);
+			dev_consume_skb_any(skb);
 		}
 
 		priv->tx_skbuff[entry] = NULL;
@@ -1105,7 +1105,7 @@ static netdev_tx_t xgmac_xmit(struct sk_buff *skb, struct net_device *dev)
 	len = skb_headlen(skb);
 	paddr = dma_map_single(priv->device, skb->data, len, DMA_TO_DEVICE);
 	if (dma_mapping_error(priv->device, paddr)) {
-		dev_kfree_skb(skb);
+		dev_kfree_skb_any(skb);
 		return NETDEV_TX_OK;
 	}
 	priv->tx_skbuff[entry] = skb;
@@ -1169,7 +1169,7 @@ dma_err:
 	desc = first;
 	dma_unmap_single(priv->device, desc_get_buf_addr(desc),
 			 desc_get_buf_len(desc), DMA_TO_DEVICE);
-	dev_kfree_skb(skb);
+	dev_kfree_skb_any(skb);
 	return NETDEV_TX_OK;
 }
 
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* [PATCH 15/54] cxgb3: Call dev_kfree/consume_skb_any instead of [dev_]kfree_skb.
  2014-03-25  6:04                         ` [PATCH 01/54] uml/net_kern: Call dev_consume_skb_any instead of dev_kfree_skb Eric W. Biederman
                                             ` (12 preceding siblings ...)
  2014-03-25  6:05                           ` [PATCH 14/54] xgmac: Call dev_kfree/consume_skb_any instead of dev_kfree_skb Eric W. Biederman
@ 2014-03-25  6:05                           ` Eric W. Biederman
  2014-03-25 15:18                             ` Eric Dumazet
  2014-03-25  6:05                           ` [PATCH 16/54] cxgb4: " Eric W. Biederman
                                             ` (39 subsequent siblings)
  53 siblings, 1 reply; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-25  6:05 UTC (permalink / raw)
  To: David Miller
  Cc: eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma,
	Eric W. Biederman

From: "Eric W. Biederman" <ebiederm@xmission.com>

Replace kfree_skb with dev_consume_skb_any in free_tx_desc, and
write_tx_pkt_wr that can be called in hard irq and other contexts.

Replace dev_kfree_skb with dev_kfree_skb_any in t3_eth_xmit that can
be called in hard irq and other contexts.

dev_kfree_skb is replaced with dev_kfree_skb_any in t3_eth_xmit as
that location is a packet drop, while kfree_skb in free_tx_desc,
and in write_tx_pkt_wr are places where packets are consumed
in a healthy manner.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 drivers/net/ethernet/chelsio/cxgb3/sge.c |    6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/chelsio/cxgb3/sge.c b/drivers/net/ethernet/chelsio/cxgb3/sge.c
index 632b318eb38a..8b069f96e920 100644
--- a/drivers/net/ethernet/chelsio/cxgb3/sge.c
+++ b/drivers/net/ethernet/chelsio/cxgb3/sge.c
@@ -298,7 +298,7 @@ static void free_tx_desc(struct adapter *adapter, struct sge_txq *q,
 			if (need_unmap)
 				unmap_skb(d->skb, q, cidx, pdev);
 			if (d->eop) {
-				kfree_skb(d->skb);
+				dev_consume_skb_any(d->skb);
 				d->skb = NULL;
 			}
 		}
@@ -1188,7 +1188,7 @@ static void write_tx_pkt_wr(struct adapter *adap, struct sk_buff *skb,
 			cpl->wr.wr_lo = htonl(V_WR_LEN(flits) | V_WR_GEN(gen) |
 					      V_WR_TID(q->token));
 			wr_gen2(d, gen);
-			kfree_skb(skb);
+			dev_consume_skb_any(skb);
 			return;
 		}
 
@@ -1233,7 +1233,7 @@ netdev_tx_t t3_eth_xmit(struct sk_buff *skb, struct net_device *dev)
 	 * anything shorter than an Ethernet header.
 	 */
 	if (unlikely(skb->len < ETH_HLEN)) {
-		dev_kfree_skb(skb);
+		dev_kfree_skb_any(skb);
 		return NETDEV_TX_OK;
 	}
 
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* [PATCH 16/54] cxgb4: Call dev_kfree/consume_skb_any instead of [dev_]kfree_skb.
  2014-03-25  6:04                         ` [PATCH 01/54] uml/net_kern: Call dev_consume_skb_any instead of dev_kfree_skb Eric W. Biederman
                                             ` (13 preceding siblings ...)
  2014-03-25  6:05                           ` [PATCH 15/54] cxgb3: Call dev_kfree/consume_skb_any instead of [dev_]kfree_skb Eric W. Biederman
@ 2014-03-25  6:05                           ` Eric W. Biederman
  2014-03-25 15:19                             ` Eric Dumazet
  2014-03-25  6:05                           ` [PATCH 17/54] cxfb4vf: " Eric W. Biederman
                                             ` (38 subsequent siblings)
  53 siblings, 1 reply; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-25  6:05 UTC (permalink / raw)
  To: David Miller
  Cc: eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma,
	Eric W. Biederman

From: "Eric W. Biederman" <ebiederm@xmission.com>

Replace kfree_skb with dev_consume_skb_any in free_tx_desc that can be
called in hard irq and other contexts. dev_consume_skb_any is used
as this function consumes successfully transmitted skbs.

Replace dev_kfree_skb with dev_kfree_skb_any in t4_eth_xmit that can
be called in hard irq and other contexts, on paths that drop the skb.

Replace dev_kfree_skb with dev_consume_skb_any in t4_eth_xmit that can
be called in hard irq and other contexts, on paths that successfully
transmit the skb.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 drivers/net/ethernet/chelsio/cxgb4/sge.c |    6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/chelsio/cxgb4/sge.c b/drivers/net/ethernet/chelsio/cxgb4/sge.c
index d4db382ff8c7..ca95cf2954eb 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/sge.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/sge.c
@@ -383,7 +383,7 @@ static void free_tx_desc(struct adapter *adap, struct sge_txq *q,
 		if (d->skb) {                       /* an SGL is present */
 			if (unmap)
 				unmap_sgl(dev, d->skb, d->sgl, q);
-			kfree_skb(d->skb);
+			dev_consume_skb_any(d->skb);
 			d->skb = NULL;
 		}
 		++d;
@@ -1009,7 +1009,7 @@ netdev_tx_t t4_eth_xmit(struct sk_buff *skb, struct net_device *dev)
 	 * anything shorter than an Ethernet header.
 	 */
 	if (unlikely(skb->len < ETH_HLEN)) {
-out_free:	dev_kfree_skb(skb);
+out_free:	dev_kfree_skb_any(skb);
 		return NETDEV_TX_OK;
 	}
 
@@ -1104,7 +1104,7 @@ out_free:	dev_kfree_skb(skb);
 
 	if (immediate) {
 		inline_tx_skb(skb, &q->q, cpl + 1);
-		dev_kfree_skb(skb);
+		dev_consume_skb_any(skb);
 	} else {
 		int last_desc;
 
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* [PATCH 17/54] cxfb4vf: Call dev_kfree/consume_skb_any instead of [dev_]kfree_skb.
  2014-03-25  6:04                         ` [PATCH 01/54] uml/net_kern: Call dev_consume_skb_any instead of dev_kfree_skb Eric W. Biederman
                                             ` (14 preceding siblings ...)
  2014-03-25  6:05                           ` [PATCH 16/54] cxgb4: " Eric W. Biederman
@ 2014-03-25  6:05                           ` Eric W. Biederman
  2014-03-25 15:22                             ` Eric Dumazet
  2014-03-25  6:05                           ` [PATCH 18/54] cs89x0: Call dev_consume_skb_any instead of dev_kfree_skb Eric W. Biederman
                                             ` (37 subsequent siblings)
  53 siblings, 1 reply; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-25  6:05 UTC (permalink / raw)
  To: David Miller
  Cc: eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma,
	Eric W. Biederman

From: "Eric W. Biederman" <ebiederm@xmission.com>

Replace kfree_skb with dev_consume_skb_any in free_tx_desc that can be
called in hard irq and other contexts. dev_consume_skb_any is used
as this function consumes successfully transmitted skbs.

Replace dev_kfree_skb with dev_kfree_skb_any in t4vf_eth_xmit that can
be called in hard irq and other contexts, on paths that drop the skb.

Replace dev_kfree_skb with dev_consume_skb_any in t4vf_eth_xmit that can
be called in hard irq and other contexts, on paths that successfully
transmit the skb.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 drivers/net/ethernet/chelsio/cxgb4vf/sge.c |    6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/chelsio/cxgb4vf/sge.c b/drivers/net/ethernet/chelsio/cxgb4vf/sge.c
index 0a89963c48ce..9cfa4b4bb089 100644
--- a/drivers/net/ethernet/chelsio/cxgb4vf/sge.c
+++ b/drivers/net/ethernet/chelsio/cxgb4vf/sge.c
@@ -401,7 +401,7 @@ static void free_tx_desc(struct adapter *adapter, struct sge_txq *tq,
 		if (sdesc->skb) {
 			if (need_unmap)
 				unmap_sgl(dev, sdesc->skb, sdesc->sgl, tq);
-			kfree_skb(sdesc->skb);
+			dev_consume_skb_any(sdesc->skb);
 			sdesc->skb = NULL;
 		}
 
@@ -1275,7 +1275,7 @@ int t4vf_eth_xmit(struct sk_buff *skb, struct net_device *dev)
 		 * need it any longer.
 		 */
 		inline_tx_skb(skb, &txq->q, cpl + 1);
-		dev_kfree_skb(skb);
+		dev_consume_skb_any(skb);
 	} else {
 		/*
 		 * Write the skb's Scatter/Gather list into the TX Packet CPL
@@ -1354,7 +1354,7 @@ out_free:
 	 * An error of some sort happened.  Free the TX skb and tell the
 	 * OS that we've "dealt" with the packet ...
 	 */
-	dev_kfree_skb(skb);
+	dev_kfree_skb_any(skb);
 	return NETDEV_TX_OK;
 }
 
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* [PATCH 18/54] cs89x0: Call dev_consume_skb_any instead of dev_kfree_skb.
  2014-03-25  6:04                         ` [PATCH 01/54] uml/net_kern: Call dev_consume_skb_any instead of dev_kfree_skb Eric W. Biederman
                                             ` (15 preceding siblings ...)
  2014-03-25  6:05                           ` [PATCH 17/54] cxfb4vf: " Eric W. Biederman
@ 2014-03-25  6:05                           ` Eric W. Biederman
  2014-03-25 15:23                             ` Eric Dumazet
  2014-03-25  6:05                           ` [PATCH 19/54] enic: Call dev_kfree_skb_any " Eric W. Biederman
                                             ` (36 subsequent siblings)
  53 siblings, 1 reply; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-25  6:05 UTC (permalink / raw)
  To: David Miller
  Cc: eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma,
	Eric W. Biederman

From: "Eric W. Biederman" <ebiederm@xmission.com>

Replace dev_kfree_skb with dev_consume_skb_any in net_send_packet that
can be called in hard irq and other contexts.

net_send_packet consumes (not drops) the skb of interest.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 drivers/net/ethernet/cirrus/cs89x0.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/cirrus/cs89x0.c b/drivers/net/ethernet/cirrus/cs89x0.c
index 19f642a45f40..fe84fbabc0d4 100644
--- a/drivers/net/ethernet/cirrus/cs89x0.c
+++ b/drivers/net/ethernet/cirrus/cs89x0.c
@@ -1174,7 +1174,7 @@ static netdev_tx_t net_send_packet(struct sk_buff *skb, struct net_device *dev)
 	writewords(lp, TX_FRAME_PORT, skb->data, (skb->len + 1) >> 1);
 	spin_unlock_irqrestore(&lp->lock, flags);
 	dev->stats.tx_bytes += skb->len;
-	dev_kfree_skb(skb);
+	dev_consume_skb_any(skb);
 
 	/* We DO NOT call netif_wake_queue() here.
 	 * We also DO NOT call netif_start_queue().
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* [PATCH 19/54] enic: Call dev_kfree_skb_any instead of dev_kfree_skb.
  2014-03-25  6:04                         ` [PATCH 01/54] uml/net_kern: Call dev_consume_skb_any instead of dev_kfree_skb Eric W. Biederman
                                             ` (16 preceding siblings ...)
  2014-03-25  6:05                           ` [PATCH 18/54] cs89x0: Call dev_consume_skb_any instead of dev_kfree_skb Eric W. Biederman
@ 2014-03-25  6:05                           ` Eric W. Biederman
  2014-03-25 15:24                             ` Eric Dumazet
  2014-03-25  6:05                           ` [PATCH 20/54] dm9000: Call dev_consume_skb_any " Eric W. Biederman
                                             ` (35 subsequent siblings)
  53 siblings, 1 reply; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-25  6:05 UTC (permalink / raw)
  To: David Miller
  Cc: eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma,
	Eric W. Biederman

From: "Eric W. Biederman" <ebiederm@xmission.com>

Replace dev_kfree_skb with dev_kfree_skb_any in enic_hard_start_xmit
that can be called in hard irq and other contexts.

enic_hard_start_xmit only frees the skb when dropping it.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 drivers/net/ethernet/cisco/enic/enic_main.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/cisco/enic/enic_main.c b/drivers/net/ethernet/cisco/enic/enic_main.c
index 4c35fc8fad99..2945718ce806 100644
--- a/drivers/net/ethernet/cisco/enic/enic_main.c
+++ b/drivers/net/ethernet/cisco/enic/enic_main.c
@@ -521,7 +521,7 @@ static netdev_tx_t enic_hard_start_xmit(struct sk_buff *skb,
 	unsigned int txq_map;
 
 	if (skb->len <= 0) {
-		dev_kfree_skb(skb);
+		dev_kfree_skb_any(skb);
 		return NETDEV_TX_OK;
 	}
 
@@ -536,7 +536,7 @@ static netdev_tx_t enic_hard_start_xmit(struct sk_buff *skb,
 	if (skb_shinfo(skb)->gso_size == 0 &&
 	    skb_shinfo(skb)->nr_frags + 1 > ENIC_NON_TSO_MAX_DESC &&
 	    skb_linearize(skb)) {
-		dev_kfree_skb(skb);
+		dev_kfree_skb_any(skb);
 		return NETDEV_TX_OK;
 	}
 
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* [PATCH 20/54] dm9000: Call dev_consume_skb_any instead of dev_kfree_skb.
  2014-03-25  6:04                         ` [PATCH 01/54] uml/net_kern: Call dev_consume_skb_any instead of dev_kfree_skb Eric W. Biederman
                                             ` (17 preceding siblings ...)
  2014-03-25  6:05                           ` [PATCH 19/54] enic: Call dev_kfree_skb_any " Eric W. Biederman
@ 2014-03-25  6:05                           ` Eric W. Biederman
  2014-03-25 15:26                             ` Eric Dumazet
  2014-03-25  6:05                           ` [PATCH 21/54] dmfe: Call dev_kfree/consume_skb_any " Eric W. Biederman
                                             ` (34 subsequent siblings)
  53 siblings, 1 reply; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-25  6:05 UTC (permalink / raw)
  To: David Miller
  Cc: eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma,
	Eric W. Biederman

From: "Eric W. Biederman" <ebiederm@xmission.com>

Replace dev_kfree_skb with dev_consume_skb_any in dm9000_start_xmit
that can be called in hard irq and other contexts, on the path
that successfully transmits the packet.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 drivers/net/ethernet/davicom/dm9000.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/davicom/dm9000.c b/drivers/net/ethernet/davicom/dm9000.c
index a1a2b4028a5c..8c4b93be333b 100644
--- a/drivers/net/ethernet/davicom/dm9000.c
+++ b/drivers/net/ethernet/davicom/dm9000.c
@@ -1033,7 +1033,7 @@ dm9000_start_xmit(struct sk_buff *skb, struct net_device *dev)
 	spin_unlock_irqrestore(&db->lock, flags);
 
 	/* free this SKB */
-	dev_kfree_skb(skb);
+	dev_consume_skb_any(skb);
 
 	return NETDEV_TX_OK;
 }
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* [PATCH 21/54] dmfe: Call dev_kfree/consume_skb_any instead of dev_kfree_skb.
  2014-03-25  6:04                         ` [PATCH 01/54] uml/net_kern: Call dev_consume_skb_any instead of dev_kfree_skb Eric W. Biederman
                                             ` (18 preceding siblings ...)
  2014-03-25  6:05                           ` [PATCH 20/54] dm9000: Call dev_consume_skb_any " Eric W. Biederman
@ 2014-03-25  6:05                           ` Eric W. Biederman
  2014-03-25 15:28                             ` Eric Dumazet
  2014-03-25  6:05                           ` [PATCH 22/54] uli526x: " Eric W. Biederman
                                             ` (33 subsequent siblings)
  53 siblings, 1 reply; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-25  6:05 UTC (permalink / raw)
  To: David Miller
  Cc: eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma,
	Eric W. Biederman

From: "Eric W. Biederman" <ebiederm@xmission.com>

Replace dev_kfree_skb with dev_kfree_skb_any in dmfe_start_xmit that
can be called in hard irq and other contexts, when the packet is
dropped.

Replace dev_kfree_skb with dev_consume_skb_any in dmfe_start_xmit that
can be called in hard irq and other contexts, when the packet is
transmitted.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 drivers/net/ethernet/dec/tulip/dmfe.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/dec/tulip/dmfe.c b/drivers/net/ethernet/dec/tulip/dmfe.c
index 5ad9e3e3c0b8..53f0c618045c 100644
--- a/drivers/net/ethernet/dec/tulip/dmfe.c
+++ b/drivers/net/ethernet/dec/tulip/dmfe.c
@@ -696,7 +696,7 @@ static netdev_tx_t dmfe_start_xmit(struct sk_buff *skb,
 	/* Too large packet check */
 	if (skb->len > MAX_PACKET_SIZE) {
 		pr_err("big packet = %d\n", (u16)skb->len);
-		dev_kfree_skb(skb);
+		dev_kfree_skb_any(skb);
 		return NETDEV_TX_OK;
 	}
 
@@ -743,7 +743,7 @@ static netdev_tx_t dmfe_start_xmit(struct sk_buff *skb,
 	dw32(DCR7, db->cr7_data);
 
 	/* free this SKB */
-	dev_kfree_skb(skb);
+	dev_consume_skb_any(skb);
 
 	return NETDEV_TX_OK;
 }
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* [PATCH 22/54] uli526x: Call dev_kfree/consume_skb_any instead of dev_kfree_skb.
  2014-03-25  6:04                         ` [PATCH 01/54] uml/net_kern: Call dev_consume_skb_any instead of dev_kfree_skb Eric W. Biederman
                                             ` (19 preceding siblings ...)
  2014-03-25  6:05                           ` [PATCH 21/54] dmfe: Call dev_kfree/consume_skb_any " Eric W. Biederman
@ 2014-03-25  6:05                           ` Eric W. Biederman
  2014-03-25 15:29                             ` Eric Dumazet
  2014-03-25  6:05                           ` [PATCH 23/54] sundance: Call dev_kfree_skb_any " Eric W. Biederman
                                             ` (32 subsequent siblings)
  53 siblings, 1 reply; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-25  6:05 UTC (permalink / raw)
  To: David Miller
  Cc: eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma,
	Eric W. Biederman

From: "Eric W. Biederman" <ebiederm@xmission.com>

Replace dev_kfree_skb with dev_kfree_skb_any in uli562x_start_xmit
that can be called in hard irq and other contexts, when the packet is
dropped.

Replace dev_kfree_skb with dev_consume_skb_any in uli562x_start_xmit
that can be called in hard irq and other contexts, when the packet is
transmitted.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 drivers/net/ethernet/dec/tulip/uli526x.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/dec/tulip/uli526x.c b/drivers/net/ethernet/dec/tulip/uli526x.c
index aa4ee385091f..aa801a6af7b9 100644
--- a/drivers/net/ethernet/dec/tulip/uli526x.c
+++ b/drivers/net/ethernet/dec/tulip/uli526x.c
@@ -607,7 +607,7 @@ static netdev_tx_t uli526x_start_xmit(struct sk_buff *skb,
 	/* Too large packet check */
 	if (skb->len > MAX_PACKET_SIZE) {
 		netdev_err(dev, "big packet = %d\n", (u16)skb->len);
-		dev_kfree_skb(skb);
+		dev_kfree_skb_any(skb);
 		return NETDEV_TX_OK;
 	}
 
@@ -648,7 +648,7 @@ static netdev_tx_t uli526x_start_xmit(struct sk_buff *skb,
 	uw32(DCR7, db->cr7_data);
 
 	/* free this SKB */
-	dev_kfree_skb(skb);
+	dev_consume_skb_any(skb);
 
 	return NETDEV_TX_OK;
 }
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* [PATCH 23/54] sundance: Call dev_kfree_skb_any instead of dev_kfree_skb.
  2014-03-25  6:04                         ` [PATCH 01/54] uml/net_kern: Call dev_consume_skb_any instead of dev_kfree_skb Eric W. Biederman
                                             ` (20 preceding siblings ...)
  2014-03-25  6:05                           ` [PATCH 22/54] uli526x: " Eric W. Biederman
@ 2014-03-25  6:05                           ` Eric W. Biederman
  2014-03-25 15:29                             ` Eric Dumazet
  2014-03-25  6:05                           ` [PATCH 24/54] fec: Call dev_kfree_skb_any instead of kfree_skb Eric W. Biederman
                                             ` (31 subsequent siblings)
  53 siblings, 1 reply; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-25  6:05 UTC (permalink / raw)
  To: David Miller
  Cc: eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma,
	Eric W. Biederman

From: "Eric W. Biederman" <ebiederm@xmission.com>

Replace dev_kfree_skb with dev_kfree_skb_any in start_tx that can
be called in hard irq and other contexts, when the skb is dropped.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 drivers/net/ethernet/dlink/sundance.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/dlink/sundance.c b/drivers/net/ethernet/dlink/sundance.c
index 113cd799a131..d9e5ca0d48c1 100644
--- a/drivers/net/ethernet/dlink/sundance.c
+++ b/drivers/net/ethernet/dlink/sundance.c
@@ -1137,7 +1137,7 @@ start_tx (struct sk_buff *skb, struct net_device *dev)
 	return NETDEV_TX_OK;
 
 drop_frame:
-	dev_kfree_skb(skb);
+	dev_kfree_skb_any(skb);
 	np->tx_skbuff[entry] = NULL;
 	dev->stats.tx_dropped++;
 	return NETDEV_TX_OK;
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* [PATCH 24/54] fec: Call dev_kfree_skb_any instead of kfree_skb.
  2014-03-25  6:04                         ` [PATCH 01/54] uml/net_kern: Call dev_consume_skb_any instead of dev_kfree_skb Eric W. Biederman
                                             ` (21 preceding siblings ...)
  2014-03-25  6:05                           ` [PATCH 23/54] sundance: Call dev_kfree_skb_any " Eric W. Biederman
@ 2014-03-25  6:05                           ` Eric W. Biederman
  2014-03-25 15:30                             ` Eric Dumazet
  2014-03-25  6:05                           ` [PATCH 25/54] ucc_geth: Call dev_consume_skb_any instead of dev_kfree_skb Eric W. Biederman
                                             ` (30 subsequent siblings)
  53 siblings, 1 reply; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-25  6:05 UTC (permalink / raw)
  To: David Miller
  Cc: eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma,
	Eric W. Biederman

From: "Eric W. Biederman" <ebiederm@xmission.com>

Replace kfree_skb with dev_kfree_skb_any in fec_enet_start_xmit that
can be called in hard irq and other contexts, when the packet is
dropped.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 drivers/net/ethernet/freescale/fec_main.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/freescale/fec_main.c b/drivers/net/ethernet/freescale/fec_main.c
index 03a351300013..f9f8a589cdef 100644
--- a/drivers/net/ethernet/freescale/fec_main.c
+++ b/drivers/net/ethernet/freescale/fec_main.c
@@ -338,7 +338,7 @@ fec_enet_start_xmit(struct sk_buff *skb, struct net_device *ndev)
 
 	/* Protocol checksum off-load for TCP and UDP. */
 	if (fec_enet_clear_csum(skb, ndev)) {
-		kfree_skb(skb);
+		dev_kfree_skb_any(skb);
 		return NETDEV_TX_OK;
 	}
 
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* [PATCH 25/54] ucc_geth: Call dev_consume_skb_any instead of dev_kfree_skb.
  2014-03-25  6:04                         ` [PATCH 01/54] uml/net_kern: Call dev_consume_skb_any instead of dev_kfree_skb Eric W. Biederman
                                             ` (22 preceding siblings ...)
  2014-03-25  6:05                           ` [PATCH 24/54] fec: Call dev_kfree_skb_any instead of kfree_skb Eric W. Biederman
@ 2014-03-25  6:05                           ` Eric W. Biederman
  2014-03-25 15:30                             ` Eric Dumazet
  2014-03-25  6:05                           ` [PATCH 26/54] i825xx: Call dev_kfree_skb_any " Eric W. Biederman
                                             ` (29 subsequent siblings)
  53 siblings, 1 reply; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-25  6:05 UTC (permalink / raw)
  To: David Miller
  Cc: eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma,
	Eric W. Biederman

From: "Eric W. Biederman" <ebiederm@xmission.com>

Replace dev_kfree_skb with dev_consume_skb_any in ucc_geth_tx that can
be called in hard irq and other contexts, when processing the
tx completion event.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 drivers/net/ethernet/freescale/ucc_geth.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/freescale/ucc_geth.c b/drivers/net/ethernet/freescale/ucc_geth.c
index 72291a8904a9..c8299c31b21f 100644
--- a/drivers/net/ethernet/freescale/ucc_geth.c
+++ b/drivers/net/ethernet/freescale/ucc_geth.c
@@ -3261,7 +3261,7 @@ static int ucc_geth_tx(struct net_device *dev, u8 txQ)
 
 		dev->stats.tx_packets++;
 
-		dev_kfree_skb(skb);
+		dev_consume_skb_any(skb);
 
 		ugeth->tx_skbuff[txQ][ugeth->skb_dirtytx[txQ]] = NULL;
 		ugeth->skb_dirtytx[txQ] =
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* [PATCH 26/54] i825xx: Call dev_kfree_skb_any instead of dev_kfree_skb.
  2014-03-25  6:04                         ` [PATCH 01/54] uml/net_kern: Call dev_consume_skb_any instead of dev_kfree_skb Eric W. Biederman
                                             ` (23 preceding siblings ...)
  2014-03-25  6:05                           ` [PATCH 25/54] ucc_geth: Call dev_consume_skb_any instead of dev_kfree_skb Eric W. Biederman
@ 2014-03-25  6:05                           ` Eric W. Biederman
  2014-03-25 15:31                             ` Eric Dumazet
  2014-03-25  6:05                           ` [PATCH 27/54] ehea: Call dev_consume_skb_any " Eric W. Biederman
                                             ` (28 subsequent siblings)
  53 siblings, 1 reply; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-25  6:05 UTC (permalink / raw)
  To: David Miller
  Cc: eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma,
	Eric W. Biederman

From: "Eric W. Biederman" <ebiederm@xmission.com>

Replace dev_kfree_skb with dev_kfree_skb_any in i596_start_xmit that
can be called in hard irq and other contexts, when the skb is dropped.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 drivers/net/ethernet/i825xx/lib82596.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/i825xx/lib82596.c b/drivers/net/ethernet/i825xx/lib82596.c
index 17fca323c143..c984998b34a0 100644
--- a/drivers/net/ethernet/i825xx/lib82596.c
+++ b/drivers/net/ethernet/i825xx/lib82596.c
@@ -993,7 +993,7 @@ static int i596_start_xmit(struct sk_buff *skb, struct net_device *dev)
 				       dev->name));
 		dev->stats.tx_dropped++;
 
-		dev_kfree_skb(skb);
+		dev_kfree_skb_any(skb);
 	} else {
 		if (++lp->next_tx_cmd == TX_RING_SIZE)
 			lp->next_tx_cmd = 0;
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* [PATCH 27/54] ehea: Call dev_consume_skb_any instead of dev_kfree_skb.
  2014-03-25  6:04                         ` [PATCH 01/54] uml/net_kern: Call dev_consume_skb_any instead of dev_kfree_skb Eric W. Biederman
                                             ` (24 preceding siblings ...)
  2014-03-25  6:05                           ` [PATCH 26/54] i825xx: Call dev_kfree_skb_any " Eric W. Biederman
@ 2014-03-25  6:05                           ` Eric W. Biederman
  2014-03-25 15:39                             ` Eric Dumazet
  2014-03-25 15:39                             ` Eric Dumazet
  2014-03-25  6:05                           ` [PATCH 28/54] ibmveth: " Eric W. Biederman
                                             ` (27 subsequent siblings)
  53 siblings, 2 replies; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-25  6:05 UTC (permalink / raw)
  To: David Miller
  Cc: eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma,
	Eric W. Biederman

From: "Eric W. Biederman" <ebiederm@xmission.com>

Replace dev_kfree_skb with dev_consume_skb_any in functions that can
be called in hard irq and other contexts.

None of the locations was a packet drop so dev_kfree_skb_any is
inappropriate.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 drivers/net/ethernet/ibm/ehea/ehea_main.c |    6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/ibm/ehea/ehea_main.c b/drivers/net/ethernet/ibm/ehea/ehea_main.c
index 7628e0fd8455..538903bf13bc 100644
--- a/drivers/net/ethernet/ibm/ehea/ehea_main.c
+++ b/drivers/net/ethernet/ibm/ehea/ehea_main.c
@@ -490,7 +490,7 @@ static int ehea_refill_rq_def(struct ehea_port_res *pr,
 		skb_arr[index] = skb;
 		tmp_addr = ehea_map_vaddr(skb->data);
 		if (tmp_addr == -1) {
-			dev_kfree_skb(skb);
+			dev_consume_skb_any(skb);
 			q_skba->os_skbs = fill_wqes - i;
 			ret = 0;
 			break;
@@ -856,7 +856,7 @@ static struct ehea_cqe *ehea_proc_cqes(struct ehea_port_res *pr, int my_quota)
 
 			index = EHEA_BMASK_GET(EHEA_WR_ID_INDEX, cqe->wr_id);
 			skb = pr->sq_skba.arr[index];
-			dev_kfree_skb(skb);
+			dev_consume_skb_any(skb);
 			pr->sq_skba.arr[index] = NULL;
 		}
 
@@ -2044,7 +2044,7 @@ static void ehea_xmit3(struct sk_buff *skb, struct net_device *dev,
 		skb_copy_bits(skb, 0, imm_data, skb->len);
 
 	swqe->immediate_data_length = skb->len;
-	dev_kfree_skb(skb);
+	dev_consume_skb_any(skb);
 }
 
 static int ehea_start_xmit(struct sk_buff *skb, struct net_device *dev)
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* [PATCH 28/54] ibmveth: Call dev_consume_skb_any instead of dev_kfree_skb.
  2014-03-25  6:04                         ` [PATCH 01/54] uml/net_kern: Call dev_consume_skb_any instead of dev_kfree_skb Eric W. Biederman
                                             ` (25 preceding siblings ...)
  2014-03-25  6:05                           ` [PATCH 27/54] ehea: Call dev_consume_skb_any " Eric W. Biederman
@ 2014-03-25  6:05                           ` Eric W. Biederman
  2014-03-25  6:05                           ` [PATCH 29/54] jme: Call dev_kfree_skb_any " Eric W. Biederman
                                             ` (26 subsequent siblings)
  53 siblings, 0 replies; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-25  6:05 UTC (permalink / raw)
  To: David Miller
  Cc: eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma,
	Eric W. Biederman

From: "Eric W. Biederman" <ebiederm@xmission.com>

Replace dev_kfree_skb with dev_consume_skb_any in ibmveth_start_xmit
that can be called in hard irq and other contexts.

In this code path the packet can have either been transmitted
or dropped, dev_consume_skb_any was choosen because that preserves
the existing semantics of the code, and a transmitted packet is
more likely.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 drivers/net/ethernet/ibm/ibmveth.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/ibm/ibmveth.c b/drivers/net/ethernet/ibm/ibmveth.c
index e75bdfcd1374..c9127562bd22 100644
--- a/drivers/net/ethernet/ibm/ibmveth.c
+++ b/drivers/net/ethernet/ibm/ibmveth.c
@@ -1044,7 +1044,7 @@ retry_bounce:
 			       DMA_TO_DEVICE);
 
 out:
-	dev_kfree_skb(skb);
+	dev_consume_skb_any(skb);
 	return NETDEV_TX_OK;
 
 map_failed_frags:
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* [PATCH 29/54] jme: Call dev_kfree_skb_any instead of dev_kfree_skb.
  2014-03-25  6:04                         ` [PATCH 01/54] uml/net_kern: Call dev_consume_skb_any instead of dev_kfree_skb Eric W. Biederman
                                             ` (26 preceding siblings ...)
  2014-03-25  6:05                           ` [PATCH 28/54] ibmveth: " Eric W. Biederman
@ 2014-03-25  6:05                           ` Eric W. Biederman
  2014-03-25 15:45                             ` Eric Dumazet
  2014-03-25  6:05                           ` [PATCH 30/54] mv643xx_eth: " Eric W. Biederman
                                             ` (25 subsequent siblings)
  53 siblings, 1 reply; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-25  6:05 UTC (permalink / raw)
  To: David Miller
  Cc: eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma,
	Eric W. Biederman

From: "Eric W. Biederman" <ebiederm@xmission.com>

Replace dev_kfree_skb with dev_kfree_skb_any in jme_expand_header that
can be called in hard irq and other contexts, on the failure
path where the skb is dropped.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 drivers/net/ethernet/jme.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/jme.c b/drivers/net/ethernet/jme.c
index f5685c0d0579..14ff8d64257d 100644
--- a/drivers/net/ethernet/jme.c
+++ b/drivers/net/ethernet/jme.c
@@ -2059,7 +2059,7 @@ jme_expand_header(struct jme_adapter *jme, struct sk_buff *skb)
 	if (unlikely(skb_shinfo(skb)->gso_size &&
 			skb_header_cloned(skb) &&
 			pskb_expand_head(skb, 0, 0, GFP_ATOMIC))) {
-		dev_kfree_skb(skb);
+		dev_kfree_skb_any(skb);
 		return -1;
 	}
 
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* [PATCH 30/54] mv643xx_eth: Call dev_kfree_skb_any instead of dev_kfree_skb.
  2014-03-25  6:04                         ` [PATCH 01/54] uml/net_kern: Call dev_consume_skb_any instead of dev_kfree_skb Eric W. Biederman
                                             ` (27 preceding siblings ...)
  2014-03-25  6:05                           ` [PATCH 29/54] jme: Call dev_kfree_skb_any " Eric W. Biederman
@ 2014-03-25  6:05                           ` Eric W. Biederman
  2014-03-25 15:46                             ` Eric Dumazet
  2014-03-25  6:05                           ` [PATCH 31/54] skge: Call dev_kfree/consume_skb_any " Eric W. Biederman
                                             ` (24 subsequent siblings)
  53 siblings, 1 reply; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-25  6:05 UTC (permalink / raw)
  To: David Miller
  Cc: eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma,
	Eric W. Biederman

From: "Eric W. Biederman" <ebiederm@xmission.com>

Replace dev_kfree_skb with dev_kfree_skb_any in mv643xx_eth_xmit and
txq_submit_skb that can be called in hard irq and other contexts,
on paths where the skbs are dropped.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 drivers/net/ethernet/marvell/mv643xx_eth.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/marvell/mv643xx_eth.c b/drivers/net/ethernet/marvell/mv643xx_eth.c
index a2565ce22b7c..b7b8d74c22d9 100644
--- a/drivers/net/ethernet/marvell/mv643xx_eth.c
+++ b/drivers/net/ethernet/marvell/mv643xx_eth.c
@@ -730,7 +730,7 @@ static int txq_submit_skb(struct tx_queue *txq, struct sk_buff *skb)
 		    unlikely(tag_bytes & ~12)) {
 			if (skb_checksum_help(skb) == 0)
 				goto no_csum;
-			kfree_skb(skb);
+			dev_kfree_skb_any(skb);
 			return 1;
 		}
 
@@ -819,7 +819,7 @@ static netdev_tx_t mv643xx_eth_xmit(struct sk_buff *skb, struct net_device *dev)
 	if (txq->tx_ring_size - txq->tx_desc_count < MAX_SKB_FRAGS + 1) {
 		if (net_ratelimit())
 			netdev_err(dev, "tx queue full?!\n");
-		kfree_skb(skb);
+		dev_kfree_skb_any(skb);
 		return NETDEV_TX_OK;
 	}
 
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* [PATCH 31/54] skge: Call dev_kfree/consume_skb_any instead of dev_kfree_skb.
  2014-03-25  6:04                         ` [PATCH 01/54] uml/net_kern: Call dev_consume_skb_any instead of dev_kfree_skb Eric W. Biederman
                                             ` (28 preceding siblings ...)
  2014-03-25  6:05                           ` [PATCH 30/54] mv643xx_eth: " Eric W. Biederman
@ 2014-03-25  6:05                           ` Eric W. Biederman
  2014-03-25 15:47                             ` Eric Dumazet
  2014-03-25  6:05                           ` [PATCH 32/54] sky2: Call dev_kfree_skb_any " Eric W. Biederman
                                             ` (23 subsequent siblings)
  53 siblings, 1 reply; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-25  6:05 UTC (permalink / raw)
  To: David Miller
  Cc: eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma,
	Eric W. Biederman

From: "Eric W. Biederman" <ebiederm@xmission.com>

Replace dev_kfree_skb with dev_kfree_skb_any skge_xmit_free that can
be called in hard irq and other contexts, on the path that
handles dropped packets.

Replace dev_kfree_skb with dev_consume_skb_any in skge_tx_done that can
be called in hard irq and other contexts, on the path that handles
successfully transmitted skbs.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 drivers/net/ethernet/marvell/skge.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/marvell/skge.c b/drivers/net/ethernet/marvell/skge.c
index 597846193869..7f81ae66cc89 100644
--- a/drivers/net/ethernet/marvell/skge.c
+++ b/drivers/net/ethernet/marvell/skge.c
@@ -2845,7 +2845,7 @@ mapping_unwind:
 mapping_error:
 	if (net_ratelimit())
 		dev_warn(&hw->pdev->dev, "%s: tx mapping error\n", dev->name);
-	dev_kfree_skb(skb);
+	dev_kfree_skb_any(skb);
 	return NETDEV_TX_OK;
 }
 
@@ -3172,7 +3172,7 @@ static void skge_tx_done(struct net_device *dev)
 			pkts_compl++;
 			bytes_compl += e->skb->len;
 
-			dev_kfree_skb(e->skb);
+			dev_consume_skb_any(e->skb);
 		}
 	}
 	netdev_completed_queue(dev, pkts_compl, bytes_compl);
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* [PATCH 32/54] sky2: Call dev_kfree_skb_any instead of dev_kfree_skb.
  2014-03-25  6:04                         ` [PATCH 01/54] uml/net_kern: Call dev_consume_skb_any instead of dev_kfree_skb Eric W. Biederman
                                             ` (29 preceding siblings ...)
  2014-03-25  6:05                           ` [PATCH 31/54] skge: Call dev_kfree/consume_skb_any " Eric W. Biederman
@ 2014-03-25  6:05                           ` Eric W. Biederman
  2014-03-25 16:23                             ` Eric Dumazet
  2014-03-25  6:05                           ` [PATCH 33/54] ksz884x: Call dev_consume_skb_any " Eric W. Biederman
                                             ` (22 subsequent siblings)
  53 siblings, 1 reply; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-25  6:05 UTC (permalink / raw)
  To: David Miller
  Cc: eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma,
	Eric W. Biederman

From: "Eric W. Biederman" <ebiederm@xmission.com>

Replace dev_kfree_skb with dev_kfree_skb_any in sky2_xmit_frame that
can be called in hard irq and other contexts.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 drivers/net/ethernet/marvell/sky2.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/marvell/sky2.c b/drivers/net/ethernet/marvell/sky2.c
index d524676fdff4..b81106451a0a 100644
--- a/drivers/net/ethernet/marvell/sky2.c
+++ b/drivers/net/ethernet/marvell/sky2.c
@@ -2002,7 +2002,7 @@ mapping_unwind:
 mapping_error:
 	if (net_ratelimit())
 		dev_warn(&hw->pdev->dev, "%s: tx mapping error\n", dev->name);
-	dev_kfree_skb(skb);
+	dev_kfree_skb_any(skb);
 	return NETDEV_TX_OK;
 }
 
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* [PATCH 33/54] ksz884x: Call dev_consume_skb_any instead of dev_kfree_skb.
  2014-03-25  6:04                         ` [PATCH 01/54] uml/net_kern: Call dev_consume_skb_any instead of dev_kfree_skb Eric W. Biederman
                                             ` (30 preceding siblings ...)
  2014-03-25  6:05                           ` [PATCH 32/54] sky2: Call dev_kfree_skb_any " Eric W. Biederman
@ 2014-03-25  6:05                           ` Eric W. Biederman
  2014-03-25 16:23                             ` Eric Dumazet
  2014-03-25  6:05                           ` [PATCH 34/54] s2io: Call dev_kfree_skb_any " Eric W. Biederman
                                             ` (21 subsequent siblings)
  53 siblings, 1 reply; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-25  6:05 UTC (permalink / raw)
  To: David Miller
  Cc: eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma,
	Eric W. Biederman

From: "Eric W. Biederman" <ebiederm@xmission.com>

Replace dev_kfree_skb with dev_consume_skb_any in copy_old_skb that can
be called in hard irq and other contexts.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 drivers/net/ethernet/micrel/ksz884x.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/micrel/ksz884x.c b/drivers/net/ethernet/micrel/ksz884x.c
index ce84dc289c8f..14ac0e2bc09f 100644
--- a/drivers/net/ethernet/micrel/ksz884x.c
+++ b/drivers/net/ethernet/micrel/ksz884x.c
@@ -4832,7 +4832,7 @@ static inline void copy_old_skb(struct sk_buff *old, struct sk_buff *skb)
 	skb->csum = old->csum;
 	skb_set_network_header(skb, ETH_HLEN);
 
-	dev_kfree_skb(old);
+	dev_consume_skb_any(old);
 }
 
 /**
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* [PATCH 34/54] s2io: Call dev_kfree_skb_any instead of dev_kfree_skb.
  2014-03-25  6:04                         ` [PATCH 01/54] uml/net_kern: Call dev_consume_skb_any instead of dev_kfree_skb Eric W. Biederman
                                             ` (31 preceding siblings ...)
  2014-03-25  6:05                           ` [PATCH 33/54] ksz884x: Call dev_consume_skb_any " Eric W. Biederman
@ 2014-03-25  6:05                           ` Eric W. Biederman
  2014-03-25 16:25                             ` Eric Dumazet
  2014-03-25  6:05                           ` [PATCH 35/54] vxge: " Eric W. Biederman
                                             ` (20 subsequent siblings)
  53 siblings, 1 reply; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-25  6:05 UTC (permalink / raw)
  To: David Miller
  Cc: eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma,
	Eric W. Biederman

From: "Eric W. Biederman" <ebiederm@xmission.com>

Replace dev_kfree_skb with dev_kfree_skb_any in s2io_xmit that can
be called in hard irq and other contexts.

All instances that are changed are packet drops.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 drivers/net/ethernet/neterion/s2io.c |    6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/neterion/s2io.c b/drivers/net/ethernet/neterion/s2io.c
index d44fdb91808e..a2844ff322c4 100644
--- a/drivers/net/ethernet/neterion/s2io.c
+++ b/drivers/net/ethernet/neterion/s2io.c
@@ -4049,7 +4049,7 @@ static netdev_tx_t s2io_xmit(struct sk_buff *skb, struct net_device *dev)
 	if (!is_s2io_card_up(sp)) {
 		DBG_PRINT(TX_DBG, "%s: Card going down for reset\n",
 			  dev->name);
-		dev_kfree_skb(skb);
+		dev_kfree_skb_any(skb);
 		return NETDEV_TX_OK;
 	}
 
@@ -4122,7 +4122,7 @@ static netdev_tx_t s2io_xmit(struct sk_buff *skb, struct net_device *dev)
 	    ((put_off+1) == queue_len ? 0 : (put_off+1)) == get_off) {
 		DBG_PRINT(TX_DBG, "Error in xmit, No free TXDs.\n");
 		s2io_stop_tx_queue(sp, fifo->fifo_no);
-		dev_kfree_skb(skb);
+		dev_kfree_skb_any(skb);
 		spin_unlock_irqrestore(&fifo->tx_lock, flags);
 		return NETDEV_TX_OK;
 	}
@@ -4244,7 +4244,7 @@ pci_map_failed:
 	swstats->pci_map_fail_cnt++;
 	s2io_stop_tx_queue(sp, fifo->fifo_no);
 	swstats->mem_freed += skb->truesize;
-	dev_kfree_skb(skb);
+	dev_kfree_skb_any(skb);
 	spin_unlock_irqrestore(&fifo->tx_lock, flags);
 	return NETDEV_TX_OK;
 }
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* [PATCH 35/54] vxge: Call dev_kfree_skb_any instead of dev_kfree_skb.
  2014-03-25  6:04                         ` [PATCH 01/54] uml/net_kern: Call dev_consume_skb_any instead of dev_kfree_skb Eric W. Biederman
                                             ` (32 preceding siblings ...)
  2014-03-25  6:05                           ` [PATCH 34/54] s2io: Call dev_kfree_skb_any " Eric W. Biederman
@ 2014-03-25  6:05                           ` Eric W. Biederman
  2014-03-25 16:26                             ` Eric Dumazet
  2014-03-25  6:05                           ` [PATCH 36/54] forcedeth: Call dev_kfree_skb_any instead of kfree_skb Eric W. Biederman
                                             ` (19 subsequent siblings)
  53 siblings, 1 reply; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-25  6:05 UTC (permalink / raw)
  To: David Miller
  Cc: eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma,
	Eric W. Biederman

From: "Eric W. Biederman" <ebiederm@xmission.com>

Replace dev_kfree_skb with dev_kfree_skb_any in vxge_xmit that can
be called in hard irq and other contexts.

vxge_xmit only calls dev_kfree_skb_any when errors result in dropping
skbs.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 drivers/net/ethernet/neterion/vxge/vxge-main.c |    8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/neterion/vxge/vxge-main.c b/drivers/net/ethernet/neterion/vxge/vxge-main.c
index 11adc89959c1..d107bcbb8543 100644
--- a/drivers/net/ethernet/neterion/vxge/vxge-main.c
+++ b/drivers/net/ethernet/neterion/vxge/vxge-main.c
@@ -824,7 +824,7 @@ vxge_xmit(struct sk_buff *skb, struct net_device *dev)
 	if (unlikely(skb->len <= 0)) {
 		vxge_debug_tx(VXGE_ERR,
 			"%s: Buffer has no data..", dev->name);
-		dev_kfree_skb(skb);
+		dev_kfree_skb_any(skb);
 		return NETDEV_TX_OK;
 	}
 
@@ -833,7 +833,7 @@ vxge_xmit(struct sk_buff *skb, struct net_device *dev)
 	if (unlikely(!is_vxge_card_up(vdev))) {
 		vxge_debug_tx(VXGE_ERR,
 			"%s: vdev not initialized", dev->name);
-		dev_kfree_skb(skb);
+		dev_kfree_skb_any(skb);
 		return NETDEV_TX_OK;
 	}
 
@@ -843,7 +843,7 @@ vxge_xmit(struct sk_buff *skb, struct net_device *dev)
 			vxge_debug_tx(VXGE_ERR,
 				"%s: Failed to store the mac address",
 				dev->name);
-			dev_kfree_skb(skb);
+			dev_kfree_skb_any(skb);
 			return NETDEV_TX_OK;
 		}
 	}
@@ -990,7 +990,7 @@ _exit1:
 	vxge_hw_fifo_txdl_free(fifo_hw, dtr);
 _exit0:
 	netif_tx_stop_queue(fifo->txq);
-	dev_kfree_skb(skb);
+	dev_kfree_skb_any(skb);
 
 	return NETDEV_TX_OK;
 }
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* [PATCH 36/54] forcedeth: Call dev_kfree_skb_any instead of kfree_skb.
  2014-03-25  6:04                         ` [PATCH 01/54] uml/net_kern: Call dev_consume_skb_any instead of dev_kfree_skb Eric W. Biederman
                                             ` (33 preceding siblings ...)
  2014-03-25  6:05                           ` [PATCH 35/54] vxge: " Eric W. Biederman
@ 2014-03-25  6:05                           ` Eric W. Biederman
  2014-03-25 16:27                             ` Eric Dumazet
  2014-03-25  6:05                           ` [PATCH 37/54] sc92031: Call dev_consume_skb_any instead of dev_kfree_skb Eric W. Biederman
                                             ` (18 subsequent siblings)
  53 siblings, 1 reply; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-25  6:05 UTC (permalink / raw)
  To: David Miller
  Cc: eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma,
	Eric W. Biederman

From: "Eric W. Biederman" <ebiederm@xmission.com>

Replace kfree_skb with dev_kfree_skb_any in functions that can
be called in hard irq and other contexts.

Every location changes is a drop making dev_kfree_skby_any appropriate.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 drivers/net/ethernet/nvidia/forcedeth.c |    8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/nvidia/forcedeth.c b/drivers/net/ethernet/nvidia/forcedeth.c
index 811be0bccd14..fddb464aeab3 100644
--- a/drivers/net/ethernet/nvidia/forcedeth.c
+++ b/drivers/net/ethernet/nvidia/forcedeth.c
@@ -2231,7 +2231,7 @@ static netdev_tx_t nv_start_xmit(struct sk_buff *skb, struct net_device *dev)
 		if (pci_dma_mapping_error(np->pci_dev,
 					  np->put_tx_ctx->dma)) {
 			/* on DMA mapping error - drop the packet */
-			kfree_skb(skb);
+			dev_kfree_skb_any(skb);
 			u64_stats_update_begin(&np->swstats_tx_syncp);
 			np->stat_tx_dropped++;
 			u64_stats_update_end(&np->swstats_tx_syncp);
@@ -2277,7 +2277,7 @@ static netdev_tx_t nv_start_xmit(struct sk_buff *skb, struct net_device *dev)
 					if (unlikely(tmp_tx_ctx++ == np->last_tx_ctx))
 						tmp_tx_ctx = np->first_tx_ctx;
 				} while (tmp_tx_ctx != np->put_tx_ctx);
-				kfree_skb(skb);
+				dev_kfree_skb_any(skb);
 				np->put_tx_ctx = start_tx_ctx;
 				u64_stats_update_begin(&np->swstats_tx_syncp);
 				np->stat_tx_dropped++;
@@ -2380,7 +2380,7 @@ static netdev_tx_t nv_start_xmit_optimized(struct sk_buff *skb,
 		if (pci_dma_mapping_error(np->pci_dev,
 					  np->put_tx_ctx->dma)) {
 			/* on DMA mapping error - drop the packet */
-			kfree_skb(skb);
+			dev_kfree_skb_any(skb);
 			u64_stats_update_begin(&np->swstats_tx_syncp);
 			np->stat_tx_dropped++;
 			u64_stats_update_end(&np->swstats_tx_syncp);
@@ -2427,7 +2427,7 @@ static netdev_tx_t nv_start_xmit_optimized(struct sk_buff *skb,
 					if (unlikely(tmp_tx_ctx++ == np->last_tx_ctx))
 						tmp_tx_ctx = np->first_tx_ctx;
 				} while (tmp_tx_ctx != np->put_tx_ctx);
-				kfree_skb(skb);
+				dev_kfree_skb_any(skb);
 				np->put_tx_ctx = start_tx_ctx;
 				u64_stats_update_begin(&np->swstats_tx_syncp);
 				np->stat_tx_dropped++;
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* [PATCH 37/54] sc92031: Call dev_consume_skb_any instead of dev_kfree_skb.
  2014-03-25  6:04                         ` [PATCH 01/54] uml/net_kern: Call dev_consume_skb_any instead of dev_kfree_skb Eric W. Biederman
                                             ` (34 preceding siblings ...)
  2014-03-25  6:05                           ` [PATCH 36/54] forcedeth: Call dev_kfree_skb_any instead of kfree_skb Eric W. Biederman
@ 2014-03-25  6:05                           ` Eric W. Biederman
  2014-03-25 20:39                             ` Eric Dumazet
  2014-03-25  6:05                           ` [PATCH 38/54] sis900: Call dev_kfree_skb_any " Eric W. Biederman
                                             ` (17 subsequent siblings)
  53 siblings, 1 reply; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-25  6:05 UTC (permalink / raw)
  To: David Miller
  Cc: eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma,
	Eric W. Biederman

From: "Eric W. Biederman" <ebiederm@xmission.com>

Replace dev_kfree_skb with dev_consume_skb_any in sc92031_start_xmit
that can be called in hard irq and other contexts.

Using dev_consume_skb_any preserves the current semantics (as
dev_kfree_skb is just consume_skb) and since packet drops
are rare is usually accurate.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 drivers/net/ethernet/silan/sc92031.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/silan/sc92031.c b/drivers/net/ethernet/silan/sc92031.c
index 5eb933c97bba..7daa7d433099 100644
--- a/drivers/net/ethernet/silan/sc92031.c
+++ b/drivers/net/ethernet/silan/sc92031.c
@@ -987,7 +987,7 @@ out_unlock:
 	spin_unlock(&priv->lock);
 
 out:
-	dev_kfree_skb(skb);
+	dev_consume_skb_any(skb);
 
 	return NETDEV_TX_OK;
 }
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* [PATCH 38/54] sis900: Call dev_kfree_skb_any instead of dev_kfree_skb.
  2014-03-25  6:04                         ` [PATCH 01/54] uml/net_kern: Call dev_consume_skb_any instead of dev_kfree_skb Eric W. Biederman
                                             ` (35 preceding siblings ...)
  2014-03-25  6:05                           ` [PATCH 37/54] sc92031: Call dev_consume_skb_any instead of dev_kfree_skb Eric W. Biederman
@ 2014-03-25  6:05                           ` Eric W. Biederman
  2014-03-25 20:39                             ` Eric Dumazet
  2014-03-25  6:05                           ` [PATCH 39/54] smc911x: " Eric W. Biederman
                                             ` (16 subsequent siblings)
  53 siblings, 1 reply; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-25  6:05 UTC (permalink / raw)
  To: David Miller
  Cc: eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma,
	Eric W. Biederman

From: "Eric W. Biederman" <ebiederm@xmission.com>

Replace dev_kfree_skb with dev_kfree_skb_any in functions that can
be called in hard irq and other contexts.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 drivers/net/ethernet/sis/sis900.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/sis/sis900.c b/drivers/net/ethernet/sis/sis900.c
index ff57a46388ee..6072f093e6b4 100644
--- a/drivers/net/ethernet/sis/sis900.c
+++ b/drivers/net/ethernet/sis/sis900.c
@@ -1614,7 +1614,7 @@ sis900_start_xmit(struct sk_buff *skb, struct net_device *net_dev)
 		skb->data, skb->len, PCI_DMA_TODEVICE);
 	if (unlikely(pci_dma_mapping_error(sis_priv->pci_dev,
 		sis_priv->tx_ring[entry].bufptr))) {
-			dev_kfree_skb(skb);
+			dev_kfree_skb_any(skb);
 			sis_priv->tx_skbuff[entry] = NULL;
 			net_dev->stats.tx_dropped++;
 			spin_unlock_irqrestore(&sis_priv->lock, flags);
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* [PATCH 39/54] smc911x: Call dev_kfree_skb_any instead of dev_kfree_skb.
  2014-03-25  6:04                         ` [PATCH 01/54] uml/net_kern: Call dev_consume_skb_any instead of dev_kfree_skb Eric W. Biederman
                                             ` (36 preceding siblings ...)
  2014-03-25  6:05                           ` [PATCH 38/54] sis900: Call dev_kfree_skb_any " Eric W. Biederman
@ 2014-03-25  6:05                           ` Eric W. Biederman
  2014-03-25 20:40                             ` Eric Dumazet
  2014-03-25  6:05                           ` [PATCH 40/54] smc91x: Call dev_kfree/consume_skb_any " Eric W. Biederman
                                             ` (15 subsequent siblings)
  53 siblings, 1 reply; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-25  6:05 UTC (permalink / raw)
  To: David Miller
  Cc: eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma,
	Eric W. Biederman

From: "Eric W. Biederman" <ebiederm@xmission.com>

Replace dev_kfree_skb with dev_kfree_skb_any in functions that can
be called in hard irq and other contexts.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 drivers/net/ethernet/smsc/smc911x.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/smsc/smc911x.c b/drivers/net/ethernet/smsc/smc911x.c
index c50fb08c9905..66b05e62f70a 100644
--- a/drivers/net/ethernet/smsc/smc911x.c
+++ b/drivers/net/ethernet/smsc/smc911x.c
@@ -551,7 +551,7 @@ static int smc911x_hard_start_xmit(struct sk_buff *skb, struct net_device *dev)
 		dev->stats.tx_errors++;
 		dev->stats.tx_dropped++;
 		spin_unlock_irqrestore(&lp->lock, flags);
-		dev_kfree_skb(skb);
+		dev_kfree_skb_any(skb);
 		return NETDEV_TX_OK;
 	}
 
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* [PATCH 40/54] smc91x: Call dev_kfree/consume_skb_any instead of dev_kfree_skb.
  2014-03-25  6:04                         ` [PATCH 01/54] uml/net_kern: Call dev_consume_skb_any instead of dev_kfree_skb Eric W. Biederman
                                             ` (37 preceding siblings ...)
  2014-03-25  6:05                           ` [PATCH 39/54] smc911x: " Eric W. Biederman
@ 2014-03-25  6:05                           ` Eric W. Biederman
  2014-03-25 20:40                             ` Eric Dumazet
  2014-03-25  6:05                           ` [PATCH 41/54] smsc911x: Call dev_consume_skb_any " Eric W. Biederman
                                             ` (14 subsequent siblings)
  53 siblings, 1 reply; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-25  6:05 UTC (permalink / raw)
  To: David Miller
  Cc: eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma,
	Eric W. Biederman

From: "Eric W. Biederman" <ebiederm@xmission.com>

Replace dev_kfree_skb with dev_consume_skb_any in
smc_hardware_send_pkt that can be called in hard irq and other
contexts, and handles successfully transmitted packets.

Replace dev_kfree_skb with dev_kfree_skb_any in smc_hard_start_xmit which
can be called in hard irq and other contexts, and only frees skbs
when dropping them.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 drivers/net/ethernet/smsc/smc91x.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/smsc/smc91x.c b/drivers/net/ethernet/smsc/smc91x.c
index 839c0e6cca01..d1b4dca53a9d 100644
--- a/drivers/net/ethernet/smsc/smc91x.c
+++ b/drivers/net/ethernet/smsc/smc91x.c
@@ -621,7 +621,7 @@ static void smc_hardware_send_pkt(unsigned long data)
 done:	if (!THROTTLE_TX_PKTS)
 		netif_wake_queue(dev);
 
-	dev_kfree_skb(skb);
+	dev_consume_skb_any(skb);
 }
 
 /*
@@ -657,7 +657,7 @@ static int smc_hard_start_xmit(struct sk_buff *skb, struct net_device *dev)
 		netdev_warn(dev, "Far too big packet error.\n");
 		dev->stats.tx_errors++;
 		dev->stats.tx_dropped++;
-		dev_kfree_skb(skb);
+		dev_kfree_skb_any(skb);
 		return NETDEV_TX_OK;
 	}
 
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* [PATCH 41/54] smsc911x: Call dev_consume_skb_any instead of dev_kfree_skb.
  2014-03-25  6:04                         ` [PATCH 01/54] uml/net_kern: Call dev_consume_skb_any instead of dev_kfree_skb Eric W. Biederman
                                             ` (38 preceding siblings ...)
  2014-03-25  6:05                           ` [PATCH 40/54] smc91x: Call dev_kfree/consume_skb_any " Eric W. Biederman
@ 2014-03-25  6:05                           ` Eric W. Biederman
  2014-03-25 20:41                             ` Eric Dumazet
  2014-03-25  6:05                           ` [PATCH 42/54] stmmac: " Eric W. Biederman
                                             ` (13 subsequent siblings)
  53 siblings, 1 reply; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-25  6:05 UTC (permalink / raw)
  To: David Miller
  Cc: eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma,
	Eric W. Biederman

From: "Eric W. Biederman" <ebiederm@xmission.com>

Replace dev_kfree_skb with dev_consume_skb_any in smsc911x_hard_xmit
which can be called in hard irq and other contexts. smsc911x_hard_xmit
always transmits and consumes the specified skb.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 drivers/net/ethernet/smsc/smsc911x.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/smsc/smsc911x.c b/drivers/net/ethernet/smsc/smsc911x.c
index 95e2b9a20d40..ed36ff48af57 100644
--- a/drivers/net/ethernet/smsc/smsc911x.c
+++ b/drivers/net/ethernet/smsc/smsc911x.c
@@ -1672,7 +1672,7 @@ static int smsc911x_hard_start_xmit(struct sk_buff *skb, struct net_device *dev)
 	pdata->ops->tx_writefifo(pdata, (unsigned int *)bufp, wrsz);
 	freespace -= (skb->len + 32);
 	skb_tx_timestamp(skb);
-	dev_kfree_skb(skb);
+	dev_consume_skb_any(skb);
 
 	if (unlikely(smsc911x_tx_get_txstatcount(pdata) >= 30))
 		smsc911x_tx_update_txcounters(dev);
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* [PATCH 42/54] stmmac: Call dev_consume_skb_any instead of dev_kfree_skb.
  2014-03-25  6:04                         ` [PATCH 01/54] uml/net_kern: Call dev_consume_skb_any instead of dev_kfree_skb Eric W. Biederman
                                             ` (39 preceding siblings ...)
  2014-03-25  6:05                           ` [PATCH 41/54] smsc911x: Call dev_consume_skb_any " Eric W. Biederman
@ 2014-03-25  6:05                           ` Eric W. Biederman
  2014-03-25 20:42                             ` Eric Dumazet
  2014-03-25  6:05                           ` [PATCH 43/54] sungem: " Eric W. Biederman
                                             ` (12 subsequent siblings)
  53 siblings, 1 reply; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-25  6:05 UTC (permalink / raw)
  To: David Miller
  Cc: eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma,
	Eric W. Biederman

From: "Eric W. Biederman" <ebiederm@xmission.com>

Replace dev_kfree_skb with dev_consume_skb_any in stmmac_tx_clean that can
be called in hard irq and other contexts.  stmmac_tx_clean handles
freeing successfully transmitted packets.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 drivers/net/ethernet/stmicro/stmmac/stmmac_main.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index 8543e1cfd55e..d940034acdd4 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -1303,7 +1303,7 @@ static void stmmac_tx_clean(struct stmmac_priv *priv)
 		priv->hw->mode->clean_desc3(priv, p);
 
 		if (likely(skb != NULL)) {
-			dev_kfree_skb(skb);
+			dev_consume_skb_any(skb);
 			priv->tx_skbuff[entry] = NULL;
 		}
 
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* [PATCH 43/54] sungem: Call dev_consume_skb_any instead of dev_kfree_skb.
  2014-03-25  6:04                         ` [PATCH 01/54] uml/net_kern: Call dev_consume_skb_any instead of dev_kfree_skb Eric W. Biederman
                                             ` (40 preceding siblings ...)
  2014-03-25  6:05                           ` [PATCH 42/54] stmmac: " Eric W. Biederman
@ 2014-03-25  6:05                           ` Eric W. Biederman
  2014-03-25 20:42                             ` Eric Dumazet
  2014-03-25  6:05                           ` [PATCH 44/54] tilepro: Call dev_consume_skb_any instead of kfree_skb Eric W. Biederman
                                             ` (11 subsequent siblings)
  53 siblings, 1 reply; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-25  6:05 UTC (permalink / raw)
  To: David Miller
  Cc: eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma,
	Eric W. Biederman

From: "Eric W. Biederman" <ebiederm@xmission.com>

Replace dev_kfree_skb with dev_consume_skb_any in gem_tx which can be
called in hard irq and other contexts.  gem_tx handles successfully
transmitted packets.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 drivers/net/ethernet/sun/sungem.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/sun/sungem.c b/drivers/net/ethernet/sun/sungem.c
index c2799dc46325..102a66fc54a2 100644
--- a/drivers/net/ethernet/sun/sungem.c
+++ b/drivers/net/ethernet/sun/sungem.c
@@ -688,7 +688,7 @@ static __inline__ void gem_tx(struct net_device *dev, struct gem *gp, u32 gem_st
 		}
 
 		dev->stats.tx_packets++;
-		dev_kfree_skb(skb);
+		dev_consume_skb_any(skb);
 	}
 	gp->tx_old = entry;
 
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* [PATCH 44/54] tilepro: Call dev_consume_skb_any instead of kfree_skb.
  2014-03-25  6:04                         ` [PATCH 01/54] uml/net_kern: Call dev_consume_skb_any instead of dev_kfree_skb Eric W. Biederman
                                             ` (41 preceding siblings ...)
  2014-03-25  6:05                           ` [PATCH 43/54] sungem: " Eric W. Biederman
@ 2014-03-25  6:05                           ` Eric W. Biederman
  2014-03-25 20:43                             ` Eric Dumazet
  2014-03-25  6:05                           ` [PATCH 45/54] spider_net: Call dev_consume_skb_any instead of dev_kfree_skb Eric W. Biederman
                                             ` (10 subsequent siblings)
  53 siblings, 1 reply; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-25  6:05 UTC (permalink / raw)
  To: David Miller
  Cc: eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma,
	Eric W. Biederman

From: "Eric W. Biederman" <ebiederm@xmission.com>

Replace kfree_skb with dev_consume_skb_any in tile_net_tx and
tile_net_tx_tso which can be called in hard irq and other contexts.

At the point where the skbs are freed a packet has been successfully
transmitted so dev_consume_skb_any is the appropriate variant to use.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 drivers/net/ethernet/tile/tilepro.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/tile/tilepro.c b/drivers/net/ethernet/tile/tilepro.c
index b94449b4bd34..e5a5c5d4ce0c 100644
--- a/drivers/net/ethernet/tile/tilepro.c
+++ b/drivers/net/ethernet/tile/tilepro.c
@@ -1824,7 +1824,7 @@ busy:
 
 	/* Handle completions. */
 	for (i = 0; i < nolds; i++)
-		kfree_skb(olds[i]);
+		dev_consume_skb_any(olds[i]);
 
 	/* Update stats. */
 	u64_stats_update_begin(&stats->syncp);
@@ -2008,7 +2008,7 @@ busy:
 
 	/* Handle completions. */
 	for (i = 0; i < nolds; i++)
-		kfree_skb(olds[i]);
+		dev_consume_skb_any(olds[i]);
 
 	/* HACK: Track "expanded" size for short packets (e.g. 42 < 60). */
 	u64_stats_update_begin(&stats->syncp);
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* [PATCH 45/54] spider_net: Call dev_consume_skb_any instead of dev_kfree_skb.
  2014-03-25  6:04                         ` [PATCH 01/54] uml/net_kern: Call dev_consume_skb_any instead of dev_kfree_skb Eric W. Biederman
                                             ` (42 preceding siblings ...)
  2014-03-25  6:05                           ` [PATCH 44/54] tilepro: Call dev_consume_skb_any instead of kfree_skb Eric W. Biederman
@ 2014-03-25  6:05                           ` Eric W. Biederman
  2014-03-25 20:44                             ` Eric Dumazet
  2014-03-25  6:05                           ` [PATCH 46/54] via-rhine: Call dev_kfree/consume_skb_any " Eric W. Biederman
                                             ` (9 subsequent siblings)
  53 siblings, 1 reply; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-25  6:05 UTC (permalink / raw)
  To: David Miller
  Cc: eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma,
	Eric W. Biederman

From: "Eric W. Biederman" <ebiederm@xmission.com>

Replace dev_kfree_skb with dev_consume_skb_any in
spider_net_release_tx_chain which can be called in hard irq and other
contexts.

dev_consume_skb_any was choosen as it preserves the current
dev_kfree_skb semantics (dev_kfree_skb is consume_skb) and
is because it is correct most of the time as most packets
will have been successfully transmitted not dropeed.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 drivers/net/ethernet/toshiba/spider_net.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/toshiba/spider_net.c b/drivers/net/ethernet/toshiba/spider_net.c
index 3f4a32e39d27..0282d0161859 100644
--- a/drivers/net/ethernet/toshiba/spider_net.c
+++ b/drivers/net/ethernet/toshiba/spider_net.c
@@ -860,7 +860,7 @@ spider_net_release_tx_chain(struct spider_net_card *card, int brutal)
 		if (skb) {
 			pci_unmap_single(card->pdev, buf_addr, skb->len,
 					PCI_DMA_TODEVICE);
-			dev_kfree_skb(skb);
+			dev_consume_skb_any(skb);
 		}
 	}
 	return 0;
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* [PATCH 46/54] via-rhine: Call dev_kfree/consume_skb_any instead of dev_kfree_skb.
  2014-03-25  6:04                         ` [PATCH 01/54] uml/net_kern: Call dev_consume_skb_any instead of dev_kfree_skb Eric W. Biederman
                                             ` (43 preceding siblings ...)
  2014-03-25  6:05                           ` [PATCH 45/54] spider_net: Call dev_consume_skb_any instead of dev_kfree_skb Eric W. Biederman
@ 2014-03-25  6:05                           ` Eric W. Biederman
  2014-03-25 20:44                             ` Eric Dumazet
  2014-03-25  6:05                           ` [PATCH 47/54] via-velocity: Call dev_kfree_skb_any instead of kfree_skb Eric W. Biederman
                                             ` (8 subsequent siblings)
  53 siblings, 1 reply; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-25  6:05 UTC (permalink / raw)
  To: David Miller
  Cc: eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma,
	Eric W. Biederman

From: "Eric W. Biederman" <ebiederm@xmission.com>

Replace dev_kfree_skb with dev_kfree_skb_any in rhine_start_tx which
can be called in hard irq and other contexts.  Packets are only freed
in rhine_start_tx if they are dropped.

Replace dev_kfree_skb with dev_consume_skb_any in rhine_tx that can be
called in hard irq and other contexts.  rhine_tx handles successfully
transmitted packets.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 drivers/net/ethernet/via/via-rhine.c |    6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/via/via-rhine.c b/drivers/net/ethernet/via/via-rhine.c
index 9d93fa120578..ce2e4d14ab31 100644
--- a/drivers/net/ethernet/via/via-rhine.c
+++ b/drivers/net/ethernet/via/via-rhine.c
@@ -1676,7 +1676,7 @@ static netdev_tx_t rhine_start_tx(struct sk_buff *skb,
 		/* Must use alignment buffer. */
 		if (skb->len > PKT_BUF_SZ) {
 			/* packet too long, drop it */
-			dev_kfree_skb(skb);
+			dev_kfree_skb_any(skb);
 			rp->tx_skbuff[entry] = NULL;
 			dev->stats.tx_dropped++;
 			return NETDEV_TX_OK;
@@ -1696,7 +1696,7 @@ static netdev_tx_t rhine_start_tx(struct sk_buff *skb,
 			pci_map_single(rp->pdev, skb->data, skb->len,
 				       PCI_DMA_TODEVICE);
 		if (dma_mapping_error(&rp->pdev->dev, rp->tx_skbuff_dma[entry])) {
-			dev_kfree_skb(skb);
+			dev_kfree_skb_any(skb);
 			rp->tx_skbuff_dma[entry] = 0;
 			dev->stats.tx_dropped++;
 			return NETDEV_TX_OK;
@@ -1834,7 +1834,7 @@ static void rhine_tx(struct net_device *dev)
 					 rp->tx_skbuff[entry]->len,
 					 PCI_DMA_TODEVICE);
 		}
-		dev_kfree_skb(rp->tx_skbuff[entry]);
+		dev_consume_skb_any(rp->tx_skbuff[entry]);
 		rp->tx_skbuff[entry] = NULL;
 		entry = (++rp->dirty_tx) % TX_RING_SIZE;
 	}
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* [PATCH 47/54] via-velocity: Call dev_kfree_skb_any instead of kfree_skb.
  2014-03-25  6:04                         ` [PATCH 01/54] uml/net_kern: Call dev_consume_skb_any instead of dev_kfree_skb Eric W. Biederman
                                             ` (44 preceding siblings ...)
  2014-03-25  6:05                           ` [PATCH 46/54] via-rhine: Call dev_kfree/consume_skb_any " Eric W. Biederman
@ 2014-03-25  6:05                           ` Eric W. Biederman
  2014-03-25 20:45                             ` Eric Dumazet
  2014-03-25  6:05                           ` [PATCH 48/54] xilinx_emaclite: Call dev_consume_skb_any instead of dev_kfree_skb Eric W. Biederman
                                             ` (7 subsequent siblings)
  53 siblings, 1 reply; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-25  6:05 UTC (permalink / raw)
  To: David Miller
  Cc: eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma,
	Eric W. Biederman

From: "Eric W. Biederman" <ebiederm@xmission.com>

Replace dev_kfree_skb with dev_kfree_skb_any in velocity_xmit that can
be called in hard irq and other contexts.  Packets are freed and
dropped in velocity_xmit when they are too fragmented and can
not be linearized.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 drivers/net/ethernet/via/via-velocity.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/via/via-velocity.c b/drivers/net/ethernet/via/via-velocity.c
index ad61d26a44f3..de08e86db209 100644
--- a/drivers/net/ethernet/via/via-velocity.c
+++ b/drivers/net/ethernet/via/via-velocity.c
@@ -2565,7 +2565,7 @@ static netdev_tx_t velocity_xmit(struct sk_buff *skb,
 	/* The hardware can handle at most 7 memory segments, so merge
 	 * the skb if there are more */
 	if (skb_shinfo(skb)->nr_frags > 6 && __skb_linearize(skb)) {
-		kfree_skb(skb);
+		dev_kfree_skb_any(skb);
 		return NETDEV_TX_OK;
 	}
 
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* [PATCH 48/54] xilinx_emaclite: Call dev_consume_skb_any instead of dev_kfree_skb.
  2014-03-25  6:04                         ` [PATCH 01/54] uml/net_kern: Call dev_consume_skb_any instead of dev_kfree_skb Eric W. Biederman
                                             ` (45 preceding siblings ...)
  2014-03-25  6:05                           ` [PATCH 47/54] via-velocity: Call dev_kfree_skb_any instead of kfree_skb Eric W. Biederman
@ 2014-03-25  6:05                           ` Eric W. Biederman
  2014-03-25 20:46                             ` Eric Dumazet
  2014-03-25  6:05                           ` [PATCH 49/54] vmxnet3: Call dev_kfree_skb_any " Eric W. Biederman
                                             ` (6 subsequent siblings)
  53 siblings, 1 reply; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-25  6:05 UTC (permalink / raw)
  To: David Miller
  Cc: eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma,
	Eric W. Biederman

From: "Eric W. Biederman" <ebiederm@xmission.com>

Replace dev_kfree_skb with dev_consume_skb_any in xemaclite_send which
can be called in hard irq and other contexts.  xemacelite_send only
frees skbs that it has successfully transmitted.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 drivers/net/ethernet/xilinx/xilinx_emaclite.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/xilinx/xilinx_emaclite.c b/drivers/net/ethernet/xilinx/xilinx_emaclite.c
index 36052b98b3fc..58756617644f 100644
--- a/drivers/net/ethernet/xilinx/xilinx_emaclite.c
+++ b/drivers/net/ethernet/xilinx/xilinx_emaclite.c
@@ -1037,7 +1037,7 @@ static int xemaclite_send(struct sk_buff *orig_skb, struct net_device *dev)
 	skb_tx_timestamp(new_skb);
 
 	dev->stats.tx_bytes += len;
-	dev_kfree_skb(new_skb);
+	dev_consume_skb_any(new_skb);
 
 	return 0;
 }
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* [PATCH 49/54] vmxnet3: Call dev_kfree_skb_any instead of dev_kfree_skb.
  2014-03-25  6:04                         ` [PATCH 01/54] uml/net_kern: Call dev_consume_skb_any instead of dev_kfree_skb Eric W. Biederman
                                             ` (46 preceding siblings ...)
  2014-03-25  6:05                           ` [PATCH 48/54] xilinx_emaclite: Call dev_consume_skb_any instead of dev_kfree_skb Eric W. Biederman
@ 2014-03-25  6:05                           ` Eric W. Biederman
  2014-03-25 20:46                             ` Eric Dumazet
  2014-03-25  6:05                           ` [PATCH 50/54] xen-netfront: " Eric W. Biederman
                                             ` (5 subsequent siblings)
  53 siblings, 1 reply; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-25  6:05 UTC (permalink / raw)
  To: David Miller
  Cc: eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma,
	Eric W. Biederman

From: "Eric W. Biederman" <ebiederm@xmission.com>

Replace dev_kfree_skb with dev_kfree_skb_any in vmnet3_tx_xmit which
can be called in hard irq and other contexts.  vmnet3_tx_xmit only
frees skbs that it has dropped.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 drivers/net/vmxnet3/vmxnet3_drv.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/vmxnet3/vmxnet3_drv.c b/drivers/net/vmxnet3/vmxnet3_drv.c
index 28965adfeebd..97394345e5dd 100644
--- a/drivers/net/vmxnet3/vmxnet3_drv.c
+++ b/drivers/net/vmxnet3/vmxnet3_drv.c
@@ -1078,7 +1078,7 @@ unlock_drop_pkt:
 	spin_unlock_irqrestore(&tq->tx_lock, flags);
 drop_pkt:
 	tq->stats.drop_total++;
-	dev_kfree_skb(skb);
+	dev_kfree_skb_any(skb);
 	return NETDEV_TX_OK;
 }
 
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* [PATCH 50/54] xen-netfront: Call dev_kfree_skb_any instead of dev_kfree_skb.
  2014-03-25  6:04                         ` [PATCH 01/54] uml/net_kern: Call dev_consume_skb_any instead of dev_kfree_skb Eric W. Biederman
                                             ` (47 preceding siblings ...)
  2014-03-25  6:05                           ` [PATCH 49/54] vmxnet3: Call dev_kfree_skb_any " Eric W. Biederman
@ 2014-03-25  6:05                           ` Eric W. Biederman
  2014-03-25 20:46                             ` Eric Dumazet
  2014-03-25  6:05                           ` [PATCH 51/54] wlags49_h2: Call dev_kfree/consume_skb_any " Eric W. Biederman
                                             ` (4 subsequent siblings)
  53 siblings, 1 reply; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-25  6:05 UTC (permalink / raw)
  To: David Miller
  Cc: eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma,
	Eric W. Biederman

From: "Eric W. Biederman" <ebiederm@xmission.com>

Replace dev_kfree_skb with dev_kfree_skb_any in xennet_start_xmit
which can be called in hard irq and other contexts.  xennet_start_xmit
only fress skbs which it drops.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 drivers/net/xen-netfront.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c
index 49f3b3dbbed8..057b05700f8b 100644
--- a/drivers/net/xen-netfront.c
+++ b/drivers/net/xen-netfront.c
@@ -658,7 +658,7 @@ static int xennet_start_xmit(struct sk_buff *skb, struct net_device *dev)
 
  drop:
 	dev->stats.tx_dropped++;
-	dev_kfree_skb(skb);
+	dev_kfree_skb_any(skb);
 	return NETDEV_TX_OK;
 }
 
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* [PATCH 51/54] wlags49_h2: Call dev_kfree/consume_skb_any instead of dev_kfree_skb.
  2014-03-25  6:04                         ` [PATCH 01/54] uml/net_kern: Call dev_consume_skb_any instead of dev_kfree_skb Eric W. Biederman
                                             ` (48 preceding siblings ...)
  2014-03-25  6:05                           ` [PATCH 50/54] xen-netfront: " Eric W. Biederman
@ 2014-03-25  6:05                           ` Eric W. Biederman
  2014-03-25 20:47                             ` Eric Dumazet
  2014-03-25  6:05                           ` [PATCH 52/54] staging/octeon-ethernet: " Eric W. Biederman
                                             ` (3 subsequent siblings)
  53 siblings, 1 reply; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-25  6:05 UTC (permalink / raw)
  To: David Miller
  Cc: eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma,
	Eric W. Biederman

From: "Eric W. Biederman" <ebiederm@xmission.com>

Replace dev_kfree_skb with dev_consume_skb_any in wl_send and
wl_send_dma which can be called in hard irq and other contexts,
on the code paths where the skb was transmitted successfully.

Replace dev_kfree_skb with dev_kfree_skb_any in wl_send_dmay which can
be called in hard irq and other contexts, on the code path where a
skb is dropped.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 drivers/staging/wlags49_h2/wl_netdev.c |    6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/staging/wlags49_h2/wl_netdev.c b/drivers/staging/wlags49_h2/wl_netdev.c
index 965b1c0a4753..69bc0a01ae14 100644
--- a/drivers/staging/wlags49_h2/wl_netdev.c
+++ b/drivers/staging/wlags49_h2/wl_netdev.c
@@ -715,7 +715,7 @@ int wl_send( struct wl_private *lp )
 
         /* Free the skb and perform queue cleanup, as the buffer was
             transmitted successfully */
-        dev_kfree_skb( lp->txF.skb );
+        dev_consume_skb_any( lp->txF.skb );
 
         lp->txF.skb = NULL;
         lp->txF.port = 0;
@@ -1730,7 +1730,7 @@ int wl_send_dma( struct wl_private *lp, struct sk_buff *skb, int port )
             WL_WDS_NETIF_STOP_QUEUE( lp );
             lp->netif_queue_on = FALSE;
 
-            dev_kfree_skb( skb );
+            dev_kfree_skb_any( skb );
             return 0;
         }
     }
@@ -1755,7 +1755,7 @@ int wl_send_dma( struct wl_private *lp, struct sk_buff *skb, int port )
 
     /* Free the skb and perform queue cleanup, as the buffer was
             transmitted successfully */
-    dev_kfree_skb( skb );
+    dev_consume_skb_any( skb );
 
     return TRUE;
 } // wl_send_dma
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* [PATCH 52/54] staging/octeon-ethernet: Call dev_kfree/consume_skb_any instead of dev_kfree_skb.
  2014-03-25  6:04                         ` [PATCH 01/54] uml/net_kern: Call dev_consume_skb_any instead of dev_kfree_skb Eric W. Biederman
                                             ` (49 preceding siblings ...)
  2014-03-25  6:05                           ` [PATCH 51/54] wlags49_h2: Call dev_kfree/consume_skb_any " Eric W. Biederman
@ 2014-03-25  6:05                           ` Eric W. Biederman
  2014-03-25 20:47                             ` Eric Dumazet
  2014-03-25  6:05                           ` [PATCH 53/54] virtio_net: Call dev_kfree_skb_any " Eric W. Biederman
                                             ` (2 subsequent siblings)
  53 siblings, 1 reply; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-25  6:05 UTC (permalink / raw)
  To: David Miller
  Cc: eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma,
	Eric W. Biederman

From: "Eric W. Biederman" <ebiederm@xmission.com>

Replace dev_kfree_skb with dev_kfree_skb_any in cvm_oct_xmit_pow which
can be called in hard irq and other contexts, on the code paths that
drop packets.

Replace dev_kfree_skb with dev_consume_skb_any in cvm_oct_xmit_pow which
can be called in hard irq and other contexts, on the code path where
the packet is transmitted successfully.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 drivers/staging/octeon/ethernet-tx.c |    6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/staging/octeon/ethernet-tx.c b/drivers/staging/octeon/ethernet-tx.c
index 47541e1608f3..ebb3ebc7176b 100644
--- a/drivers/staging/octeon/ethernet-tx.c
+++ b/drivers/staging/octeon/ethernet-tx.c
@@ -554,7 +554,7 @@ int cvm_oct_xmit_pow(struct sk_buff *skb, struct net_device *dev)
 		printk_ratelimited("%s: Failed to allocate a work queue entry\n",
 				   dev->name);
 		priv->stats.tx_dropped++;
-		dev_kfree_skb(skb);
+		dev_kfree_skb_any(skb);
 		return 0;
 	}
 
@@ -565,7 +565,7 @@ int cvm_oct_xmit_pow(struct sk_buff *skb, struct net_device *dev)
 				   dev->name);
 		cvmx_fpa_free(work, CVMX_FPA_WQE_POOL, DONT_WRITEBACK(1));
 		priv->stats.tx_dropped++;
-		dev_kfree_skb(skb);
+		dev_kfree_skb_any(skb);
 		return 0;
 	}
 
@@ -682,7 +682,7 @@ int cvm_oct_xmit_pow(struct sk_buff *skb, struct net_device *dev)
 			     work->grp);
 	priv->stats.tx_packets++;
 	priv->stats.tx_bytes += skb->len;
-	dev_kfree_skb(skb);
+	dev_consume_skb_any(skb);
 	return 0;
 }
 
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* [PATCH 53/54] virtio_net: Call dev_kfree_skb_any instead of dev_kfree_skb.
  2014-03-25  6:04                         ` [PATCH 01/54] uml/net_kern: Call dev_consume_skb_any instead of dev_kfree_skb Eric W. Biederman
                                             ` (50 preceding siblings ...)
  2014-03-25  6:05                           ` [PATCH 52/54] staging/octeon-ethernet: " Eric W. Biederman
@ 2014-03-25  6:05                           ` Eric W. Biederman
  2014-03-25 20:48                             ` Eric Dumazet
  2014-03-25  6:05                           ` [PATCH 54/54] if_vlan: Call dev_kfree_skb_any instead of kfree_skb Eric W. Biederman
  2014-03-25 13:01                           ` [PATCH 01/54] uml/net_kern: Call dev_consume_skb_any instead of dev_kfree_skb Eric Dumazet
  53 siblings, 1 reply; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-25  6:05 UTC (permalink / raw)
  To: David Miller
  Cc: eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma,
	Eric W. Biederman

From: "Eric W. Biederman" <ebiederm@xmission.com>

Replace dev_kfree_skb with dev_kfree_skb_any in start_xmit which can
be called in hard irq and other contexts.

start_xmit only frees skbs that it is dropping.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 drivers/net/virtio_net.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 80d84c446962..99fa48c941c6 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -883,7 +883,7 @@ static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev)
 			dev_warn(&dev->dev,
 				 "Unexpected TXQ (%d) queue failure: %d\n", qnum, err);
 		dev->stats.tx_dropped++;
-		kfree_skb(skb);
+		dev_kfree_skb_any(skb);
 		return NETDEV_TX_OK;
 	}
 
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* [PATCH 54/54] if_vlan: Call dev_kfree_skb_any instead of kfree_skb.
  2014-03-25  6:04                         ` [PATCH 01/54] uml/net_kern: Call dev_consume_skb_any instead of dev_kfree_skb Eric W. Biederman
                                             ` (51 preceding siblings ...)
  2014-03-25  6:05                           ` [PATCH 53/54] virtio_net: Call dev_kfree_skb_any " Eric W. Biederman
@ 2014-03-25  6:05                           ` Eric W. Biederman
  2014-03-25 20:48                             ` Eric Dumazet
  2014-03-25 13:01                           ` [PATCH 01/54] uml/net_kern: Call dev_consume_skb_any instead of dev_kfree_skb Eric Dumazet
  53 siblings, 1 reply; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-25  6:05 UTC (permalink / raw)
  To: David Miller
  Cc: eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma,
	Eric W. Biederman

From: "Eric W. Biederman" <ebiederm@xmission.com>

Replace kfree_skb with dev_kfree_skb_any in vlan_insert_tag as
vlan_insert_tag can be called from hard irq context (netpoll)
and from other contexts.

dev_kfree_skb_any is used as vlan_insert_tag only frees the skb if the
skb can not be modified to insert a tag, in which case vlan_insert_tag
drops the skb.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 include/linux/if_vlan.h |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/if_vlan.h b/include/linux/if_vlan.h
index bbedfb56bd66..d3d2306f00bf 100644
--- a/include/linux/if_vlan.h
+++ b/include/linux/if_vlan.h
@@ -288,7 +288,7 @@ static inline struct sk_buff *vlan_insert_tag(struct sk_buff *skb,
 	struct vlan_ethhdr *veth;
 
 	if (skb_cow_head(skb, VLAN_HLEN) < 0) {
-		kfree_skb(skb);
+		dev_kfree_skb_any(skb);
 		return NULL;
 	}
 	veth = (struct vlan_ethhdr *)skb_push(skb, VLAN_HLEN);
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* Re: [PATCH 01/54] uml/net_kern: Call dev_consume_skb_any instead of dev_kfree_skb.
  2014-03-25  6:04                         ` [PATCH 01/54] uml/net_kern: Call dev_consume_skb_any instead of dev_kfree_skb Eric W. Biederman
                                             ` (52 preceding siblings ...)
  2014-03-25  6:05                           ` [PATCH 54/54] if_vlan: Call dev_kfree_skb_any instead of kfree_skb Eric W. Biederman
@ 2014-03-25 13:01                           ` Eric Dumazet
  2014-03-25 18:05                             ` Eric W. Biederman
  53 siblings, 1 reply; 288+ messages in thread
From: Eric Dumazet @ 2014-03-25 13:01 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: David Miller, netdev, xiyou.wangcong, mpm, satyam.sharma

On Mon, 2014-03-24 at 23:04 -0700, Eric W. Biederman wrote:
> From: "Eric W. Biederman" <ebiederm@xmission.com>
> 
> Replace dev_kfree_skb with dev_consume_skb_any in uml_net_start_xmit
> as it can be called in hard irq and other contexts.
> 
> dev_consume_skb_any is used as uml_net_start_xmit typically
> consumes (not drops) packets.

Well, this is not exactly true. This driver certainly can drop packets.

Here is an untested/not even compiled patch.

diff --git a/arch/um/drivers/net_kern.c b/arch/um/drivers/net_kern.c
index 39f186252e02..8d1df7ed759e 100644
--- a/arch/um/drivers/net_kern.c
+++ b/arch/um/drivers/net_kern.c
@@ -212,6 +212,7 @@ static int uml_net_start_xmit(struct sk_buff *skb, struct net_device *dev)
 	struct uml_net_private *lp = netdev_priv(dev);
 	unsigned long flags;
 	int len;
+	enum skb_free_reason reason = SKB_REASON_CONSUMED;
 
 	netif_stop_queue(dev);
 
@@ -228,19 +229,18 @@ static int uml_net_start_xmit(struct sk_buff *skb, struct net_device *dev)
 
 		/* this is normally done in the interrupt when tx finishes */
 		netif_wake_queue(dev);
-	}
-	else if (len == 0) {
-		netif_start_queue(dev);
-		dev->stats.tx_dropped++;
-	}
-	else {
+	} else {
+		reason = SKB_REASON_DROPPED;
 		netif_start_queue(dev);
-		printk(KERN_ERR "uml_net_start_xmit: failed(%d)\n", len);
+		if (len == 0)
+			dev->stats.tx_dropped++;
+		else
+			pr_err("uml_net_start_xmit: failed(%d)\n", len);
 	}
 
 	spin_unlock_irqrestore(&lp->lock, flags);
 
-	dev_kfree_skb(skb);
+	__dev_kfree_skb_any(skb, reason);
 
 	return NETDEV_TX_OK;
 }

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* Re: [PATCH 02/54] 3c509: Call dev_consume_skb_any instead of dev_kfree_skb.
  2014-03-25  6:04                           ` [PATCH 02/54] 3c509: " Eric W. Biederman
@ 2014-03-25 13:03                             ` Eric Dumazet
  0 siblings, 0 replies; 288+ messages in thread
From: Eric Dumazet @ 2014-03-25 13:03 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: David Miller, netdev, xiyou.wangcong, mpm, satyam.sharma

On Mon, 2014-03-24 at 23:04 -0700, Eric W. Biederman wrote:
> From: "Eric W. Biederman" <ebiederm@xmission.com>
> 
> Replace dev_kfree_skb with dev_consume_skb_any in el3_start_xmit
> as it can be called in hard irq and other contexts.
> 
> dev_consume_skb_any is used as on this simple hardware the
> skb is consumed directly by the start_xmit function.
> 
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
> ---
>  drivers/net/ethernet/3com/3c509.c |    2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/net/ethernet/3com/3c509.c b/drivers/net/ethernet/3com/3c509.c
> index c53384d41c96..35df0b9e6848 100644
> --- a/drivers/net/ethernet/3com/3c509.c
> +++ b/drivers/net/ethernet/3com/3c509.c
> @@ -749,7 +749,7 @@ el3_start_xmit(struct sk_buff *skb, struct net_device *dev)
>  
>  	spin_unlock_irqrestore(&lp->lock, flags);
>  
> -	dev_kfree_skb (skb);
> +	dev_consume_skb_any (skb);
>  

Please remove the space ?

Acked-by: Eric Dumazet <edumazet@google.com>

^ permalink raw reply	[flat|nested] 288+ messages in thread

* Re: [PATCH 03/54] 3c59x: Call dev_consume_skb_any instead of dev_kfree_skb.
  2014-03-25  6:04                           ` [PATCH 03/54] 3c59x: " Eric W. Biederman
@ 2014-03-25 13:04                             ` Eric Dumazet
  0 siblings, 0 replies; 288+ messages in thread
From: Eric Dumazet @ 2014-03-25 13:04 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: David Miller, netdev, xiyou.wangcong, mpm, satyam.sharma

On Mon, 2014-03-24 at 23:04 -0700, Eric W. Biederman wrote:
> From: "Eric W. Biederman" <ebiederm@xmission.com>
> 
> Replace dev_kfree_skb with dev_consume_skb_any in vortex_start_xmit
> as it can be called in hard irq and other contexts.
> 
> dev_consume_skb_any is used when vortext_start_xmit directly consumes
> the packet instead of dmaing it to the device.
> 
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
> ---
>  drivers/net/ethernet/3com/3c59x.c |    2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/net/ethernet/3com/3c59x.c b/drivers/net/ethernet/3com/3c59x.c
> index 238ccea965c8..61477b8e8d24 100644
> --- a/drivers/net/ethernet/3com/3c59x.c
> +++ b/drivers/net/ethernet/3com/3c59x.c
> @@ -2086,7 +2086,7 @@ vortex_start_xmit(struct sk_buff *skb, struct net_device *dev)
>  		/* ... and the packet rounded to a doubleword. */
>  		skb_tx_timestamp(skb);
>  		iowrite32_rep(ioaddr + TX_FIFO, skb->data, (skb->len + 3) >> 2);
> -		dev_kfree_skb (skb);
> +		dev_consume_skb_any (skb);
>  		if (ioread16(ioaddr + TxFree) > 1536) {
>  			netif_start_queue (dev);	/* AKPM: redundant? */
>  		} else {

remove the extra space ?

Acked-by: Eric Dumazet <edumazet@google.com>

^ permalink raw reply	[flat|nested] 288+ messages in thread

* Re: [PATCH 04/54] 8390: Call dev_consume_skb_any instead of dev_kfree_skb.
  2014-03-25  6:04                           ` [PATCH 04/54] 8390: " Eric W. Biederman
@ 2014-03-25 13:06                             ` Eric Dumazet
  0 siblings, 0 replies; 288+ messages in thread
From: Eric Dumazet @ 2014-03-25 13:06 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: David Miller, netdev, xiyou.wangcong, mpm, satyam.sharma

On Mon, 2014-03-24 at 23:04 -0700, Eric W. Biederman wrote:
> From: "Eric W. Biederman" <ebiederm@xmission.com>
> 
> Replace dev_kfree_skb with dev_consume_skb_any in __ei_start_xmit that
> can be called in hard irq and other contexts.
> 
> dev_consume_skb is used as in this simple driver the skb is always
> immediately consumed, there are no drops.
> 
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
> ---

Acked-by: Eric Dumazet <edumazet@google.com>

^ permalink raw reply	[flat|nested] 288+ messages in thread

* Re: [PATCH 05/54] bfin_mac: Call dev_consume_skb_any instead of dev_kfree_skb.
  2014-03-25  6:04                           ` [PATCH 05/54] bfin_mac: " Eric W. Biederman
@ 2014-03-25 13:10                             ` Eric Dumazet
  0 siblings, 0 replies; 288+ messages in thread
From: Eric Dumazet @ 2014-03-25 13:10 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: David Miller, netdev, xiyou.wangcong, mpm, satyam.sharma

On Mon, 2014-03-24 at 23:04 -0700, Eric W. Biederman wrote:
> From: "Eric W. Biederman" <ebiederm@xmission.com>
> 
> Replace dev_kfree_skb with dev_consume_skb_any in _tx_reclaim_skb that
> can be called in hard irq and other contexts.
> 
> dev_consume_skb is used as _tx_reclaim_skb is called after a packet
> has been successfully transmitted.
> 
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
> ---
>  drivers/net/ethernet/adi/bfin_mac.c |    2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/net/ethernet/adi/bfin_mac.c b/drivers/net/ethernet/adi/bfin_mac.c
> index 83a8cdbcd936..95779b6b7394 100644
> --- a/drivers/net/ethernet/adi/bfin_mac.c
> +++ b/drivers/net/ethernet/adi/bfin_mac.c
> @@ -1087,7 +1087,7 @@ static inline void _tx_reclaim_skb(void)
>  		tx_list_head->desc_a.config &= ~DMAEN;
>  		tx_list_head->status.status_word = 0;
>  		if (tx_list_head->skb) {
> -			dev_kfree_skb(tx_list_head->skb);
> +			dev_consume_skb_any(tx_list_head->skb);
>  			tx_list_head->skb = NULL;
>  		}
>  		tx_list_head = tx_list_head->next;

Acked-by: Eric Dumazet <edumazet@google.com>

Note this driver has a race in tx_reclaim_skb_timeout(), calling
tx_reclaim_skb() without any lock (under timer interrupt, thats all)

^ permalink raw reply	[flat|nested] 288+ messages in thread

* Re: [PATCH 06/54] sun4i-emac: Call dev_consume_skb_any instead of dev_kfree_skb.
  2014-03-25  6:04                           ` [PATCH 06/54] sun4i-emac: " Eric W. Biederman
@ 2014-03-25 13:11                             ` Eric Dumazet
  0 siblings, 0 replies; 288+ messages in thread
From: Eric Dumazet @ 2014-03-25 13:11 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: David Miller, netdev, xiyou.wangcong, mpm, satyam.sharma

On Mon, 2014-03-24 at 23:04 -0700, Eric W. Biederman wrote:
> From: "Eric W. Biederman" <ebiederm@xmission.com>
> 
> Replace dev_kfree_skb with dev_consume_skb_any in emacs_start_xmit
> which can be called in hard irq and other contexts.
> 
> emac_start_xmit always transmits the packet making dev_consume_skb
> the appropriate function to call.
> 
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>

Acked-by: Eric Dumazet <edumazet@google.com>

^ permalink raw reply	[flat|nested] 288+ messages in thread

* Re: [PATCH 07/54] am79c961a: Call dev_consume_skb_any instead of dev_kfree_skb.
  2014-03-25  6:04                           ` [PATCH 07/54] am79c961a: " Eric W. Biederman
@ 2014-03-25 13:13                             ` Eric Dumazet
  0 siblings, 0 replies; 288+ messages in thread
From: Eric Dumazet @ 2014-03-25 13:13 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: David Miller, netdev, xiyou.wangcong, mpm, satyam.sharma

On Mon, 2014-03-24 at 23:04 -0700, Eric W. Biederman wrote:
> From: "Eric W. Biederman" <ebiederm@xmission.com>
> 
> Replace dev_kfree_skb with dev_consume_skb_any in am79c961 that can
> be called in hard irq and other contexts.
> 
> dev_consume_skb_any is used as am79c961_sendpacket always
> immediately consumes the skb.
> 
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
> ---

Acked-by: Eric Dumazet <edumazet@google.com>

^ permalink raw reply	[flat|nested] 288+ messages in thread

* Re: [PATCH 08/54] lance: Call dev_consume_skb_any instead of dev_kfree_skb.
  2014-03-25  6:04                           ` [PATCH 08/54] lance: " Eric W. Biederman
@ 2014-03-25 13:14                             ` Eric Dumazet
  0 siblings, 0 replies; 288+ messages in thread
From: Eric Dumazet @ 2014-03-25 13:14 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: David Miller, netdev, xiyou.wangcong, mpm, satyam.sharma

On Mon, 2014-03-24 at 23:04 -0700, Eric W. Biederman wrote:
> From: "Eric W. Biederman" <ebiederm@xmission.com>
> 
> Replace dev_kfree_skb with dev_consume_skb_any in lance_start_xmit
> that can be called in hard irq and other contexts.
> 
> dev_consume_skb_any is used as lance_start_xmit always immediately
> consumes the skb.
> 
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
> ---
Acked-by: Eric Dumazet <edumazet@google.com>

^ permalink raw reply	[flat|nested] 288+ messages in thread

* Re: [PATCH 09/54] pcnet32: Call dev_kfree_skb_any instead of dev_kfree_skb.
  2014-03-25  6:04                           ` [PATCH 09/54] pcnet32: Call dev_kfree_skb_any " Eric W. Biederman
@ 2014-03-25 13:15                             ` Eric Dumazet
  0 siblings, 0 replies; 288+ messages in thread
From: Eric Dumazet @ 2014-03-25 13:15 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: David Miller, netdev, xiyou.wangcong, mpm, satyam.sharma

On Mon, 2014-03-24 at 23:04 -0700, Eric W. Biederman wrote:
> From: "Eric W. Biederman" <ebiederm@xmission.com>
> 
> Replace dev_kfree_skb with dev_kfree_skb_any in pcnet32_start_xmit
> that can be called in hard irq and other contexts.
> 
> dev_kfree_skb_any is used as pcnet32_start_xmit only frees an
> skb when it drops a packet during transmit.
> 
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
> ---
Acked-by: Eric Dumazet <edumazet@google.com>

^ permalink raw reply	[flat|nested] 288+ messages in thread

* Re: [PATCH 10/54] alx: Call dev_kfree_skb_any instead of dev_kfree_skb.
  2014-03-25  6:04                           ` [PATCH 10/54] alx: " Eric W. Biederman
@ 2014-03-25 13:16                             ` Eric Dumazet
  0 siblings, 0 replies; 288+ messages in thread
From: Eric Dumazet @ 2014-03-25 13:16 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: David Miller, netdev, xiyou.wangcong, mpm, satyam.sharma

On Mon, 2014-03-24 at 23:04 -0700, Eric W. Biederman wrote:
> From: "Eric W. Biederman" <ebiederm@xmission.com>
> 
> Replace dev_kfree_skb with dev_kfree_skb_any in alx_start_xmit that
> can be called in hard irq and other contexts.
> 
> dev_kfree_skb_any is used as alx_start_xmit only frees skbs
> when dropping them.
> 
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
> ---

Acked-by: Eric Dumazet <edumazet@google.com>

^ permalink raw reply	[flat|nested] 288+ messages in thread

* Re: [PATCH 11/54] atl1c: Call dev_kfree/consume_skb_any instead of dev_kfree_skb.
  2014-03-25  6:04                           ` [PATCH 11/54] atl1c: Call dev_kfree/consume_skb_any " Eric W. Biederman
@ 2014-03-25 13:18                             ` Eric Dumazet
  0 siblings, 0 replies; 288+ messages in thread
From: Eric Dumazet @ 2014-03-25 13:18 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: David Miller, netdev, xiyou.wangcong, mpm, satyam.sharma

On Mon, 2014-03-24 at 23:04 -0700, Eric W. Biederman wrote:
> From: "Eric W. Biederman" <ebiederm@xmission.com>
> 
> The call path: atl1c_xmit_frame, atlc_tx_rollback, atl1c_clean_buffer
> can not be tell at compile time if it will be invoked from hard irq
> or other context, as atl1c_xmit_frame does not know.  So remove
> the logic that  passes the compile time knowledge into al1c_clean_buffer
> and figure out it out at runtime with dev_consume_skb_any.
> 
> Replace dev_kfree_skb with dev_kfree_skb_any in atl1c_xmit_frame that
> can be called in hard irq and other contexts.
> 
> Replace dev_kfree_skb and dev_kfree_skb_irq with dev_consume_skb_any
> in atl1c_clean_buffer that can be called in hard irq and other
> contexts.
> 
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
> ---

Nice ;)

Acked-by: Eric Dumazet <edumazet@google.com>

^ permalink raw reply	[flat|nested] 288+ messages in thread

* Re: [PATCH 12/54] bnad: Call dev_kfree_skb_any instead of dev_kfree_skb.
  2014-03-25  6:04                           ` [PATCH 12/54] bnad: Call dev_kfree_skb_any " Eric W. Biederman
@ 2014-03-25 13:19                             ` Eric Dumazet
  0 siblings, 0 replies; 288+ messages in thread
From: Eric Dumazet @ 2014-03-25 13:19 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: David Miller, netdev, xiyou.wangcong, mpm, satyam.sharma

On Mon, 2014-03-24 at 23:04 -0700, Eric W. Biederman wrote:
> From: "Eric W. Biederman" <ebiederm@xmission.com>
> 
> Replace dev_kfree_skb with dev_kfree_skb_any in bnad_start_xmit that
> can be called in hard irq and other contexts.
> 
> dev_kfree_skb_any is used as bnad_start_xmit only frees skbs when to
> drop them, normally transmitted packets are handled elsewhere.
> 
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
> ---

Acked-by: Eric Dumazet <edumazet@google.com>

^ permalink raw reply	[flat|nested] 288+ messages in thread

* Re: [PATCH 13/54] macb: Call dev_kfree_skb_any instead of kfree_skb.
  2014-03-25  6:04                           ` [PATCH 13/54] macb: Call dev_kfree_skb_any instead of kfree_skb Eric W. Biederman
@ 2014-03-25 13:21                             ` Eric Dumazet
  0 siblings, 0 replies; 288+ messages in thread
From: Eric Dumazet @ 2014-03-25 13:21 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: David Miller, netdev, xiyou.wangcong, mpm, satyam.sharma

On Mon, 2014-03-24 at 23:04 -0700, Eric W. Biederman wrote:
> From: "Eric W. Biederman" <ebiederm@xmission.com>
> 
> Replace kfree_skb with dev_kfree_skb_any in macb_start_xmit that can
> be called in hard irq and other contexts.
> 
> macb_start_xmit only frees skbs when dropping them so
> dev_kfree_skb_any is used.
> 
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
> ---

Acked-by: Eric Dumazet <edumazet@google.com>

^ permalink raw reply	[flat|nested] 288+ messages in thread

* Re: [PATCH 14/54] xgmac: Call dev_kfree/consume_skb_any instead of dev_kfree_skb.
  2014-03-25  6:05                           ` [PATCH 14/54] xgmac: Call dev_kfree/consume_skb_any instead of dev_kfree_skb Eric W. Biederman
@ 2014-03-25 15:16                             ` Eric Dumazet
  0 siblings, 0 replies; 288+ messages in thread
From: Eric Dumazet @ 2014-03-25 15:16 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: David Miller, netdev, xiyou.wangcong, mpm, satyam.sharma

On Mon, 2014-03-24 at 23:05 -0700, Eric W. Biederman wrote:
> From: "Eric W. Biederman" <ebiederm@xmission.com>
> 
> Replace dev_kfree_skb with dev_consume_skb_any in xgmac_tx_complete
> that can be called in hard irq and other contexts.
> 
> Replace dev_kfree_skb with dev_kfree_skb_any in xgmac_xmit that can
> be called in hard irq and other contexts.
> 
> dev_consume_skb_any is used in xgamc_tx_complete as skbs that reach
> there have been successfully transmitted, dev_kfree_skby_any is used
> in xgmac_xmit as skbs that are freed there are being dropped.
> 
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
> ---
>  drivers/net/ethernet/calxeda/xgmac.c |    6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)

Acked-by: Eric Dumazet <edumazet@google.com>

^ permalink raw reply	[flat|nested] 288+ messages in thread

* Re: [PATCH 15/54] cxgb3: Call dev_kfree/consume_skb_any instead of [dev_]kfree_skb.
  2014-03-25  6:05                           ` [PATCH 15/54] cxgb3: Call dev_kfree/consume_skb_any instead of [dev_]kfree_skb Eric W. Biederman
@ 2014-03-25 15:18                             ` Eric Dumazet
  0 siblings, 0 replies; 288+ messages in thread
From: Eric Dumazet @ 2014-03-25 15:18 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: David Miller, netdev, xiyou.wangcong, mpm, satyam.sharma

On Mon, 2014-03-24 at 23:05 -0700, Eric W. Biederman wrote:
> From: "Eric W. Biederman" <ebiederm@xmission.com>
> 
> Replace kfree_skb with dev_consume_skb_any in free_tx_desc, and
> write_tx_pkt_wr that can be called in hard irq and other contexts.
> 
> Replace dev_kfree_skb with dev_kfree_skb_any in t3_eth_xmit that can
> be called in hard irq and other contexts.
> 
> dev_kfree_skb is replaced with dev_kfree_skb_any in t3_eth_xmit as
> that location is a packet drop, while kfree_skb in free_tx_desc,
> and in write_tx_pkt_wr are places where packets are consumed
> in a healthy manner.
> 
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
> ---

Acked-by: Eric Dumazet <edumazet@google.com>

^ permalink raw reply	[flat|nested] 288+ messages in thread

* Re: [PATCH 16/54] cxgb4: Call dev_kfree/consume_skb_any instead of [dev_]kfree_skb.
  2014-03-25  6:05                           ` [PATCH 16/54] cxgb4: " Eric W. Biederman
@ 2014-03-25 15:19                             ` Eric Dumazet
  0 siblings, 0 replies; 288+ messages in thread
From: Eric Dumazet @ 2014-03-25 15:19 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: David Miller, netdev, xiyou.wangcong, mpm, satyam.sharma

On Mon, 2014-03-24 at 23:05 -0700, Eric W. Biederman wrote:
> From: "Eric W. Biederman" <ebiederm@xmission.com>
> 
> Replace kfree_skb with dev_consume_skb_any in free_tx_desc that can be
> called in hard irq and other contexts. dev_consume_skb_any is used
> as this function consumes successfully transmitted skbs.
> 
> Replace dev_kfree_skb with dev_kfree_skb_any in t4_eth_xmit that can
> be called in hard irq and other contexts, on paths that drop the skb.
> 
> Replace dev_kfree_skb with dev_consume_skb_any in t4_eth_xmit that can
> be called in hard irq and other contexts, on paths that successfully
> transmit the skb.
> 
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
> ---

Acked-by: Eric Dumazet <edumazet@google.com>

^ permalink raw reply	[flat|nested] 288+ messages in thread

* Re: [PATCH 17/54] cxfb4vf: Call dev_kfree/consume_skb_any instead of [dev_]kfree_skb.
  2014-03-25  6:05                           ` [PATCH 17/54] cxfb4vf: " Eric W. Biederman
@ 2014-03-25 15:22                             ` Eric Dumazet
  0 siblings, 0 replies; 288+ messages in thread
From: Eric Dumazet @ 2014-03-25 15:22 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: David Miller, netdev, xiyou.wangcong, mpm, satyam.sharma

On Mon, 2014-03-24 at 23:05 -0700, Eric W. Biederman wrote:
> From: "Eric W. Biederman" <ebiederm@xmission.com>
> 
> Replace kfree_skb with dev_consume_skb_any in free_tx_desc that can be
> called in hard irq and other contexts. dev_consume_skb_any is used
> as this function consumes successfully transmitted skbs.
> 
> Replace dev_kfree_skb with dev_kfree_skb_any in t4vf_eth_xmit that can
> be called in hard irq and other contexts, on paths that drop the skb.
> 
> Replace dev_kfree_skb with dev_consume_skb_any in t4vf_eth_xmit that can
> be called in hard irq and other contexts, on paths that successfully
> transmit the skb.
> 
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
> ---

Acked-by: Eric Dumazet <edumazet@google.com>

^ permalink raw reply	[flat|nested] 288+ messages in thread

* Re: [PATCH 18/54] cs89x0: Call dev_consume_skb_any instead of dev_kfree_skb.
  2014-03-25  6:05                           ` [PATCH 18/54] cs89x0: Call dev_consume_skb_any instead of dev_kfree_skb Eric W. Biederman
@ 2014-03-25 15:23                             ` Eric Dumazet
  0 siblings, 0 replies; 288+ messages in thread
From: Eric Dumazet @ 2014-03-25 15:23 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: David Miller, netdev, xiyou.wangcong, mpm, satyam.sharma

On Mon, 2014-03-24 at 23:05 -0700, Eric W. Biederman wrote:
> From: "Eric W. Biederman" <ebiederm@xmission.com>
> 
> Replace dev_kfree_skb with dev_consume_skb_any in net_send_packet that
> can be called in hard irq and other contexts.
> 
> net_send_packet consumes (not drops) the skb of interest.
> 
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
> ---
Acked-by: Eric Dumazet <edumazet@google.com>

^ permalink raw reply	[flat|nested] 288+ messages in thread

* Re: [PATCH 19/54] enic: Call dev_kfree_skb_any instead of dev_kfree_skb.
  2014-03-25  6:05                           ` [PATCH 19/54] enic: Call dev_kfree_skb_any " Eric W. Biederman
@ 2014-03-25 15:24                             ` Eric Dumazet
  0 siblings, 0 replies; 288+ messages in thread
From: Eric Dumazet @ 2014-03-25 15:24 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: David Miller, netdev, xiyou.wangcong, mpm, satyam.sharma

On Mon, 2014-03-24 at 23:05 -0700, Eric W. Biederman wrote:
> From: "Eric W. Biederman" <ebiederm@xmission.com>
> 
> Replace dev_kfree_skb with dev_kfree_skb_any in enic_hard_start_xmit
> that can be called in hard irq and other contexts.
> 
> enic_hard_start_xmit only frees the skb when dropping it.
> 
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
> ---
>  drivers/net/ethernet/cisco/enic/enic_main.c |    4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/ethernet/cisco/enic/enic_main.c b/drivers/net/ethernet/cisco/enic/enic_main.c
> index 4c35fc8fad99..2945718ce806 100644
> --- a/drivers/net/ethernet/cisco/enic/enic_main.c
> +++ b/drivers/net/ethernet/cisco/enic/enic_main.c
> @@ -521,7 +521,7 @@ static netdev_tx_t enic_hard_start_xmit(struct sk_buff *skb,
>  	unsigned int txq_map;
>  
>  	if (skb->len <= 0) {
> -		dev_kfree_skb(skb);
> +		dev_kfree_skb_any(skb);
>  		return NETDEV_TX_OK;
>  	}
>  
> @@ -536,7 +536,7 @@ static netdev_tx_t enic_hard_start_xmit(struct sk_buff *skb,
>  	if (skb_shinfo(skb)->gso_size == 0 &&
>  	    skb_shinfo(skb)->nr_frags + 1 > ENIC_NON_TSO_MAX_DESC &&
>  	    skb_linearize(skb)) {
> -		dev_kfree_skb(skb);
> +		dev_kfree_skb_any(skb);
>  		return NETDEV_TX_OK;
>  	}
>  

Yep, apparently this driver do not care incrementing tx_errors...

Acked-by: Eric Dumazet <edumazet@google.com>

^ permalink raw reply	[flat|nested] 288+ messages in thread

* Re: [PATCH 20/54] dm9000: Call dev_consume_skb_any instead of dev_kfree_skb.
  2014-03-25  6:05                           ` [PATCH 20/54] dm9000: Call dev_consume_skb_any " Eric W. Biederman
@ 2014-03-25 15:26                             ` Eric Dumazet
  0 siblings, 0 replies; 288+ messages in thread
From: Eric Dumazet @ 2014-03-25 15:26 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: David Miller, netdev, xiyou.wangcong, mpm, satyam.sharma

On Mon, 2014-03-24 at 23:05 -0700, Eric W. Biederman wrote:
> From: "Eric W. Biederman" <ebiederm@xmission.com>
> 
> Replace dev_kfree_skb with dev_consume_skb_any in dm9000_start_xmit
> that can be called in hard irq and other contexts, on the path
> that successfully transmits the packet.
> 
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
> ---
>  drivers/net/ethernet/davicom/dm9000.c |    2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

Acked-by: Eric Dumazet <edumazet@google.com>

^ permalink raw reply	[flat|nested] 288+ messages in thread

* Re: [PATCH 21/54] dmfe: Call dev_kfree/consume_skb_any instead of dev_kfree_skb.
  2014-03-25  6:05                           ` [PATCH 21/54] dmfe: Call dev_kfree/consume_skb_any " Eric W. Biederman
@ 2014-03-25 15:28                             ` Eric Dumazet
  0 siblings, 0 replies; 288+ messages in thread
From: Eric Dumazet @ 2014-03-25 15:28 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: David Miller, netdev, xiyou.wangcong, mpm, satyam.sharma

On Mon, 2014-03-24 at 23:05 -0700, Eric W. Biederman wrote:
> From: "Eric W. Biederman" <ebiederm@xmission.com>
> 
> Replace dev_kfree_skb with dev_kfree_skb_any in dmfe_start_xmit that
> can be called in hard irq and other contexts, when the packet is
> dropped.
> 
> Replace dev_kfree_skb with dev_consume_skb_any in dmfe_start_xmit that
> can be called in hard irq and other contexts, when the packet is
> transmitted.
> 
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
> ---

Acked-by: Eric Dumazet <edumazet@google.com>

^ permalink raw reply	[flat|nested] 288+ messages in thread

* Re: [PATCH 22/54] uli526x: Call dev_kfree/consume_skb_any instead of dev_kfree_skb.
  2014-03-25  6:05                           ` [PATCH 22/54] uli526x: " Eric W. Biederman
@ 2014-03-25 15:29                             ` Eric Dumazet
  0 siblings, 0 replies; 288+ messages in thread
From: Eric Dumazet @ 2014-03-25 15:29 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: David Miller, netdev, xiyou.wangcong, mpm, satyam.sharma

On Mon, 2014-03-24 at 23:05 -0700, Eric W. Biederman wrote:
> From: "Eric W. Biederman" <ebiederm@xmission.com>
> 
> Replace dev_kfree_skb with dev_kfree_skb_any in uli562x_start_xmit
> that can be called in hard irq and other contexts, when the packet is
> dropped.
> 
> Replace dev_kfree_skb with dev_consume_skb_any in uli562x_start_xmit
> that can be called in hard irq and other contexts, when the packet is
> transmitted.
> 
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
> ---
>  drivers/net/ethernet/dec/tulip/uli526x.c |    4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)

Acked-by: Eric Dumazet <edumazet@google.com>

^ permalink raw reply	[flat|nested] 288+ messages in thread

* Re: [PATCH 23/54] sundance: Call dev_kfree_skb_any instead of dev_kfree_skb.
  2014-03-25  6:05                           ` [PATCH 23/54] sundance: Call dev_kfree_skb_any " Eric W. Biederman
@ 2014-03-25 15:29                             ` Eric Dumazet
  0 siblings, 0 replies; 288+ messages in thread
From: Eric Dumazet @ 2014-03-25 15:29 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: David Miller, netdev, xiyou.wangcong, mpm, satyam.sharma

On Mon, 2014-03-24 at 23:05 -0700, Eric W. Biederman wrote:
> From: "Eric W. Biederman" <ebiederm@xmission.com>
> 
> Replace dev_kfree_skb with dev_kfree_skb_any in start_tx that can
> be called in hard irq and other contexts, when the skb is dropped.
> 
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
> ---
>  drivers/net/ethernet/dlink/sundance.c |    2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

Acked-by: Eric Dumazet <edumazet@google.com>

^ permalink raw reply	[flat|nested] 288+ messages in thread

* Re: [PATCH 24/54] fec: Call dev_kfree_skb_any instead of kfree_skb.
  2014-03-25  6:05                           ` [PATCH 24/54] fec: Call dev_kfree_skb_any instead of kfree_skb Eric W. Biederman
@ 2014-03-25 15:30                             ` Eric Dumazet
  0 siblings, 0 replies; 288+ messages in thread
From: Eric Dumazet @ 2014-03-25 15:30 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: David Miller, netdev, xiyou.wangcong, mpm, satyam.sharma

On Mon, 2014-03-24 at 23:05 -0700, Eric W. Biederman wrote:
> From: "Eric W. Biederman" <ebiederm@xmission.com>
> 
> Replace kfree_skb with dev_kfree_skb_any in fec_enet_start_xmit that
> can be called in hard irq and other contexts, when the packet is
> dropped.
> 
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
> ---

Acked-by: Eric Dumazet <edumazet@google.com>

^ permalink raw reply	[flat|nested] 288+ messages in thread

* Re: [PATCH 25/54] ucc_geth: Call dev_consume_skb_any instead of dev_kfree_skb.
  2014-03-25  6:05                           ` [PATCH 25/54] ucc_geth: Call dev_consume_skb_any instead of dev_kfree_skb Eric W. Biederman
@ 2014-03-25 15:30                             ` Eric Dumazet
  0 siblings, 0 replies; 288+ messages in thread
From: Eric Dumazet @ 2014-03-25 15:30 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: David Miller, netdev, xiyou.wangcong, mpm, satyam.sharma

On Mon, 2014-03-24 at 23:05 -0700, Eric W. Biederman wrote:
> From: "Eric W. Biederman" <ebiederm@xmission.com>
> 
> Replace dev_kfree_skb with dev_consume_skb_any in ucc_geth_tx that can
> be called in hard irq and other contexts, when processing the
> tx completion event.
> 
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
> ---
>  drivers/net/ethernet/freescale/ucc_geth.c |    2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

Acked-by: Eric Dumazet <edumazet@google.com>

^ permalink raw reply	[flat|nested] 288+ messages in thread

* Re: [PATCH 26/54] i825xx: Call dev_kfree_skb_any instead of dev_kfree_skb.
  2014-03-25  6:05                           ` [PATCH 26/54] i825xx: Call dev_kfree_skb_any " Eric W. Biederman
@ 2014-03-25 15:31                             ` Eric Dumazet
  0 siblings, 0 replies; 288+ messages in thread
From: Eric Dumazet @ 2014-03-25 15:31 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: David Miller, netdev, xiyou.wangcong, mpm, satyam.sharma

On Mon, 2014-03-24 at 23:05 -0700, Eric W. Biederman wrote:
> From: "Eric W. Biederman" <ebiederm@xmission.com>
> 
> Replace dev_kfree_skb with dev_kfree_skb_any in i596_start_xmit that
> can be called in hard irq and other contexts, when the skb is dropped.
> 
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
> ---
>  drivers/net/ethernet/i825xx/lib82596.c |    2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

Acked-by: Eric Dumazet <edumazet@google.com>

^ permalink raw reply	[flat|nested] 288+ messages in thread

* Re: [PATCH 27/54] ehea: Call dev_consume_skb_any instead of dev_kfree_skb.
  2014-03-25  6:05                           ` [PATCH 27/54] ehea: Call dev_consume_skb_any " Eric W. Biederman
@ 2014-03-25 15:39                             ` Eric Dumazet
  2014-03-25 15:39                             ` Eric Dumazet
  1 sibling, 0 replies; 288+ messages in thread
From: Eric Dumazet @ 2014-03-25 15:39 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: David Miller, netdev, xiyou.wangcong, mpm, satyam.sharma

On Mon, 2014-03-24 at 23:05 -0700, Eric W. Biederman wrote:
> From: "Eric W. Biederman" <ebiederm@xmission.com>
> 
> Replace dev_kfree_skb with dev_consume_skb_any in functions that can
> be called in hard irq and other contexts.
> 
> None of the locations was a packet drop so dev_kfree_skb_any is
> inappropriate.
> 
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
> ---
>  drivers/net/ethernet/ibm/ehea/ehea_main.c |    6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)

Acked-by: Eric Dumazet <edumazet@google.com>

^ permalink raw reply	[flat|nested] 288+ messages in thread

* Re: [PATCH 27/54] ehea: Call dev_consume_skb_any instead of dev_kfree_skb.
  2014-03-25  6:05                           ` [PATCH 27/54] ehea: Call dev_consume_skb_any " Eric W. Biederman
  2014-03-25 15:39                             ` Eric Dumazet
@ 2014-03-25 15:39                             ` Eric Dumazet
  1 sibling, 0 replies; 288+ messages in thread
From: Eric Dumazet @ 2014-03-25 15:39 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: David Miller, netdev, xiyou.wangcong, mpm, satyam.sharma

On Mon, 2014-03-24 at 23:05 -0700, Eric W. Biederman wrote:
> From: "Eric W. Biederman" <ebiederm@xmission.com>
> 
> Replace dev_kfree_skb with dev_consume_skb_any in functions that can
> be called in hard irq and other contexts.
> 
> None of the locations was a packet drop so dev_kfree_skb_any is
> inappropriate.
> 
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
> ---

Acked-by: Eric Dumazet <edumazet@google.com>

^ permalink raw reply	[flat|nested] 288+ messages in thread

* Re: [PATCH 29/54] jme: Call dev_kfree_skb_any instead of dev_kfree_skb.
  2014-03-25  6:05                           ` [PATCH 29/54] jme: Call dev_kfree_skb_any " Eric W. Biederman
@ 2014-03-25 15:45                             ` Eric Dumazet
  0 siblings, 0 replies; 288+ messages in thread
From: Eric Dumazet @ 2014-03-25 15:45 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: David Miller, netdev, xiyou.wangcong, mpm, satyam.sharma

On Mon, 2014-03-24 at 23:05 -0700, Eric W. Biederman wrote:
> From: "Eric W. Biederman" <ebiederm@xmission.com>
> 
> Replace dev_kfree_skb with dev_kfree_skb_any in jme_expand_header that
> can be called in hard irq and other contexts, on the failure
> path where the skb is dropped.
> 
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
> ---

Acked-by: Eric Dumazet <edumazet@google.com>

^ permalink raw reply	[flat|nested] 288+ messages in thread

* Re: [PATCH 30/54] mv643xx_eth: Call dev_kfree_skb_any instead of dev_kfree_skb.
  2014-03-25  6:05                           ` [PATCH 30/54] mv643xx_eth: " Eric W. Biederman
@ 2014-03-25 15:46                             ` Eric Dumazet
  0 siblings, 0 replies; 288+ messages in thread
From: Eric Dumazet @ 2014-03-25 15:46 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: David Miller, netdev, xiyou.wangcong, mpm, satyam.sharma

On Mon, 2014-03-24 at 23:05 -0700, Eric W. Biederman wrote:
> From: "Eric W. Biederman" <ebiederm@xmission.com>
> 
> Replace dev_kfree_skb with dev_kfree_skb_any in mv643xx_eth_xmit and
> txq_submit_skb that can be called in hard irq and other contexts,
> on paths where the skbs are dropped.
> 
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
> ---

Acked-by: Eric Dumazet <edumazet@google.com>

^ permalink raw reply	[flat|nested] 288+ messages in thread

* Re: [PATCH 31/54] skge: Call dev_kfree/consume_skb_any instead of dev_kfree_skb.
  2014-03-25  6:05                           ` [PATCH 31/54] skge: Call dev_kfree/consume_skb_any " Eric W. Biederman
@ 2014-03-25 15:47                             ` Eric Dumazet
  0 siblings, 0 replies; 288+ messages in thread
From: Eric Dumazet @ 2014-03-25 15:47 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: David Miller, netdev, xiyou.wangcong, mpm, satyam.sharma

On Mon, 2014-03-24 at 23:05 -0700, Eric W. Biederman wrote:
> From: "Eric W. Biederman" <ebiederm@xmission.com>
> 
> Replace dev_kfree_skb with dev_kfree_skb_any skge_xmit_free that can
> be called in hard irq and other contexts, on the path that
> handles dropped packets.
> 
> Replace dev_kfree_skb with dev_consume_skb_any in skge_tx_done that can
> be called in hard irq and other contexts, on the path that handles
> successfully transmitted skbs.
> 
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
> ---

Acked-by: Eric Dumazet <edumazet@google.com>

^ permalink raw reply	[flat|nested] 288+ messages in thread

* Re: [PATCH 32/54] sky2: Call dev_kfree_skb_any instead of dev_kfree_skb.
  2014-03-25  6:05                           ` [PATCH 32/54] sky2: Call dev_kfree_skb_any " Eric W. Biederman
@ 2014-03-25 16:23                             ` Eric Dumazet
  0 siblings, 0 replies; 288+ messages in thread
From: Eric Dumazet @ 2014-03-25 16:23 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: David Miller, netdev, xiyou.wangcong, mpm, satyam.sharma

On Mon, 2014-03-24 at 23:05 -0700, Eric W. Biederman wrote:
> From: "Eric W. Biederman" <ebiederm@xmission.com>
> 
> Replace dev_kfree_skb with dev_kfree_skb_any in sky2_xmit_frame that
> can be called in hard irq and other contexts.
> 
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
> ---
>  drivers/net/ethernet/marvell/sky2.c |    2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

Acked-by: Eric Dumazet <edumazet@google.com>

^ permalink raw reply	[flat|nested] 288+ messages in thread

* Re: [PATCH 33/54] ksz884x: Call dev_consume_skb_any instead of dev_kfree_skb.
  2014-03-25  6:05                           ` [PATCH 33/54] ksz884x: Call dev_consume_skb_any " Eric W. Biederman
@ 2014-03-25 16:23                             ` Eric Dumazet
  0 siblings, 0 replies; 288+ messages in thread
From: Eric Dumazet @ 2014-03-25 16:23 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: David Miller, netdev, xiyou.wangcong, mpm, satyam.sharma

On Mon, 2014-03-24 at 23:05 -0700, Eric W. Biederman wrote:
> From: "Eric W. Biederman" <ebiederm@xmission.com>
> 
> Replace dev_kfree_skb with dev_consume_skb_any in copy_old_skb that can
> be called in hard irq and other contexts.
> 
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
> ---
>  drivers/net/ethernet/micrel/ksz884x.c |    2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

Acked-by: Eric Dumazet <edumazet@google.com>

^ permalink raw reply	[flat|nested] 288+ messages in thread

* Re: [PATCH 34/54] s2io: Call dev_kfree_skb_any instead of dev_kfree_skb.
  2014-03-25  6:05                           ` [PATCH 34/54] s2io: Call dev_kfree_skb_any " Eric W. Biederman
@ 2014-03-25 16:25                             ` Eric Dumazet
  0 siblings, 0 replies; 288+ messages in thread
From: Eric Dumazet @ 2014-03-25 16:25 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: David Miller, netdev, xiyou.wangcong, mpm, satyam.sharma

On Mon, 2014-03-24 at 23:05 -0700, Eric W. Biederman wrote:
> From: "Eric W. Biederman" <ebiederm@xmission.com>
> 
> Replace dev_kfree_skb with dev_kfree_skb_any in s2io_xmit that can
> be called in hard irq and other contexts.
> 
> All instances that are changed are packet drops.
> 
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
> ---
>  drivers/net/ethernet/neterion/s2io.c |    6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/net/ethernet/neterion/s2io.c b/drivers/net/ethernet/neterion/s2io.c
> index d44fdb91808e..a2844ff322c4 100644
> --- a/drivers/net/ethernet/neterion/s2io.c
> +++ b/drivers/net/ethernet/neterion/s2io.c
> @@ -4049,7 +4049,7 @@ static netdev_tx_t s2io_xmit(struct sk_buff *skb, struct net_device *dev)
>  	if (!is_s2io_card_up(sp)) {
>  		DBG_PRINT(TX_DBG, "%s: Card going down for reset\n",
>  			  dev->name);
> -		dev_kfree_skb(skb);
> +		dev_kfree_skb_any(skb);
>  		return NETDEV_TX_OK;
>  	}
>  
> @@ -4122,7 +4122,7 @@ static netdev_tx_t s2io_xmit(struct sk_buff *skb, struct net_device *dev)
>  	    ((put_off+1) == queue_len ? 0 : (put_off+1)) == get_off) {
>  		DBG_PRINT(TX_DBG, "Error in xmit, No free TXDs.\n");
>  		s2io_stop_tx_queue(sp, fifo->fifo_no);
> -		dev_kfree_skb(skb);
> +		dev_kfree_skb_any(skb);

minor nit : This could be done after spin_unlock_irqrestore()...

>  		spin_unlock_irqrestore(&fifo->tx_lock, flags);
>  		return NETDEV_TX_OK;
>  	}
> @@ -4244,7 +4244,7 @@ pci_map_failed:
>  	swstats->pci_map_fail_cnt++;
>  	s2io_stop_tx_queue(sp, fifo->fifo_no);
>  	swstats->mem_freed += skb->truesize;
> -	dev_kfree_skb(skb);
> +	dev_kfree_skb_any(skb);
>  	spin_unlock_irqrestore(&fifo->tx_lock, flags);
>  	return NETDEV_TX_OK;
>  }

Acked-by: Eric Dumazet <edumazet@google.com>

^ permalink raw reply	[flat|nested] 288+ messages in thread

* Re: [PATCH 35/54] vxge: Call dev_kfree_skb_any instead of dev_kfree_skb.
  2014-03-25  6:05                           ` [PATCH 35/54] vxge: " Eric W. Biederman
@ 2014-03-25 16:26                             ` Eric Dumazet
  0 siblings, 0 replies; 288+ messages in thread
From: Eric Dumazet @ 2014-03-25 16:26 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: David Miller, netdev, xiyou.wangcong, mpm, satyam.sharma

On Mon, 2014-03-24 at 23:05 -0700, Eric W. Biederman wrote:
> From: "Eric W. Biederman" <ebiederm@xmission.com>
> 
> Replace dev_kfree_skb with dev_kfree_skb_any in vxge_xmit that can
> be called in hard irq and other contexts.
> 
> vxge_xmit only calls dev_kfree_skb_any when errors result in dropping
> skbs.
> 
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
> ---

Acked-by: Eric Dumazet <edumazet@google.com>

^ permalink raw reply	[flat|nested] 288+ messages in thread

* Re: [PATCH 36/54] forcedeth: Call dev_kfree_skb_any instead of kfree_skb.
  2014-03-25  6:05                           ` [PATCH 36/54] forcedeth: Call dev_kfree_skb_any instead of kfree_skb Eric W. Biederman
@ 2014-03-25 16:27                             ` Eric Dumazet
  0 siblings, 0 replies; 288+ messages in thread
From: Eric Dumazet @ 2014-03-25 16:27 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: David Miller, netdev, xiyou.wangcong, mpm, satyam.sharma

On Mon, 2014-03-24 at 23:05 -0700, Eric W. Biederman wrote:
> From: "Eric W. Biederman" <ebiederm@xmission.com>
> 
> Replace kfree_skb with dev_kfree_skb_any in functions that can
> be called in hard irq and other contexts.
> 
> Every location changes is a drop making dev_kfree_skby_any appropriate.
> 
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
> ---

Acked-by: Eric Dumazet <edumazet@google.com>

^ permalink raw reply	[flat|nested] 288+ messages in thread

* Re: [PATCH 01/54] uml/net_kern: Call dev_consume_skb_any instead of dev_kfree_skb.
  2014-03-25 13:01                           ` [PATCH 01/54] uml/net_kern: Call dev_consume_skb_any instead of dev_kfree_skb Eric Dumazet
@ 2014-03-25 18:05                             ` Eric W. Biederman
  2014-03-26  9:49                               ` David Laight
  0 siblings, 1 reply; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-25 18:05 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: David Miller, netdev, xiyou.wangcong, mpm, satyam.sharma

Eric Dumazet <eric.dumazet@gmail.com> writes:

> On Mon, 2014-03-24 at 23:04 -0700, Eric W. Biederman wrote:
>> From: "Eric W. Biederman" <ebiederm@xmission.com>
>> 
>> Replace dev_kfree_skb with dev_consume_skb_any in uml_net_start_xmit
>> as it can be called in hard irq and other contexts.
>> 
>> dev_consume_skb_any is used as uml_net_start_xmit typically
>> consumes (not drops) packets.
>
> Well, this is not exactly true. This driver certainly can drop packets.
>
> Here is an untested/not even compiled patch.

I said typically which does make it true :P.

I care because I trying to keep us from calling kfree_skb or consume_skb
aka dev_kfree_skb in hard irq context, as that can result in nasty
issues.

Since I am touching those places I am doing my best to pick the
correct consume or kfree variant that matches what the code does,
and there isn't always one.  But that is all at a best effort so I can
preserve the code logic.

These patches are deliberately very conservative so I can successfully
make and test them with simply code inspection.

I really don't think using enum skb_free_reason makes any sense
whatsoever.  Not in the implementation of dev_kfree_skb_any and
dev_kfree_skb_irq and certainly not in a driver.  What
net/core/drop_monitor.c wants is the address of the function where drops
occur (so we can track down and debug why the kernel is dropping
packets) and the existing implementation of dev_kfree_skb_any and
dev_kfree_skb_irq loose that information.  The use of enum
skb_free_reason is a big part of the reason why we loose that
information.  (We should be using a (void *) so that we can capture
__builtin_return_address(0) instead.  Your expanded use of enum
skb_free_reason below seems to encourage the bad implemenation and make
it even harder to fix dev_kfree_skb_any and dev_kfree_skb_irq.

In other locations with the same logic I justified the change as not
changing semantics when going from consume_skb (aka dev_kfree_skb) to
dev_consume_skb.  My apologies for not mentioning that in uml/net_kern.
I think not causing a regression in the kfree/consume distinction is
more important than getting the code exactly right.

If you are really interested in seeing us get the consume_skb vs
kfree_skb difference correct in drivers I recommend an audit of drivers
yourself.  A few weeks ago when I started looking at this there
was exactly one instance of dev_consume_skb_any or was it just
consume_skb in the entire driver tree.

So I really think getting the drop vs consume distinction perfect right
now is silly when it is hard to test and we have so much low hanging
fruit where the distinction was not even recognized.

So please ack my patches (especially this one) on the basis of execution
context correctness and drop/kfree distinction no regression or best
effort correctness.

Eric

> diff --git a/arch/um/drivers/net_kern.c b/arch/um/drivers/net_kern.c
> index 39f186252e02..8d1df7ed759e 100644
> --- a/arch/um/drivers/net_kern.c
> +++ b/arch/um/drivers/net_kern.c
> @@ -212,6 +212,7 @@ static int uml_net_start_xmit(struct sk_buff *skb, struct net_device *dev)
>  	struct uml_net_private *lp = netdev_priv(dev);
>  	unsigned long flags;
>  	int len;
> +	enum skb_free_reason reason = SKB_REASON_CONSUMED;
>  
>  	netif_stop_queue(dev);
>  
> @@ -228,19 +229,18 @@ static int uml_net_start_xmit(struct sk_buff *skb, struct net_device *dev)
>  
>  		/* this is normally done in the interrupt when tx finishes */
>  		netif_wake_queue(dev);
> -	}
> -	else if (len == 0) {
> -		netif_start_queue(dev);
> -		dev->stats.tx_dropped++;
> -	}
> -	else {
> +	} else {
> +		reason = SKB_REASON_DROPPED;
>  		netif_start_queue(dev);
> -		printk(KERN_ERR "uml_net_start_xmit: failed(%d)\n", len);
> +		if (len == 0)
> +			dev->stats.tx_dropped++;
> +		else
> +			pr_err("uml_net_start_xmit: failed(%d)\n", len);
>  	}
>  
>  	spin_unlock_irqrestore(&lp->lock, flags);
>  
> -	dev_kfree_skb(skb);
> +	__dev_kfree_skb_any(skb, reason);
>  
>  	return NETDEV_TX_OK;
>  }

^ permalink raw reply	[flat|nested] 288+ messages in thread

* Re: [PATCH 37/54] sc92031: Call dev_consume_skb_any instead of dev_kfree_skb.
  2014-03-25  6:05                           ` [PATCH 37/54] sc92031: Call dev_consume_skb_any instead of dev_kfree_skb Eric W. Biederman
@ 2014-03-25 20:39                             ` Eric Dumazet
  0 siblings, 0 replies; 288+ messages in thread
From: Eric Dumazet @ 2014-03-25 20:39 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: David Miller, netdev, xiyou.wangcong, mpm, satyam.sharma

On Mon, 2014-03-24 at 23:05 -0700, Eric W. Biederman wrote:
> From: "Eric W. Biederman" <ebiederm@xmission.com>
> 
> Replace dev_kfree_skb with dev_consume_skb_any in sc92031_start_xmit
> that can be called in hard irq and other contexts.
> 
> Using dev_consume_skb_any preserves the current semantics (as
> dev_kfree_skb is just consume_skb) and since packet drops
> are rare is usually accurate.
> 
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
> ---

Acked-by: Eric Dumazet <edumazet@google.com>

^ permalink raw reply	[flat|nested] 288+ messages in thread

* Re: [PATCH 38/54] sis900: Call dev_kfree_skb_any instead of dev_kfree_skb.
  2014-03-25  6:05                           ` [PATCH 38/54] sis900: Call dev_kfree_skb_any " Eric W. Biederman
@ 2014-03-25 20:39                             ` Eric Dumazet
  0 siblings, 0 replies; 288+ messages in thread
From: Eric Dumazet @ 2014-03-25 20:39 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: David Miller, netdev, xiyou.wangcong, mpm, satyam.sharma

On Mon, 2014-03-24 at 23:05 -0700, Eric W. Biederman wrote:
> From: "Eric W. Biederman" <ebiederm@xmission.com>
> 
> Replace dev_kfree_skb with dev_kfree_skb_any in functions that can
> be called in hard irq and other contexts.
> 
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
> ---
>  drivers/net/ethernet/sis/sis900.c |    2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

Acked-by: Eric Dumazet <edumazet@google.com>

^ permalink raw reply	[flat|nested] 288+ messages in thread

* Re: [PATCH 39/54] smc911x: Call dev_kfree_skb_any instead of dev_kfree_skb.
  2014-03-25  6:05                           ` [PATCH 39/54] smc911x: " Eric W. Biederman
@ 2014-03-25 20:40                             ` Eric Dumazet
  0 siblings, 0 replies; 288+ messages in thread
From: Eric Dumazet @ 2014-03-25 20:40 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: David Miller, netdev, xiyou.wangcong, mpm, satyam.sharma

On Mon, 2014-03-24 at 23:05 -0700, Eric W. Biederman wrote:
> From: "Eric W. Biederman" <ebiederm@xmission.com>
> 
> Replace dev_kfree_skb with dev_kfree_skb_any in functions that can
> be called in hard irq and other contexts.
> 
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
> ---
>  drivers/net/ethernet/smsc/smc911x.c |    2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

Acked-by: Eric Dumazet <edumazet@google.com>

^ permalink raw reply	[flat|nested] 288+ messages in thread

* Re: [PATCH 40/54] smc91x: Call dev_kfree/consume_skb_any instead of dev_kfree_skb.
  2014-03-25  6:05                           ` [PATCH 40/54] smc91x: Call dev_kfree/consume_skb_any " Eric W. Biederman
@ 2014-03-25 20:40                             ` Eric Dumazet
  0 siblings, 0 replies; 288+ messages in thread
From: Eric Dumazet @ 2014-03-25 20:40 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: David Miller, netdev, xiyou.wangcong, mpm, satyam.sharma

On Mon, 2014-03-24 at 23:05 -0700, Eric W. Biederman wrote:
> From: "Eric W. Biederman" <ebiederm@xmission.com>
> 
> Replace dev_kfree_skb with dev_consume_skb_any in
> smc_hardware_send_pkt that can be called in hard irq and other
> contexts, and handles successfully transmitted packets.
> 
> Replace dev_kfree_skb with dev_kfree_skb_any in smc_hard_start_xmit which
> can be called in hard irq and other contexts, and only frees skbs
> when dropping them.
> 
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
> ---

Acked-by: Eric Dumazet <edumazet@google.com>

^ permalink raw reply	[flat|nested] 288+ messages in thread

* Re: [PATCH 41/54] smsc911x: Call dev_consume_skb_any instead of dev_kfree_skb.
  2014-03-25  6:05                           ` [PATCH 41/54] smsc911x: Call dev_consume_skb_any " Eric W. Biederman
@ 2014-03-25 20:41                             ` Eric Dumazet
  0 siblings, 0 replies; 288+ messages in thread
From: Eric Dumazet @ 2014-03-25 20:41 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: David Miller, netdev, xiyou.wangcong, mpm, satyam.sharma

On Mon, 2014-03-24 at 23:05 -0700, Eric W. Biederman wrote:
> From: "Eric W. Biederman" <ebiederm@xmission.com>
> 
> Replace dev_kfree_skb with dev_consume_skb_any in smsc911x_hard_xmit
> which can be called in hard irq and other contexts. smsc911x_hard_xmit
> always transmits and consumes the specified skb.
> 
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>

Acked-by: Eric Dumazet <edumazet@google.com>

^ permalink raw reply	[flat|nested] 288+ messages in thread

* Re: [PATCH 42/54] stmmac: Call dev_consume_skb_any instead of dev_kfree_skb.
  2014-03-25  6:05                           ` [PATCH 42/54] stmmac: " Eric W. Biederman
@ 2014-03-25 20:42                             ` Eric Dumazet
  0 siblings, 0 replies; 288+ messages in thread
From: Eric Dumazet @ 2014-03-25 20:42 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: David Miller, netdev, xiyou.wangcong, mpm, satyam.sharma

On Mon, 2014-03-24 at 23:05 -0700, Eric W. Biederman wrote:
> From: "Eric W. Biederman" <ebiederm@xmission.com>
> 
> Replace dev_kfree_skb with dev_consume_skb_any in stmmac_tx_clean that can
> be called in hard irq and other contexts.  stmmac_tx_clean handles
> freeing successfully transmitted packets.
> 
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
> ---
>  drivers/net/ethernet/stmicro/stmmac/stmmac_main.c |    2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

Acked-by: Eric Dumazet <edumazet@google.com>

^ permalink raw reply	[flat|nested] 288+ messages in thread

* Re: [PATCH 43/54] sungem: Call dev_consume_skb_any instead of dev_kfree_skb.
  2014-03-25  6:05                           ` [PATCH 43/54] sungem: " Eric W. Biederman
@ 2014-03-25 20:42                             ` Eric Dumazet
  0 siblings, 0 replies; 288+ messages in thread
From: Eric Dumazet @ 2014-03-25 20:42 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: David Miller, netdev, xiyou.wangcong, mpm, satyam.sharma

On Mon, 2014-03-24 at 23:05 -0700, Eric W. Biederman wrote:
> From: "Eric W. Biederman" <ebiederm@xmission.com>
> 
> Replace dev_kfree_skb with dev_consume_skb_any in gem_tx which can be
> called in hard irq and other contexts.  gem_tx handles successfully
> transmitted packets.
> 
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
> ---
>  drivers/net/ethernet/sun/sungem.c |    2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

Acked-by: Eric Dumazet <edumazet@google.com>

^ permalink raw reply	[flat|nested] 288+ messages in thread

* Re: [PATCH 44/54] tilepro: Call dev_consume_skb_any instead of kfree_skb.
  2014-03-25  6:05                           ` [PATCH 44/54] tilepro: Call dev_consume_skb_any instead of kfree_skb Eric W. Biederman
@ 2014-03-25 20:43                             ` Eric Dumazet
  0 siblings, 0 replies; 288+ messages in thread
From: Eric Dumazet @ 2014-03-25 20:43 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: David Miller, netdev, xiyou.wangcong, mpm, satyam.sharma

On Mon, 2014-03-24 at 23:05 -0700, Eric W. Biederman wrote:
> From: "Eric W. Biederman" <ebiederm@xmission.com>
> 
> Replace kfree_skb with dev_consume_skb_any in tile_net_tx and
> tile_net_tx_tso which can be called in hard irq and other contexts.
> 
> At the point where the skbs are freed a packet has been successfully
> transmitted so dev_consume_skb_any is the appropriate variant to use.
> 
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
> ---
>  drivers/net/ethernet/tile/tilepro.c |    4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)

Acked-by: Eric Dumazet <edumazet@google.com>

^ permalink raw reply	[flat|nested] 288+ messages in thread

* Re: [PATCH 45/54] spider_net: Call dev_consume_skb_any instead of dev_kfree_skb.
  2014-03-25  6:05                           ` [PATCH 45/54] spider_net: Call dev_consume_skb_any instead of dev_kfree_skb Eric W. Biederman
@ 2014-03-25 20:44                             ` Eric Dumazet
  0 siblings, 0 replies; 288+ messages in thread
From: Eric Dumazet @ 2014-03-25 20:44 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: David Miller, netdev, xiyou.wangcong, mpm, satyam.sharma

On Mon, 2014-03-24 at 23:05 -0700, Eric W. Biederman wrote:
> From: "Eric W. Biederman" <ebiederm@xmission.com>
> 
> Replace dev_kfree_skb with dev_consume_skb_any in
> spider_net_release_tx_chain which can be called in hard irq and other
> contexts.
> 
> dev_consume_skb_any was choosen as it preserves the current
> dev_kfree_skb semantics (dev_kfree_skb is consume_skb) and
> is because it is correct most of the time as most packets
> will have been successfully transmitted not dropeed.
> 
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
> ---

Acked-by: Eric Dumazet <edumazet@google.com>

^ permalink raw reply	[flat|nested] 288+ messages in thread

* Re: [PATCH 46/54] via-rhine: Call dev_kfree/consume_skb_any instead of dev_kfree_skb.
  2014-03-25  6:05                           ` [PATCH 46/54] via-rhine: Call dev_kfree/consume_skb_any " Eric W. Biederman
@ 2014-03-25 20:44                             ` Eric Dumazet
  0 siblings, 0 replies; 288+ messages in thread
From: Eric Dumazet @ 2014-03-25 20:44 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: David Miller, netdev, xiyou.wangcong, mpm, satyam.sharma

On Mon, 2014-03-24 at 23:05 -0700, Eric W. Biederman wrote:
> From: "Eric W. Biederman" <ebiederm@xmission.com>
> 
> Replace dev_kfree_skb with dev_kfree_skb_any in rhine_start_tx which
> can be called in hard irq and other contexts.  Packets are only freed
> in rhine_start_tx if they are dropped.
> 
> Replace dev_kfree_skb with dev_consume_skb_any in rhine_tx that can be
> called in hard irq and other contexts.  rhine_tx handles successfully
> transmitted packets.
> 
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>

Acked-by: Eric Dumazet <edumazet@google.com>

^ permalink raw reply	[flat|nested] 288+ messages in thread

* Re: [PATCH 47/54] via-velocity: Call dev_kfree_skb_any instead of kfree_skb.
  2014-03-25  6:05                           ` [PATCH 47/54] via-velocity: Call dev_kfree_skb_any instead of kfree_skb Eric W. Biederman
@ 2014-03-25 20:45                             ` Eric Dumazet
  0 siblings, 0 replies; 288+ messages in thread
From: Eric Dumazet @ 2014-03-25 20:45 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: David Miller, netdev, xiyou.wangcong, mpm, satyam.sharma

On Mon, 2014-03-24 at 23:05 -0700, Eric W. Biederman wrote:
> From: "Eric W. Biederman" <ebiederm@xmission.com>
> 
> Replace dev_kfree_skb with dev_kfree_skb_any in velocity_xmit that can
> be called in hard irq and other contexts.  Packets are freed and
> dropped in velocity_xmit when they are too fragmented and can
> not be linearized.
> 
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>

Acked-by: Eric Dumazet <edumazet@google.com>

^ permalink raw reply	[flat|nested] 288+ messages in thread

* Re: [PATCH 48/54] xilinx_emaclite: Call dev_consume_skb_any instead of dev_kfree_skb.
  2014-03-25  6:05                           ` [PATCH 48/54] xilinx_emaclite: Call dev_consume_skb_any instead of dev_kfree_skb Eric W. Biederman
@ 2014-03-25 20:46                             ` Eric Dumazet
  0 siblings, 0 replies; 288+ messages in thread
From: Eric Dumazet @ 2014-03-25 20:46 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: David Miller, netdev, xiyou.wangcong, mpm, satyam.sharma

On Mon, 2014-03-24 at 23:05 -0700, Eric W. Biederman wrote:
> From: "Eric W. Biederman" <ebiederm@xmission.com>
> 
> Replace dev_kfree_skb with dev_consume_skb_any in xemaclite_send which
> can be called in hard irq and other contexts.  xemacelite_send only
> frees skbs that it has successfully transmitted.
> 
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
> ---
>  drivers/net/ethernet/xilinx/xilinx_emaclite.c |    2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

^ permalink raw reply	[flat|nested] 288+ messages in thread

* Re: [PATCH 49/54] vmxnet3: Call dev_kfree_skb_any instead of dev_kfree_skb.
  2014-03-25  6:05                           ` [PATCH 49/54] vmxnet3: Call dev_kfree_skb_any " Eric W. Biederman
@ 2014-03-25 20:46                             ` Eric Dumazet
  0 siblings, 0 replies; 288+ messages in thread
From: Eric Dumazet @ 2014-03-25 20:46 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: David Miller, netdev, xiyou.wangcong, mpm, satyam.sharma

On Mon, 2014-03-24 at 23:05 -0700, Eric W. Biederman wrote:
> From: "Eric W. Biederman" <ebiederm@xmission.com>
> 
> Replace dev_kfree_skb with dev_kfree_skb_any in vmnet3_tx_xmit which
> can be called in hard irq and other contexts.  vmnet3_tx_xmit only
> frees skbs that it has dropped.
> 
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>

Acked-by: Eric Dumazet <edumazet@google.com>

^ permalink raw reply	[flat|nested] 288+ messages in thread

* Re: [PATCH 50/54] xen-netfront: Call dev_kfree_skb_any instead of dev_kfree_skb.
  2014-03-25  6:05                           ` [PATCH 50/54] xen-netfront: " Eric W. Biederman
@ 2014-03-25 20:46                             ` Eric Dumazet
  0 siblings, 0 replies; 288+ messages in thread
From: Eric Dumazet @ 2014-03-25 20:46 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: David Miller, netdev, xiyou.wangcong, mpm, satyam.sharma

On Mon, 2014-03-24 at 23:05 -0700, Eric W. Biederman wrote:
> From: "Eric W. Biederman" <ebiederm@xmission.com>
> 
> Replace dev_kfree_skb with dev_kfree_skb_any in xennet_start_xmit
> which can be called in hard irq and other contexts.  xennet_start_xmit
> only fress skbs which it drops.
> 
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
> ---
Acked-by: Eric Dumazet <edumazet@google.com>

^ permalink raw reply	[flat|nested] 288+ messages in thread

* Re: [PATCH 51/54] wlags49_h2: Call dev_kfree/consume_skb_any instead of dev_kfree_skb.
  2014-03-25  6:05                           ` [PATCH 51/54] wlags49_h2: Call dev_kfree/consume_skb_any " Eric W. Biederman
@ 2014-03-25 20:47                             ` Eric Dumazet
  0 siblings, 0 replies; 288+ messages in thread
From: Eric Dumazet @ 2014-03-25 20:47 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: David Miller, netdev, xiyou.wangcong, mpm, satyam.sharma

On Mon, 2014-03-24 at 23:05 -0700, Eric W. Biederman wrote:
> From: "Eric W. Biederman" <ebiederm@xmission.com>
> 
> Replace dev_kfree_skb with dev_consume_skb_any in wl_send and
> wl_send_dma which can be called in hard irq and other contexts,
> on the code paths where the skb was transmitted successfully.
> 
> Replace dev_kfree_skb with dev_kfree_skb_any in wl_send_dmay which can
> be called in hard irq and other contexts, on the code path where a
> skb is dropped.
> 
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
> ---

Acked-by: Eric Dumazet <edumazet@google.com>

^ permalink raw reply	[flat|nested] 288+ messages in thread

* Re: [PATCH 52/54] staging/octeon-ethernet: Call dev_kfree/consume_skb_any instead of dev_kfree_skb.
  2014-03-25  6:05                           ` [PATCH 52/54] staging/octeon-ethernet: " Eric W. Biederman
@ 2014-03-25 20:47                             ` Eric Dumazet
  0 siblings, 0 replies; 288+ messages in thread
From: Eric Dumazet @ 2014-03-25 20:47 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: David Miller, netdev, xiyou.wangcong, mpm, satyam.sharma

On Mon, 2014-03-24 at 23:05 -0700, Eric W. Biederman wrote:
> From: "Eric W. Biederman" <ebiederm@xmission.com>
> 
> Replace dev_kfree_skb with dev_kfree_skb_any in cvm_oct_xmit_pow which
> can be called in hard irq and other contexts, on the code paths that
> drop packets.
> 
> Replace dev_kfree_skb with dev_consume_skb_any in cvm_oct_xmit_pow which
> can be called in hard irq and other contexts, on the code path where
> the packet is transmitted successfully.
> 
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
> ---

Acked-by: Eric Dumazet <edumazet@google.com>

^ permalink raw reply	[flat|nested] 288+ messages in thread

* Re: [PATCH 53/54] virtio_net: Call dev_kfree_skb_any instead of dev_kfree_skb.
  2014-03-25  6:05                           ` [PATCH 53/54] virtio_net: Call dev_kfree_skb_any " Eric W. Biederman
@ 2014-03-25 20:48                             ` Eric Dumazet
  0 siblings, 0 replies; 288+ messages in thread
From: Eric Dumazet @ 2014-03-25 20:48 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: David Miller, netdev, xiyou.wangcong, mpm, satyam.sharma

On Mon, 2014-03-24 at 23:05 -0700, Eric W. Biederman wrote:
> From: "Eric W. Biederman" <ebiederm@xmission.com>
> 
> Replace dev_kfree_skb with dev_kfree_skb_any in start_xmit which can
> be called in hard irq and other contexts.
> 
> start_xmit only frees skbs that it is dropping.
> 
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>

Acked-by: Eric Dumazet <edumazet@google.com>

^ permalink raw reply	[flat|nested] 288+ messages in thread

* Re: [PATCH 54/54] if_vlan: Call dev_kfree_skb_any instead of kfree_skb.
  2014-03-25  6:05                           ` [PATCH 54/54] if_vlan: Call dev_kfree_skb_any instead of kfree_skb Eric W. Biederman
@ 2014-03-25 20:48                             ` Eric Dumazet
  0 siblings, 0 replies; 288+ messages in thread
From: Eric Dumazet @ 2014-03-25 20:48 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: David Miller, netdev, xiyou.wangcong, mpm, satyam.sharma

On Mon, 2014-03-24 at 23:05 -0700, Eric W. Biederman wrote:
> From: "Eric W. Biederman" <ebiederm@xmission.com>
> 
> Replace kfree_skb with dev_kfree_skb_any in vlan_insert_tag as
> vlan_insert_tag can be called from hard irq context (netpoll)
> and from other contexts.
> 
> dev_kfree_skb_any is used as vlan_insert_tag only frees the skb if the
> skb can not be modified to insert a tag, in which case vlan_insert_tag
> drops the skb.
> 
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>

Acked-by: Eric Dumazet <edumazet@google.com>

^ permalink raw reply	[flat|nested] 288+ messages in thread

* Re: [net-next 00/54][pull request] Using dev_kfree/consume_skb_any for functions called in multiple contexts
  2014-03-25  5:58                       ` [net-next 00/54][pull request] Using dev_kfree/consume_skb_any " Eric W. Biederman
  2014-03-25  6:04                         ` [PATCH 01/54] uml/net_kern: Call dev_consume_skb_any instead of dev_kfree_skb Eric W. Biederman
@ 2014-03-25 20:49                         ` Eric Dumazet
  2014-03-25 22:54                           ` David Miller
  1 sibling, 1 reply; 288+ messages in thread
From: Eric Dumazet @ 2014-03-25 20:49 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: David Miller, netdev, xiyou.wangcong, mpm, satyam.sharma

On Mon, 2014-03-24 at 22:58 -0700, Eric W. Biederman wrote:
> These changes are a result of walking through the network drivers
> supporting netpoll and verifying the code paths that netpoll can cause
> to be called in hard irq context use an appropriate flavor of
> kfree_skb.  Either dev_kfree_skb_any or dev_consume_skb_any.
> 
> Since my last pass at this I have become aware of the small differences
> between dev_kfree_skb_any and dev_consume_skb_any.
> net/core/drop_monitor.c reports the dev_kfree_skb_any as a drop and
> while being quite about the second.  With the weird twist that
> dev_kfree_skb is unintuitively consume_skb.
> 
> As netpoll now calls the napi poll function with budget == 0, pieces of
> a drivers the napi poll function that don't run when budget == 0 have
> been ignored.
> 
> The most interesting change is to the atl1c which tried unsuccesfully to
> tell one of it's functions which context it is called in so that it
> could call dev_kfree_skb_irq or dev_kfree_skb as appropriate.  I have
> just removed the extra parameter and called dev_consume_skb_any.
> 
> At 54 separate changes I will post each change as a separate patch (so
> they can be reviewed) but for general sanity sake I have gathered them
> all into a git branch for easy acces.
> 
> David when you are satisified with these changes please pull:
> 
>     git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/net-next.git master
> 
> Hopefully this will allow me to forget this class of error when dealing
> with netpoll.


Perfect, thanks Eric.

(the minor issues can be addressed later in a single followup)

Acked-by: Eric Dumazet <edumazet@google.com>

^ permalink raw reply	[flat|nested] 288+ messages in thread

* Re: [net-next 00/54][pull request] Using dev_kfree/consume_skb_any for functions called in multiple contexts
  2014-03-25 20:49                         ` [net-next 00/54][pull request] Using dev_kfree/consume_skb_any for functions called in multiple contexts Eric Dumazet
@ 2014-03-25 22:54                           ` David Miller
  0 siblings, 0 replies; 288+ messages in thread
From: David Miller @ 2014-03-25 22:54 UTC (permalink / raw)
  To: eric.dumazet; +Cc: ebiederm, netdev, xiyou.wangcong, mpm, satyam.sharma

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Tue, 25 Mar 2014 13:49:51 -0700

> On Mon, 2014-03-24 at 22:58 -0700, Eric W. Biederman wrote:
>> These changes are a result of walking through the network drivers
>> supporting netpoll and verifying the code paths that netpoll can cause
>> to be called in hard irq context use an appropriate flavor of
>> kfree_skb.  Either dev_kfree_skb_any or dev_consume_skb_any.
>> 
>> Since my last pass at this I have become aware of the small differences
>> between dev_kfree_skb_any and dev_consume_skb_any.
>> net/core/drop_monitor.c reports the dev_kfree_skb_any as a drop and
>> while being quite about the second.  With the weird twist that
>> dev_kfree_skb is unintuitively consume_skb.
>> 
>> As netpoll now calls the napi poll function with budget == 0, pieces of
>> a drivers the napi poll function that don't run when budget == 0 have
>> been ignored.
>> 
>> The most interesting change is to the atl1c which tried unsuccesfully to
>> tell one of it's functions which context it is called in so that it
>> could call dev_kfree_skb_irq or dev_kfree_skb as appropriate.  I have
>> just removed the extra parameter and called dev_consume_skb_any.
>> 
>> At 54 separate changes I will post each change as a separate patch (so
>> they can be reviewed) but for general sanity sake I have gathered them
>> all into a git branch for easy acces.
>> 
>> David when you are satisified with these changes please pull:
>> 
>>     git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/net-next.git master
>> 
>> Hopefully this will allow me to forget this class of error when dealing
>> with netpoll.
> 
> 
> Perfect, thanks Eric.
> 
> (the minor issues can be addressed later in a single followup)
> 
> Acked-by: Eric Dumazet <edumazet@google.com>

Pulled, thanks everyone.

^ permalink raw reply	[flat|nested] 288+ messages in thread

* RE: [PATCH 01/54] uml/net_kern: Call dev_consume_skb_any instead of dev_kfree_skb.
  2014-03-25 18:05                             ` Eric W. Biederman
@ 2014-03-26  9:49                               ` David Laight
  0 siblings, 0 replies; 288+ messages in thread
From: David Laight @ 2014-03-26  9:49 UTC (permalink / raw)
  To: 'Eric W. Biederman', Eric Dumazet
  Cc: David Miller, netdev, xiyou.wangcong, mpm, satyam.sharma

From: Of Eric W. Biederman
...
> I really don't think using enum skb_free_reason makes any sense
> whatsoever.  Not in the implementation of dev_kfree_skb_any and
> dev_kfree_skb_irq and certainly not in a driver.  What
> net/core/drop_monitor.c wants is the address of the function where drops
> occur (so we can track down and debug why the kernel is dropping
> packets) and the existing implementation of dev_kfree_skb_any and
> dev_kfree_skb_irq loose that information.  The use of enum
> skb_free_reason is a big part of the reason why we loose that
> information.  (We should be using a (void *) so that we can capture
> __builtin_return_address(0) instead...

Maybe more useful to allow a literal string be given.
Easier to find in the source tree than the return address.

Or (OTT) create a linkset data item containing info about the
call site and a counter....

	David

^ permalink raw reply	[flat|nested] 288+ messages in thread

* [PATCH v2 0/6] netpoll: Cleanups and fixes
  2014-03-18  6:22                                   ` [PATCH 0/6] netpoll: Cleanups and fixes Eric W. Biederman
                                                       ` (5 preceding siblings ...)
  2014-03-18  6:27                                     ` [PATCH 6/6] net: Free skbs from irqs when possible Eric W. Biederman
@ 2014-03-27 22:35                                     ` Eric W. Biederman
  2014-03-27 22:36                                       ` [PATCH v2 1/6] netpoll: Remove gfp parameter from __netpoll_setup Eric W. Biederman
                                                         ` (6 more replies)
  6 siblings, 7 replies; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-27 22:35 UTC (permalink / raw)
  To: David Miller
  Cc: stephen, eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma


This should be a small set of safe cleanups and fixes to netpoll.

The fixes are vlan headers are now always inserted when needed, and
napi polling is always avoided when network devices are closed.

There are a bunch of little cleanups removing unnecessary code, fixing
function naming, not taking unnecessary locks and removing general
silliness.

Eric W. Biederman (6):
      netpoll: Remove gfp parameter from __netpoll_setup
      netpoll: Only call ndo_start_xmit from a single place
      netpoll: Move rx enable/disable into __dev_close_many
      netpoll: Rename netpoll_rx_enable/disable to netpoll_poll_disable/enable
      netpoll: Remove strong unnecessary assumptions about skbs
      netpoll: Respect NETIF_F_LLTX

 drivers/net/bonding/bond_main.c |  6 +--
 drivers/net/team/team.c         | 16 ++++----
 include/linux/netdevice.h       |  8 +++-
 include/linux/netpoll.h         | 10 ++---
 net/8021q/vlan_dev.c            |  7 ++--
 net/bridge/br_device.c          | 15 ++++---
 net/bridge/br_if.c              |  2 +-
 net/bridge/br_private.h         |  4 +-
 net/core/dev.c                  | 17 +++-----
 net/core/netpoll.c              | 91 +++++++++++++++++++++++------------------
 10 files changed, 91 insertions(+), 85 deletions(-)

^ permalink raw reply	[flat|nested] 288+ messages in thread

* [PATCH v2 1/6] netpoll: Remove gfp parameter from __netpoll_setup
  2014-03-27 22:35                                     ` [PATCH v2 0/6] netpoll: Cleanups and fixes Eric W. Biederman
@ 2014-03-27 22:36                                       ` Eric W. Biederman
  2014-03-27 22:37                                       ` [PATCH v2 2/6] netpoll: Only call ndo_start_xmit from a single place Eric W. Biederman
                                                         ` (5 subsequent siblings)
  6 siblings, 0 replies; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-27 22:36 UTC (permalink / raw)
  To: David Miller
  Cc: stephen, eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma


The gfp parameter was added in:
commit 47be03a28cc6c80e3aa2b3e8ed6d960ff0c5c0af
Author: Amerigo Wang <amwang@redhat.com>
Date:   Fri Aug 10 01:24:37 2012 +0000

    netpoll: use GFP_ATOMIC in slave_enable_netpoll() and __netpoll_setup()

    slave_enable_netpoll() and __netpoll_setup() may be called
    with read_lock() held, so should use GFP_ATOMIC to allocate
    memory. Eric suggested to pass gfp flags to __netpoll_setup().

    Cc: Eric Dumazet <eric.dumazet@gmail.com>
    Cc: "David S. Miller" <davem@davemloft.net>
    Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Signed-off-by: Cong Wang <amwang@redhat.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

The reason for the gfp parameter was removed in:
commit c4cdef9b7183159c23c7302aaf270d64c549f557
Author: dingtianhong <dingtianhong@huawei.com>
Date:   Tue Jul 23 15:25:27 2013 +0800

    bonding: don't call slave_xxx_netpoll under spinlocks

    The slave_xxx_netpoll will call synchronize_rcu_bh(),
    so the function may schedule and sleep, it should't be
    called under spinlocks.

    bond_netpoll_setup() and bond_netpoll_cleanup() are always
    protected by rtnl lock, it is no need to take the read lock,
    as the slave list couldn't be changed outside rtnl lock.

    Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
    Cc: Jay Vosburgh <fubar@us.ibm.com>
    Cc: Andy Gospodarek <andy@greyhouse.net>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Nothing else that calls __netpoll_setup or ndo_netpoll_setup
requires a gfp paramter, so remove the gfp parameter from both
of these functions making the code clearer.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 drivers/net/bonding/bond_main.c |    6 +++---
 drivers/net/team/team.c         |   16 +++++++---------
 include/linux/netdevice.h       |    3 +--
 include/linux/netpoll.h         |    2 +-
 net/8021q/vlan_dev.c            |    7 +++----
 net/bridge/br_device.c          |   15 +++++++--------
 net/bridge/br_if.c              |    2 +-
 net/bridge/br_private.h         |    4 ++--
 net/core/netpoll.c              |    8 ++++----
 9 files changed, 29 insertions(+), 34 deletions(-)

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 5be34b72a048..95a6ca7d9e51 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -922,12 +922,12 @@ static inline int slave_enable_netpoll(struct slave *slave)
 	struct netpoll *np;
 	int err = 0;
 
-	np = kzalloc(sizeof(*np), GFP_ATOMIC);
+	np = kzalloc(sizeof(*np), GFP_KERNEL);
 	err = -ENOMEM;
 	if (!np)
 		goto out;
 
-	err = __netpoll_setup(np, slave->dev, GFP_ATOMIC);
+	err = __netpoll_setup(np, slave->dev);
 	if (err) {
 		kfree(np);
 		goto out;
@@ -962,7 +962,7 @@ static void bond_netpoll_cleanup(struct net_device *bond_dev)
 			slave_disable_netpoll(slave);
 }
 
-static int bond_netpoll_setup(struct net_device *dev, struct netpoll_info *ni, gfp_t gfp)
+static int bond_netpoll_setup(struct net_device *dev, struct netpoll_info *ni)
 {
 	struct bonding *bond = netdev_priv(dev);
 	struct list_head *iter;
diff --git a/drivers/net/team/team.c b/drivers/net/team/team.c
index 2b1a1d61072c..33008c1d1d67 100644
--- a/drivers/net/team/team.c
+++ b/drivers/net/team/team.c
@@ -1031,8 +1031,7 @@ static void team_port_leave(struct team *team, struct team_port *port)
 }
 
 #ifdef CONFIG_NET_POLL_CONTROLLER
-static int team_port_enable_netpoll(struct team *team, struct team_port *port,
-				    gfp_t gfp)
+static int team_port_enable_netpoll(struct team *team, struct team_port *port)
 {
 	struct netpoll *np;
 	int err;
@@ -1040,11 +1039,11 @@ static int team_port_enable_netpoll(struct team *team, struct team_port *port,
 	if (!team->dev->npinfo)
 		return 0;
 
-	np = kzalloc(sizeof(*np), gfp);
+	np = kzalloc(sizeof(*np), GFP_KERNEL);
 	if (!np)
 		return -ENOMEM;
 
-	err = __netpoll_setup(np, port->dev, gfp);
+	err = __netpoll_setup(np, port->dev);
 	if (err) {
 		kfree(np);
 		return err;
@@ -1067,8 +1066,7 @@ static void team_port_disable_netpoll(struct team_port *port)
 	kfree(np);
 }
 #else
-static int team_port_enable_netpoll(struct team *team, struct team_port *port,
-				    gfp_t gfp)
+static int team_port_enable_netpoll(struct team *team, struct team_port *port)
 {
 	return 0;
 }
@@ -1156,7 +1154,7 @@ static int team_port_add(struct team *team, struct net_device *port_dev)
 		goto err_vids_add;
 	}
 
-	err = team_port_enable_netpoll(team, port, GFP_KERNEL);
+	err = team_port_enable_netpoll(team, port);
 	if (err) {
 		netdev_err(dev, "Failed to enable netpoll on device %s\n",
 			   portname);
@@ -1850,7 +1848,7 @@ static void team_netpoll_cleanup(struct net_device *dev)
 }
 
 static int team_netpoll_setup(struct net_device *dev,
-			      struct netpoll_info *npifo, gfp_t gfp)
+			      struct netpoll_info *npifo)
 {
 	struct team *team = netdev_priv(dev);
 	struct team_port *port;
@@ -1858,7 +1856,7 @@ static int team_netpoll_setup(struct net_device *dev,
 
 	mutex_lock(&team->lock);
 	list_for_each_entry(port, &team->port_list, list) {
-		err = team_port_enable_netpoll(team, port, gfp);
+		err = team_port_enable_netpoll(team, port);
 		if (err) {
 			__team_netpoll_cleanup(team);
 			break;
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 4b6d12c7b803..77142a78c4d9 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -1037,8 +1037,7 @@ struct net_device_ops {
 #ifdef CONFIG_NET_POLL_CONTROLLER
 	void                    (*ndo_poll_controller)(struct net_device *dev);
 	int			(*ndo_netpoll_setup)(struct net_device *dev,
-						     struct netpoll_info *info,
-						     gfp_t gfp);
+						     struct netpoll_info *info);
 	void			(*ndo_netpoll_cleanup)(struct net_device *dev);
 #endif
 #ifdef CONFIG_NET_RX_BUSY_POLL
diff --git a/include/linux/netpoll.h b/include/linux/netpoll.h
index 1b475a5a7239..893b9e66060e 100644
--- a/include/linux/netpoll.h
+++ b/include/linux/netpoll.h
@@ -57,7 +57,7 @@ static inline void netpoll_rx_enable(struct net_device *dev) { return; }
 void netpoll_send_udp(struct netpoll *np, const char *msg, int len);
 void netpoll_print_options(struct netpoll *np);
 int netpoll_parse_options(struct netpoll *np, char *opt);
-int __netpoll_setup(struct netpoll *np, struct net_device *ndev, gfp_t gfp);
+int __netpoll_setup(struct netpoll *np, struct net_device *ndev);
 int netpoll_setup(struct netpoll *np);
 void __netpoll_cleanup(struct netpoll *np);
 void __netpoll_free_async(struct netpoll *np);
diff --git a/net/8021q/vlan_dev.c b/net/8021q/vlan_dev.c
index 4f3e9073cb49..a78bebeca4d9 100644
--- a/net/8021q/vlan_dev.c
+++ b/net/8021q/vlan_dev.c
@@ -707,20 +707,19 @@ static void vlan_dev_poll_controller(struct net_device *dev)
 	return;
 }
 
-static int vlan_dev_netpoll_setup(struct net_device *dev, struct netpoll_info *npinfo,
-				  gfp_t gfp)
+static int vlan_dev_netpoll_setup(struct net_device *dev, struct netpoll_info *npinfo)
 {
 	struct vlan_dev_priv *vlan = vlan_dev_priv(dev);
 	struct net_device *real_dev = vlan->real_dev;
 	struct netpoll *netpoll;
 	int err = 0;
 
-	netpoll = kzalloc(sizeof(*netpoll), gfp);
+	netpoll = kzalloc(sizeof(*netpoll), GFP_KERNEL);
 	err = -ENOMEM;
 	if (!netpoll)
 		goto out;
 
-	err = __netpoll_setup(netpoll, real_dev, gfp);
+	err = __netpoll_setup(netpoll, real_dev);
 	if (err) {
 		kfree(netpoll);
 		goto out;
diff --git a/net/bridge/br_device.c b/net/bridge/br_device.c
index f2a08477e0f5..0dd01a05bd59 100644
--- a/net/bridge/br_device.c
+++ b/net/bridge/br_device.c
@@ -218,16 +218,16 @@ static void br_netpoll_cleanup(struct net_device *dev)
 		br_netpoll_disable(p);
 }
 
-static int __br_netpoll_enable(struct net_bridge_port *p, gfp_t gfp)
+static int __br_netpoll_enable(struct net_bridge_port *p)
 {
 	struct netpoll *np;
 	int err;
 
-	np = kzalloc(sizeof(*p->np), gfp);
+	np = kzalloc(sizeof(*p->np), GFP_KERNEL);
 	if (!np)
 		return -ENOMEM;
 
-	err = __netpoll_setup(np, p->dev, gfp);
+	err = __netpoll_setup(np, p->dev);
 	if (err) {
 		kfree(np);
 		return err;
@@ -237,16 +237,15 @@ static int __br_netpoll_enable(struct net_bridge_port *p, gfp_t gfp)
 	return err;
 }
 
-int br_netpoll_enable(struct net_bridge_port *p, gfp_t gfp)
+int br_netpoll_enable(struct net_bridge_port *p)
 {
 	if (!p->br->dev->npinfo)
 		return 0;
 
-	return __br_netpoll_enable(p, gfp);
+	return __br_netpoll_enable(p);
 }
 
-static int br_netpoll_setup(struct net_device *dev, struct netpoll_info *ni,
-			    gfp_t gfp)
+static int br_netpoll_setup(struct net_device *dev, struct netpoll_info *ni)
 {
 	struct net_bridge *br = netdev_priv(dev);
 	struct net_bridge_port *p;
@@ -255,7 +254,7 @@ static int br_netpoll_setup(struct net_device *dev, struct netpoll_info *ni,
 	list_for_each_entry(p, &br->port_list, list) {
 		if (!p->dev)
 			continue;
-		err = __br_netpoll_enable(p, gfp);
+		err = __br_netpoll_enable(p);
 		if (err)
 			goto fail;
 	}
diff --git a/net/bridge/br_if.c b/net/bridge/br_if.c
index 54d207d3a31c..5262b8617eb9 100644
--- a/net/bridge/br_if.c
+++ b/net/bridge/br_if.c
@@ -366,7 +366,7 @@ int br_add_if(struct net_bridge *br, struct net_device *dev)
 	if (err)
 		goto err2;
 
-	err = br_netpoll_enable(p, GFP_KERNEL);
+	err = br_netpoll_enable(p);
 	if (err)
 		goto err3;
 
diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h
index e1ca1dc916a4..06811d79f89f 100644
--- a/net/bridge/br_private.h
+++ b/net/bridge/br_private.h
@@ -349,7 +349,7 @@ static inline void br_netpoll_send_skb(const struct net_bridge_port *p,
 		netpoll_send_skb(np, skb);
 }
 
-int br_netpoll_enable(struct net_bridge_port *p, gfp_t gfp);
+int br_netpoll_enable(struct net_bridge_port *p);
 void br_netpoll_disable(struct net_bridge_port *p);
 #else
 static inline void br_netpoll_send_skb(const struct net_bridge_port *p,
@@ -357,7 +357,7 @@ static inline void br_netpoll_send_skb(const struct net_bridge_port *p,
 {
 }
 
-static inline int br_netpoll_enable(struct net_bridge_port *p, gfp_t gfp)
+static inline int br_netpoll_enable(struct net_bridge_port *p)
 {
 	return 0;
 }
diff --git a/net/core/netpoll.c b/net/core/netpoll.c
index 7291dde93469..4bccc78c5b58 100644
--- a/net/core/netpoll.c
+++ b/net/core/netpoll.c
@@ -584,7 +584,7 @@ int netpoll_parse_options(struct netpoll *np, char *opt)
 }
 EXPORT_SYMBOL(netpoll_parse_options);
 
-int __netpoll_setup(struct netpoll *np, struct net_device *ndev, gfp_t gfp)
+int __netpoll_setup(struct netpoll *np, struct net_device *ndev)
 {
 	struct netpoll_info *npinfo;
 	const struct net_device_ops *ops;
@@ -603,7 +603,7 @@ int __netpoll_setup(struct netpoll *np, struct net_device *ndev, gfp_t gfp)
 	}
 
 	if (!ndev->npinfo) {
-		npinfo = kmalloc(sizeof(*npinfo), gfp);
+		npinfo = kmalloc(sizeof(*npinfo), GFP_KERNEL);
 		if (!npinfo) {
 			err = -ENOMEM;
 			goto out;
@@ -617,7 +617,7 @@ int __netpoll_setup(struct netpoll *np, struct net_device *ndev, gfp_t gfp)
 
 		ops = np->dev->netdev_ops;
 		if (ops->ndo_netpoll_setup) {
-			err = ops->ndo_netpoll_setup(ndev, npinfo, gfp);
+			err = ops->ndo_netpoll_setup(ndev, npinfo);
 			if (err)
 				goto free_npinfo;
 		}
@@ -749,7 +749,7 @@ int netpoll_setup(struct netpoll *np)
 	/* fill up the skb queue */
 	refill_skbs();
 
-	err = __netpoll_setup(np, ndev, GFP_KERNEL);
+	err = __netpoll_setup(np, ndev);
 	if (err)
 		goto put;
 
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* [PATCH v2 2/6] netpoll: Only call ndo_start_xmit from a single place
  2014-03-27 22:35                                     ` [PATCH v2 0/6] netpoll: Cleanups and fixes Eric W. Biederman
  2014-03-27 22:36                                       ` [PATCH v2 1/6] netpoll: Remove gfp parameter from __netpoll_setup Eric W. Biederman
@ 2014-03-27 22:37                                       ` Eric W. Biederman
  2014-03-27 22:38                                       ` [PATCH v2 3/6] netpoll: Move rx enable/disable into __dev_close_many Eric W. Biederman
                                                         ` (4 subsequent siblings)
  6 siblings, 0 replies; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-27 22:37 UTC (permalink / raw)
  To: David Miller
  Cc: stephen, eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma


Factor out the code that needs to surround ndo_start_xmit
from netpoll_send_skb_on_dev into netpoll_start_xmit.

It is an unfortunate fact that as the netpoll code has been maintained
the primary call site ndo_start_xmit learned how to handle vlans
and timestamps but the second call of ndo_start_xmit in queue_process
did not.

With the introduction of netpoll_start_xmit this associated logic now
happens at both call sites of ndo_start_xmit and should make it easy
for that to continue into the future.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 net/core/netpoll.c |   61 +++++++++++++++++++++++++++++++---------------------
 1 file changed, 36 insertions(+), 25 deletions(-)

diff --git a/net/core/netpoll.c b/net/core/netpoll.c
index 4bccc78c5b58..825200fcb0ff 100644
--- a/net/core/netpoll.c
+++ b/net/core/netpoll.c
@@ -69,6 +69,37 @@ module_param(carrier_timeout, uint, 0644);
 #define np_notice(np, fmt, ...)				\
 	pr_notice("%s: " fmt, np->name, ##__VA_ARGS__)
 
+static int netpoll_start_xmit(struct sk_buff *skb, struct net_device *dev,
+			      struct netdev_queue *txq)
+{
+	const struct net_device_ops *ops = dev->netdev_ops;
+	int status = NETDEV_TX_OK;
+	netdev_features_t features;
+
+	features = netif_skb_features(skb);
+
+	if (vlan_tx_tag_present(skb) &&
+	    !vlan_hw_offload_capable(features, skb->vlan_proto)) {
+		skb = __vlan_put_tag(skb, skb->vlan_proto,
+				     vlan_tx_tag_get(skb));
+		if (unlikely(!skb)) {
+			/* This is actually a packet drop, but we
+			 * don't want the code that calls this
+			 * function to try and operate on a NULL skb.
+			 */
+			goto out;
+		}
+		skb->vlan_tci = 0;
+	}
+
+	status = ops->ndo_start_xmit(skb, dev);
+	if (status == NETDEV_TX_OK)
+		txq_trans_update(txq);
+
+out:
+	return status;
+}
+
 static void queue_process(struct work_struct *work)
 {
 	struct netpoll_info *npinfo =
@@ -78,7 +109,6 @@ static void queue_process(struct work_struct *work)
 
 	while ((skb = skb_dequeue(&npinfo->txq))) {
 		struct net_device *dev = skb->dev;
-		const struct net_device_ops *ops = dev->netdev_ops;
 		struct netdev_queue *txq;
 
 		if (!netif_device_present(dev) || !netif_running(dev)) {
@@ -91,7 +121,7 @@ static void queue_process(struct work_struct *work)
 		local_irq_save(flags);
 		__netif_tx_lock(txq, smp_processor_id());
 		if (netif_xmit_frozen_or_stopped(txq) ||
-		    ops->ndo_start_xmit(skb, dev) != NETDEV_TX_OK) {
+		    netpoll_start_xmit(skb, dev, txq) != NETDEV_TX_OK) {
 			skb_queue_head(&npinfo->txq, skb);
 			__netif_tx_unlock(txq);
 			local_irq_restore(flags);
@@ -295,7 +325,6 @@ void netpoll_send_skb_on_dev(struct netpoll *np, struct sk_buff *skb,
 {
 	int status = NETDEV_TX_BUSY;
 	unsigned long tries;
-	const struct net_device_ops *ops = dev->netdev_ops;
 	/* It is up to the caller to keep npinfo alive. */
 	struct netpoll_info *npinfo;
 
@@ -317,27 +346,9 @@ void netpoll_send_skb_on_dev(struct netpoll *np, struct sk_buff *skb,
 		for (tries = jiffies_to_usecs(1)/USEC_PER_POLL;
 		     tries > 0; --tries) {
 			if (__netif_tx_trylock(txq)) {
-				if (!netif_xmit_stopped(txq)) {
-					if (vlan_tx_tag_present(skb) &&
-					    !vlan_hw_offload_capable(netif_skb_features(skb),
-								     skb->vlan_proto)) {
-						skb = __vlan_put_tag(skb, skb->vlan_proto, vlan_tx_tag_get(skb));
-						if (unlikely(!skb)) {
-							/* This is actually a packet drop, but we
-							 * don't want the code at the end of this
-							 * function to try and re-queue a NULL skb.
-							 */
-							status = NETDEV_TX_OK;
-							goto unlock_txq;
-						}
-						skb->vlan_tci = 0;
-					}
-
-					status = ops->ndo_start_xmit(skb, dev);
-					if (status == NETDEV_TX_OK)
-						txq_trans_update(txq);
-				}
-			unlock_txq:
+				if (!netif_xmit_stopped(txq))
+					status = netpoll_start_xmit(skb, dev, txq);
+
 				__netif_tx_unlock(txq);
 
 				if (status == NETDEV_TX_OK)
@@ -353,7 +364,7 @@ void netpoll_send_skb_on_dev(struct netpoll *np, struct sk_buff *skb,
 
 		WARN_ONCE(!irqs_disabled(),
 			"netpoll_send_skb_on_dev(): %s enabled interrupts in poll (%pF)\n",
-			dev->name, ops->ndo_start_xmit);
+			dev->name, dev->netdev_ops->ndo_start_xmit);
 
 	}
 
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* [PATCH v2 3/6] netpoll: Move rx enable/disable into __dev_close_many
  2014-03-27 22:35                                     ` [PATCH v2 0/6] netpoll: Cleanups and fixes Eric W. Biederman
  2014-03-27 22:36                                       ` [PATCH v2 1/6] netpoll: Remove gfp parameter from __netpoll_setup Eric W. Biederman
  2014-03-27 22:37                                       ` [PATCH v2 2/6] netpoll: Only call ndo_start_xmit from a single place Eric W. Biederman
@ 2014-03-27 22:38                                       ` Eric W. Biederman
  2014-03-27 22:39                                       ` [PATCH v2 4/6] netpoll: Rename netpoll_rx_enable/disable to netpoll_poll_disable/enable Eric W. Biederman
                                                         ` (3 subsequent siblings)
  6 siblings, 0 replies; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-27 22:38 UTC (permalink / raw)
  To: David Miller
  Cc: stephen, eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma


Today netpoll_rx_enable and netpoll_rx_disable are called from
dev_close and and __dev_close, and not from dev_close_many.

Move the calls into __dev_close_many so that we have a single call
site to maintain, and so that dev_close_many gains this protection as
well.  Which importantly makes batched network device deletes safe.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 net/core/dev.c |   13 ++++---------
 1 file changed, 4 insertions(+), 9 deletions(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index 48dd323d5918..f33dd7382498 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -1313,6 +1313,9 @@ static int __dev_close_many(struct list_head *head)
 	might_sleep();
 
 	list_for_each_entry(dev, head, close_list) {
+		/* Temporarily disable netpoll until the interface is down */
+		netpoll_rx_disable(dev);
+
 		call_netdevice_notifiers(NETDEV_GOING_DOWN, dev);
 
 		clear_bit(__LINK_STATE_START, &dev->state);
@@ -1343,6 +1346,7 @@ static int __dev_close_many(struct list_head *head)
 
 		dev->flags &= ~IFF_UP;
 		net_dmaengine_put();
+		netpoll_rx_enable(dev);
 	}
 
 	return 0;
@@ -1353,14 +1357,10 @@ static int __dev_close(struct net_device *dev)
 	int retval;
 	LIST_HEAD(single);
 
-	/* Temporarily disable netpoll until the interface is down */
-	netpoll_rx_disable(dev);
-
 	list_add(&dev->close_list, &single);
 	retval = __dev_close_many(&single);
 	list_del(&single);
 
-	netpoll_rx_enable(dev);
 	return retval;
 }
 
@@ -1398,14 +1398,9 @@ int dev_close(struct net_device *dev)
 	if (dev->flags & IFF_UP) {
 		LIST_HEAD(single);
 
-		/* Block netpoll rx while the interface is going down */
-		netpoll_rx_disable(dev);
-
 		list_add(&dev->close_list, &single);
 		dev_close_many(&single);
 		list_del(&single);
-
-		netpoll_rx_enable(dev);
 	}
 	return 0;
 }
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* [PATCH v2 4/6] netpoll: Rename netpoll_rx_enable/disable to netpoll_poll_disable/enable
  2014-03-27 22:35                                     ` [PATCH v2 0/6] netpoll: Cleanups and fixes Eric W. Biederman
                                                         ` (2 preceding siblings ...)
  2014-03-27 22:38                                       ` [PATCH v2 3/6] netpoll: Move rx enable/disable into __dev_close_many Eric W. Biederman
@ 2014-03-27 22:39                                       ` Eric W. Biederman
  2014-03-27 22:41                                       ` [PATCH v2 5/6] netpoll: Remove strong unnecessary assumptions about skbs Eric W. Biederman
                                                         ` (2 subsequent siblings)
  6 siblings, 0 replies; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-27 22:39 UTC (permalink / raw)
  To: David Miller
  Cc: stephen, eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma


The netpoll_rx_enable and netpoll_rx_disable functions have always
controlled polling the network drivers transmit and receive queues.

Rename them to netpoll_poll_enable and netpoll_poll_disable to make
their functionality clear.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 include/linux/netpoll.h |    8 ++++----
 net/core/dev.c          |    8 ++++----
 net/core/netpoll.c      |    8 ++++----
 3 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/include/linux/netpoll.h b/include/linux/netpoll.h
index 893b9e66060e..b25ee9ffdbe6 100644
--- a/include/linux/netpoll.h
+++ b/include/linux/netpoll.h
@@ -47,11 +47,11 @@ struct netpoll_info {
 };
 
 #ifdef CONFIG_NETPOLL
-extern void netpoll_rx_disable(struct net_device *dev);
-extern void netpoll_rx_enable(struct net_device *dev);
+extern void netpoll_poll_disable(struct net_device *dev);
+extern void netpoll_poll_enable(struct net_device *dev);
 #else
-static inline void netpoll_rx_disable(struct net_device *dev) { return; }
-static inline void netpoll_rx_enable(struct net_device *dev) { return; }
+static inline void netpoll_poll_disable(struct net_device *dev) { return; }
+static inline void netpoll_poll_enable(struct net_device *dev) { return; }
 #endif
 
 void netpoll_send_udp(struct netpoll *np, const char *msg, int len);
diff --git a/net/core/dev.c b/net/core/dev.c
index f33dd7382498..38e8a2b96d4b 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -1245,7 +1245,7 @@ static int __dev_open(struct net_device *dev)
 	 * If we don't do this there is a chance ndo_poll_controller
 	 * or ndo_poll may be running while we open the device
 	 */
-	netpoll_rx_disable(dev);
+	netpoll_poll_disable(dev);
 
 	ret = call_netdevice_notifiers(NETDEV_PRE_UP, dev);
 	ret = notifier_to_errno(ret);
@@ -1260,7 +1260,7 @@ static int __dev_open(struct net_device *dev)
 	if (!ret && ops->ndo_open)
 		ret = ops->ndo_open(dev);
 
-	netpoll_rx_enable(dev);
+	netpoll_poll_enable(dev);
 
 	if (ret)
 		clear_bit(__LINK_STATE_START, &dev->state);
@@ -1314,7 +1314,7 @@ static int __dev_close_many(struct list_head *head)
 
 	list_for_each_entry(dev, head, close_list) {
 		/* Temporarily disable netpoll until the interface is down */
-		netpoll_rx_disable(dev);
+		netpoll_poll_disable(dev);
 
 		call_netdevice_notifiers(NETDEV_GOING_DOWN, dev);
 
@@ -1346,7 +1346,7 @@ static int __dev_close_many(struct list_head *head)
 
 		dev->flags &= ~IFF_UP;
 		net_dmaengine_put();
-		netpoll_rx_enable(dev);
+		netpoll_poll_enable(dev);
 	}
 
 	return 0;
diff --git a/net/core/netpoll.c b/net/core/netpoll.c
index 825200fcb0ff..f3012800a81b 100644
--- a/net/core/netpoll.c
+++ b/net/core/netpoll.c
@@ -214,7 +214,7 @@ static void netpoll_poll_dev(struct net_device *dev)
 	zap_completion_queue();
 }
 
-void netpoll_rx_disable(struct net_device *dev)
+void netpoll_poll_disable(struct net_device *dev)
 {
 	struct netpoll_info *ni;
 	int idx;
@@ -225,9 +225,9 @@ void netpoll_rx_disable(struct net_device *dev)
 		down(&ni->dev_lock);
 	srcu_read_unlock(&netpoll_srcu, idx);
 }
-EXPORT_SYMBOL(netpoll_rx_disable);
+EXPORT_SYMBOL(netpoll_poll_disable);
 
-void netpoll_rx_enable(struct net_device *dev)
+void netpoll_poll_enable(struct net_device *dev)
 {
 	struct netpoll_info *ni;
 	rcu_read_lock();
@@ -236,7 +236,7 @@ void netpoll_rx_enable(struct net_device *dev)
 		up(&ni->dev_lock);
 	rcu_read_unlock();
 }
-EXPORT_SYMBOL(netpoll_rx_enable);
+EXPORT_SYMBOL(netpoll_poll_enable);
 
 static void refill_skbs(void)
 {
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* [PATCH v2 5/6] netpoll: Remove strong unnecessary assumptions about skbs
  2014-03-27 22:35                                     ` [PATCH v2 0/6] netpoll: Cleanups and fixes Eric W. Biederman
                                                         ` (3 preceding siblings ...)
  2014-03-27 22:39                                       ` [PATCH v2 4/6] netpoll: Rename netpoll_rx_enable/disable to netpoll_poll_disable/enable Eric W. Biederman
@ 2014-03-27 22:41                                       ` Eric W. Biederman
  2014-03-27 22:42                                       ` [PATCH v2 6/6] netpoll: Respect NETIF_F_LLTX Eric W. Biederman
  2014-03-29 22:01                                       ` [PATCH v2 0/6] netpoll: Cleanups and fixes David Miller
  6 siblings, 0 replies; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-27 22:41 UTC (permalink / raw)
  To: David Miller
  Cc: stephen, eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma


Remove the assumption that the skbs that make it to
netpoll_send_skb_on_dev are allocated with find_skb, such that
skb->users == 1 and nothing is attached that would prevent the skbs from
being freed from hard irq context.

Remove this assumption by replacing __kfree_skb on error paths with
dev_kfree_skb_irq (in hard irq context) and kfree_skb (in process
context).

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 net/core/netpoll.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/core/netpoll.c b/net/core/netpoll.c
index f3012800a81b..2136c9aacdbf 100644
--- a/net/core/netpoll.c
+++ b/net/core/netpoll.c
@@ -112,7 +112,7 @@ static void queue_process(struct work_struct *work)
 		struct netdev_queue *txq;
 
 		if (!netif_device_present(dev) || !netif_running(dev)) {
-			__kfree_skb(skb);
+			kfree_skb(skb);
 			continue;
 		}
 
@@ -332,7 +332,7 @@ void netpoll_send_skb_on_dev(struct netpoll *np, struct sk_buff *skb,
 
 	npinfo = rcu_dereference_bh(np->dev->npinfo);
 	if (!npinfo || !netif_running(dev) || !netif_device_present(dev)) {
-		__kfree_skb(skb);
+		dev_kfree_skb_irq(skb);
 		return;
 	}
 
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* [PATCH v2 6/6] netpoll: Respect NETIF_F_LLTX
  2014-03-27 22:35                                     ` [PATCH v2 0/6] netpoll: Cleanups and fixes Eric W. Biederman
                                                         ` (4 preceding siblings ...)
  2014-03-27 22:41                                       ` [PATCH v2 5/6] netpoll: Remove strong unnecessary assumptions about skbs Eric W. Biederman
@ 2014-03-27 22:42                                       ` Eric W. Biederman
  2014-03-29 22:01                                       ` [PATCH v2 0/6] netpoll: Cleanups and fixes David Miller
  6 siblings, 0 replies; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-27 22:42 UTC (permalink / raw)
  To: David Miller
  Cc: stephen, eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma


Stop taking the transmit lock when a network device has specified
NETIF_F_LLTX.

If no locks needed to trasnmit a packet this is the ideal scenario for
netpoll as all packets can be trasnmitted immediately.

Even if some locks are needed in ndo_start_xmit skipping any unnecessary
serialization is desirable for netpoll as it makes it more likely a
debugging packet may be trasnmitted immediately instead of being
deferred until later.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 include/linux/netdevice.h |    5 +++++
 net/core/netpoll.c        |   10 +++++-----
 2 files changed, 10 insertions(+), 5 deletions(-)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 77142a78c4d9..aa06633fc966 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -2908,6 +2908,11 @@ static inline void netif_tx_unlock_bh(struct net_device *dev)
 	}						\
 }
 
+#define HARD_TX_TRYLOCK(dev, txq)			\
+	(((dev->features & NETIF_F_LLTX) == 0) ?	\
+		__netif_tx_trylock(txq) :		\
+		true )
+
 #define HARD_TX_UNLOCK(dev, txq) {			\
 	if ((dev->features & NETIF_F_LLTX) == 0) {	\
 		__netif_tx_unlock(txq);			\
diff --git a/net/core/netpoll.c b/net/core/netpoll.c
index 2136c9aacdbf..af8dc13d69d6 100644
--- a/net/core/netpoll.c
+++ b/net/core/netpoll.c
@@ -119,17 +119,17 @@ static void queue_process(struct work_struct *work)
 		txq = netdev_get_tx_queue(dev, skb_get_queue_mapping(skb));
 
 		local_irq_save(flags);
-		__netif_tx_lock(txq, smp_processor_id());
+		HARD_TX_LOCK(dev, txq, smp_processor_id());
 		if (netif_xmit_frozen_or_stopped(txq) ||
 		    netpoll_start_xmit(skb, dev, txq) != NETDEV_TX_OK) {
 			skb_queue_head(&npinfo->txq, skb);
-			__netif_tx_unlock(txq);
+			HARD_TX_UNLOCK(dev, txq);
 			local_irq_restore(flags);
 
 			schedule_delayed_work(&npinfo->tx_work, HZ/10);
 			return;
 		}
-		__netif_tx_unlock(txq);
+		HARD_TX_UNLOCK(dev, txq);
 		local_irq_restore(flags);
 	}
 }
@@ -345,11 +345,11 @@ void netpoll_send_skb_on_dev(struct netpoll *np, struct sk_buff *skb,
 		/* try until next clock tick */
 		for (tries = jiffies_to_usecs(1)/USEC_PER_POLL;
 		     tries > 0; --tries) {
-			if (__netif_tx_trylock(txq)) {
+			if (HARD_TX_TRYLOCK(dev, txq)) {
 				if (!netif_xmit_stopped(txq))
 					status = netpoll_start_xmit(skb, dev, txq);
 
-				__netif_tx_unlock(txq);
+				HARD_TX_UNLOCK(dev, txq);
 
 				if (status == NETDEV_TX_OK)
 					break;
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* Re: [PATCH 6/6] net: Free skbs from irqs when possible.
  2014-03-18 18:37                                           ` David Miller
@ 2014-03-27 23:02                                             ` Eric W. Biederman
  0 siblings, 0 replies; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-27 23:02 UTC (permalink / raw)
  To: David Miller
  Cc: stephen, eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma

David Miller <davem@davemloft.net> writes:

> From: ebiederm@xmission.com (Eric W. Biederman)
> Date: Tue, 18 Mar 2014 10:47:36 -0700
>
>> Most of the destructors today are fine (which doubly makes the warning
>> confusing).
>
> Not true by my estimation.  We absolutely do not want socket state being
> modified from hardware interrupts, and that's the most common destructor,
> releasing socket memory.

I definitely was not and am not suggesting that we change this.


I was just pointing out the difference between hard irq and soft irq
state for code is very slight, and I do not see anything that stands out
as hard irq unsafe.


Here is what I see when I read the destructors:

sock_rmem_free 
    is an antomic operation which is safe in all contexts.

sock_wfree
    appears safe in hard irq contect assuming the comment about
    of sk_free being safe in hard irq context is correct.

sock_rfree
    except for changing sk_foward_alloc without any kind of
    apparent serialization in sk_mem_uncharge appears safe.

sock_edemux 
    This just calls sock_put and inet_twsk_put
    sock_put just calls sk_put (which is documented as hard irq safe)
    Nothing in inet_twsk_put appears unsafe in hard irq context.

There are other destructors out there that definitely do things such
as call local_bh_disable that are unambiguously unsafe in hard irq
context but I had to look hard to find them.

Eric

^ permalink raw reply	[flat|nested] 288+ messages in thread

* [PATCH 0/3] netpoll: Freeing skbs in hard irq context
  2014-03-18 15:52                                               ` David Laight
@ 2014-03-28  1:14                                                 ` Eric W. Biederman
  2014-03-28  1:15                                                   ` [PATCH 1/3] net: Add a test to see if a skb is freeable in " Eric W. Biederman
                                                                     ` (2 more replies)
  0 siblings, 3 replies; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-28  1:14 UTC (permalink / raw)
  To: David Miller
  Cc: 'Bjørn Mork',
	Eric Dumazet, Ben Hutchings, stephen, netdev, xiyou.wangcong,
	mpm, satyam.sharma, David Laight


In some case such as trigger_all_cpu_backtrace netpoll can wind up
generating a lot of packets in hard irq context.  My rough estimate is
perhaps 1500 packets.  That is larger than any driver tx ring, which
makes netpoll_poll_dev necessary to transmit all of the netconsole
packets immediately.  Those 1500+ packets can take up a couple megabytes
of memory if we aren't careful.  On some machines that is enough to
start depleting the polls GFP_ATOMIC can dig into, so netpoll needs to
at a minimum to be able to reuse the memory for the skbs it has
transmitted.

Today this reclamation of transmitted packets happens in
zap_completetion_queue as dev_kfree_skb_irq places all packets to be
freed on a completion queue.  netpoll then searches this queue for
packets it thinks are freeable, and frees them.  Unfortunately
the current logic netpoll uses to decided a packet is freeable
is incorrect and thus unsafe :(

The logic netpoll uses to determine if a packet is freeable is to verify
a skb does not have a destructor.  Which works most of the time.  But in
pathological cases it can report that is a packet is freeable in hard
irq context when it is not.

This set of changes adds a function skb_irq_freeable and uses that
function in zap_completion_queue to remove the bug, and in bowls
of kfree_skb in skb_release_head_state to warn if we are inappropriate
freeing a skb.

While I don't expect this will allow anything except skbs sent by
netpoll to be freed, solving the general problem rather than solving
this for just packets generated by netpoll seems like a robust way of
handling this.

Eric W. Biederman (3):
      net: Add a test to see if a skb is freeable in irq context
      netpoll: Use skb_irq_freeable to make zap_completion_queue safe.
      net: Warn when a skb is freed inappropriately in hard irq context.

 include/linux/skbuff.h | 13 +++++++++++++
 net/core/netpoll.c     |  2 +-
 net/core/skbuff.c      |  6 +++---
 3 files changed, 17 insertions(+), 4 deletions(-)

Eric

^ permalink raw reply	[flat|nested] 288+ messages in thread

* [PATCH 1/3] net: Add a test to see if a skb is freeable in irq context
  2014-03-28  1:14                                                 ` [PATCH 0/3] netpoll: Freeing skbs in hard irq context Eric W. Biederman
@ 2014-03-28  1:15                                                   ` Eric W. Biederman
  2014-03-29 22:09                                                     ` David Miller
  2014-03-28  1:20                                                   ` [PATCH 2/3] netpoll: Use skb_irq_freeable to make zap_completion_queue safe Eric W. Biederman
  2014-03-28  1:23                                                   ` [PATCH 3/3] net: Warn when a skb is freed inappropriately in hard irq context Eric W. Biederman
  2 siblings, 1 reply; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-28  1:15 UTC (permalink / raw)
  To: David Miller
  Cc: 'Bjørn Mork',
	Eric Dumazet, Ben Hutchings, stephen, netdev, xiyou.wangcong,
	mpm, satyam.sharma, David Laight


Currently netpoll and skb_release_head_state assume that a skb is
freeable in hard irq context except when skb->destructor is set.

The reality is far from this.  So add a function skb_irq_freeable to
compute the full test and in the process be the living documentation of
what the requirements are of actually freeing a skb in hard irq context.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 include/linux/skbuff.h |   13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index aa2c22cb8158..50a909ca7b3b 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -2833,6 +2833,19 @@ static inline void skb_init_secmark(struct sk_buff *skb)
 { }
 #endif
 
+static inline bool skb_irq_freeable(struct sk_buff *skb)
+{
+	return !skb->destructor &&
+#if IS_ENABLED(CONFIG_XFRM)
+		!skb->sp &&
+#endif
+#if IS_ENABLED(CONFIG_NF_CONNTRACK)
+		!skb->nfct &&
+#endif
+		!skb->_skb_refdst &&
+		!skb_has_frag_list(skb);
+}
+
 static inline void skb_set_queue_mapping(struct sk_buff *skb, u16 queue_mapping)
 {
 	skb->queue_mapping = queue_mapping;
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* [PATCH 2/3] netpoll: Use skb_irq_freeable to make zap_completion_queue safe.
  2014-03-28  1:14                                                 ` [PATCH 0/3] netpoll: Freeing skbs in hard irq context Eric W. Biederman
  2014-03-28  1:15                                                   ` [PATCH 1/3] net: Add a test to see if a skb is freeable in " Eric W. Biederman
@ 2014-03-28  1:20                                                   ` Eric W. Biederman
  2014-03-28 13:17                                                     ` Sergei Shtylyov
  2014-03-28  1:23                                                   ` [PATCH 3/3] net: Warn when a skb is freed inappropriately in hard irq context Eric W. Biederman
  2 siblings, 1 reply; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-28  1:20 UTC (permalink / raw)
  To: David Miller
  Cc: 'Bjørn Mork',
	Eric Dumazet, Ben Hutchings, stephen, netdev, xiyou.wangcong,
	mpm, satyam.sharma, David Laight


Replace the test in zap_completion_queu to test when it is safe to
free skbs in hard irq context with skb_irq_freeable ensuring we only
free skbs when it is safe, and removing the possibility of subtle
problems.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 net/core/netpoll.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/core/netpoll.c b/net/core/netpoll.c
index af8dc13d69d6..a0563642c5b5 100644
--- a/net/core/netpoll.c
+++ b/net/core/netpoll.c
@@ -270,7 +270,7 @@ static void zap_completion_queue(void)
 		while (clist != NULL) {
 			struct sk_buff *skb = clist;
 			clist = clist->next;
-			if (skb->destructor) {
+			if (!skb_irq_freeable(skb)) {
 				atomic_inc(&skb->users);
 				dev_kfree_skb_any(skb); /* put this one back */
 			} else {
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* [PATCH 3/3] net: Warn when a skb is freed inappropriately in hard irq context.
  2014-03-28  1:14                                                 ` [PATCH 0/3] netpoll: Freeing skbs in hard irq context Eric W. Biederman
  2014-03-28  1:15                                                   ` [PATCH 1/3] net: Add a test to see if a skb is freeable in " Eric W. Biederman
  2014-03-28  1:20                                                   ` [PATCH 2/3] netpoll: Use skb_irq_freeable to make zap_completion_queue safe Eric W. Biederman
@ 2014-03-28  1:23                                                   ` Eric W. Biederman
  2 siblings, 0 replies; 288+ messages in thread
From: Eric W. Biederman @ 2014-03-28  1:23 UTC (permalink / raw)
  To: David Miller
  Cc: 'Bjørn Mork',
	Eric Dumazet, Ben Hutchings, stephen, netdev, xiyou.wangcong,
	mpm, satyam.sharma, David Laight


Use skb_irq_freeable to warn on all cases where it is not safe to free a
skb in hard irq context.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 net/core/skbuff.c |    6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 3f14c638c2b1..aaee52840a7d 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -554,14 +554,14 @@ static void kfree_skbmem(struct sk_buff *skb)
 
 static void skb_release_head_state(struct sk_buff *skb)
 {
+	WARN_ON(in_irq() && !skb_irq_freeable(skb));
+
 	skb_dst_drop(skb);
 #ifdef CONFIG_XFRM
 	secpath_put(skb->sp);
 #endif
-	if (skb->destructor) {
-		WARN_ON(in_irq());
+	if (skb->destructor)
 		skb->destructor(skb);
-	}
 #if IS_ENABLED(CONFIG_NF_CONNTRACK)
 	nf_conntrack_put(skb->nfct);
 #endif
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* Re: [PATCH 2/3] netpoll: Use skb_irq_freeable to make zap_completion_queue safe.
  2014-03-28  1:20                                                   ` [PATCH 2/3] netpoll: Use skb_irq_freeable to make zap_completion_queue safe Eric W. Biederman
@ 2014-03-28 13:17                                                     ` Sergei Shtylyov
  2014-04-01 19:19                                                       ` [PATCH v2 0/2] " Eric W. Biederman
  0 siblings, 1 reply; 288+ messages in thread
From: Sergei Shtylyov @ 2014-03-28 13:17 UTC (permalink / raw)
  To: Eric W. Biederman, David Miller
  Cc: 'Bjørn Mork',
	Eric Dumazet, Ben Hutchings, stephen, netdev, xiyou.wangcong,
	mpm, satyam.sharma, David Laight

Hello.

On 28-03-2014 5:20, Eric W. Biederman wrote:

> Replace the test in zap_completion_queu to test when it is safe to

    Small typo: it's zap_completion_queue().

> free skbs in hard irq context with skb_irq_freeable ensuring we only
> free skbs when it is safe, and removing the possibility of subtle
> problems.

> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>

WBR, Sergei

^ permalink raw reply	[flat|nested] 288+ messages in thread

* Re: [PATCH v2 0/6] netpoll: Cleanups and fixes
  2014-03-27 22:35                                     ` [PATCH v2 0/6] netpoll: Cleanups and fixes Eric W. Biederman
                                                         ` (5 preceding siblings ...)
  2014-03-27 22:42                                       ` [PATCH v2 6/6] netpoll: Respect NETIF_F_LLTX Eric W. Biederman
@ 2014-03-29 22:01                                       ` David Miller
  6 siblings, 0 replies; 288+ messages in thread
From: David Miller @ 2014-03-29 22:01 UTC (permalink / raw)
  To: ebiederm
  Cc: stephen, eric.dumazet, netdev, xiyou.wangcong, mpm, satyam.sharma

From: ebiederm@xmission.com (Eric W. Biederman)
Date: Thu, 27 Mar 2014 15:35:20 -0700

> This should be a small set of safe cleanups and fixes to netpoll.
> 
> The fixes are vlan headers are now always inserted when needed, and
> napi polling is always avoided when network devices are closed.
> 
> There are a bunch of little cleanups removing unnecessary code, fixing
> function naming, not taking unnecessary locks and removing general
> silliness.

Looks good, series applied, thanks Eric.

^ permalink raw reply	[flat|nested] 288+ messages in thread

* Re: [PATCH 1/3] net: Add a test to see if a skb is freeable in irq context
  2014-03-28  1:15                                                   ` [PATCH 1/3] net: Add a test to see if a skb is freeable in " Eric W. Biederman
@ 2014-03-29 22:09                                                     ` David Miller
  2014-04-01  8:03                                                       ` Eric W. Biederman
  0 siblings, 1 reply; 288+ messages in thread
From: David Miller @ 2014-03-29 22:09 UTC (permalink / raw)
  To: ebiederm
  Cc: bjorn, eric.dumazet, ben, stephen, netdev, xiyou.wangcong, mpm,
	satyam.sharma, David.Laight

From: ebiederm@xmission.com (Eric W. Biederman)
Date: Thu, 27 Mar 2014 18:15:47 -0700

> Currently netpoll and skb_release_head_state assume that a skb is
> freeable in hard irq context except when skb->destructor is set.
> 
> The reality is far from this.  So add a function skb_irq_freeable to
> compute the full test and in the process be the living documentation of
> what the requirements are of actually freeing a skb in hard irq context.
> 
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
 ...
> +	return !skb->destructor &&
> +#if IS_ENABLED(CONFIG_XFRM)
> +		!skb->sp &&
> +#endif
> +#if IS_ENABLED(CONFIG_NF_CONNTRACK)
> +		!skb->nfct &&
> +#endif
> +		!skb->_skb_refdst &&
> +		!skb_has_frag_list(skb);

I think you need to add "!skb->nf_bridge &&" to this test.

^ permalink raw reply	[flat|nested] 288+ messages in thread

* Re: [PATCH 1/3] net: Add a test to see if a skb is freeable in irq context
  2014-03-29 22:09                                                     ` David Miller
@ 2014-04-01  8:03                                                       ` Eric W. Biederman
  2014-04-01 16:15                                                         ` David Miller
  0 siblings, 1 reply; 288+ messages in thread
From: Eric W. Biederman @ 2014-04-01  8:03 UTC (permalink / raw)
  To: David Miller
  Cc: bjorn, eric.dumazet, ben, stephen, netdev, xiyou.wangcong, mpm,
	satyam.sharma, David.Laight

David Miller <davem@davemloft.net> writes:

> From: ebiederm@xmission.com (Eric W. Biederman)
> Date: Thu, 27 Mar 2014 18:15:47 -0700
>
>> Currently netpoll and skb_release_head_state assume that a skb is
>> freeable in hard irq context except when skb->destructor is set.
>> 
>> The reality is far from this.  So add a function skb_irq_freeable to
>> compute the full test and in the process be the living documentation of
>> what the requirements are of actually freeing a skb in hard irq context.
>> 
>> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
>  ...
>> +	return !skb->destructor &&
>> +#if IS_ENABLED(CONFIG_XFRM)
>> +		!skb->sp &&
>> +#endif
>> +#if IS_ENABLED(CONFIG_NF_CONNTRACK)
>> +		!skb->nfct &&
>> +#endif
>> +		!skb->_skb_refdst &&
>> +		!skb_has_frag_list(skb);
>
> I think you need to add "!skb->nf_bridge &&" to this test.

Given that the definition of nf_bridge_put is just:

static inline void nf_bridge_put(struct nf_bridge_info *nf_bridge)
{
	if (nf_bridge && atomic_dec_and_test(&nf_bridge->use))
		kfree(nf_bridge);
}

I don't see why.

atomic_dec_and_test and kfree are hard irq safe.

I can see the code evolving in a way where it wouldn't be safe to put a
nf_bridge from hard irq context but the code as it is today is trivially
safe.

Eric

^ permalink raw reply	[flat|nested] 288+ messages in thread

* Re: [PATCH 1/3] net: Add a test to see if a skb is freeable in irq context
  2014-04-01  8:03                                                       ` Eric W. Biederman
@ 2014-04-01 16:15                                                         ` David Miller
  0 siblings, 0 replies; 288+ messages in thread
From: David Miller @ 2014-04-01 16:15 UTC (permalink / raw)
  To: ebiederm
  Cc: bjorn, eric.dumazet, ben, stephen, netdev, xiyou.wangcong, mpm,
	satyam.sharma, David.Laight

From: ebiederm@xmission.com (Eric W. Biederman)
Date: Tue, 01 Apr 2014 01:03:53 -0700

> David Miller <davem@davemloft.net> writes:
> 
>> From: ebiederm@xmission.com (Eric W. Biederman)
>> Date: Thu, 27 Mar 2014 18:15:47 -0700
>>
>>> Currently netpoll and skb_release_head_state assume that a skb is
>>> freeable in hard irq context except when skb->destructor is set.
>>> 
>>> The reality is far from this.  So add a function skb_irq_freeable to
>>> compute the full test and in the process be the living documentation of
>>> what the requirements are of actually freeing a skb in hard irq context.
>>> 
>>> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
>>  ...
>>> +	return !skb->destructor &&
>>> +#if IS_ENABLED(CONFIG_XFRM)
>>> +		!skb->sp &&
>>> +#endif
>>> +#if IS_ENABLED(CONFIG_NF_CONNTRACK)
>>> +		!skb->nfct &&
>>> +#endif
>>> +		!skb->_skb_refdst &&
>>> +		!skb_has_frag_list(skb);
>>
>> I think you need to add "!skb->nf_bridge &&" to this test.
> 
> Given that the definition of nf_bridge_put is just:
> 
> static inline void nf_bridge_put(struct nf_bridge_info *nf_bridge)
> {
> 	if (nf_bridge && atomic_dec_and_test(&nf_bridge->use))
> 		kfree(nf_bridge);
> }
> 
> I don't see why.
> 
> atomic_dec_and_test and kfree are hard irq safe.
> 
> I can see the code evolving in a way where it wouldn't be safe to put a
> nf_bridge from hard irq context but the code as it is today is trivially
> safe.

Fair enough.

^ permalink raw reply	[flat|nested] 288+ messages in thread

* [PATCH v2 0/2] netpoll: Use skb_irq_freeable to make zap_completion_queue safe.
  2014-03-28 13:17                                                     ` Sergei Shtylyov
@ 2014-04-01 19:19                                                       ` Eric W. Biederman
  2014-04-01 19:20                                                         ` [PATCH v2 1/2] net: Add a test to see if a skb is freeable in irq context Eric W. Biederman
                                                                           ` (2 more replies)
  0 siblings, 3 replies; 288+ messages in thread
From: Eric W. Biederman @ 2014-04-01 19:19 UTC (permalink / raw)
  To: David Miller
  Cc: 'Bjørn Mork',
	Eric Dumazet, Ben Hutchings, stephen, netdev, xiyou.wangcong,
	mpm, satyam.sharma, David Laight, Sergei Shtylyov


Resending with the requested typo fix in the commit message and pruned
down to just the bare minimal bug fix, that this patchset is.

In some case such as trigger_all_cpu_backtrace netpoll can wind up
generating a lot of packets in hard irq context.  My rough estimate is
perhaps 1500 packets.  That is larger than any driver tx ring, which
makes netpoll_poll_dev necessary to transmit all of the netconsole
packets immediately.  Those 1500+ packets can take up a couple megabytes
of memory if we aren't careful.  On some machines that is enough to
start depleting the polls GFP_ATOMIC can dig into, so netpoll needs to
at a minimum to be able to reuse the memory for the skbs it has
transmitted.

Today this reclamation of transmitted packets happens in
zap_completetion_queue as dev_kfree_skb_irq places all packets to be
freed on a completion queue.  netpoll then searches this queue for
packets it thinks are freeable, and frees them.  Unfortunately
the current logic netpoll uses to decided a packet is freeable
is incorrect and thus unsafe :(

The logic netpoll uses to determine if a packet is freeable is to verify
a skb does not have a destructor.  Which works most of the time.  But in
pathological cases it can report that a packet is freeable in hard irq
context when it is not.

This set of changes adds a function skb_irq_freeable and uses that
function in zap_completion_queue to remove the bug.

While I don't expect this will allow anything except skbs sent by
netpoll to be freed, finding all packets that are freeable instead
of just find packets generated by netpoll that are guaranteed to be
freeable seems like the most robust and maintainable way of handling
this proble.

Eric W. Biederman (2):
      net: Add a test to see if a skb is freeable in irq context
      netpoll: Use skb_irq_freeable to make zap_completion_queue safe.

 include/linux/skbuff.h | 13 +++++++++++++
 net/core/netpoll.c     |  2 +-
 2 files changed, 14 insertions(+), 1 deletion(-)

---
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 18ef0224fb6a..113fee1b7b63 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -2833,6 +2833,19 @@ static inline void skb_init_secmark(struct sk_buff *skb)
 { }
 #endif
 
+static inline bool skb_irq_freeable(struct sk_buff *skb)
+{
+       return !skb->destructor &&
+#if IS_ENABLED(CONFIG_XFRM)
+               !skb->sp &&
+#endif
+#if IS_ENABLED(CONFIG_NF_CONNTRACK)
+               !skb->nfct &&
+#endif
+               !skb->_skb_refdst &&
+               !skb_has_frag_list(skb);
+}
+
 static inline void skb_set_queue_mapping(struct sk_buff *skb, u16 queue_mapping)
 {
        skb->queue_mapping = queue_mapping;
diff --git a/net/core/netpoll.c b/net/core/netpoll.c
index ed7740f7a94d..e33937fb32a0 100644
--- a/net/core/netpoll.c
+++ b/net/core/netpoll.c
@@ -270,7 +270,7 @@ static void zap_completion_queue(void)
                while (clist != NULL) {
                        struct sk_buff *skb = clist;
                        clist = clist->next;
-                       if (skb->destructor) {
+                       if (!skb_irq_freeable(skb)) {
                                atomic_inc(&skb->users);
                                dev_kfree_skb_any(skb); /* put this one back */
                        } else {

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* [PATCH v2 1/2] net: Add a test to see if a skb is freeable in irq context
  2014-04-01 19:19                                                       ` [PATCH v2 0/2] " Eric W. Biederman
@ 2014-04-01 19:20                                                         ` Eric W. Biederman
  2014-04-01 19:49                                                           ` Eric Dumazet
  2014-04-01 19:21                                                         ` [PATCH v2 2/2] netpoll: Use skb_irq_freeable to make zap_completion_queue safe Eric W. Biederman
  2014-04-01 21:54                                                         ` [PATCH v2 0/2] " David Miller
  2 siblings, 1 reply; 288+ messages in thread
From: Eric W. Biederman @ 2014-04-01 19:20 UTC (permalink / raw)
  To: David Miller
  Cc: 'Bjørn Mork',
	Eric Dumazet, Ben Hutchings, stephen, netdev, xiyou.wangcong,
	mpm, satyam.sharma, David Laight, Sergei Shtylyov


Currently netpoll and skb_release_head_state assume that a skb is
freeable in hard irq context except when skb->destructor is set.

The reality is far from this.  So add a function skb_irq_freeable to
compute the full test and in the process be the living documentation
of what the requirements are of actually freeing a skb in hard irq
context.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 include/linux/skbuff.h |   13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 18ef0224fb6a..113fee1b7b63 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -2833,6 +2833,19 @@ static inline void skb_init_secmark(struct sk_buff *skb)
 { }
 #endif
 
+static inline bool skb_irq_freeable(struct sk_buff *skb)
+{
+	return !skb->destructor &&
+#if IS_ENABLED(CONFIG_XFRM)
+		!skb->sp &&
+#endif
+#if IS_ENABLED(CONFIG_NF_CONNTRACK)
+		!skb->nfct &&
+#endif
+		!skb->_skb_refdst &&
+		!skb_has_frag_list(skb);
+}
+
 static inline void skb_set_queue_mapping(struct sk_buff *skb, u16 queue_mapping)
 {
 	skb->queue_mapping = queue_mapping;
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* [PATCH v2 2/2] netpoll: Use skb_irq_freeable to make zap_completion_queue safe.
  2014-04-01 19:19                                                       ` [PATCH v2 0/2] " Eric W. Biederman
  2014-04-01 19:20                                                         ` [PATCH v2 1/2] net: Add a test to see if a skb is freeable in irq context Eric W. Biederman
@ 2014-04-01 19:21                                                         ` Eric W. Biederman
  2014-04-01 21:54                                                         ` [PATCH v2 0/2] " David Miller
  2 siblings, 0 replies; 288+ messages in thread
From: Eric W. Biederman @ 2014-04-01 19:21 UTC (permalink / raw)
  To: David Miller
  Cc: 'Bjørn Mork',
	Eric Dumazet, Ben Hutchings, stephen, netdev, xiyou.wangcong,
	mpm, satyam.sharma, David Laight, Sergei Shtylyov


Replace the test in zap_completion_queue to test when it is safe to
free skbs in hard irq context with skb_irq_freeable ensuring we only
free skbs when it is safe, and removing the possibility of subtle
problems.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 net/core/netpoll.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/core/netpoll.c b/net/core/netpoll.c
index ed7740f7a94d..e33937fb32a0 100644
--- a/net/core/netpoll.c
+++ b/net/core/netpoll.c
@@ -270,7 +270,7 @@ static void zap_completion_queue(void)
 		while (clist != NULL) {
 			struct sk_buff *skb = clist;
 			clist = clist->next;
-			if (skb->destructor) {
+			if (!skb_irq_freeable(skb)) {
 				atomic_inc(&skb->users);
 				dev_kfree_skb_any(skb); /* put this one back */
 			} else {
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 288+ messages in thread

* Re: [PATCH v2 1/2] net: Add a test to see if a skb is freeable in irq context
  2014-04-01 19:20                                                         ` [PATCH v2 1/2] net: Add a test to see if a skb is freeable in irq context Eric W. Biederman
@ 2014-04-01 19:49                                                           ` Eric Dumazet
  0 siblings, 0 replies; 288+ messages in thread
From: Eric Dumazet @ 2014-04-01 19:49 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: David Miller, 'Bjørn Mork',
	Ben Hutchings, stephen, netdev, xiyou.wangcong, mpm,
	satyam.sharma, David Laight, Sergei Shtylyov

On Tue, 2014-04-01 at 12:20 -0700, Eric W. Biederman wrote:
> Currently netpoll and skb_release_head_state assume that a skb is
> freeable in hard irq context except when skb->destructor is set.
> 
> The reality is far from this.  So add a function skb_irq_freeable to
> compute the full test and in the process be the living documentation
> of what the requirements are of actually freeing a skb in hard irq
> context.
> 
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
> ---
>  include/linux/skbuff.h |   13 +++++++++++++
>  1 file changed, 13 insertions(+)
> 
> diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
> index 18ef0224fb6a..113fee1b7b63 100644
> --- a/include/linux/skbuff.h
> +++ b/include/linux/skbuff.h
> @@ -2833,6 +2833,19 @@ static inline void skb_init_secmark(struct sk_buff *skb)
>  { }
>  #endif
>  
> +static inline bool skb_irq_freeable(struct sk_buff *skb)
> +{

This probably should be 'const struct sk_buff *skb'

Acked-by: Eric Dumazet <edumazet@google.com>

^ permalink raw reply	[flat|nested] 288+ messages in thread

* Re: [PATCH v2 0/2] netpoll: Use skb_irq_freeable to make zap_completion_queue safe.
  2014-04-01 19:19                                                       ` [PATCH v2 0/2] " Eric W. Biederman
  2014-04-01 19:20                                                         ` [PATCH v2 1/2] net: Add a test to see if a skb is freeable in irq context Eric W. Biederman
  2014-04-01 19:21                                                         ` [PATCH v2 2/2] netpoll: Use skb_irq_freeable to make zap_completion_queue safe Eric W. Biederman
@ 2014-04-01 21:54                                                         ` David Miller
  2 siblings, 0 replies; 288+ messages in thread
From: David Miller @ 2014-04-01 21:54 UTC (permalink / raw)
  To: ebiederm
  Cc: bjorn, eric.dumazet, ben, stephen, netdev, xiyou.wangcong, mpm,
	satyam.sharma, David.Laight, sergei.shtylyov

From: ebiederm@xmission.com (Eric W. Biederman)
Date: Tue, 01 Apr 2014 12:19:30 -0700

> Resending with the requested typo fix in the commit message and
> pruned down to just the bare minimal bug fix, that this patchset is.

Series applied, I added the "const" to the skb_irq_freeable() argument
as suggested by Eric Dumazet.

^ permalink raw reply	[flat|nested] 288+ messages in thread

end of thread, other threads:[~2014-04-01 21:54 UTC | newest]

Thread overview: 288+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-03-03 20:40 [PATCH] netpoll: Don't call driver methods from interrupt context Eric W. Biederman
2014-03-04  4:23 ` Cong Wang
2014-03-04 10:29   ` Eric W. Biederman
2014-03-04 21:09   ` David Miller
2014-03-04 21:08 ` David Miller
2014-03-05  0:03   ` Eric W. Biederman
2014-03-05  0:26     ` David Miller
2014-03-05 19:24       ` Eric W. Biederman
2014-03-07 19:30         ` David Miller
2014-03-08  5:13           ` Eric W. Biederman
2014-03-05 19:14   ` Eric W. Biederman
2014-03-11  3:16   ` [PATCH next-next 0/11] Using dev_kfree_skb_any for functions called in multiple contexts Eric W. Biederman
2014-03-11  3:18     ` [PATCH 01/11] bonding: Call dev_kfree_skby_any instead of kfree_skb Eric W. Biederman
2014-03-11  3:44       ` Eric Dumazet
2014-03-11  4:00         ` Eric W. Biederman
2014-03-11  4:56           ` Eric Dumazet
2014-03-11  4:42         ` David Miller
2014-03-11  5:02           ` Eric Dumazet
2014-03-11  8:43             ` [RFC PATCH 0/2] remove netpoll rx support Eric W. Biederman
2014-03-11  8:44               ` [RFC PATCH 1/2] netpoll: Remove dead netpoll_rx code Eric W. Biederman
2014-03-11 12:29                 ` Eric Dumazet
2014-03-11 15:23                   ` Stephen Hemminger
2014-03-11 15:34                     ` Hannes Frederic Sowa
2014-03-11 20:48                     ` Eric W. Biederman
2014-03-12 18:31                       ` Cong Wang
2014-03-13 19:23                       ` David Miller
2014-03-13 20:46                         ` Eric W. Biederman
2014-03-15  1:30                         ` [PATCH 0/9] netpoll: Cleanup received packet processing Eric W. Biederman
2014-03-15  1:31                           ` [PATCH 1/9] netpoll: Pass budget into poll_napi Eric W. Biederman
2014-03-15  1:32                           ` [PATCH 2/9] netpoll: Visit all napi handlers in poll_napi Eric W. Biederman
2014-03-15  1:33                           ` [PATCH 3/9] netpoll: Warn if more packets are processed than are budgeted Eric W. Biederman
2014-03-15  1:33                           ` [PATCH 4/9] netpoll: Add netpoll_rx_processing Eric W. Biederman
2014-03-15  1:34                           ` [PATCH 5/9] netpoll: Don't drop all received packets Eric W. Biederman
2014-03-15  1:35                           ` [PATCH 6/9] netpoll: Move netpoll_trap under CONFIG_NETPOLL_TRAP Eric W. Biederman
2014-03-15  1:36                           ` [PATCH 7/9] netpoll: Consolidate neigh_tx processing in service_neigh_queue Eric W. Biederman
2014-03-15  1:37                           ` [PATCH 8/9] netpoll: Move all receive processing under CONFIG_NETPOLL_TRAP Eric W. Biederman
2014-03-15  1:39                           ` [PATCH 9/9] netpoll: Remove dead packet receive code (CONFIG_NETPOLL_TRAP) Eric W. Biederman
2014-03-15  2:59                           ` [PATCH 0/9] netpoll: Cleanup received packet processing David Miller
2014-03-15  3:39                             ` Eric W. Biederman
2014-03-15  3:43                               ` [PATCH 00/10] " Eric W. Biederman
2014-03-15  3:44                                 ` [PATCH 01/10] netpoll: move setting of NETPOLL_RX_DROP into netpoll_poll_dev Eric W. Biederman
2014-03-15  3:45                                 ` [PATCH 02/10] netpoll: Pass budget into poll_napi Eric W. Biederman
2014-03-15  3:45                                 ` [PATCH 03/10] netpoll: Visit all napi handlers in poll_napi Eric W. Biederman
2014-03-15  3:47                                 ` [PATCH 04/10] netpoll: Warn if more packets are processed than are budgeted Eric W. Biederman
2014-03-15  3:47                                 ` [PATCH 05/10] netpoll: Add netpoll_rx_processing Eric W. Biederman
2014-03-15  3:48                                 ` [PATCH 06/10] netpoll: Don't drop all received packets Eric W. Biederman
2014-03-15  3:49                                 ` [PATCH 07/10] netpoll: Move netpoll_trap under CONFIG_NETPOLL_TRAP Eric W. Biederman
2014-03-15  3:50                                 ` [PATCH 08/10] netpoll: Consolidate neigh_tx processing in service_neigh_queue Eric W. Biederman
2014-03-15  3:50                                 ` [PATCH 09/10] netpoll: Move all receive processing under CONFIG_NETPOLL_TRAP Eric W. Biederman
2014-03-15  3:51                                 ` [PATCH 10/10] netpoll: Remove dead packet receive code (CONFIG_NETPOLL_TRAP) Eric W. Biederman
2014-03-17 19:49                                 ` [PATCH 00/10] netpoll: Cleanup received packet processing David Miller
2014-03-18  6:22                                   ` [PATCH 0/6] netpoll: Cleanups and fixes Eric W. Biederman
2014-03-18  6:24                                     ` [PATCH 1/6] netpoll: Remove gfp parameter from __netpoll_setup Eric W. Biederman
2014-03-18  6:24                                     ` [PATCH 2/6] netpoll: Only call ndo_start_xmit from a single place Eric W. Biederman
2014-03-18  6:25                                     ` [PATCH 3/6] netpoll: Don't allow on devices that perform their own xmit locking Eric W. Biederman
2014-03-18 18:26                                       ` Cong Wang
2014-03-18 18:38                                         ` David Miller
2014-03-18  6:26                                     ` [PATCH 4/6] netpoll: Move rx enable/disable into __dev_close_many Eric W. Biederman
2014-03-18  6:27                                     ` [PATCH 5/6] netpoll: Rename netpoll_rx_enable/disable to netpoll_poll_disable/enable Eric W. Biederman
2014-03-18  6:27                                     ` [PATCH 6/6] net: Free skbs from irqs when possible Eric W. Biederman
2014-03-18  9:32                                       ` David Laight
2014-03-18 13:22                                       ` Eric Dumazet
2014-03-18 17:51                                         ` Eric W. Biederman
2014-03-18 13:30                                       ` Ben Hutchings
2014-03-18 14:24                                         ` Bjørn Mork
2014-03-18 15:23                                           ` Eric Dumazet
2014-03-18 15:41                                             ` Bjørn Mork
2014-03-18 15:52                                               ` David Laight
2014-03-28  1:14                                                 ` [PATCH 0/3] netpoll: Freeing skbs in hard irq context Eric W. Biederman
2014-03-28  1:15                                                   ` [PATCH 1/3] net: Add a test to see if a skb is freeable in " Eric W. Biederman
2014-03-29 22:09                                                     ` David Miller
2014-04-01  8:03                                                       ` Eric W. Biederman
2014-04-01 16:15                                                         ` David Miller
2014-03-28  1:20                                                   ` [PATCH 2/3] netpoll: Use skb_irq_freeable to make zap_completion_queue safe Eric W. Biederman
2014-03-28 13:17                                                     ` Sergei Shtylyov
2014-04-01 19:19                                                       ` [PATCH v2 0/2] " Eric W. Biederman
2014-04-01 19:20                                                         ` [PATCH v2 1/2] net: Add a test to see if a skb is freeable in irq context Eric W. Biederman
2014-04-01 19:49                                                           ` Eric Dumazet
2014-04-01 19:21                                                         ` [PATCH v2 2/2] netpoll: Use skb_irq_freeable to make zap_completion_queue safe Eric W. Biederman
2014-04-01 21:54                                                         ` [PATCH v2 0/2] " David Miller
2014-03-28  1:23                                                   ` [PATCH 3/3] net: Warn when a skb is freed inappropriately in hard irq context Eric W. Biederman
2014-03-18 17:53                                         ` [PATCH 6/6] net: Free skbs from irqs when possible Eric W. Biederman
2014-03-18 15:23                                       ` Stephen Hemminger
2014-03-18 17:47                                         ` Eric W. Biederman
2014-03-18 18:37                                           ` David Miller
2014-03-27 23:02                                             ` Eric W. Biederman
2014-03-27 22:35                                     ` [PATCH v2 0/6] netpoll: Cleanups and fixes Eric W. Biederman
2014-03-27 22:36                                       ` [PATCH v2 1/6] netpoll: Remove gfp parameter from __netpoll_setup Eric W. Biederman
2014-03-27 22:37                                       ` [PATCH v2 2/6] netpoll: Only call ndo_start_xmit from a single place Eric W. Biederman
2014-03-27 22:38                                       ` [PATCH v2 3/6] netpoll: Move rx enable/disable into __dev_close_many Eric W. Biederman
2014-03-27 22:39                                       ` [PATCH v2 4/6] netpoll: Rename netpoll_rx_enable/disable to netpoll_poll_disable/enable Eric W. Biederman
2014-03-27 22:41                                       ` [PATCH v2 5/6] netpoll: Remove strong unnecessary assumptions about skbs Eric W. Biederman
2014-03-27 22:42                                       ` [PATCH v2 6/6] netpoll: Respect NETIF_F_LLTX Eric W. Biederman
2014-03-29 22:01                                       ` [PATCH v2 0/6] netpoll: Cleanups and fixes David Miller
2014-03-11  8:45               ` [RFC PATCH 2/2] netpoll: Don't poll for received packets Eric W. Biederman
2014-03-11 12:44                 ` Eric Dumazet
2014-03-12 18:39                 ` Cong Wang
2014-03-13 20:48                   ` Eric W. Biederman
2014-03-11 12:24               ` [RFC PATCH 0/2] remove netpoll rx support Eric Dumazet
2014-03-11 16:49               ` David Miller
2014-03-11 19:48                 ` Eric W. Biederman
2014-03-11 20:09                   ` David Miller
2014-03-11 21:13                     ` [PATCH next-next 0/10] Using dev_kfree_skb_any for functions called in multiple contexts Eric W. Biederman
2014-03-11 21:14                       ` [PATCH net-next 01/10] 8139cp: Call dev_kfree_skby_any instead of kfree_skb Eric W. Biederman
2014-03-11 21:15                       ` [PATCH net-next 02/10] 8139too: Call dev_kfree_skby_any instead of dev_kfree_skb Eric W. Biederman
2014-03-12  2:06                         ` Eric Dumazet
2014-03-12 21:24                           ` Francois Romieu
2014-03-12 22:01                             ` Eric Dumazet
2014-03-13 21:08                               ` Eric W. Biederman
2014-03-14  4:26                               ` [PATCH net-next] net: Replace u64_stats_fetch_begin_bh to u64_stats_fetch_begin_irq Eric W. Biederman
2014-03-15  2:41                                 ` David Miller
2014-03-11 21:16                       ` [PATCH net-next 03/10] r8169: Call dev_kfree_skby_any instead of dev_kfree_skb Eric W. Biederman
2014-03-12  2:02                         ` Eric Dumazet
2014-03-11 21:16                       ` [PATCH net-next 04/10] bonding: Call dev_kfree_skby_any instead of kfree_skb Eric W. Biederman
2014-03-11 21:17                       ` [PATCH net-next 05/10] bnx2: Call dev_kfree_skby_any instead of dev_kfree_skb Eric W. Biederman
2014-03-11 21:18                       ` [PATCH net-next 06/10] tg3: " Eric W. Biederman
2014-03-11 21:18                       ` [PATCH net-next 07/10] ixgb: " Eric W. Biederman
2014-03-11 21:19                       ` [PATCH net-next 08/10] mlx4: " Eric W. Biederman
2014-03-11 21:19                       ` [PATCH net-next 09/10] benet: Call dev_kfree_skby_any instead of kfree_skb Eric W. Biederman
2014-03-11 21:20                       ` [PATCH net-next 10/10] gianfar: Carefully free skbs in functions called by netpoll Eric W. Biederman
2014-03-12  2:54                       ` [PATCH next-next 0/10] Using dev_kfree_skb_any for functions called in multiple contexts Eric Dumazet
2014-03-12 20:22                         ` David Miller
2014-03-25  5:58                       ` [net-next 00/54][pull request] Using dev_kfree/consume_skb_any " Eric W. Biederman
2014-03-25  6:04                         ` [PATCH 01/54] uml/net_kern: Call dev_consume_skb_any instead of dev_kfree_skb Eric W. Biederman
2014-03-25  6:04                           ` [PATCH 02/54] 3c509: " Eric W. Biederman
2014-03-25 13:03                             ` Eric Dumazet
2014-03-25  6:04                           ` [PATCH 03/54] 3c59x: " Eric W. Biederman
2014-03-25 13:04                             ` Eric Dumazet
2014-03-25  6:04                           ` [PATCH 04/54] 8390: " Eric W. Biederman
2014-03-25 13:06                             ` Eric Dumazet
2014-03-25  6:04                           ` [PATCH 05/54] bfin_mac: " Eric W. Biederman
2014-03-25 13:10                             ` Eric Dumazet
2014-03-25  6:04                           ` [PATCH 06/54] sun4i-emac: " Eric W. Biederman
2014-03-25 13:11                             ` Eric Dumazet
2014-03-25  6:04                           ` [PATCH 07/54] am79c961a: " Eric W. Biederman
2014-03-25 13:13                             ` Eric Dumazet
2014-03-25  6:04                           ` [PATCH 08/54] lance: " Eric W. Biederman
2014-03-25 13:14                             ` Eric Dumazet
2014-03-25  6:04                           ` [PATCH 09/54] pcnet32: Call dev_kfree_skb_any " Eric W. Biederman
2014-03-25 13:15                             ` Eric Dumazet
2014-03-25  6:04                           ` [PATCH 10/54] alx: " Eric W. Biederman
2014-03-25 13:16                             ` Eric Dumazet
2014-03-25  6:04                           ` [PATCH 11/54] atl1c: Call dev_kfree/consume_skb_any " Eric W. Biederman
2014-03-25 13:18                             ` Eric Dumazet
2014-03-25  6:04                           ` [PATCH 12/54] bnad: Call dev_kfree_skb_any " Eric W. Biederman
2014-03-25 13:19                             ` Eric Dumazet
2014-03-25  6:04                           ` [PATCH 13/54] macb: Call dev_kfree_skb_any instead of kfree_skb Eric W. Biederman
2014-03-25 13:21                             ` Eric Dumazet
2014-03-25  6:05                           ` [PATCH 14/54] xgmac: Call dev_kfree/consume_skb_any instead of dev_kfree_skb Eric W. Biederman
2014-03-25 15:16                             ` Eric Dumazet
2014-03-25  6:05                           ` [PATCH 15/54] cxgb3: Call dev_kfree/consume_skb_any instead of [dev_]kfree_skb Eric W. Biederman
2014-03-25 15:18                             ` Eric Dumazet
2014-03-25  6:05                           ` [PATCH 16/54] cxgb4: " Eric W. Biederman
2014-03-25 15:19                             ` Eric Dumazet
2014-03-25  6:05                           ` [PATCH 17/54] cxfb4vf: " Eric W. Biederman
2014-03-25 15:22                             ` Eric Dumazet
2014-03-25  6:05                           ` [PATCH 18/54] cs89x0: Call dev_consume_skb_any instead of dev_kfree_skb Eric W. Biederman
2014-03-25 15:23                             ` Eric Dumazet
2014-03-25  6:05                           ` [PATCH 19/54] enic: Call dev_kfree_skb_any " Eric W. Biederman
2014-03-25 15:24                             ` Eric Dumazet
2014-03-25  6:05                           ` [PATCH 20/54] dm9000: Call dev_consume_skb_any " Eric W. Biederman
2014-03-25 15:26                             ` Eric Dumazet
2014-03-25  6:05                           ` [PATCH 21/54] dmfe: Call dev_kfree/consume_skb_any " Eric W. Biederman
2014-03-25 15:28                             ` Eric Dumazet
2014-03-25  6:05                           ` [PATCH 22/54] uli526x: " Eric W. Biederman
2014-03-25 15:29                             ` Eric Dumazet
2014-03-25  6:05                           ` [PATCH 23/54] sundance: Call dev_kfree_skb_any " Eric W. Biederman
2014-03-25 15:29                             ` Eric Dumazet
2014-03-25  6:05                           ` [PATCH 24/54] fec: Call dev_kfree_skb_any instead of kfree_skb Eric W. Biederman
2014-03-25 15:30                             ` Eric Dumazet
2014-03-25  6:05                           ` [PATCH 25/54] ucc_geth: Call dev_consume_skb_any instead of dev_kfree_skb Eric W. Biederman
2014-03-25 15:30                             ` Eric Dumazet
2014-03-25  6:05                           ` [PATCH 26/54] i825xx: Call dev_kfree_skb_any " Eric W. Biederman
2014-03-25 15:31                             ` Eric Dumazet
2014-03-25  6:05                           ` [PATCH 27/54] ehea: Call dev_consume_skb_any " Eric W. Biederman
2014-03-25 15:39                             ` Eric Dumazet
2014-03-25 15:39                             ` Eric Dumazet
2014-03-25  6:05                           ` [PATCH 28/54] ibmveth: " Eric W. Biederman
2014-03-25  6:05                           ` [PATCH 29/54] jme: Call dev_kfree_skb_any " Eric W. Biederman
2014-03-25 15:45                             ` Eric Dumazet
2014-03-25  6:05                           ` [PATCH 30/54] mv643xx_eth: " Eric W. Biederman
2014-03-25 15:46                             ` Eric Dumazet
2014-03-25  6:05                           ` [PATCH 31/54] skge: Call dev_kfree/consume_skb_any " Eric W. Biederman
2014-03-25 15:47                             ` Eric Dumazet
2014-03-25  6:05                           ` [PATCH 32/54] sky2: Call dev_kfree_skb_any " Eric W. Biederman
2014-03-25 16:23                             ` Eric Dumazet
2014-03-25  6:05                           ` [PATCH 33/54] ksz884x: Call dev_consume_skb_any " Eric W. Biederman
2014-03-25 16:23                             ` Eric Dumazet
2014-03-25  6:05                           ` [PATCH 34/54] s2io: Call dev_kfree_skb_any " Eric W. Biederman
2014-03-25 16:25                             ` Eric Dumazet
2014-03-25  6:05                           ` [PATCH 35/54] vxge: " Eric W. Biederman
2014-03-25 16:26                             ` Eric Dumazet
2014-03-25  6:05                           ` [PATCH 36/54] forcedeth: Call dev_kfree_skb_any instead of kfree_skb Eric W. Biederman
2014-03-25 16:27                             ` Eric Dumazet
2014-03-25  6:05                           ` [PATCH 37/54] sc92031: Call dev_consume_skb_any instead of dev_kfree_skb Eric W. Biederman
2014-03-25 20:39                             ` Eric Dumazet
2014-03-25  6:05                           ` [PATCH 38/54] sis900: Call dev_kfree_skb_any " Eric W. Biederman
2014-03-25 20:39                             ` Eric Dumazet
2014-03-25  6:05                           ` [PATCH 39/54] smc911x: " Eric W. Biederman
2014-03-25 20:40                             ` Eric Dumazet
2014-03-25  6:05                           ` [PATCH 40/54] smc91x: Call dev_kfree/consume_skb_any " Eric W. Biederman
2014-03-25 20:40                             ` Eric Dumazet
2014-03-25  6:05                           ` [PATCH 41/54] smsc911x: Call dev_consume_skb_any " Eric W. Biederman
2014-03-25 20:41                             ` Eric Dumazet
2014-03-25  6:05                           ` [PATCH 42/54] stmmac: " Eric W. Biederman
2014-03-25 20:42                             ` Eric Dumazet
2014-03-25  6:05                           ` [PATCH 43/54] sungem: " Eric W. Biederman
2014-03-25 20:42                             ` Eric Dumazet
2014-03-25  6:05                           ` [PATCH 44/54] tilepro: Call dev_consume_skb_any instead of kfree_skb Eric W. Biederman
2014-03-25 20:43                             ` Eric Dumazet
2014-03-25  6:05                           ` [PATCH 45/54] spider_net: Call dev_consume_skb_any instead of dev_kfree_skb Eric W. Biederman
2014-03-25 20:44                             ` Eric Dumazet
2014-03-25  6:05                           ` [PATCH 46/54] via-rhine: Call dev_kfree/consume_skb_any " Eric W. Biederman
2014-03-25 20:44                             ` Eric Dumazet
2014-03-25  6:05                           ` [PATCH 47/54] via-velocity: Call dev_kfree_skb_any instead of kfree_skb Eric W. Biederman
2014-03-25 20:45                             ` Eric Dumazet
2014-03-25  6:05                           ` [PATCH 48/54] xilinx_emaclite: Call dev_consume_skb_any instead of dev_kfree_skb Eric W. Biederman
2014-03-25 20:46                             ` Eric Dumazet
2014-03-25  6:05                           ` [PATCH 49/54] vmxnet3: Call dev_kfree_skb_any " Eric W. Biederman
2014-03-25 20:46                             ` Eric Dumazet
2014-03-25  6:05                           ` [PATCH 50/54] xen-netfront: " Eric W. Biederman
2014-03-25 20:46                             ` Eric Dumazet
2014-03-25  6:05                           ` [PATCH 51/54] wlags49_h2: Call dev_kfree/consume_skb_any " Eric W. Biederman
2014-03-25 20:47                             ` Eric Dumazet
2014-03-25  6:05                           ` [PATCH 52/54] staging/octeon-ethernet: " Eric W. Biederman
2014-03-25 20:47                             ` Eric Dumazet
2014-03-25  6:05                           ` [PATCH 53/54] virtio_net: Call dev_kfree_skb_any " Eric W. Biederman
2014-03-25 20:48                             ` Eric Dumazet
2014-03-25  6:05                           ` [PATCH 54/54] if_vlan: Call dev_kfree_skb_any instead of kfree_skb Eric W. Biederman
2014-03-25 20:48                             ` Eric Dumazet
2014-03-25 13:01                           ` [PATCH 01/54] uml/net_kern: Call dev_consume_skb_any instead of dev_kfree_skb Eric Dumazet
2014-03-25 18:05                             ` Eric W. Biederman
2014-03-26  9:49                               ` David Laight
2014-03-25 20:49                         ` [net-next 00/54][pull request] Using dev_kfree/consume_skb_any for functions called in multiple contexts Eric Dumazet
2014-03-25 22:54                           ` David Miller
2014-03-11 21:30                     ` [PATCH net-next 0/2] Don't receive packets when the napi budget == 0 Eric W. Biederman
2014-03-11 21:31                       ` [PATCH net-next 1/2] bnx2: " Eric W. Biederman
2014-03-12  5:07                         ` Eric Dumazet
2014-03-11 21:31                       ` [PATCH net-next 2/2] 8139cp: " Eric W. Biederman
2014-03-12  5:08                         ` Eric Dumazet
2014-03-13 19:19                       ` [PATCH net-next 0/2] " David Miller
2014-03-15  0:56                       ` [PATCH net-next 0/16] " Eric W. Biederman
2014-03-15  0:57                         ` [PATCH net-next 01/16] bnx2x: " Eric W. Biederman
2014-03-15  0:59                         ` [PATCH net-next 02/16] i40e: " Eric W. Biederman
2014-03-15  1:00                         ` [PATCH net-next 03/16] igb: " Eric W. Biederman
2014-03-15  1:00                         ` [PATCH net-next 04/16] ixgbe: " Eric W. Biederman
2014-03-15  1:01                         ` [PATCH net-next 05/16] amd8111e: " Eric W. Biederman
2014-03-15  1:02                         ` [PATCH net-next 06/16] enic: " Eric W. Biederman
2014-03-15  1:03                         ` [PATCH net-next 07/16] fs_enet: " Eric W. Biederman
2014-03-15  1:03                         ` [PATCH net-next 08/16] ibmveth: " Eric W. Biederman
2014-03-15  1:05                         ` [PATCH net-next 09/16] sky2: " Eric W. Biederman
2014-03-15  1:34                           ` Stephen Hemminger
2014-03-15  1:05                         ` [PATCH net-next 10/16] mlx4: " Eric W. Biederman
2014-03-15  1:06                         ` [PATCH net-next 11/16] s2io: " Eric W. Biederman
2014-03-15  1:08                         ` [PATCH net-next 12/16] tilegx: " Eric W. Biederman
2014-03-15  1:09                         ` [PATCH net-next 13/16] tilepro: " Eric W. Biederman
2014-03-15  1:10                         ` [PATCH net-next-test 14/16] tc35815: " Eric W. Biederman
2014-03-15  1:10                         ` [PATCH net-next 15/16] vxge: " Eric W. Biederman
2014-03-15  1:11                         ` [PATCH net-next 16/16] sfc: " Eric W. Biederman
2014-03-15 15:23                           ` Ben Hutchings
2014-03-15 16:29                             ` David Miller
2014-03-15 17:23                               ` Ben Hutchings
2014-03-15 18:54                                 ` Eric Dumazet
2014-03-15 19:25                                   ` Eric W. Biederman
2014-03-15 20:01                               ` mlx4 netpoll and rx/tx weirdness Eric W. Biederman
2014-03-16 16:17                                 ` Eric Dumazet
2014-03-17 21:22                                   ` David Miller
2014-03-17 21:40                                     ` Eric Dumazet
2014-03-15  2:54                         ` [PATCH net-next 0/16] Don't receive packets when the napi budget == 0 David Miller
2014-03-11 21:33                     ` [PATCH net-next] bcm63xx_enet: Stop pretending to support netpoll Eric W. Biederman
2014-03-13 19:26                       ` David Miller
2014-03-13 19:42                         ` Florian Fainelli
2014-03-13 19:58                           ` David Miller
2014-03-11 16:39             ` [PATCH 01/11] bonding: Call dev_kfree_skby_any instead of kfree_skb David Miller
2014-03-11  5:31           ` Eric W. Biederman
2014-03-11  3:18     ` [PATCH 02/11] bnx2: Call dev_kfree_skby_any instead of dev_kfree_skb Eric W. Biederman
2014-03-11  3:47       ` Eric Dumazet
2014-03-11  4:10         ` Eric W. Biederman
2014-03-11  4:43         ` David Miller
2014-03-11  3:19     ` [PATCH 03/11] bnx2x: " Eric W. Biederman
2014-03-11  3:19     ` [PATCH 04/11] tg3: " Eric W. Biederman
2014-03-11  3:20     ` [PATCH 05/11] bcm63xx_enet: " Eric W. Biederman
2014-03-11  3:21     ` [PATCH 06/11] e1000: " Eric W. Biederman
2014-03-11  3:22     ` [PATCH 07/11] igbvf: " Eric W. Biederman
2014-03-11  3:22     ` [PATCH 08/11] ixgb: " Eric W. Biederman
2014-03-11  3:23     ` [PATCH 09/11] mlx4: " Eric W. Biederman
2014-03-11  3:23     ` [PATCH 10/11] benet: Call dev_kfree_skby_any instead of kfree_skb Eric W. Biederman
2014-03-11  3:24     ` [PATCH 11/11] gianfar: Carefully free skbs in functions called by netpoll Eric W. Biederman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).