All of lore.kernel.org
 help / color / mirror / Atom feed
From: Tony Lu <tonylu@linux.alibaba.com>
To: Steven Rostedt <rostedt@goodmis.org>
Cc: davem@davemloft.net, mingo@redhat.com, netdev@vger.kernel.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH] net: add net namespace inode for all net_dev events
Date: Wed, 10 Mar 2021 17:03:40 +0800	[thread overview]
Message-ID: <YEiLbLiXe6ju/vCO@TonyMac-Alibaba> (raw)
In-Reply-To: <20210309124011.709c6cd3@gandalf.local.home>

On Tue, Mar 09, 2021 at 12:40:11PM -0500, Steven Rostedt wrote:
> On Tue,  9 Mar 2021 12:43:50 +0800
> Tony Lu <tonylu@linux.alibaba.com> wrote:
> 
> > There are lots of net namespaces on the host runs containers like k8s.
> > It is very common to see the same interface names among different net
> > namespaces, such as eth0. It is not possible to distinguish them without
> > net namespace inode.
> > 
> > This adds net namespace inode for all net_dev events, help us
> > distinguish between different net devices.
> > 
> > Output:
> >   <idle>-0       [006] ..s.   133.306989: net_dev_xmit: net_inum=4026531992 dev=eth0 skbaddr=0000000011a87c68 len=54 rc=0
> > 
> > Signed-off-by: Tony Lu <tonylu@linux.alibaba.com>
> > ---
> >  include/trace/events/net.h | 35 +++++++++++++++++++++++++----------
> >  1 file changed, 25 insertions(+), 10 deletions(-)
> > 
> > diff --git a/include/trace/events/net.h b/include/trace/events/net.h
> > index 2399073c3afc..a52f90d83411 100644
> > --- a/include/trace/events/net.h
> > +++ b/include/trace/events/net.h
> > @@ -35,6 +35,7 @@ TRACE_EVENT(net_dev_start_xmit,
> >  		__field(	u16,			gso_size	)
> >  		__field(	u16,			gso_segs	)
> >  		__field(	u16,			gso_type	)
> > +		__field(	unsigned int,		net_inum	)
> >  	),
> 
> This patch made me take a look at the net_dev_start_xmit trace event, and I
> see it's very "holy". That is, it has lots of holes in the structure.
> 
> 	TP_STRUCT__entry(
> 		__string(	name,			dev->name	)
> 		__field(	u16,			queue_mapping	)
> 		__field(	const void *,		skbaddr		)
> 		__field(	bool,			vlan_tagged	)
> 		__field(	u16,			vlan_proto	)
> 		__field(	u16,			vlan_tci	)
> 		__field(	u16,			protocol	)
> 		__field(	u8,			ip_summed	)
> 		__field(	unsigned int,		len		)
> 		__field(	unsigned int,		data_len	)
> 		__field(	int,			network_offset	)
> 		__field(	bool,			transport_offset_valid)
> 		__field(	int,			transport_offset)
> 		__field(	u8,			tx_flags	)
> 		__field(	u16,			gso_size	)
> 		__field(	u16,			gso_segs	)
> 		__field(	u16,			gso_type	)
> 		__field(	unsigned int,		net_inum	)
> 	),
> 
> If you look at /sys/kernel/tracing/events/net/net_dev_start_xmit/format
> 
> name: net_dev_start_xmit
> ID: 1581
> format:
> 	field:unsigned short common_type;	offset:0;	size:2;	signed:0;
> 	field:unsigned char common_flags;	offset:2;	size:1;	signed:0;
> 	field:unsigned char common_preempt_count;	offset:3;	size:1;	signed:0;
> 	field:int common_pid;	offset:4;	size:4;	signed:1;
> 
> 	field:__data_loc char[] name;	offset:8;	size:4;	signed:1;
> 	field:u16 queue_mapping;	offset:12;	size:2;	signed:0;
> 	field:const void * skbaddr;	offset:16;	size:8;	signed:0;
> 
> Notice, queue_mapping is 2 bytes at offset 12 (ends at offset 14), but
> skbaddr starts at offset 16. That means there's two bytes wasted.
> 
> 	field:bool vlan_tagged;	offset:24;	size:1;	signed:0;
> 	field:u16 vlan_proto;	offset:26;	size:2;	signed:0;
> 
> Another byte missing above (24 + 1 != 26)
> 
> 	field:u16 vlan_tci;	offset:28;	size:2;	signed:0;
> 	field:u16 protocol;	offset:30;	size:2;	signed:0;
> 	field:u8 ip_summed;	offset:32;	size:1;	signed:0;
> 	field:unsigned int len;	offset:36;	size:4;	signed:0;
> 
> Again another three bytes missing (32 + 1 != 36)
> 
> 	field:unsigned int data_len;	offset:40;	size:4;	signed:0;
> 	field:int network_offset;	offset:44;	size:4;	signed:1;
> 	field:bool transport_offset_valid;	offset:48;	size:1;	signed:0;
> 	field:int transport_offset;	offset:52;	size:4;	signed:1;
> 
> Again, another 3 bytes missing (48 + 1 != 52)
> 
> 	field:u8 tx_flags;	offset:56;	size:1;	signed:0;
> 	field:u16 gso_size;	offset:58;	size:2;	signed:0;
> 
> Another byte missing (56 + 1 != 58)
> 
> 	field:u16 gso_segs;	offset:60;	size:2;	signed:0;	
> 	field:u16 gso_type;	offset:62;	size:2;	signed:0;
> 	field:unsigned int net_inum;	offset:64;	size:4;	signed:0;
> 
> The above shows 10 bytes wasted for this event.
> 
> The order of the fields is important. Don't worry about breaking API by
> fixing it. The parsing code uses this output to find where the binary data
> is.
> 
> Hmm, I should write a script that reads all the format files and point out
> areas that have holes in it.

I use pahole to read vmlinux.o directly with defconfig and
CONFIG_DEBUG_INFO enabled, the result shows 22 structs prefixed with
trace_event_raw_ that have at least one hole.

Command:
	# structs have at least one hole and can be packed
	pahole vmlinux.o -H 1 -R -P -y trace_event_raw_ > output.txt

	# details (result includes above)
	pahole vmlinux.o -H 1 -y trace_event_raw_ > output_detail.txt

Here is the lists (>= 1 hole and can be packed, #1 size, #2 ):

	trace_event_raw_mm_shrink_slab_start    80      72      8
	trace_event_raw_mm_shrink_slab_end      64      56      8
	trace_event_raw_percpu_alloc_percpu     56      48      8
	trace_event_raw_writeback_inode_template        48      40      8
	trace_event_raw_filelock_lock   88      80      8
	trace_event_raw_iomap_apply     72      64      8
	trace_event_raw_ext4__map_blocks_exit   56      48      8
	trace_event_raw_ext4_ext_rm_leaf        64      56      8
	trace_event_raw_ext4_ext_remove_space_done      64      56      8
	trace_event_raw_nfs_rename_event_done   56      48      8
	trace_event_raw_nfs4_rename     56      48      8
	trace_event_raw_net_dev_start_xmit      72      64      8
	trace_event_raw_net_dev_rx_verbose_template     88      72      16
	trace_event_raw_tcp_probe       120     112     8
	trace_event_raw_qdisc_dequeue   64      56      8
	trace_event_raw_rpc_xdr_alignment       88      80      8
	trace_event_raw_rdev_return_int_mesh_config     108     104     4
	trace_event_raw_rdev_update_mesh_config 128     120     8
	trace_event_raw_rdev_get_ftm_responder_stats    120     112     8
	trace_event_raw_drv_bss_info_changed    184     168     16
	trace_event_raw_drv_ampdu_action        88      80      8
	trace_event_raw_drv_tdls_recv_channel_switch    112     104     8


Here is the detail (net_dev_start_xmit as example):

	struct trace_event_raw_net_dev_start_xmit {
	        struct trace_entry         ent;                  /*     0     8 */
	        u32                        __data_loc_name;      /*     8     4 */
	        u16                        queue_mapping;        /*    12     2 */
	
	        /* XXX 2 bytes hole, try to pack */
	
	        const void  *              skbaddr;              /*    16     8 */
	        bool                       vlan_tagged;          /*    24     1 */
	
	        /* XXX 1 byte hole, try to pack */
	
	        u16                        vlan_proto;           /*    26     2 */
	        u16                        vlan_tci;             /*    28     2 */
	        u16                        protocol;             /*    30     2 */
	        u8                         ip_summed;            /*    32     1 */
	
	        /* XXX 3 bytes hole, try to pack */
	
	        unsigned int               len;                  /*    36     4 */
	        unsigned int               data_len;             /*    40     4 */
	        int                        network_offset;       /*    44     4 */
	        bool                       transport_offset_valid; /*    48     1 */
	
	        /* XXX 3 bytes hole, try to pack */
	
	        int                        transport_offset;     /*    52     4 */
	        u8                         tx_flags;             /*    56     1 */
	
	        /* XXX 1 byte hole, try to pack */
	
	        u16                        gso_size;             /*    58     2 */
	        u16                        gso_segs;             /*    60     2 */
	        u16                        gso_type;             /*    62     2 */
	        /* --- cacheline 1 boundary (64 bytes) --- */
	        unsigned int               net_inum;             /*    64     4 */
	        char                       __data[];             /*    68     0 */
	
	        /* size: 72, cachelines: 2, members: 20 */
	        /* sum members: 58, holes: 5, sum holes: 10 */
	        /* padding: 4 */
	        /* last cacheline: 8 bytes */
	};

pahole shows there are 5 holes with 10 bytes in net_dev_start_xmit event.


Cheers,
Tony Lu

> 
> I haven't looked at the other trace events, but they may all have the same
> issues.
> 
> -- Steve
> 
> 
> 
> >  
> >  	TP_fast_assign(
> > @@ -56,10 +57,12 @@ TRACE_EVENT(net_dev_start_xmit,
> >  		__entry->gso_size = skb_shinfo(skb)->gso_size;
> >  		__entry->gso_segs = skb_shinfo(skb)->gso_segs;
> >  		__entry->gso_type = skb_shinfo(skb)->gso_type;
> > +		__entry->net_inum = dev_net(skb->dev)->ns.inum;
> >  	),
> >  
> >

  parent reply	other threads:[~2021-03-10  9:04 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-09  4:43 [PATCH] net: add net namespace inode for all net_dev events Tony Lu
2021-03-09 17:40 ` Steven Rostedt
2021-03-09 19:53   ` David Ahern
2021-03-09 20:02     ` Steven Rostedt
2021-03-09 20:17       ` David Ahern
2021-03-09 20:20         ` Alexei Starovoitov
2021-03-09 20:35         ` Steven Rostedt
2021-03-09 20:39           ` Alexei Starovoitov
2021-03-09 20:18       ` Alexei Starovoitov
2021-03-10  9:03   ` Tony Lu [this message]
2021-03-10 16:31     ` Steven Rostedt
2021-03-11  6:39       ` Tony Lu
2021-03-10 16:49     ` Steven Rostedt
2021-03-09 20:12 ` Eric Dumazet
2021-03-10  9:22   ` Lorenz Bauer
2021-03-10 12:01     ` Tony Lu
2021-03-10  9:33   ` Tony Lu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YEiLbLiXe6ju/vCO@TonyMac-Alibaba \
    --to=tonylu@linux.alibaba.com \
    --cc=davem@davemloft.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=netdev@vger.kernel.org \
    --cc=rostedt@goodmis.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.