* [RFC 0/3] vsockmon: virtual device to monitor AF_VSOCK sockets. @ 2016-05-28 16:29 ggarcia 2016-05-28 16:29 ` [RFC 1/3] vsockmon: Add tap functions ggarcia ` (2 more replies) 0 siblings, 3 replies; 13+ messages in thread From: ggarcia @ 2016-05-28 16:29 UTC (permalink / raw) To: netdev; +Cc: jhansen, stefanha, Gerard Garcia From: Gerard Garcia <ggarcia@deic.uab.cat> Virtual socket transports operate at kernel level therefore, there is no easy way to see the traffic exchanged between virtual machines and hypervisors that communicate using AF_VSOCK sockets. In addition, being able to see the control messages exchanged by the transports may be useful for debugging and optimization purposes. This patch adds a virtual device that may be used to see the traffic exchanged between virtual machines and hypervisors through AF_VSOCK sockets. Its structure is based on the nlmon device and this version just targets the virtio transport, but support for the VMCI transport can be easily implemented. The vsockmon header consists of two structs: a generic header and a header specific to the transport. The generic header allows to follow an AF_VSOCK stream without having to understand the details of the transport while the transport header gives more detail which may be useful for troubleshooting and debugging. The repository https://github.com/GerardGarcia/linux/tree/vsockmon implements these patches over the Stefan Hajnoczi vsock-next repository https://github.com/stefanha/linux/tree/vsock-next where the virtio trasnport is implemented. In the repository there is also a simple program that shows the traffic from a vsockmon device: https://github.com/GerardGarcia/linux/blob/vsockmon/vsockmon.c that can be used for testing. Any thoughts and comments will be greatly appreciated. Thanks to Stefan Hajnoczi for his help. Gerard Gerard Garcia (3): vsockmon: Add tap functions. vsockmon: Add vsockmon device vsockmon: Add vsock hooks drivers/net/Kconfig | 8 ++ drivers/net/Makefile | 1 + drivers/net/vsockmon.c | 171 ++++++++++++++++++++++++++++++++++++++++++ drivers/vhost/vsock.c | 71 ++++++++++++++++++ include/net/af_vsock.h | 13 ++++ include/uapi/linux/Kbuild | 1 + include/uapi/linux/if_arp.h | 1 + include/uapi/linux/vsockmon.h | 37 +++++++++ net/vmw_vsock/af_vsock.c | 105 ++++++++++++++++++++++++++ 9 files changed, 408 insertions(+) create mode 100644 drivers/net/vsockmon.c create mode 100644 include/uapi/linux/vsockmon.h -- 2.8.3 ^ permalink raw reply [flat|nested] 13+ messages in thread
* [RFC 1/3] vsockmon: Add tap functions. 2016-05-28 16:29 [RFC 0/3] vsockmon: virtual device to monitor AF_VSOCK sockets ggarcia @ 2016-05-28 16:29 ` ggarcia 2016-06-01 21:07 ` Stefan Hajnoczi 2016-05-28 16:29 ` [RFC 2/3] vsockmon: Add vsockmon device ggarcia 2016-05-28 16:29 ` [RFC 3/3] vsockmon: Add vsock hooks ggarcia 2 siblings, 1 reply; 13+ messages in thread From: ggarcia @ 2016-05-28 16:29 UTC (permalink / raw) To: netdev; +Cc: jhansen, stefanha, Gerard Garcia From: Gerard Garcia <ggarcia@deic.uab.cat> Signed-off-by: Gerard Garcia <ggarcia@deic.uab.cat> --- include/net/af_vsock.h | 13 ++++++ include/uapi/linux/if_arp.h | 1 + net/vmw_vsock/af_vsock.c | 105 ++++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 119 insertions(+) diff --git a/include/net/af_vsock.h b/include/net/af_vsock.h index 15694c9..c4593d8 100644 --- a/include/net/af_vsock.h +++ b/include/net/af_vsock.h @@ -187,4 +187,17 @@ struct sock *vsock_find_connected_socket(struct sockaddr_vm *src, struct sockaddr_vm *dst); void vsock_for_each_connected_socket(void (*fn)(struct sock *sk)); +/**** TAP ****/ + +struct vsock_tap { + struct net_device *dev; + struct module *module; + struct list_head list; +}; + +extern int vsock_init_tap(void); +extern int vsock_add_tap(struct vsock_tap *vt); +extern int vsock_remove_tap(struct vsock_tap *vt); +extern void vsock_deliver_tap(struct sk_buff *skb); + #endif /* __AF_VSOCK_H__ */ diff --git a/include/uapi/linux/if_arp.h b/include/uapi/linux/if_arp.h index 4d024d7..869262a 100644 --- a/include/uapi/linux/if_arp.h +++ b/include/uapi/linux/if_arp.h @@ -95,6 +95,7 @@ #define ARPHRD_IP6GRE 823 /* GRE over IPv6 */ #define ARPHRD_NETLINK 824 /* Netlink header */ #define ARPHRD_6LOWPAN 825 /* IPv6 over LoWPAN */ +#define ARPHRD_VSOCKMON 826 /* Vsock monitor header */ #define ARPHRD_VOID 0xFFFF /* Void type, nothing is known */ #define ARPHRD_NONE 0xFFFE /* zero header length */ diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c index 6b158ab..ec7a05d 100644 --- a/net/vmw_vsock/af_vsock.c +++ b/net/vmw_vsock/af_vsock.c @@ -96,6 +96,7 @@ #include <linux/unistd.h> #include <linux/wait.h> #include <linux/workqueue.h> +#include <linux/if_arp.h> #include <net/sock.h> #include <net/af_vsock.h> @@ -2012,6 +2013,110 @@ const struct vsock_transport *vsock_core_get_transport(void) } EXPORT_SYMBOL_GPL(vsock_core_get_transport); +/**** TAP ****/ +static DEFINE_SPINLOCK(vsock_tap_lock); +static struct list_head vsock_tap_all __read_mostly; + +int vsock_init_tap(void) { + INIT_LIST_HEAD(&vsock_tap_all); + + return 0; +} +EXPORT_SYMBOL_GPL(vsock_init_tap); + +int vsock_add_tap(struct vsock_tap *vt) { + if (unlikely(vt->dev->type != ARPHRD_VSOCKMON)) + return -EINVAL; + + spin_lock(&vsock_tap_lock); + list_add_rcu(&vt->list, &vsock_tap_all); + spin_unlock(&vsock_tap_lock); + + __module_get(vt->module); + + return 0; +} +EXPORT_SYMBOL_GPL(vsock_add_tap); + +int __vsock_remove_tap(struct vsock_tap *vt) { + bool found = false; + struct vsock_tap *tmp; + + spin_lock(&vsock_tap_lock); + + list_for_each_entry(tmp, &vsock_tap_all, list) { + if (vt == tmp) { + list_del_rcu(&vt->list); + found = true; + goto out; + } + } + + pr_warn("__vsock_remove_tap: %p not found\n", vt); +out: + spin_unlock(&vsock_tap_lock); + + if (found) + module_put(vt->module); + + return found ? 0 : -ENODEV; +} + +int vsock_remove_tap(struct vsock_tap *vt) +{ + int ret; + + ret = __vsock_remove_tap(vt); + synchronize_net(); + + return ret; +} +EXPORT_SYMBOL_GPL(vsock_remove_tap); + +static int __vsock_deliver_tap_skb(struct sk_buff *skb, + struct net_device *dev) +{ + struct sk_buff *vskb; + int ret = 0; + + if (skb) { + dev_hold(dev); + vskb = skb_clone(skb, GFP_ATOMIC); + vskb->dev = dev; + vskb->pkt_type = PACKET_USER; + ret = dev_queue_xmit(vskb); + if (unlikely(ret > 0)) + ret = net_xmit_errno(ret); + + dev_put(dev); + } + + return ret; +} + +static void __vsock_deliver_tap(struct sk_buff *skb) +{ + int ret; + struct vsock_tap *tmp; + + list_for_each_entry_rcu(tmp, &vsock_tap_all, list) { + ret = __vsock_deliver_tap_skb(skb, tmp->dev); + if (unlikely(ret)) + break; + } +} + +void vsock_deliver_tap(struct sk_buff *skb) +{ + rcu_read_lock(); + + if (unlikely(!list_empty(&vsock_tap_all))) + __vsock_deliver_tap(skb); + + rcu_read_unlock(); +} +EXPORT_SYMBOL_GPL(vsock_deliver_tap); + MODULE_AUTHOR("VMware, Inc."); MODULE_DESCRIPTION("VMware Virtual Socket Family"); MODULE_VERSION("1.0.1.0-k"); -- 2.8.3 ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [RFC 1/3] vsockmon: Add tap functions. 2016-05-28 16:29 ` [RFC 1/3] vsockmon: Add tap functions ggarcia @ 2016-06-01 21:07 ` Stefan Hajnoczi 2016-06-09 15:02 ` Gerard Garcia 0 siblings, 1 reply; 13+ messages in thread From: Stefan Hajnoczi @ 2016-06-01 21:07 UTC (permalink / raw) To: ggarcia; +Cc: netdev, jhansen [-- Attachment #1: Type: text/plain, Size: 6419 bytes --] On Sat, May 28, 2016 at 06:29:05PM +0200, ggarcia@abra.uab.cat wrote: > From: Gerard Garcia <ggarcia@deic.uab.cat> > > Signed-off-by: Gerard Garcia <ggarcia@deic.uab.cat> > --- > include/net/af_vsock.h | 13 ++++++ > include/uapi/linux/if_arp.h | 1 + > net/vmw_vsock/af_vsock.c | 105 ++++++++++++++++++++++++++++++++++++++++++++ > 3 files changed, 119 insertions(+) > > diff --git a/include/net/af_vsock.h b/include/net/af_vsock.h > index 15694c9..c4593d8 100644 > --- a/include/net/af_vsock.h > +++ b/include/net/af_vsock.h > @@ -187,4 +187,17 @@ struct sock *vsock_find_connected_socket(struct sockaddr_vm *src, > struct sockaddr_vm *dst); > void vsock_for_each_connected_socket(void (*fn)(struct sock *sk)); > > +/**** TAP ****/ > + > +struct vsock_tap { > + struct net_device *dev; > + struct module *module; > + struct list_head list; > +}; > + > +extern int vsock_init_tap(void); > +extern int vsock_add_tap(struct vsock_tap *vt); > +extern int vsock_remove_tap(struct vsock_tap *vt); > +extern void vsock_deliver_tap(struct sk_buff *skb); The other function prototypes in this header don't use "extern" either. Please drop for consistency. > + > #endif /* __AF_VSOCK_H__ */ > diff --git a/include/uapi/linux/if_arp.h b/include/uapi/linux/if_arp.h > index 4d024d7..869262a 100644 > --- a/include/uapi/linux/if_arp.h > +++ b/include/uapi/linux/if_arp.h > @@ -95,6 +95,7 @@ > #define ARPHRD_IP6GRE 823 /* GRE over IPv6 */ > #define ARPHRD_NETLINK 824 /* Netlink header */ > #define ARPHRD_6LOWPAN 825 /* IPv6 over LoWPAN */ > +#define ARPHRD_VSOCKMON 826 /* Vsock monitor header */ Previous lines use a tab character (^I) before the numeric constant. Please follow that style for consistency. I suggest calling it just ARPHRD_VSOCK. Netlink also uses ARPHRD_NETLINK instead of ARPHRD_NLMON. > > #define ARPHRD_VOID 0xFFFF /* Void type, nothing is known */ > #define ARPHRD_NONE 0xFFFE /* zero header length */ > diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c > index 6b158ab..ec7a05d 100644 > --- a/net/vmw_vsock/af_vsock.c > +++ b/net/vmw_vsock/af_vsock.c > @@ -96,6 +96,7 @@ > #include <linux/unistd.h> > #include <linux/wait.h> > #include <linux/workqueue.h> > +#include <linux/if_arp.h> > #include <net/sock.h> > #include <net/af_vsock.h> > > @@ -2012,6 +2013,110 @@ const struct vsock_transport *vsock_core_get_transport(void) > } > EXPORT_SYMBOL_GPL(vsock_core_get_transport); > > +/**** TAP ****/ Feel free to put this in a separate source file. The Kbuild can link multiple objects into a single kernel module. That would be cleaner than using a big comment to separate it from af_vsock.c code. I wonder whether this tap mechanism should be generic so that netlink, vsock, and other address families can reuse the code. This is basically copied from netlink. > +static DEFINE_SPINLOCK(vsock_tap_lock); > +static struct list_head vsock_tap_all __read_mostly; > + > +int vsock_init_tap(void) { > + INIT_LIST_HEAD(&vsock_tap_all); > + > + return 0; > +} > +EXPORT_SYMBOL_GPL(vsock_init_tap); Can you use LIST_HEAD_INIT() on the vsock_tap_all declaration line to eliminate the need for vsock_init_tap() completely? > + > +int vsock_add_tap(struct vsock_tap *vt) { > + if (unlikely(vt->dev->type != ARPHRD_VSOCKMON)) > + return -EINVAL; > + > + spin_lock(&vsock_tap_lock); > + list_add_rcu(&vt->list, &vsock_tap_all); > + spin_unlock(&vsock_tap_lock); > + > + __module_get(vt->module); It's slightly safer to get the module before publishing it on the list. But in practice I guess the caller is the module so the module won't disappear underneath us. > + > + return 0; > +} > +EXPORT_SYMBOL_GPL(vsock_add_tap); > + > +int __vsock_remove_tap(struct vsock_tap *vt) { > + bool found = false; > + struct vsock_tap *tmp; > + > + spin_lock(&vsock_tap_lock); > + > + list_for_each_entry(tmp, &vsock_tap_all, list) { > + if (vt == tmp) { > + list_del_rcu(&vt->list); > + found = true; > + goto out; > + } > + } > + > + pr_warn("__vsock_remove_tap: %p not found\n", vt); > +out: > + spin_unlock(&vsock_tap_lock); > + > + if (found) > + module_put(vt->module); > + > + return found ? 0 : -ENODEV; > +} > + > +int vsock_remove_tap(struct vsock_tap *vt) > +{ > + int ret; > + > + ret = __vsock_remove_tap(vt); > + synchronize_net(); module_put() should be called after synchronize_net(). That way we guarantee that no one is still accessing vt when the module is put. The caller is probably the module though, so maybe there is an implicit reference... > + > + return ret; > +} > +EXPORT_SYMBOL_GPL(vsock_remove_tap); > + > +static int __vsock_deliver_tap_skb(struct sk_buff *skb, > + struct net_device *dev) > +{ > + struct sk_buff *vskb; > + int ret = 0; > + > + if (skb) { > + dev_hold(dev); > + vskb = skb_clone(skb, GFP_ATOMIC); You must handle the case where skb_clone() returns NULL. In other words, we don't have enough memory to capture the packet and just return -ENOMEM. > + vskb->dev = dev; > + vskb->pkt_type = PACKET_USER; I'm not sure if PACKET_USER is correct (or if pkt_type matters at all). PACKET_USER/PACKET_KERNEL seems to be purely a netlink concept to distinguish netlink packets originating from the kernel or userspace. AF_VSOCK is more like Ethernet where packets come from the host itself or from a VM. I would consider PACKET_HOST and PACKET_OTHERHOST. > + ret = dev_queue_xmit(vskb); > + if (unlikely(ret > 0)) > + ret = net_xmit_errno(ret); > + > + dev_put(dev); > + } > + > + return ret; > +} > + > +static void __vsock_deliver_tap(struct sk_buff *skb) > +{ > + int ret; > + struct vsock_tap *tmp; > + > + list_for_each_entry_rcu(tmp, &vsock_tap_all, list) { > + ret = __vsock_deliver_tap_skb(skb, tmp->dev); > + if (unlikely(ret)) > + break; > + } > +} > + > +void vsock_deliver_tap(struct sk_buff *skb) > +{ > + rcu_read_lock(); > + > + if (unlikely(!list_empty(&vsock_tap_all))) > + __vsock_deliver_tap(skb); > + > + rcu_read_unlock(); > +} > +EXPORT_SYMBOL_GPL(vsock_deliver_tap); > + > MODULE_AUTHOR("VMware, Inc."); > MODULE_DESCRIPTION("VMware Virtual Socket Family"); > MODULE_VERSION("1.0.1.0-k"); > -- > 2.8.3 > [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 473 bytes --] ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [RFC 1/3] vsockmon: Add tap functions. 2016-06-01 21:07 ` Stefan Hajnoczi @ 2016-06-09 15:02 ` Gerard Garcia 2016-06-10 15:44 ` Stefan Hajnoczi 0 siblings, 1 reply; 13+ messages in thread From: Gerard Garcia @ 2016-06-09 15:02 UTC (permalink / raw) To: Stefan Hajnoczi; +Cc: netdev, jhansen On 06/01/2016 11:07 PM, Stefan Hajnoczi wrote: > On Sat, May 28, 2016 at 06:29:05PM +0200, ggarcia@abra.uab.cat wrote: >> From: Gerard Garcia <ggarcia@deic.uab.cat> >> >> Signed-off-by: Gerard Garcia <ggarcia@deic.uab.cat> >> --- >> include/net/af_vsock.h | 13 ++++++ >> include/uapi/linux/if_arp.h | 1 + >> net/vmw_vsock/af_vsock.c | 105 ++++++++++++++++++++++++++++++++++++++++++++ >> 3 files changed, 119 insertions(+) >> >> diff --git a/include/net/af_vsock.h b/include/net/af_vsock.h >> index 15694c9..c4593d8 100644 >> --- a/include/net/af_vsock.h >> +++ b/include/net/af_vsock.h >> @@ -187,4 +187,17 @@ struct sock *vsock_find_connected_socket(struct sockaddr_vm *src, >> struct sockaddr_vm *dst); >> void vsock_for_each_connected_socket(void (*fn)(struct sock *sk)); >> >> +/**** TAP ****/ >> + >> +struct vsock_tap { >> + struct net_device *dev; >> + struct module *module; >> + struct list_head list; >> +}; >> + >> +extern int vsock_init_tap(void); >> +extern int vsock_add_tap(struct vsock_tap *vt); >> +extern int vsock_remove_tap(struct vsock_tap *vt); >> +extern void vsock_deliver_tap(struct sk_buff *skb); > The other function prototypes in this header don't use "extern" either. > Please drop for consistency. Ok >> + >> #endif /* __AF_VSOCK_H__ */ >> diff --git a/include/uapi/linux/if_arp.h b/include/uapi/linux/if_arp.h >> index 4d024d7..869262a 100644 >> --- a/include/uapi/linux/if_arp.h >> +++ b/include/uapi/linux/if_arp.h >> @@ -95,6 +95,7 @@ >> #define ARPHRD_IP6GRE 823 /* GRE over IPv6 */ >> #define ARPHRD_NETLINK 824 /* Netlink header */ >> #define ARPHRD_6LOWPAN 825 /* IPv6 over LoWPAN */ >> +#define ARPHRD_VSOCKMON 826 /* Vsock monitor header */ > Previous lines use a tab character (^I) before the numeric constant. > Please follow that style for consistency. Ok > I suggest calling it just ARPHRD_VSOCK. Netlink also uses > ARPHRD_NETLINK instead of ARPHRD_NLMON. We define a new header used exclusively by the vsockmon device, nlmon copies the netlink header entirely. I'm not sure if we should *link* the ARPHRD_VSOCK identifier to the vsockmon header. >> >> #define ARPHRD_VOID 0xFFFF /* Void type, nothing is known */ >> #define ARPHRD_NONE 0xFFFE /* zero header length */ >> diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c >> index 6b158ab..ec7a05d 100644 >> --- a/net/vmw_vsock/af_vsock.c >> +++ b/net/vmw_vsock/af_vsock.c >> @@ -96,6 +96,7 @@ >> #include <linux/unistd.h> >> #include <linux/wait.h> >> #include <linux/workqueue.h> >> +#include <linux/if_arp.h> >> #include <net/sock.h> >> #include <net/af_vsock.h> >> >> @@ -2012,6 +2013,110 @@ const struct vsock_transport *vsock_core_get_transport(void) >> } >> EXPORT_SYMBOL_GPL(vsock_core_get_transport); >> >> +/**** TAP ****/ > Feel free to put this in a separate source file. The Kbuild can link > multiple objects into a single kernel module. That would be cleaner > than using a big comment to separate it from af_vsock.c code. I'm following the af_vsock.c style, where different logic is separated using this style of comments. It is not a lot of code so I thought it would be cleaner to have it in the same file. > I wonder whether this tap mechanism should be generic so that netlink, > vsock, and other address families can reuse the code. This is basically > copied from netlink. > >> +static DEFINE_SPINLOCK(vsock_tap_lock); >> +static struct list_head vsock_tap_all __read_mostly; >> + >> +int vsock_init_tap(void) { >> + INIT_LIST_HEAD(&vsock_tap_all); >> + >> + return 0; >> +} >> +EXPORT_SYMBOL_GPL(vsock_init_tap); > Can you use LIST_HEAD_INIT() on the vsock_tap_all declaration line to > eliminate the need for vsock_init_tap() completely? Yes, I'll do that. I though we may need to do more initialization at some point, but for now it is better to just statically initialize it. >> + >> +int vsock_add_tap(struct vsock_tap *vt) { >> + if (unlikely(vt->dev->type != ARPHRD_VSOCKMON)) >> + return -EINVAL; >> + >> + spin_lock(&vsock_tap_lock); >> + list_add_rcu(&vt->list, &vsock_tap_all); >> + spin_unlock(&vsock_tap_lock); >> + >> + __module_get(vt->module); > It's slightly safer to get the module before publishing it on the list. > But in practice I guess the caller is the module so the module won't > disappear underneath us. This function is equal to the function in af_netlink.c used by nlmon. As you said, in practice the caller is the module so it won't disappear. >> + >> + return 0; >> +} >> +EXPORT_SYMBOL_GPL(vsock_add_tap); >> + >> +int __vsock_remove_tap(struct vsock_tap *vt) { >> + bool found = false; >> + struct vsock_tap *tmp; >> + >> + spin_lock(&vsock_tap_lock); >> + >> + list_for_each_entry(tmp, &vsock_tap_all, list) { >> + if (vt == tmp) { >> + list_del_rcu(&vt->list); >> + found = true; >> + goto out; >> + } >> + } >> + >> + pr_warn("__vsock_remove_tap: %p not found\n", vt); >> +out: >> + spin_unlock(&vsock_tap_lock); >> + >> + if (found) >> + module_put(vt->module); >> + >> + return found ? 0 : -ENODEV; >> +} >> + >> +int vsock_remove_tap(struct vsock_tap *vt) >> +{ >> + int ret; >> + >> + ret = __vsock_remove_tap(vt); >> + synchronize_net(); > module_put() should be called after synchronize_net(). That way we > guarantee that no one is still accessing vt when the module is put. The > caller is probably the module though, so maybe there is an implicit > reference... Also, it is equal to the function in af_netlink.c used by nlmon. And as before, this function is supposed to be called from the same module so it can't disappear before calling synchronize_net(). >> + >> + return ret; >> +} >> +EXPORT_SYMBOL_GPL(vsock_remove_tap); >> + >> +static int __vsock_deliver_tap_skb(struct sk_buff *skb, >> + struct net_device *dev) >> +{ >> + struct sk_buff *vskb; >> + int ret = 0; >> + >> + if (skb) { >> + dev_hold(dev); >> + vskb = skb_clone(skb, GFP_ATOMIC); > You must handle the case where skb_clone() returns NULL. In other > words, we don't have enough memory to capture the packet and just return > -ENOMEM. You are right! >> + vskb->dev = dev; >> + vskb->pkt_type = PACKET_USER; > I'm not sure if PACKET_USER is correct (or if pkt_type matters at all). > > PACKET_USER/PACKET_KERNEL seems to be purely a netlink concept to > distinguish netlink packets originating from the kernel or userspace. > > AF_VSOCK is more like Ethernet where packets come from the host itself > or from a VM. I would consider PACKET_HOST and PACKET_OTHERHOST. > I'm not sure either if pkt_type matters. I haven't found any code that uses it so I'll just leave it unset. >> + ret = dev_queue_xmit(vskb); >> + if (unlikely(ret > 0)) >> + ret = net_xmit_errno(ret); >> + >> + dev_put(dev); >> + } >> + >> + return ret; >> +} >> + >> +static void __vsock_deliver_tap(struct sk_buff *skb) >> +{ >> + int ret; >> + struct vsock_tap *tmp; >> + >> + list_for_each_entry_rcu(tmp, &vsock_tap_all, list) { >> + ret = __vsock_deliver_tap_skb(skb, tmp->dev); >> + if (unlikely(ret)) >> + break; >> + } >> +} >> + >> +void vsock_deliver_tap(struct sk_buff *skb) >> +{ >> + rcu_read_lock(); >> + >> + if (unlikely(!list_empty(&vsock_tap_all))) >> + __vsock_deliver_tap(skb); >> + >> + rcu_read_unlock(); >> +} >> +EXPORT_SYMBOL_GPL(vsock_deliver_tap); >> + >> MODULE_AUTHOR("VMware, Inc."); >> MODULE_DESCRIPTION("VMware Virtual Socket Family"); >> MODULE_VERSION("1.0.1.0-k"); >> -- >> 2.8.3 >> ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [RFC 1/3] vsockmon: Add tap functions. 2016-06-09 15:02 ` Gerard Garcia @ 2016-06-10 15:44 ` Stefan Hajnoczi 2016-06-14 12:05 ` Jorgen S. Hansen 0 siblings, 1 reply; 13+ messages in thread From: Stefan Hajnoczi @ 2016-06-10 15:44 UTC (permalink / raw) To: Gerard Garcia; +Cc: netdev, jhansen [-- Attachment #1: Type: text/plain, Size: 2390 bytes --] On Thu, Jun 09, 2016 at 05:02:47PM +0200, Gerard Garcia wrote: > On 06/01/2016 11:07 PM, Stefan Hajnoczi wrote: > > On Sat, May 28, 2016 at 06:29:05PM +0200, ggarcia@abra.uab.cat wrote: > > > diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c > > > index 6b158ab..ec7a05d 100644 > > > --- a/net/vmw_vsock/af_vsock.c > > > +++ b/net/vmw_vsock/af_vsock.c > > > @@ -96,6 +96,7 @@ > > > #include <linux/unistd.h> > > > #include <linux/wait.h> > > > #include <linux/workqueue.h> > > > +#include <linux/if_arp.h> > > > #include <net/sock.h> > > > #include <net/af_vsock.h> > > > @@ -2012,6 +2013,110 @@ const struct vsock_transport *vsock_core_get_transport(void) > > > } > > > EXPORT_SYMBOL_GPL(vsock_core_get_transport); > > > +/**** TAP ****/ > > Feel free to put this in a separate source file. The Kbuild can link > > multiple objects into a single kernel module. That would be cleaner > > than using a big comment to separate it from af_vsock.c code. > I'm following the af_vsock.c style, where different logic is separated using > this style of comments. It is not a lot of code > so I thought it would be cleaner to have it in the same file. It's up to the af_vsock.c maintainer, but if we keep appending independent chunks of code to one file it becomes hard to manage and chances of conflicts during patch merging increases. > > > +int vsock_add_tap(struct vsock_tap *vt) { > > > + if (unlikely(vt->dev->type != ARPHRD_VSOCKMON)) > > > + return -EINVAL; > > > + > > > + spin_lock(&vsock_tap_lock); > > > + list_add_rcu(&vt->list, &vsock_tap_all); > > > + spin_unlock(&vsock_tap_lock); > > > + > > > + __module_get(vt->module); > > It's slightly safer to get the module before publishing it on the list. > > But in practice I guess the caller is the module so the module won't > > disappear underneath us. > This function is equal to the function in af_netlink.c used by nlmon. As you > said, in practice the caller is the module > so it won't disappear. Yes, there isn't a huge win right now but given that it's easy to resolve the issue I'd do it. The problem comes when people copy-paste code that contains assumptions and the assumption no longer holds. Better to write it in the safe way, eliminating the assumption, so that derived code will be correct under more conditions. There is no drawback to moving __module_get() above the spin_lock(). [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 473 bytes --] ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [RFC 1/3] vsockmon: Add tap functions. 2016-06-10 15:44 ` Stefan Hajnoczi @ 2016-06-14 12:05 ` Jorgen S. Hansen 0 siblings, 0 replies; 13+ messages in thread From: Jorgen S. Hansen @ 2016-06-14 12:05 UTC (permalink / raw) To: Stefan Hajnoczi, Gerard Garcia; +Cc: netdev On 6/10/16, 5:44 PM, "Stefan Hajnoczi" <stefanha@redhat.com> wrote: >On Thu, Jun 09, 2016 at 05:02:47PM +0200, Gerard Garcia wrote: >> On 06/01/2016 11:07 PM, Stefan Hajnoczi wrote: >> > On Sat, May 28, 2016 at 06:29:05PM +0200, ggarcia@abra.uab.cat wrote: >> > > diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c >> > > index 6b158ab..ec7a05d 100644 >> > > --- a/net/vmw_vsock/af_vsock.c >> > > +++ b/net/vmw_vsock/af_vsock.c >> > > @@ -96,6 +96,7 @@ >> > > #include <linux/unistd.h> >> > > #include <linux/wait.h> >> > > #include <linux/workqueue.h> >> > > +#include <linux/if_arp.h> >> > > #include <net/sock.h> >> > > #include <net/af_vsock.h> >> > > @@ -2012,6 +2013,110 @@ const struct vsock_transport *vsock_core_get_transport(void) >> > > } >> > > EXPORT_SYMBOL_GPL(vsock_core_get_transport); >> > > +/**** TAP ****/ >> > Feel free to put this in a separate source file. The Kbuild can link >> > multiple objects into a single kernel module. That would be cleaner >> > than using a big comment to separate it from af_vsock.c code. >> I'm following the af_vsock.c style, where different logic is separated using >> this style of comments. It is not a lot of code >> so I thought it would be cleaner to have it in the same file. > >It's up to the af_vsock.c maintainer, but if we keep appending >independent chunks of code to one file it becomes hard to manage and >chances of conflicts during patch merging increases. I agree with Stefan - af_vsock.c is already undesirably large, so it would be great if you could make this a separate file. ^ permalink raw reply [flat|nested] 13+ messages in thread
* [RFC 2/3] vsockmon: Add vsockmon device 2016-05-28 16:29 [RFC 0/3] vsockmon: virtual device to monitor AF_VSOCK sockets ggarcia 2016-05-28 16:29 ` [RFC 1/3] vsockmon: Add tap functions ggarcia @ 2016-05-28 16:29 ` ggarcia 2016-06-01 21:15 ` Stefan Hajnoczi 2016-05-28 16:29 ` [RFC 3/3] vsockmon: Add vsock hooks ggarcia 2 siblings, 1 reply; 13+ messages in thread From: ggarcia @ 2016-05-28 16:29 UTC (permalink / raw) To: netdev; +Cc: jhansen, stefanha, Gerard Garcia From: Gerard Garcia <ggarcia@deic.uab.cat> Signed-off-by: Gerard Garcia <ggarcia@deic.uab.cat> --- drivers/net/Kconfig | 8 ++ drivers/net/Makefile | 1 + drivers/net/vsockmon.c | 171 ++++++++++++++++++++++++++++++++++++++++++ include/uapi/linux/Kbuild | 1 + include/uapi/linux/vsockmon.h | 37 +++++++++ 5 files changed, 218 insertions(+) create mode 100644 drivers/net/vsockmon.c create mode 100644 include/uapi/linux/vsockmon.h diff --git a/drivers/net/Kconfig b/drivers/net/Kconfig index 0c5415b..42c43b6 100644 --- a/drivers/net/Kconfig +++ b/drivers/net/Kconfig @@ -330,6 +330,14 @@ config NET_VRF This option enables the support for mapping interfaces into VRF's. The support enables VRF devices. +config VSOCKMON + tristate "Virtual vsock monitoring device" + depends on VHOST_VSOCK + ---help--- + This option enables a monitoring net device for vsock sockets. It is + mostly intended for developers or support to debug vsock issues. If + unsure, say N. + endif # NET_CORE config SUNGEM_PHY diff --git a/drivers/net/Makefile b/drivers/net/Makefile index 7336cbd..e2188d4 100644 --- a/drivers/net/Makefile +++ b/drivers/net/Makefile @@ -28,6 +28,7 @@ obj-$(CONFIG_GENEVE) += geneve.o obj-$(CONFIG_GTP) += gtp.o obj-$(CONFIG_NLMON) += nlmon.o obj-$(CONFIG_NET_VRF) += vrf.o +obj-$(CONFIG_VSOCKMON) += vsockmon.o # # Networking Drivers diff --git a/drivers/net/vsockmon.c b/drivers/net/vsockmon.c new file mode 100644 index 0000000..becddc9 --- /dev/null +++ b/drivers/net/vsockmon.c @@ -0,0 +1,171 @@ +#include <linux/module.h> +#include <linux/kernel.h> +#include <linux/if_arp.h> +#include <net/rtnetlink.h> +#include <net/sock.h> +#include <net/af_vsock.h> +#include <uapi/linux/vsockmon.h> + +/* Virtio transport max packet size plus header */ +#define DEFAULT_MTU 1024 * 64 + sizeof(struct af_vsockmon_hdr); + +struct pcpu_lstats { + u64 rx_packets; + u64 rx_bytes; + struct u64_stats_sync syncp; +}; + +static int vsockmon_dev_init(struct net_device *dev) +{ + dev->lstats = netdev_alloc_pcpu_stats(struct pcpu_lstats); + return dev->lstats == NULL ? -ENOMEM : 0; +} + +static void vsockmon_dev_uninit(struct net_device *dev) +{ + free_percpu(dev->lstats); +} + +struct vsockmon { + struct vsock_tap vt; +}; + +static int vsockmon_open(struct net_device *dev) +{ + struct vsockmon *vsockmon = netdev_priv(dev); + + vsockmon->vt.dev = dev; + vsockmon->vt.module = THIS_MODULE; + return vsock_add_tap(&vsockmon->vt); +} + +static int vsockmon_close(struct net_device *dev) { + struct vsockmon *vsockmon = netdev_priv(dev); + + return vsock_remove_tap(&vsockmon->vt); +} + +static netdev_tx_t vsockmon_xmit(struct sk_buff *skb, struct net_device *dev) +{ + int len = skb->len; + struct pcpu_lstats *stats = this_cpu_ptr(dev->lstats); + + u64_stats_update_begin(&stats->syncp); + stats->rx_bytes += len; + stats->rx_packets++; + u64_stats_update_end(&stats->syncp); + + dev_kfree_skb(skb); + + return NETDEV_TX_OK; +} + +static struct rtnl_link_stats64 * +vsockmon_get_stats64(struct net_device *dev, struct rtnl_link_stats64 *stats) +{ + int i; + u64 bytes = 0, packets = 0; + + for_each_possible_cpu(i) { + const struct pcpu_lstats *vstats; + u64 tbytes, tpackets; + unsigned int start; + + vstats = per_cpu_ptr(dev->lstats, i); + + do { + start = u64_stats_fetch_begin_irq(&vstats->syncp); + tbytes = vstats->rx_bytes; + tpackets = vstats->rx_packets; + } while (u64_stats_fetch_retry_irq(&vstats->syncp, start)); + + packets += tpackets; + bytes += tbytes; + } + + stats->rx_packets = packets; + stats->tx_packets = 0; + + stats->rx_bytes = bytes; + stats->tx_bytes = 0; + + return stats; +} + +static int vsockmon_is_valid_mtu(int new_mtu) +{ + return new_mtu >= (int) sizeof(struct af_vsockmon_hdr); +} + +static int vsockmon_change_mtu(struct net_device *dev, int new_mtu) +{ + if (!vsockmon_is_valid_mtu(new_mtu)) + return -EINVAL; + + dev->mtu = new_mtu; + return 0; +} + +static const struct net_device_ops vsockmon_ops = { + .ndo_init = vsockmon_dev_init, + .ndo_uninit = vsockmon_dev_uninit, + .ndo_open = vsockmon_open, + .ndo_stop = vsockmon_close, + .ndo_start_xmit = vsockmon_xmit, + .ndo_get_stats64 = vsockmon_get_stats64, + .ndo_change_mtu = vsockmon_change_mtu, +}; + +static u32 always_on(struct net_device *dev) +{ + return 1; +} + +static const struct ethtool_ops vsockmon_ethtool_ops = { + .get_link = always_on, +}; + +static void vsockmon_setup(struct net_device *dev) +{ + dev->type = ARPHRD_VSOCKMON; + dev->priv_flags |= IFF_NO_QUEUE; + + dev->netdev_ops = &vsockmon_ops; + dev->ethtool_ops = &vsockmon_ethtool_ops; + dev->destructor = free_netdev; + + dev->features = NETIF_F_SG | NETIF_F_FRAGLIST | + NETIF_F_HIGHDMA | NETIF_F_LLTX; + + dev->flags = IFF_NOARP; + + dev->mtu = DEFAULT_MTU; +} + +static struct rtnl_link_ops vsockmon_link_ops __read_mostly = { + .kind = "vsockmon", + .priv_size = sizeof(struct vsockmon), + .setup = vsockmon_setup, +}; + +static __init int vsockmon_register(void) +{ + int ret = rtnl_link_register(&vsockmon_link_ops); + if (!ret) + vsock_init_tap(); + + return ret; +} + +static __exit void vsockmon_unregister(void) +{ + rtnl_link_unregister(&vsockmon_link_ops); +} + +module_init(vsockmon_register); +module_exit(vsockmon_unregister); + +MODULE_LICENSE("GPL v2"); +MODULE_AUTHOR("Gerard Garcia <ggarcia@deic.uab.cat>"); +MODULE_DESCRIPTION("Vsock monitoring device"); +MODULE_ALIAS_RTNL_LINK("vsockmon"); diff --git a/include/uapi/linux/Kbuild b/include/uapi/linux/Kbuild index 5f047d2..b1836cc 100644 --- a/include/uapi/linux/Kbuild +++ b/include/uapi/linux/Kbuild @@ -454,6 +454,7 @@ header-y += virtio_scsi.h header-y += virtio_types.h header-y += virtio_vsock.h header-y += vm_sockets.h +header-y += vsockmon.h header-y += vt.h header-y += wait.h header-y += wanrouter.h diff --git a/include/uapi/linux/vsockmon.h b/include/uapi/linux/vsockmon.h new file mode 100644 index 0000000..c73166f --- /dev/null +++ b/include/uapi/linux/vsockmon.h @@ -0,0 +1,37 @@ +#ifndef _UAPI_VSOCKMON_H +#define _UAPI_VSOCKMON_H + +#include <linux/virtio_vsock.h> + +/* Packet structure of packets received from the vsockmon device. */ + +struct af_vsockmon_g { + unsigned short op; /* enum af_vsock_g_ops */ + unsigned int src_cid; + unsigned int src_port; + unsigned int dst_cid; + unsigned int dst_port; +}; + +struct af_vsockmon_hdr { + unsigned short type; /* enum af_vosck_type */ + struct af_vsockmon_g g_hdr; + union { + struct virtio_vsock_hdr virtio_hdr; + } t_hdr; +}; + +enum af_vsockmon_type { + AF_VSOCK_GENERIC = 1, /* No transport header */ + AF_VSOCK_VIRTIO = 2, /* Addtional virtio transport header */ +}; + +enum af_vsockmon_g_ops { + AF_VSOCK_G_OP_UNKNOWN = 0, + AF_VSOCK_G_OP_CONNECT = 1, + AF_VSOCK_G_OP_DISCONNECT = 2, + AF_VSOCK_G_OP_CONTROL = 3, + AF_VSOCK_G_OP_PAYLOAD = 4, +}; + +#endif -- 2.8.3 ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [RFC 2/3] vsockmon: Add vsockmon device 2016-05-28 16:29 ` [RFC 2/3] vsockmon: Add vsockmon device ggarcia @ 2016-06-01 21:15 ` Stefan Hajnoczi 2016-06-09 15:21 ` Gerard Garcia 0 siblings, 1 reply; 13+ messages in thread From: Stefan Hajnoczi @ 2016-06-01 21:15 UTC (permalink / raw) To: ggarcia; +Cc: netdev, jhansen [-- Attachment #1: Type: text/plain, Size: 8115 bytes --] On Sat, May 28, 2016 at 06:29:06PM +0200, ggarcia@abra.uab.cat wrote: > From: Gerard Garcia <ggarcia@deic.uab.cat> > > Signed-off-by: Gerard Garcia <ggarcia@deic.uab.cat> > --- > drivers/net/Kconfig | 8 ++ > drivers/net/Makefile | 1 + > drivers/net/vsockmon.c | 171 ++++++++++++++++++++++++++++++++++++++++++ > include/uapi/linux/Kbuild | 1 + > include/uapi/linux/vsockmon.h | 37 +++++++++ > 5 files changed, 218 insertions(+) > create mode 100644 drivers/net/vsockmon.c > create mode 100644 include/uapi/linux/vsockmon.h > > diff --git a/drivers/net/Kconfig b/drivers/net/Kconfig > index 0c5415b..42c43b6 100644 > --- a/drivers/net/Kconfig > +++ b/drivers/net/Kconfig > @@ -330,6 +330,14 @@ config NET_VRF > This option enables the support for mapping interfaces into VRF's. The > support enables VRF devices. > > +config VSOCKMON > + tristate "Virtual vsock monitoring device" > + depends on VHOST_VSOCK > + ---help--- > + This option enables a monitoring net device for vsock sockets. It is > + mostly intended for developers or support to debug vsock issues. If > + unsure, say N. > + > endif # NET_CORE > > config SUNGEM_PHY > diff --git a/drivers/net/Makefile b/drivers/net/Makefile > index 7336cbd..e2188d4 100644 > --- a/drivers/net/Makefile > +++ b/drivers/net/Makefile > @@ -28,6 +28,7 @@ obj-$(CONFIG_GENEVE) += geneve.o > obj-$(CONFIG_GTP) += gtp.o > obj-$(CONFIG_NLMON) += nlmon.o > obj-$(CONFIG_NET_VRF) += vrf.o > +obj-$(CONFIG_VSOCKMON) += vsockmon.o > > # > # Networking Drivers > diff --git a/drivers/net/vsockmon.c b/drivers/net/vsockmon.c > new file mode 100644 > index 0000000..becddc9 > --- /dev/null > +++ b/drivers/net/vsockmon.c > @@ -0,0 +1,171 @@ > +#include <linux/module.h> > +#include <linux/kernel.h> > +#include <linux/if_arp.h> > +#include <net/rtnetlink.h> > +#include <net/sock.h> > +#include <net/af_vsock.h> > +#include <uapi/linux/vsockmon.h> > + > +/* Virtio transport max packet size plus header */ > +#define DEFAULT_MTU 1024 * 64 + sizeof(struct af_vsockmon_hdr); > + > +struct pcpu_lstats { > + u64 rx_packets; > + u64 rx_bytes; > + struct u64_stats_sync syncp; > +}; > + > +static int vsockmon_dev_init(struct net_device *dev) > +{ > + dev->lstats = netdev_alloc_pcpu_stats(struct pcpu_lstats); > + return dev->lstats == NULL ? -ENOMEM : 0; > +} > + > +static void vsockmon_dev_uninit(struct net_device *dev) > +{ > + free_percpu(dev->lstats); > +} > + > +struct vsockmon { > + struct vsock_tap vt; > +}; > + > +static int vsockmon_open(struct net_device *dev) > +{ > + struct vsockmon *vsockmon = netdev_priv(dev); > + > + vsockmon->vt.dev = dev; > + vsockmon->vt.module = THIS_MODULE; > + return vsock_add_tap(&vsockmon->vt); > +} > + > +static int vsockmon_close(struct net_device *dev) { > + struct vsockmon *vsockmon = netdev_priv(dev); > + > + return vsock_remove_tap(&vsockmon->vt); > +} > + > +static netdev_tx_t vsockmon_xmit(struct sk_buff *skb, struct net_device *dev) > +{ > + int len = skb->len; > + struct pcpu_lstats *stats = this_cpu_ptr(dev->lstats); > + > + u64_stats_update_begin(&stats->syncp); > + stats->rx_bytes += len; > + stats->rx_packets++; > + u64_stats_update_end(&stats->syncp); > + > + dev_kfree_skb(skb); > + > + return NETDEV_TX_OK; > +} > + > +static struct rtnl_link_stats64 * > +vsockmon_get_stats64(struct net_device *dev, struct rtnl_link_stats64 *stats) > +{ > + int i; > + u64 bytes = 0, packets = 0; > + > + for_each_possible_cpu(i) { > + const struct pcpu_lstats *vstats; > + u64 tbytes, tpackets; > + unsigned int start; > + > + vstats = per_cpu_ptr(dev->lstats, i); > + > + do { > + start = u64_stats_fetch_begin_irq(&vstats->syncp); > + tbytes = vstats->rx_bytes; > + tpackets = vstats->rx_packets; > + } while (u64_stats_fetch_retry_irq(&vstats->syncp, start)); > + > + packets += tpackets; > + bytes += tbytes; > + } > + > + stats->rx_packets = packets; > + stats->tx_packets = 0; > + > + stats->rx_bytes = bytes; > + stats->tx_bytes = 0; > + > + return stats; > +} > + > +static int vsockmon_is_valid_mtu(int new_mtu) > +{ > + return new_mtu >= (int) sizeof(struct af_vsockmon_hdr); > +} > + > +static int vsockmon_change_mtu(struct net_device *dev, int new_mtu) > +{ > + if (!vsockmon_is_valid_mtu(new_mtu)) > + return -EINVAL; > + > + dev->mtu = new_mtu; > + return 0; > +} I wonder if the mtu serves any purpose. What happens when you change it from the default value? > + > +static const struct net_device_ops vsockmon_ops = { > + .ndo_init = vsockmon_dev_init, > + .ndo_uninit = vsockmon_dev_uninit, > + .ndo_open = vsockmon_open, > + .ndo_stop = vsockmon_close, > + .ndo_start_xmit = vsockmon_xmit, > + .ndo_get_stats64 = vsockmon_get_stats64, > + .ndo_change_mtu = vsockmon_change_mtu, > +}; > + > +static u32 always_on(struct net_device *dev) > +{ > + return 1; > +} > + > +static const struct ethtool_ops vsockmon_ethtool_ops = { > + .get_link = always_on, > +}; > + > +static void vsockmon_setup(struct net_device *dev) > +{ > + dev->type = ARPHRD_VSOCKMON; > + dev->priv_flags |= IFF_NO_QUEUE; > + > + dev->netdev_ops = &vsockmon_ops; > + dev->ethtool_ops = &vsockmon_ethtool_ops; > + dev->destructor = free_netdev; > + > + dev->features = NETIF_F_SG | NETIF_F_FRAGLIST | > + NETIF_F_HIGHDMA | NETIF_F_LLTX; > + > + dev->flags = IFF_NOARP; > + > + dev->mtu = DEFAULT_MTU; > +} > + > +static struct rtnl_link_ops vsockmon_link_ops __read_mostly = { > + .kind = "vsockmon", > + .priv_size = sizeof(struct vsockmon), > + .setup = vsockmon_setup, > +}; > + > +static __init int vsockmon_register(void) > +{ > + int ret = rtnl_link_register(&vsockmon_link_ops); > + if (!ret) > + vsock_init_tap(); > + > + return ret; > +} > + > +static __exit void vsockmon_unregister(void) > +{ > + rtnl_link_unregister(&vsockmon_link_ops); > +} > + > +module_init(vsockmon_register); > +module_exit(vsockmon_unregister); > + > +MODULE_LICENSE("GPL v2"); > +MODULE_AUTHOR("Gerard Garcia <ggarcia@deic.uab.cat>"); > +MODULE_DESCRIPTION("Vsock monitoring device"); > +MODULE_ALIAS_RTNL_LINK("vsockmon"); there should be some mention that this is derived from nlmon.c. > diff --git a/include/uapi/linux/Kbuild b/include/uapi/linux/Kbuild > index 5f047d2..b1836cc 100644 > --- a/include/uapi/linux/Kbuild > +++ b/include/uapi/linux/Kbuild > @@ -454,6 +454,7 @@ header-y += virtio_scsi.h > header-y += virtio_types.h > header-y += virtio_vsock.h > header-y += vm_sockets.h > +header-y += vsockmon.h > header-y += vt.h > header-y += wait.h > header-y += wanrouter.h > diff --git a/include/uapi/linux/vsockmon.h b/include/uapi/linux/vsockmon.h > new file mode 100644 > index 0000000..c73166f > --- /dev/null > +++ b/include/uapi/linux/vsockmon.h > @@ -0,0 +1,37 @@ > +#ifndef _UAPI_VSOCKMON_H > +#define _UAPI_VSOCKMON_H > + > +#include <linux/virtio_vsock.h> > + > +/* Packet structure of packets received from the vsockmon device. */ > + > +struct af_vsockmon_g { > + unsigned short op; /* enum af_vsock_g_ops */ > + unsigned int src_cid; > + unsigned int src_port; > + unsigned int dst_cid; > + unsigned int dst_port; > +}; > + > +struct af_vsockmon_hdr { > + unsigned short type; /* enum af_vosck_type */ > + struct af_vsockmon_g g_hdr; > + union { > + struct virtio_vsock_hdr virtio_hdr; > + } t_hdr; > +}; How does endianness work? virtio_hdr uses little-endian fields on the wire. I guess that af_vsockmon_g is always CPU-endian. > + > +enum af_vsockmon_type { > + AF_VSOCK_GENERIC = 1, /* No transport header */ > + AF_VSOCK_VIRTIO = 2, /* Addtional virtio transport header */ > +}; > + > +enum af_vsockmon_g_ops { > + AF_VSOCK_G_OP_UNKNOWN = 0, > + AF_VSOCK_G_OP_CONNECT = 1, > + AF_VSOCK_G_OP_DISCONNECT = 2, > + AF_VSOCK_G_OP_CONTROL = 3, > + AF_VSOCK_G_OP_PAYLOAD = 4, > +}; > + > +#endif > -- > 2.8.3 > [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 473 bytes --] ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [RFC 2/3] vsockmon: Add vsockmon device 2016-06-01 21:15 ` Stefan Hajnoczi @ 2016-06-09 15:21 ` Gerard Garcia 2016-06-10 15:37 ` Stefan Hajnoczi 0 siblings, 1 reply; 13+ messages in thread From: Gerard Garcia @ 2016-06-09 15:21 UTC (permalink / raw) To: Stefan Hajnoczi; +Cc: netdev, jhansen On 06/01/2016 11:15 PM, Stefan Hajnoczi wrote: > On Sat, May 28, 2016 at 06:29:06PM +0200, ggarcia@abra.uab.cat wrote: >> From: Gerard Garcia <ggarcia@deic.uab.cat> >> >> Signed-off-by: Gerard Garcia <ggarcia@deic.uab.cat> >> --- >> drivers/net/Kconfig | 8 ++ >> drivers/net/Makefile | 1 + >> drivers/net/vsockmon.c | 171 ++++++++++++++++++++++++++++++++++++++++++ >> include/uapi/linux/Kbuild | 1 + >> include/uapi/linux/vsockmon.h | 37 +++++++++ >> 5 files changed, 218 insertions(+) >> create mode 100644 drivers/net/vsockmon.c >> create mode 100644 include/uapi/linux/vsockmon.h >> >> diff --git a/drivers/net/Kconfig b/drivers/net/Kconfig >> index 0c5415b..42c43b6 100644 >> --- a/drivers/net/Kconfig >> +++ b/drivers/net/Kconfig >> @@ -330,6 +330,14 @@ config NET_VRF >> This option enables the support for mapping interfaces into VRF's. The >> support enables VRF devices. >> >> +config VSOCKMON >> + tristate "Virtual vsock monitoring device" >> + depends on VHOST_VSOCK >> + ---help--- >> + This option enables a monitoring net device for vsock sockets. It is >> + mostly intended for developers or support to debug vsock issues. If >> + unsure, say N. >> + >> endif # NET_CORE >> >> config SUNGEM_PHY >> diff --git a/drivers/net/Makefile b/drivers/net/Makefile >> index 7336cbd..e2188d4 100644 >> --- a/drivers/net/Makefile >> +++ b/drivers/net/Makefile >> @@ -28,6 +28,7 @@ obj-$(CONFIG_GENEVE) += geneve.o >> obj-$(CONFIG_GTP) += gtp.o >> obj-$(CONFIG_NLMON) += nlmon.o >> obj-$(CONFIG_NET_VRF) += vrf.o >> +obj-$(CONFIG_VSOCKMON) += vsockmon.o >> >> # >> # Networking Drivers >> diff --git a/drivers/net/vsockmon.c b/drivers/net/vsockmon.c >> new file mode 100644 >> index 0000000..becddc9 >> --- /dev/null >> +++ b/drivers/net/vsockmon.c >> @@ -0,0 +1,171 @@ >> +#include <linux/module.h> >> +#include <linux/kernel.h> >> +#include <linux/if_arp.h> >> +#include <net/rtnetlink.h> >> +#include <net/sock.h> >> +#include <net/af_vsock.h> >> +#include <uapi/linux/vsockmon.h> >> + >> +/* Virtio transport max packet size plus header */ >> +#define DEFAULT_MTU 1024 * 64 + sizeof(struct af_vsockmon_hdr); >> + >> +struct pcpu_lstats { >> + u64 rx_packets; >> + u64 rx_bytes; >> + struct u64_stats_sync syncp; >> +}; >> + >> +static int vsockmon_dev_init(struct net_device *dev) >> +{ >> + dev->lstats = netdev_alloc_pcpu_stats(struct pcpu_lstats); >> + return dev->lstats == NULL ? -ENOMEM : 0; >> +} >> + >> +static void vsockmon_dev_uninit(struct net_device *dev) >> +{ >> + free_percpu(dev->lstats); >> +} >> + >> +struct vsockmon { >> + struct vsock_tap vt; >> +}; >> + >> +static int vsockmon_open(struct net_device *dev) >> +{ >> + struct vsockmon *vsockmon = netdev_priv(dev); >> + >> + vsockmon->vt.dev = dev; >> + vsockmon->vt.module = THIS_MODULE; >> + return vsock_add_tap(&vsockmon->vt); >> +} >> + >> +static int vsockmon_close(struct net_device *dev) { >> + struct vsockmon *vsockmon = netdev_priv(dev); >> + >> + return vsock_remove_tap(&vsockmon->vt); >> +} >> + >> +static netdev_tx_t vsockmon_xmit(struct sk_buff *skb, struct net_device *dev) >> +{ >> + int len = skb->len; >> + struct pcpu_lstats *stats = this_cpu_ptr(dev->lstats); >> + >> + u64_stats_update_begin(&stats->syncp); >> + stats->rx_bytes += len; >> + stats->rx_packets++; >> + u64_stats_update_end(&stats->syncp); >> + >> + dev_kfree_skb(skb); >> + >> + return NETDEV_TX_OK; >> +} >> + >> +static struct rtnl_link_stats64 * >> +vsockmon_get_stats64(struct net_device *dev, struct rtnl_link_stats64 *stats) >> +{ >> + int i; >> + u64 bytes = 0, packets = 0; >> + >> + for_each_possible_cpu(i) { >> + const struct pcpu_lstats *vstats; >> + u64 tbytes, tpackets; >> + unsigned int start; >> + >> + vstats = per_cpu_ptr(dev->lstats, i); >> + >> + do { >> + start = u64_stats_fetch_begin_irq(&vstats->syncp); >> + tbytes = vstats->rx_bytes; >> + tpackets = vstats->rx_packets; >> + } while (u64_stats_fetch_retry_irq(&vstats->syncp, start)); >> + >> + packets += tpackets; >> + bytes += tbytes; >> + } >> + >> + stats->rx_packets = packets; >> + stats->tx_packets = 0; >> + >> + stats->rx_bytes = bytes; >> + stats->tx_bytes = 0; >> + >> + return stats; >> +} >> + >> +static int vsockmon_is_valid_mtu(int new_mtu) >> +{ >> + return new_mtu >= (int) sizeof(struct af_vsockmon_hdr); >> +} >> + >> +static int vsockmon_change_mtu(struct net_device *dev, int new_mtu) >> +{ >> + if (!vsockmon_is_valid_mtu(new_mtu)) >> + return -EINVAL; >> + >> + dev->mtu = new_mtu; >> + return 0; >> +} > I wonder if the mtu serves any purpose. What happens when you change it > from the default value? It does not happen anything as we don't consider the mtu anywhere. I think it is interesting to set it so monitoring programs can adjust their buffers and it may be interesting to be able to change it so if the transport changes the mtu (or if vmci support is added) it can be adjusted. >> + >> +static const struct net_device_ops vsockmon_ops = { >> + .ndo_init = vsockmon_dev_init, >> + .ndo_uninit = vsockmon_dev_uninit, >> + .ndo_open = vsockmon_open, >> + .ndo_stop = vsockmon_close, >> + .ndo_start_xmit = vsockmon_xmit, >> + .ndo_get_stats64 = vsockmon_get_stats64, >> + .ndo_change_mtu = vsockmon_change_mtu, >> +}; >> + >> +static u32 always_on(struct net_device *dev) >> +{ >> + return 1; >> +} >> + >> +static const struct ethtool_ops vsockmon_ethtool_ops = { >> + .get_link = always_on, >> +}; >> + >> +static void vsockmon_setup(struct net_device *dev) >> +{ >> + dev->type = ARPHRD_VSOCKMON; >> + dev->priv_flags |= IFF_NO_QUEUE; >> + >> + dev->netdev_ops = &vsockmon_ops; >> + dev->ethtool_ops = &vsockmon_ethtool_ops; >> + dev->destructor = free_netdev; >> + >> + dev->features = NETIF_F_SG | NETIF_F_FRAGLIST | >> + NETIF_F_HIGHDMA | NETIF_F_LLTX; >> + >> + dev->flags = IFF_NOARP; >> + >> + dev->mtu = DEFAULT_MTU; >> +} >> + >> +static struct rtnl_link_ops vsockmon_link_ops __read_mostly = { >> + .kind = "vsockmon", >> + .priv_size = sizeof(struct vsockmon), >> + .setup = vsockmon_setup, >> +}; >> + >> +static __init int vsockmon_register(void) >> +{ >> + int ret = rtnl_link_register(&vsockmon_link_ops); >> + if (!ret) >> + vsock_init_tap(); >> + >> + return ret; >> +} >> + >> +static __exit void vsockmon_unregister(void) >> +{ >> + rtnl_link_unregister(&vsockmon_link_ops); >> +} >> + >> +module_init(vsockmon_register); >> +module_exit(vsockmon_unregister); >> + >> +MODULE_LICENSE("GPL v2"); >> +MODULE_AUTHOR("Gerard Garcia <ggarcia@deic.uab.cat>"); >> +MODULE_DESCRIPTION("Vsock monitoring device"); >> +MODULE_ALIAS_RTNL_LINK("vsockmon"); > there should be some mention that this is derived from nlmon.c. Ok, I'll add in the description that it is based on the nlmon code. >> diff --git a/include/uapi/linux/Kbuild b/include/uapi/linux/Kbuild >> index 5f047d2..b1836cc 100644 >> --- a/include/uapi/linux/Kbuild >> +++ b/include/uapi/linux/Kbuild >> @@ -454,6 +454,7 @@ header-y += virtio_scsi.h >> header-y += virtio_types.h >> header-y += virtio_vsock.h >> header-y += vm_sockets.h >> +header-y += vsockmon.h >> header-y += vt.h >> header-y += wait.h >> header-y += wanrouter.h >> diff --git a/include/uapi/linux/vsockmon.h b/include/uapi/linux/vsockmon.h >> new file mode 100644 >> index 0000000..c73166f >> --- /dev/null >> +++ b/include/uapi/linux/vsockmon.h >> @@ -0,0 +1,37 @@ >> +#ifndef _UAPI_VSOCKMON_H >> +#define _UAPI_VSOCKMON_H >> + >> +#include <linux/virtio_vsock.h> >> + >> +/* Packet structure of packets received from the vsockmon device. */ >> + >> +struct af_vsockmon_g { >> + unsigned short op; /* enum af_vsock_g_ops */ >> + unsigned int src_cid; >> + unsigned int src_port; >> + unsigned int dst_cid; >> + unsigned int dst_port; >> +}; >> + >> +struct af_vsockmon_hdr { >> + unsigned short type; /* enum af_vosck_type */ >> + struct af_vsockmon_g g_hdr; >> + union { >> + struct virtio_vsock_hdr virtio_hdr; >> + } t_hdr; >> +}; > How does endianness work? virtio_hdr uses little-endian fields on the > wire. I guess that af_vsockmon_g is always CPU-endian. Yes, af_vsockmon_g is CPU-endian. I don't know what criteria is normally used regarding the endianness of structs facing user space but I think it is better to not modify the vsock transport structs. >> + >> +enum af_vsockmon_type { >> + AF_VSOCK_GENERIC = 1, /* No transport header */ >> + AF_VSOCK_VIRTIO = 2, /* Addtional virtio transport header */ >> +}; >> + >> +enum af_vsockmon_g_ops { >> + AF_VSOCK_G_OP_UNKNOWN = 0, >> + AF_VSOCK_G_OP_CONNECT = 1, >> + AF_VSOCK_G_OP_DISCONNECT = 2, >> + AF_VSOCK_G_OP_CONTROL = 3, >> + AF_VSOCK_G_OP_PAYLOAD = 4, >> +}; >> + >> +#endif >> -- >> 2.8.3 >> ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [RFC 2/3] vsockmon: Add vsockmon device 2016-06-09 15:21 ` Gerard Garcia @ 2016-06-10 15:37 ` Stefan Hajnoczi 0 siblings, 0 replies; 13+ messages in thread From: Stefan Hajnoczi @ 2016-06-10 15:37 UTC (permalink / raw) To: Gerard Garcia; +Cc: netdev, jhansen [-- Attachment #1: Type: text/plain, Size: 1323 bytes --] On Thu, Jun 09, 2016 at 05:21:26PM +0200, Gerard Garcia wrote: > > > diff --git a/include/uapi/linux/vsockmon.h b/include/uapi/linux/vsockmon.h > > > new file mode 100644 > > > index 0000000..c73166f > > > --- /dev/null > > > +++ b/include/uapi/linux/vsockmon.h > > > @@ -0,0 +1,37 @@ > > > +#ifndef _UAPI_VSOCKMON_H > > > +#define _UAPI_VSOCKMON_H > > > + > > > +#include <linux/virtio_vsock.h> > > > + > > > +/* Packet structure of packets received from the vsockmon device. */ > > > + > > > +struct af_vsockmon_g { > > > + unsigned short op; /* enum af_vsock_g_ops */ > > > + unsigned int src_cid; > > > + unsigned int src_port; > > > + unsigned int dst_cid; > > > + unsigned int dst_port; > > > +}; > > > + > > > +struct af_vsockmon_hdr { > > > + unsigned short type; /* enum af_vosck_type */ > > > + struct af_vsockmon_g g_hdr; > > > + union { > > > + struct virtio_vsock_hdr virtio_hdr; > > > + } t_hdr; > > > +}; > > How does endianness work? virtio_hdr uses little-endian fields on the > > wire. I guess that af_vsockmon_g is always CPU-endian. > Yes, af_vsockmon_g is CPU-endian. I don't know what criteria is normally > used regarding the endianness of structs facing user space but I think it is > better to not modify the vsock transport structs. Endianness must to be documented in this uapi header file. [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 473 bytes --] ^ permalink raw reply [flat|nested] 13+ messages in thread
* [RFC 3/3] vsockmon: Add vsock hooks 2016-05-28 16:29 [RFC 0/3] vsockmon: virtual device to monitor AF_VSOCK sockets ggarcia 2016-05-28 16:29 ` [RFC 1/3] vsockmon: Add tap functions ggarcia 2016-05-28 16:29 ` [RFC 2/3] vsockmon: Add vsockmon device ggarcia @ 2016-05-28 16:29 ` ggarcia 2016-06-01 21:19 ` Stefan Hajnoczi 2 siblings, 1 reply; 13+ messages in thread From: ggarcia @ 2016-05-28 16:29 UTC (permalink / raw) To: netdev; +Cc: jhansen, stefanha, Gerard Garcia From: Gerard Garcia <ggarcia@deic.uab.cat> Signed-off-by: Gerard Garcia <ggarcia@deic.uab.cat> --- drivers/vhost/vsock.c | 71 +++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 71 insertions(+) diff --git a/drivers/vhost/vsock.c b/drivers/vhost/vsock.c index 17bfe4e..8fd5125 100644 --- a/drivers/vhost/vsock.c +++ b/drivers/vhost/vsock.c @@ -14,8 +14,10 @@ #include <net/sock.h> #include <linux/virtio_vsock.h> #include <linux/vhost.h> +#include <linux/skbuff.h> #include <net/af_vsock.h> +#include <uapi/linux/vsockmon.h> #include "vhost.h" #define VHOST_VSOCK_DEFAULT_HOST_CID 2 @@ -45,6 +47,67 @@ struct vhost_vsock { u32 guest_cid; }; +static struct sk_buff * +virtio_vsock_pkt_to_skb(struct virtio_vsock_pkt *pkt) +{ + struct sk_buff *skb; + struct af_vsockmon_hdr *hdr; + void *payload; + + u32 skb_len = sizeof(struct af_vsockmon_hdr) + pkt->len; + + skb = alloc_skb(skb_len, GFP_ATOMIC); + if (!skb) + return NULL; + + skb_reserve(skb, sizeof(struct af_vsockmon_hdr)); + + if (pkt->len) { + payload = skb_put(skb, pkt->len); + memcpy(payload, pkt->buf, pkt->len); + } + + hdr = (struct af_vsockmon_hdr *) skb_push(skb, sizeof(*hdr)); + hdr->type = AF_VSOCK_VIRTIO; + + switch(pkt->hdr.op) { + case VIRTIO_VSOCK_OP_REQUEST: + case VIRTIO_VSOCK_OP_RESPONSE: + hdr->g_hdr.op = AF_VSOCK_G_OP_CONNECT; + break; + case VIRTIO_VSOCK_OP_RST: + case VIRTIO_VSOCK_OP_SHUTDOWN: + hdr->g_hdr.op = AF_VSOCK_G_OP_DISCONNECT; + break; + case VIRTIO_VSOCK_OP_RW: + hdr->g_hdr.op = AF_VSOCK_G_OP_PAYLOAD; + break; + case VIRTIO_VSOCK_OP_CREDIT_UPDATE: + case VIRTIO_VSOCK_OP_CREDIT_REQUEST: + hdr->g_hdr.op = AF_VSOCK_G_OP_CONTROL; + break; + default: + hdr->g_hdr.op = AF_VSOCK_G_OP_UNKNOWN; + break; + } + + hdr->g_hdr.src_cid = pkt->hdr.src_cid; + hdr->g_hdr.src_port = pkt->hdr.src_port; + hdr->g_hdr.dst_cid = pkt->hdr.dst_cid; + hdr->g_hdr.dst_port = pkt->hdr.dst_port; + + hdr->t_hdr.virtio_hdr = pkt->hdr; + + return skb; +} + +static void vsock_deliver_tap_pkt(struct virtio_vsock_pkt *pkt) +{ + struct sk_buff *skb = virtio_vsock_pkt_to_skb(pkt); + if (skb) + vsock_deliver_tap(skb); +} + static u32 vhost_transport_get_local_cid(void) { return VHOST_VSOCK_DEFAULT_HOST_CID; @@ -147,6 +210,11 @@ vhost_transport_do_send_pkt(struct vhost_vsock *vsock, vsock->total_tx_buf -= pkt->len; + /* Deliver to monitoring devices all correctly transmitted + * packets. + */ + vsock_deliver_tap_pkt(pkt); + virtio_transport_free_pkt(pkt); } if (added) @@ -367,6 +435,9 @@ static void vhost_vsock_handle_tx_kick(struct vhost_work *work) continue; } + /* Deliver to monitoring devices all received packets */ + vsock_deliver_tap_pkt(pkt); + /* Only accept correctly addressed packets */ if (le64_to_cpu(pkt->hdr.src_cid) == vsock->guest_cid) virtio_transport_recv_pkt(pkt); -- 2.8.3 ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [RFC 3/3] vsockmon: Add vsock hooks 2016-05-28 16:29 ` [RFC 3/3] vsockmon: Add vsock hooks ggarcia @ 2016-06-01 21:19 ` Stefan Hajnoczi 2016-06-09 15:27 ` Gerard Garcia 0 siblings, 1 reply; 13+ messages in thread From: Stefan Hajnoczi @ 2016-06-01 21:19 UTC (permalink / raw) To: ggarcia; +Cc: netdev, jhansen [-- Attachment #1: Type: text/plain, Size: 3638 bytes --] On Sat, May 28, 2016 at 06:29:07PM +0200, ggarcia@abra.uab.cat wrote: > From: Gerard Garcia <ggarcia@deic.uab.cat> > > Signed-off-by: Gerard Garcia <ggarcia@deic.uab.cat> > --- > drivers/vhost/vsock.c | 71 +++++++++++++++++++++++++++++++++++++++++++++++++++ > 1 file changed, 71 insertions(+) > > diff --git a/drivers/vhost/vsock.c b/drivers/vhost/vsock.c > index 17bfe4e..8fd5125 100644 > --- a/drivers/vhost/vsock.c > +++ b/drivers/vhost/vsock.c > @@ -14,8 +14,10 @@ > #include <net/sock.h> > #include <linux/virtio_vsock.h> > #include <linux/vhost.h> > +#include <linux/skbuff.h> > > #include <net/af_vsock.h> > +#include <uapi/linux/vsockmon.h> > #include "vhost.h" > > #define VHOST_VSOCK_DEFAULT_HOST_CID 2 > @@ -45,6 +47,67 @@ struct vhost_vsock { > u32 guest_cid; > }; > > +static struct sk_buff * > +virtio_vsock_pkt_to_skb(struct virtio_vsock_pkt *pkt) > +{ > + struct sk_buff *skb; > + struct af_vsockmon_hdr *hdr; > + void *payload; > + > + u32 skb_len = sizeof(struct af_vsockmon_hdr) + pkt->len; > + > + skb = alloc_skb(skb_len, GFP_ATOMIC); > + if (!skb) > + return NULL; > + > + skb_reserve(skb, sizeof(struct af_vsockmon_hdr)); > + > + if (pkt->len) { > + payload = skb_put(skb, pkt->len); > + memcpy(payload, pkt->buf, pkt->len); > + } > + > + hdr = (struct af_vsockmon_hdr *) skb_push(skb, sizeof(*hdr)); > + hdr->type = AF_VSOCK_VIRTIO; > + > + switch(pkt->hdr.op) { > + case VIRTIO_VSOCK_OP_REQUEST: > + case VIRTIO_VSOCK_OP_RESPONSE: > + hdr->g_hdr.op = AF_VSOCK_G_OP_CONNECT; > + break; > + case VIRTIO_VSOCK_OP_RST: > + case VIRTIO_VSOCK_OP_SHUTDOWN: > + hdr->g_hdr.op = AF_VSOCK_G_OP_DISCONNECT; > + break; > + case VIRTIO_VSOCK_OP_RW: > + hdr->g_hdr.op = AF_VSOCK_G_OP_PAYLOAD; > + break; > + case VIRTIO_VSOCK_OP_CREDIT_UPDATE: > + case VIRTIO_VSOCK_OP_CREDIT_REQUEST: > + hdr->g_hdr.op = AF_VSOCK_G_OP_CONTROL; > + break; > + default: > + hdr->g_hdr.op = AF_VSOCK_G_OP_UNKNOWN; > + break; > + } > + > + hdr->g_hdr.src_cid = pkt->hdr.src_cid; > + hdr->g_hdr.src_port = pkt->hdr.src_port; > + hdr->g_hdr.dst_cid = pkt->hdr.dst_cid; > + hdr->g_hdr.dst_port = pkt->hdr.dst_port; Careful with endianness here. pkt->hdr uses little-endian fields. > + > + hdr->t_hdr.virtio_hdr = pkt->hdr; > + > + return skb; > +} > + > +static void vsock_deliver_tap_pkt(struct virtio_vsock_pkt *pkt) > +{ > + struct sk_buff *skb = virtio_vsock_pkt_to_skb(pkt); > + if (skb) > + vsock_deliver_tap(skb); vsock_deliver_tap() doesn't take ownership of skb so it is leaked here! In fact, given that the core vsock code is not skb-based, perhaps the tap code shouldn't clone at all. Just take ownership of the skb. This avoids the extra allocation and memcpy. > +} > + > static u32 vhost_transport_get_local_cid(void) > { > return VHOST_VSOCK_DEFAULT_HOST_CID; > @@ -147,6 +210,11 @@ vhost_transport_do_send_pkt(struct vhost_vsock *vsock, > > vsock->total_tx_buf -= pkt->len; > > + /* Deliver to monitoring devices all correctly transmitted > + * packets. > + */ > + vsock_deliver_tap_pkt(pkt); > + > virtio_transport_free_pkt(pkt); > } > if (added) > @@ -367,6 +435,9 @@ static void vhost_vsock_handle_tx_kick(struct vhost_work *work) > continue; > } > > + /* Deliver to monitoring devices all received packets */ > + vsock_deliver_tap_pkt(pkt); > + > /* Only accept correctly addressed packets */ > if (le64_to_cpu(pkt->hdr.src_cid) == vsock->guest_cid) > virtio_transport_recv_pkt(pkt); > -- > 2.8.3 > [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 473 bytes --] ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [RFC 3/3] vsockmon: Add vsock hooks 2016-06-01 21:19 ` Stefan Hajnoczi @ 2016-06-09 15:27 ` Gerard Garcia 0 siblings, 0 replies; 13+ messages in thread From: Gerard Garcia @ 2016-06-09 15:27 UTC (permalink / raw) To: Stefan Hajnoczi; +Cc: netdev, jhansen On 06/01/2016 11:19 PM, Stefan Hajnoczi wrote: > On Sat, May 28, 2016 at 06:29:07PM +0200, ggarcia@abra.uab.cat wrote: >> From: Gerard Garcia <ggarcia@deic.uab.cat> >> >> Signed-off-by: Gerard Garcia <ggarcia@deic.uab.cat> >> --- >> drivers/vhost/vsock.c | 71 +++++++++++++++++++++++++++++++++++++++++++++++++++ >> 1 file changed, 71 insertions(+) >> >> diff --git a/drivers/vhost/vsock.c b/drivers/vhost/vsock.c >> index 17bfe4e..8fd5125 100644 >> --- a/drivers/vhost/vsock.c >> +++ b/drivers/vhost/vsock.c >> @@ -14,8 +14,10 @@ >> #include <net/sock.h> >> #include <linux/virtio_vsock.h> >> #include <linux/vhost.h> >> +#include <linux/skbuff.h> >> >> #include <net/af_vsock.h> >> +#include <uapi/linux/vsockmon.h> >> #include "vhost.h" >> >> #define VHOST_VSOCK_DEFAULT_HOST_CID 2 >> @@ -45,6 +47,67 @@ struct vhost_vsock { >> u32 guest_cid; >> }; >> >> +static struct sk_buff * >> +virtio_vsock_pkt_to_skb(struct virtio_vsock_pkt *pkt) >> +{ >> + struct sk_buff *skb; >> + struct af_vsockmon_hdr *hdr; >> + void *payload; >> + >> + u32 skb_len = sizeof(struct af_vsockmon_hdr) + pkt->len; >> + >> + skb = alloc_skb(skb_len, GFP_ATOMIC); >> + if (!skb) >> + return NULL; >> + >> + skb_reserve(skb, sizeof(struct af_vsockmon_hdr)); >> + >> + if (pkt->len) { >> + payload = skb_put(skb, pkt->len); >> + memcpy(payload, pkt->buf, pkt->len); >> + } >> + >> + hdr = (struct af_vsockmon_hdr *) skb_push(skb, sizeof(*hdr)); >> + hdr->type = AF_VSOCK_VIRTIO; >> + >> + switch(pkt->hdr.op) { >> + case VIRTIO_VSOCK_OP_REQUEST: >> + case VIRTIO_VSOCK_OP_RESPONSE: >> + hdr->g_hdr.op = AF_VSOCK_G_OP_CONNECT; >> + break; >> + case VIRTIO_VSOCK_OP_RST: >> + case VIRTIO_VSOCK_OP_SHUTDOWN: >> + hdr->g_hdr.op = AF_VSOCK_G_OP_DISCONNECT; >> + break; >> + case VIRTIO_VSOCK_OP_RW: >> + hdr->g_hdr.op = AF_VSOCK_G_OP_PAYLOAD; >> + break; >> + case VIRTIO_VSOCK_OP_CREDIT_UPDATE: >> + case VIRTIO_VSOCK_OP_CREDIT_REQUEST: >> + hdr->g_hdr.op = AF_VSOCK_G_OP_CONTROL; >> + break; >> + default: >> + hdr->g_hdr.op = AF_VSOCK_G_OP_UNKNOWN; >> + break; >> + } >> + >> + hdr->g_hdr.src_cid = pkt->hdr.src_cid; >> + hdr->g_hdr.src_port = pkt->hdr.src_port; >> + hdr->g_hdr.dst_cid = pkt->hdr.dst_cid; >> + hdr->g_hdr.dst_port = pkt->hdr.dst_port; > Careful with endianness here. pkt->hdr uses little-endian fields. True, I have not considered this. >> + >> + hdr->t_hdr.virtio_hdr = pkt->hdr; >> + >> + return skb; >> +} >> + >> +static void vsock_deliver_tap_pkt(struct virtio_vsock_pkt *pkt) >> +{ >> + struct sk_buff *skb = virtio_vsock_pkt_to_skb(pkt); >> + if (skb) >> + vsock_deliver_tap(skb); > vsock_deliver_tap() doesn't take ownership of skb so it is leaked here! > > In fact, given that the core vsock code is not skb-based, perhaps the > tap code shouldn't clone at all. Just take ownership of the skb. This > avoids the extra allocation and memcpy. > You are right! and yes, I agree that would be better to avoid the clone. As you say I can just take ownership of the skb before dev_queue_xmit() in vsock_deliver_tap() so it won't be free'd after the call and avoid the clone. >> +} >> + >> static u32 vhost_transport_get_local_cid(void) >> { >> return VHOST_VSOCK_DEFAULT_HOST_CID; >> @@ -147,6 +210,11 @@ vhost_transport_do_send_pkt(struct vhost_vsock *vsock, >> >> vsock->total_tx_buf -= pkt->len; >> >> + /* Deliver to monitoring devices all correctly transmitted >> + * packets. >> + */ >> + vsock_deliver_tap_pkt(pkt); >> + >> virtio_transport_free_pkt(pkt); >> } >> if (added) >> @@ -367,6 +435,9 @@ static void vhost_vsock_handle_tx_kick(struct vhost_work *work) >> continue; >> } >> >> + /* Deliver to monitoring devices all received packets */ >> + vsock_deliver_tap_pkt(pkt); >> + >> /* Only accept correctly addressed packets */ >> if (le64_to_cpu(pkt->hdr.src_cid) == vsock->guest_cid) >> virtio_transport_recv_pkt(pkt); >> -- >> 2.8.3 >> ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2016-06-14 12:05 UTC | newest] Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2016-05-28 16:29 [RFC 0/3] vsockmon: virtual device to monitor AF_VSOCK sockets ggarcia 2016-05-28 16:29 ` [RFC 1/3] vsockmon: Add tap functions ggarcia 2016-06-01 21:07 ` Stefan Hajnoczi 2016-06-09 15:02 ` Gerard Garcia 2016-06-10 15:44 ` Stefan Hajnoczi 2016-06-14 12:05 ` Jorgen S. Hansen 2016-05-28 16:29 ` [RFC 2/3] vsockmon: Add vsockmon device ggarcia 2016-06-01 21:15 ` Stefan Hajnoczi 2016-06-09 15:21 ` Gerard Garcia 2016-06-10 15:37 ` Stefan Hajnoczi 2016-05-28 16:29 ` [RFC 3/3] vsockmon: Add vsock hooks ggarcia 2016-06-01 21:19 ` Stefan Hajnoczi 2016-06-09 15:27 ` Gerard Garcia
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.