From mboxrd@z Thu Jan 1 00:00:00 1970 From: Pavel Fedin Subject: Re: [RFC 0/5] virtio support for container Date: Thu, 31 Dec 2015 18:39:22 +0300 Message-ID: <000301d143e1$6c5c0e90$45142bb0$@samsung.com> References: <002a01d142e6$fbfeb4e0$f3fc1ea0$@samsung.com> <002401d143af$38a6fa60$a9f4ef20$@samsung.com> <002c01d143b7$568aace0$03a006a0$@samsung.com> Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit To: "'Tan, Jianfeng'" , dev@dpdk.org Return-path: Received: from mailout4.w1.samsung.com (mailout4.w1.samsung.com [210.118.77.14]) by dpdk.org (Postfix) with ESMTP id 60A2B559C for ; Thu, 31 Dec 2015 16:39:25 +0100 (CET) Received: from eucpsbgm1.samsung.com (unknown [203.254.199.244]) by mailout4.w1.samsung.com (Oracle Communications Messaging Server 7.0.5.31.0 64bit (built May 5 2014)) with ESMTP id <0O0800I8TBHOOFB0@mailout4.w1.samsung.com> for dev@dpdk.org; Thu, 31 Dec 2015 15:39:24 +0000 (GMT) In-reply-to: Content-language: ru List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Hello! Last minute note. I have found the problem but have no time to research and fix it. It happens because ovs first creates the device, starts it, then stops it, and reconfigures queues. The second queue allocation happens from within netdev_set_multiq(). Then ovs restarts the device and proceeds to actually using it. But, queues are not initialized properly in DPDK after the second allocation. Because of this thing: /* On restart after stop do not touch queues */ if (hw->started) return 0; It keeps us away from calling virtio_dev_rxtx_start(), which should in turn call virtio_dev_vring_start(), which calls vring_init(). So, VIRTQUEUE_NUSED() dies badly because vq->vq_ring all contains NULLs. See you all after 10th. And happy New Year again! Kind regards, Pavel Fedin Expert Engineer Samsung Electronics Research center Russia > -----Original Message----- > From: Pavel Fedin [mailto:p.fedin@samsung.com] > Sent: Thursday, December 31, 2015 4:47 PM > To: 'Tan, Jianfeng'; 'dev@dpdk.org' > Subject: RE: [dpdk-dev] [RFC 0/5] virtio support for container > > Hello! > > > > a) ovs_in_container does not send VHOST_USER_SET_MEM_TABLE > > Please check if rte_eth_dev_start() is called. > > (rte_eth_dev_start -> virtio_dev_start -> vtpci_reinit_complete -> kick_all_vq) > > I've figured out what happened, and it's my fault only :( I have modified your patchset and > added --shared-mem option. And forgot to specify it to gdb :) Without it memory is not shared, > and rte_memseg_info_get() returned fd = -1. And if you put it into control message for > sendmsg(), you get your -EBADF. > So please ignore this. > But, nevertheless, ovs in container still dies with: > --- cut --- > Program received signal SIGSEGV, Segmentation fault. > [Switching to Thread 0x7fff97fff700 (LWP 3866)] > virtio_recv_mergeable_pkts (rx_queue=0x7fffd46a9a80, rx_pkts=0x7fff97ffe850, nb_pkts=32) at > /home/p.fedin/dpdk/drivers/net/virtio/virtio_rxtx.c:683 > 683 /home/p.fedin/dpdk/drivers/net/virtio/virtio_rxtx.c: No such file or directory. > Missing separate debuginfos, use: dnf debuginfo-install keyutils-libs-1.5.9-7.fc23.x86_64 > krb5-libs-1.13.2-11.fc23.x86_64 libcap-ng-0.7.7-2.fc23.x86_64 libcom_err-1.42.13- > 3.fc23.x86_64 libselinux-2.4-4.fc23.x86_64 openssl-libs-1.0.2d-2.fc23.x86_64 pcre-8.37- > 4.fc23.x86_64 zlib-1.2.8-9.fc23.x86_64 > (gdb) where > #0 virtio_recv_mergeable_pkts (rx_queue=0x7fffd46a9a80, rx_pkts=0x7fff97ffe850, nb_pkts=32) > at /home/p.fedin/dpdk/drivers/net/virtio/virtio_rxtx.c:683 > #1 0x0000000000669ee8 in rte_eth_rx_burst (nb_pkts=32, rx_pkts=0x7fff97ffe850, queue_id=0, > port_id=0 '\000') at /home/p.fedin/dpdk/build/include/rte_ethdev.h:2510 > #2 netdev_dpdk_rxq_recv (rxq_=, packets=0x7fff97ffe850, c=0x7fff97ffe84c) at > lib/netdev-dpdk.c:1033 > #3 0x00000000005e8ca1 in netdev_rxq_recv (rx=, > buffers=buffers@entry=0x7fff97ffe850, cnt=cnt@entry=0x7fff97ffe84c) at lib/netdev.c:654 > #4 0x00000000005cb338 in dp_netdev_process_rxq_port (pmd=pmd@entry=0x7fffac7f8010, > rxq=, port=, port=) at lib/dpif-netdev.c:2510 > #5 0x00000000005cc649 in pmd_thread_main (f_=0x7fffac7f8010) at lib/dpif-netdev.c:2671 > #6 0x0000000000628424 in ovsthread_wrapper (aux_=) at lib/ovs-thread.c:340 > #7 0x00007ffff70f660a in start_thread () from /lib64/libpthread.so.0 > #8 0x00007ffff6926bbd in clone () from /lib64/libc.so.6 > (gdb) > --- cut --- > > and l2fwd does not reproduce this. So, let's wait until 11.01.2016. And happy New Year to > everybody who reads it (and who doesn't) :) > > Kind regards, > Pavel Fedin > Expert Engineer > Samsung Electronics Research center Russia