All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Michael S. Tsirkin" <mst@redhat.com>
To: Krishna Kumar2 <krkumar2@in.ibm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>,
	Carsten Otte <cotte@de.ibm.com>,
	habanero@linux.vnet.ibm.com,
	Heiko Carstens <heiko.carstens@de.ibm.com>,
	kvm@vger.kernel.org, lguest@lists.ozlabs.org,
	linux-kernel@vger.kernel.org, linux-s390@vger.kernel.org,
	linux390@de.ibm.com, netdev@vger.kernel.org,
	Rusty Russell <rusty@rustcorp.com.au>,
	Martin Schwidefsky <schwidefsky@de.ibm.com>,
	steved@us.ibm.com, Tom Lendacky <tahm@linux.vnet.ibm.com>,
	virtualization@lists.linux-foundation.org,
	Shirley Ma <xma@us.ibm.com>
Subject: Re: [PATCHv2 10/14] virtio_net: limit xmit polling
Date: Tue, 24 May 2011 14:29:39 +0300	[thread overview]
Message-ID: <20110524112901.GB17087@redhat.com> (raw)
In-Reply-To: <OF69E520FD.340352AC-ON6525789A.003308A2-6525789A.0033F2DF@in.ibm.com>

On Tue, May 24, 2011 at 02:57:43PM +0530, Krishna Kumar2 wrote:
> "Michael S. Tsirkin" <mst@redhat.com> wrote on 05/24/2011 02:42:55 PM:
> 
> > > > > To do this properly, we should really be using the actual number of
> sg
> > > > > elements needed, but we'd have to do most of xmit_skb beforehand so
> we
> > > > > know how many.
> > > > >
> > > > > Cheers,
> > > > > Rusty.
> > > >
> > > > Maybe I'm confused here.  The problem isn't the failing
> > > > add_buf for the given skb IIUC.  What we are trying to do here is
> stop
> > > > the queue *before xmit_skb fails*. We can't look at the
> > > > number of fragments in the current skb - the next one can be
> > > > much larger.  That's why we check capacity after xmit_skb,
> > > > not before it, right?
> > >
> > > Maybe Rusty means it is a simpler model to free the amount
> > > of space that this xmit needs. We will still fail anyway
> > > at some time but it is unlikely, since earlier iteration
> > > freed up atleast the space that it was going to use.
> >
> > Not sure I nderstand.  We can't know space is freed in the previous
> > iteration as buffers might not have been used by then.
> 
> Yes, the first few iterations may not have freed up space, but
> later ones should. The amount of free space should increase
> from then on, especially since we try to free double of what
> we consume.

Hmm. This is only an upper limit on the # of entries in the queue.
Assume that vq size is 4 and we transmit 4 enties without
getting anything in the used ring. The next transmit will fail.

So I don't really see why it's unlikely that we reach the packet
drop code with your patch.

> > > The
> > > code could become much simpler:
> > >
> > > start_xmit()
> > > {
> > > {
> > >         num_sgs = get num_sgs for this skb;
> > >
> > >         /* Free enough pending old buffers to enable queueing this one
> */
> > >         free_old_xmit_skbs(vi, num_sgs * 2);     /* ?? */
> > >
> > >         if (virtqueue_get_capacity() < num_sgs) {
> > >                 netif_stop_queue(dev);
> > >                 if (virtqueue_enable_cb_delayed(vi->svq) ||
> > >                     free_old_xmit_skbs(vi, num_sgs)) {
> > >                         /* Nothing freed up, or not enough freed up */
> > >                         kfree_skb(skb);
> > >                         return NETDEV_TX_OK;
> >
> > This packet drop is what we wanted to avoid.
> 
> Please see below on returning NETDEV_TX_BUSY.
> 
> >
> > >                 }
> > >                 netif_start_queue(dev);
> > >                 virtqueue_disable_cb(vi->svq);
> > >         }
> > >
> > >         /* xmit_skb cannot fail now, also pass 'num_sgs' */
> > >         xmit_skb(vi, skb, num_sgs);
> > >         virtqueue_kick(vi->svq);
> > >
> > >         skb_orphan(skb);
> > >         nf_reset(skb);
> > >
> > >         return NETDEV_TX_OK;
> > > }
> > >
> > > We could even return TX_BUSY since that makes the dequeue
> > > code more efficient. See dev_dequeue_skb() - you can skip a
> > > lot of code (and avoid taking locks) to check if the queue
> > > is already stopped but that code runs only if you return
> > > TX_BUSY in the earlier iteration.
> > >
> > > BTW, shouldn't the check in start_xmit be:
> > >    if (likely(!free_old_xmit_skbs(vi, 2+MAX_SKB_FRAGS))) {
> > >       ...
> > >    }
> > >
> > > Thanks,
> > >
> > > - KK
> >
> > I thought we used to do basically this but other devices moved to a
> > model where they stop *before* queueing fails, so we did too.
> 
> I am not sure of why it was changed, since returning TX_BUSY
> seems more efficient IMHO.
> qdisc_restart() handles requeue'd
> packets much better than a stopped queue, as a significant
> part of this code is skipped if gso_skb is present

I think this is the argument:
http://www.mail-archive.com/virtualization@lists.linux-foundation.org/msg06364.html


> (qdisc
> will eventually start dropping packets when tx_queue_len is
> exceeded anyway).
> 
> Thanks,
> 
> - KK

tx_queue_len is a pretty large buffer so maybe no.
I think the packet drops from the scheduler queue can also be
done intelligently (e.g. with CHOKe) which should
work better than dropping a random packet?

-- 
MST

WARNING: multiple messages have this Message-ID (diff)
From: "Michael S. Tsirkin" <mst-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
To: Krishna Kumar2 <krkumar2-xthvdsQ13ZrQT0dZR+AlfA@public.gmane.org>
Cc: habanero-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org,
	lguest-uLR06cmDAlY/bJ5BZ2RsiQ@public.gmane.org,
	Shirley Ma <xma-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>,
	kvm-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	Carsten Otte <cotte-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org>,
	linux-s390-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	Heiko Carstens
	<heiko.carstens-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org>,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	virtualization-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org,
	steved-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org,
	Christian Borntraeger
	<borntraeger-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org>,
	Tom Lendacky
	<tahm-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>,
	netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	Martin Schwidefsky
	<schwidefsky-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org>,
	linux390-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org
Subject: Re: [PATCHv2 10/14] virtio_net: limit xmit polling
Date: Tue, 24 May 2011 14:29:39 +0300	[thread overview]
Message-ID: <20110524112901.GB17087@redhat.com> (raw)
In-Reply-To: <OF69E520FD.340352AC-ON6525789A.003308A2-6525789A.0033F2DF-xthvdsQ13ZrQT0dZR+AlfA@public.gmane.org>

On Tue, May 24, 2011 at 02:57:43PM +0530, Krishna Kumar2 wrote:
> "Michael S. Tsirkin" <mst-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote on 05/24/2011 02:42:55 PM:
> 
> > > > > To do this properly, we should really be using the actual number of
> sg
> > > > > elements needed, but we'd have to do most of xmit_skb beforehand so
> we
> > > > > know how many.
> > > > >
> > > > > Cheers,
> > > > > Rusty.
> > > >
> > > > Maybe I'm confused here.  The problem isn't the failing
> > > > add_buf for the given skb IIUC.  What we are trying to do here is
> stop
> > > > the queue *before xmit_skb fails*. We can't look at the
> > > > number of fragments in the current skb - the next one can be
> > > > much larger.  That's why we check capacity after xmit_skb,
> > > > not before it, right?
> > >
> > > Maybe Rusty means it is a simpler model to free the amount
> > > of space that this xmit needs. We will still fail anyway
> > > at some time but it is unlikely, since earlier iteration
> > > freed up atleast the space that it was going to use.
> >
> > Not sure I nderstand.  We can't know space is freed in the previous
> > iteration as buffers might not have been used by then.
> 
> Yes, the first few iterations may not have freed up space, but
> later ones should. The amount of free space should increase
> from then on, especially since we try to free double of what
> we consume.

Hmm. This is only an upper limit on the # of entries in the queue.
Assume that vq size is 4 and we transmit 4 enties without
getting anything in the used ring. The next transmit will fail.

So I don't really see why it's unlikely that we reach the packet
drop code with your patch.

> > > The
> > > code could become much simpler:
> > >
> > > start_xmit()
> > > {
> > > {
> > >         num_sgs = get num_sgs for this skb;
> > >
> > >         /* Free enough pending old buffers to enable queueing this one
> */
> > >         free_old_xmit_skbs(vi, num_sgs * 2);     /* ?? */
> > >
> > >         if (virtqueue_get_capacity() < num_sgs) {
> > >                 netif_stop_queue(dev);
> > >                 if (virtqueue_enable_cb_delayed(vi->svq) ||
> > >                     free_old_xmit_skbs(vi, num_sgs)) {
> > >                         /* Nothing freed up, or not enough freed up */
> > >                         kfree_skb(skb);
> > >                         return NETDEV_TX_OK;
> >
> > This packet drop is what we wanted to avoid.
> 
> Please see below on returning NETDEV_TX_BUSY.
> 
> >
> > >                 }
> > >                 netif_start_queue(dev);
> > >                 virtqueue_disable_cb(vi->svq);
> > >         }
> > >
> > >         /* xmit_skb cannot fail now, also pass 'num_sgs' */
> > >         xmit_skb(vi, skb, num_sgs);
> > >         virtqueue_kick(vi->svq);
> > >
> > >         skb_orphan(skb);
> > >         nf_reset(skb);
> > >
> > >         return NETDEV_TX_OK;
> > > }
> > >
> > > We could even return TX_BUSY since that makes the dequeue
> > > code more efficient. See dev_dequeue_skb() - you can skip a
> > > lot of code (and avoid taking locks) to check if the queue
> > > is already stopped but that code runs only if you return
> > > TX_BUSY in the earlier iteration.
> > >
> > > BTW, shouldn't the check in start_xmit be:
> > >    if (likely(!free_old_xmit_skbs(vi, 2+MAX_SKB_FRAGS))) {
> > >       ...
> > >    }
> > >
> > > Thanks,
> > >
> > > - KK
> >
> > I thought we used to do basically this but other devices moved to a
> > model where they stop *before* queueing fails, so we did too.
> 
> I am not sure of why it was changed, since returning TX_BUSY
> seems more efficient IMHO.
> qdisc_restart() handles requeue'd
> packets much better than a stopped queue, as a significant
> part of this code is skipped if gso_skb is present

I think this is the argument:
http://www.mail-archive.com/virtualization-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org/msg06364.html


> (qdisc
> will eventually start dropping packets when tx_queue_len is
> exceeded anyway).
> 
> Thanks,
> 
> - KK

tx_queue_len is a pretty large buffer so maybe no.
I think the packet drops from the scheduler queue can also be
done intelligently (e.g. with CHOKe) which should
work better than dropping a random packet?

-- 
MST

  reply	other threads:[~2011-05-24 11:29 UTC|newest]

Thread overview: 132+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-05-19 23:10 [PATCHv2 00/14] virtio and vhost-net performance enhancements Michael S. Tsirkin
2011-05-19 23:10 ` Michael S. Tsirkin
2011-05-19 23:10 ` [PATCHv2 01/14] virtio: event index interface Michael S. Tsirkin
2011-05-19 23:10   ` Michael S. Tsirkin
2011-05-21  2:29   ` Rusty Russell
2011-05-21  2:29   ` Rusty Russell
2011-05-21  2:29     ` Rusty Russell
2011-05-19 23:10 ` Michael S. Tsirkin
2011-05-19 23:10 ` [PATCHv2 02/14] virtio ring: inline function to check for events Michael S. Tsirkin
2011-05-19 23:10   ` Michael S. Tsirkin
2011-05-21  2:29   ` Rusty Russell
2011-05-21  2:29     ` Rusty Russell
2011-05-21  2:29   ` Rusty Russell
2011-05-19 23:10 ` Michael S. Tsirkin
2011-05-19 23:10 ` [PATCHv2 03/14] virtio_ring: support event idx feature Michael S. Tsirkin
2011-05-19 23:10 ` Michael S. Tsirkin
2011-05-19 23:10   ` Michael S. Tsirkin
2011-05-21  2:31   ` Rusty Russell
2011-05-21  2:31   ` Rusty Russell
2011-05-21  2:31     ` Rusty Russell
2011-05-19 23:10 ` [PATCHv2 04/14] vhost: support event index Michael S. Tsirkin
2011-05-19 23:10   ` Michael S. Tsirkin
2011-05-21  2:31   ` Rusty Russell
2011-05-21  2:31   ` Rusty Russell
2011-05-21  2:31     ` Rusty Russell
2011-05-19 23:10 ` Michael S. Tsirkin
2011-05-19 23:11 ` [PATCHv2 05/14] virtio_test: " Michael S. Tsirkin
2011-05-19 23:11 ` Michael S. Tsirkin
2011-05-19 23:11   ` Michael S. Tsirkin
2011-05-21  2:32   ` Rusty Russell
2011-05-21  2:32   ` Rusty Russell
2011-05-21  2:32     ` Rusty Russell
2011-05-19 23:11 ` [PATCHv2 06/14] virtio: add api for delayed callbacks Michael S. Tsirkin
2011-05-19 23:11 ` Michael S. Tsirkin
2011-05-19 23:11   ` Michael S. Tsirkin
2011-05-21  2:33   ` Rusty Russell
2011-05-21  2:33   ` Rusty Russell
2011-05-21  2:33     ` Rusty Russell
2011-05-19 23:11 ` [PATCHv2 07/14] virtio_net: delay TX callbacks Michael S. Tsirkin
2011-05-19 23:11   ` Michael S. Tsirkin
2011-05-19 23:11 ` Michael S. Tsirkin
2011-05-19 23:11 ` [PATCHv2 08/14] virtio_ring: Add capacity check API Michael S. Tsirkin
2011-05-19 23:11   ` Michael S. Tsirkin
2011-05-19 23:11 ` Michael S. Tsirkin
2011-05-19 23:11 ` [PATCHv2 09/14] virtio_net: fix TX capacity checks using new API Michael S. Tsirkin
2011-05-19 23:11   ` Michael S. Tsirkin
2011-05-21  2:13   ` Rusty Russell
2011-05-21  2:13   ` Rusty Russell
2011-05-21  2:13     ` Rusty Russell
2011-05-19 23:11 ` Michael S. Tsirkin
2011-05-19 23:11 ` [PATCHv2 10/14] virtio_net: limit xmit polling Michael S. Tsirkin
2011-05-19 23:11   ` Michael S. Tsirkin
2011-05-21  2:19   ` Rusty Russell
2011-05-21  2:19   ` Rusty Russell
2011-05-21  2:19     ` Rusty Russell
2011-05-22 12:10     ` Michael S. Tsirkin
2011-05-22 12:10       ` Michael S. Tsirkin
2011-05-23  2:07       ` Rusty Russell
2011-05-23  2:07         ` Rusty Russell
2011-05-23 11:19         ` Michael S. Tsirkin
2011-05-23 11:19         ` Michael S. Tsirkin
2011-05-23 11:19           ` Michael S. Tsirkin
2011-05-24  7:54           ` Krishna Kumar2
2011-05-24  7:54             ` Krishna Kumar2
2011-05-24  9:12             ` Michael S. Tsirkin
2011-05-24  9:12             ` Michael S. Tsirkin
2011-05-24  9:12               ` Michael S. Tsirkin
2011-05-24  9:27               ` Krishna Kumar2
2011-05-24  9:27               ` Krishna Kumar2
2011-05-24 11:29                 ` Michael S. Tsirkin [this message]
2011-05-24 11:29                   ` Michael S. Tsirkin
2011-05-24 12:50                   ` Krishna Kumar2
2011-05-24 12:50                     ` Krishna Kumar2
2011-05-24 13:52                     ` Michael S. Tsirkin
2011-05-24 13:52                       ` Michael S. Tsirkin
2011-05-24 13:52                     ` Michael S. Tsirkin
2011-05-24 12:50                   ` Krishna Kumar2
2011-05-24 11:29                 ` Michael S. Tsirkin
2011-05-24  7:54           ` Krishna Kumar2
2011-05-25  1:28           ` Rusty Russell
2011-05-25  1:28             ` Rusty Russell
2011-05-25  5:50             ` Michael S. Tsirkin
2011-05-25  5:50             ` Michael S. Tsirkin
2011-05-25  5:50               ` Michael S. Tsirkin
2011-05-25  1:28           ` Rusty Russell
2011-05-25  1:35           ` Rusty Russell
2011-05-25  1:35           ` Rusty Russell
2011-05-25  1:35             ` Rusty Russell
2011-05-25  6:07             ` Michael S. Tsirkin
2011-05-25  6:07               ` Michael S. Tsirkin
2011-05-26  3:28               ` Rusty Russell
2011-05-26  3:28               ` Rusty Russell
2011-05-26  3:28                 ` Rusty Russell
2011-05-28 20:02                 ` Michael S. Tsirkin
2011-05-28 20:02                   ` Michael S. Tsirkin
2011-05-30  6:27                   ` Rusty Russell
2011-05-30  6:27                     ` Rusty Russell
2011-05-30  6:27                   ` Rusty Russell
2011-05-28 20:02                 ` Michael S. Tsirkin
2011-05-25  6:07             ` Michael S. Tsirkin
2011-05-23  2:07       ` Rusty Russell
2011-05-22 12:10     ` Michael S. Tsirkin
2011-05-19 23:11 ` Michael S. Tsirkin
2011-05-19 23:12 ` [PATCHv2 11/14] virtio: don't delay avail index update Michael S. Tsirkin
2011-05-19 23:12   ` Michael S. Tsirkin
2011-05-21  2:26   ` Rusty Russell
2011-05-21  2:26   ` Rusty Russell
2011-05-21  2:26     ` Rusty Russell
2011-05-19 23:12 ` Michael S. Tsirkin
2011-05-19 23:12 ` [PATCHv2 12/14] virtio: 64 bit features Michael S. Tsirkin
2011-05-19 23:12 ` Michael S. Tsirkin
2011-05-19 23:12   ` Michael S. Tsirkin
2011-05-19 23:12 ` [PATCHv2 13/14] virtio_test: update for " Michael S. Tsirkin
2011-05-19 23:12 ` Michael S. Tsirkin
2011-05-19 23:12   ` Michael S. Tsirkin
2011-05-19 23:12 ` [PATCHv2 14/14] vhost: fix " Michael S. Tsirkin
2011-05-19 23:12   ` Michael S. Tsirkin
2011-05-19 23:12 ` Michael S. Tsirkin
2011-05-19 23:20 ` [PATCHv2 00/14] virtio and vhost-net performance enhancements David Miller
2011-05-19 23:20 ` David Miller
2011-05-20  7:51 ` Rusty Russell
2011-05-20  7:51   ` Rusty Russell
2011-05-20  7:51 ` Rusty Russell
2011-05-26 15:32 ` [PERF RESULTS] " Krishna Kumar2
2011-05-26 15:32   ` Krishna Kumar2
2011-05-26 15:42   ` Shirley Ma
2011-05-26 16:21     ` Krishna Kumar2
2011-05-26 16:21     ` Krishna Kumar2
     [not found]     ` <OFF9D0E604.B865A006-ON6525789C.00597010-6525789C.0059987A@LocalDomain>
2011-05-26 16:29       ` Krishna Kumar2
2011-05-26 16:29       ` Krishna Kumar2
2011-05-26 16:29         ` Krishna Kumar2
2011-05-26 15:32 ` Krishna Kumar2

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110524112901.GB17087@redhat.com \
    --to=mst@redhat.com \
    --cc=borntraeger@de.ibm.com \
    --cc=cotte@de.ibm.com \
    --cc=habanero@linux.vnet.ibm.com \
    --cc=heiko.carstens@de.ibm.com \
    --cc=krkumar2@in.ibm.com \
    --cc=kvm@vger.kernel.org \
    --cc=lguest@lists.ozlabs.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=linux390@de.ibm.com \
    --cc=netdev@vger.kernel.org \
    --cc=rusty@rustcorp.com.au \
    --cc=schwidefsky@de.ibm.com \
    --cc=steved@us.ibm.com \
    --cc=tahm@linux.vnet.ibm.com \
    --cc=virtualization@lists.linux-foundation.org \
    --cc=xma@us.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.