All of lore.kernel.org
 help / color / mirror / Atom feed
* subtle change in behavior with tun driver
@ 2015-01-21 22:36 Ani Sinha
  2015-01-22  9:53 ` Michael S. Tsirkin
  0 siblings, 1 reply; 4+ messages in thread
From: Ani Sinha @ 2015-01-21 22:36 UTC (permalink / raw)
  To: mst, netdev

Hi guys :

Commit 5d097109257c03 ("tun: only queue packets on device") seems to
have introduced a subtle change in behavior in the tun driver in the
default (non IFF_ONE_QUEUE) case. Previously when the queues got full
and eventually sk_wmem_alloc of the socket exceeded sk_sndbuf value,
the user would be given a feedback by returning EAGAIN from sendto()
etc. That way, the user could retry sending the packet again.
Unfortunately, with this new  default single queue mode, the driver
silently drops the packet when the device queue is full without giving
userland any feedback. This makes it appear to userland as though the
packet was transmitted successfully. It seems there is a semantic
change in the driver with this commit.

If the receiving process gets stuck for a short interval and is unable
to drain packets and then restarts again, one might see strange packet
drops in the kernel without getting any error back on the sender's
side. It kind of feels wrong.

Any thoughts?

Ani

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: subtle change in behavior with tun driver
  2015-01-21 22:36 subtle change in behavior with tun driver Ani Sinha
@ 2015-01-22  9:53 ` Michael S. Tsirkin
  2015-01-23  0:28   ` Ani Sinha
  0 siblings, 1 reply; 4+ messages in thread
From: Michael S. Tsirkin @ 2015-01-22  9:53 UTC (permalink / raw)
  To: Ani Sinha; +Cc: netdev

On Wed, Jan 21, 2015 at 02:36:17PM -0800, Ani Sinha wrote:
> Hi guys :
> 
> Commit 5d097109257c03 ("tun: only queue packets on device") seems to
> have introduced a subtle change in behavior in the tun driver in the
> default (non IFF_ONE_QUEUE) case. Previously when the queues got full
> and eventually sk_wmem_alloc of the socket exceeded sk_sndbuf value,
> the user would be given a feedback by returning EAGAIN from sendto()
> etc. That way, the user could retry sending the packet again.

This behaviour is common, but by no means guaranteed.
For example, if socket buffer size is large enough,
packets are small enough, or there are multiple sockets
transmitting through tun, packets would previously
accumulate in qdisc, followed by packet drops
without EAGAIN.

> Unfortunately, with this new  default single queue mode, the driver
> silently drops the packet when the device queue is full without giving
> userland any feedback. This makes it appear to userland as though the
> packet was transmitted successfully. It seems there is a semantic
> change in the driver with this commit.
> 
> If the receiving process gets stuck for a short interval and is unable
> to drain packets and then restarts again, one might see strange packet
> drops in the kernel without getting any error back on the sender's
> side. It kind of feels wrong.
> 
> Any thoughts?
> 
> Ani

Unfortunately - since it's pretty common for unpriveledged userspace to
drive the tun device - blocking the queue indefinitely as was done
previously leads to deadlocks for some apps, this was deemed worse than
some performance degradation.

As a simple work-around, if you want packets to accumulate in the qdisc,
it's easy to implement by using a non work conserving qdisc.
Set the limits to match the speed at which your application
is able to consume the packets.

I've been thinking about using some kind of watchdog to
make it safe to put the old non IFF_ONE_QUEUE semantics back,
unfortunately due to application being able to consume packets at the
same time it's not trivial to do in a non-racy way.

-- 
MST

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: subtle change in behavior with tun driver
  2015-01-22  9:53 ` Michael S. Tsirkin
@ 2015-01-23  0:28   ` Ani Sinha
  2015-01-23  7:38     ` Michael S. Tsirkin
  0 siblings, 1 reply; 4+ messages in thread
From: Ani Sinha @ 2015-01-23  0:28 UTC (permalink / raw)
  To: Michael S. Tsirkin, fruggeri; +Cc: netdev

On Thu, Jan 22, 2015 at 1:53 AM, Michael S. Tsirkin <mst@redhat.com> wrote:
> On Wed, Jan 21, 2015 at 02:36:17PM -0800, Ani Sinha wrote:
>> Hi guys :
>>
>> Commit 5d097109257c03 ("tun: only queue packets on device") seems to
>> have introduced a subtle change in behavior in the tun driver in the
>> default (non IFF_ONE_QUEUE) case. Previously when the queues got full
>> and eventually sk_wmem_alloc of the socket exceeded sk_sndbuf value,
>> the user would be given a feedback by returning EAGAIN from sendto()
>> etc. That way, the user could retry sending the packet again.
>
> This behaviour is common, but by no means guaranteed.
> For example, if socket buffer size is large enough,
> packets are small enough, or there are multiple sockets
> transmitting through tun, packets would previously
> accumulate in qdisc, followed by packet drops
> without EAGAIN.

Ah I see. pfifo_fast_enqueue() also starts dropping packets when it's
length exceeds a threshold. So I supposed we do not have a strong
argument for bringing back the old semantics because it wasn't
guranteed in every scenario in the first place.

>
>> Unfortunately, with this new  default single queue mode, the driver
>> silently drops the packet when the device queue is full without giving
>> userland any feedback. This makes it appear to userland as though the
>> packet was transmitted successfully. It seems there is a semantic
>> change in the driver with this commit.
>>
>> If the receiving process gets stuck for a short interval and is unable
>> to drain packets and then restarts again, one might see strange packet
>> drops in the kernel without getting any error back on the sender's
>> side. It kind of feels wrong.
>>
>> Any thoughts?
>>
>> Ani
>
> Unfortunately - since it's pretty common for unpriveledged userspace to
> drive the tun device - blocking the queue indefinitely as was done
> previously leads to deadlocks for some apps, this was deemed worse than
> some performance degradation.

agreed. However applications which needs protection from deadlock can
exclusively set IFF_ONE_QUEUE mode and live with the fact that they
might see dropped packets.

>
> As a simple work-around, if you want packets to accumulate in the qdisc,
> it's easy to implement by using a non work conserving qdisc.
> Set the limits to match the speed at which your application
> is able to consume the packets.
>
> I've been thinking about using some kind of watchdog to
> make it safe to put the old non IFF_ONE_QUEUE semantics back,
> unfortunately due to application being able to consume packets at the
> same time it's not trivial to do in a non-racy way.

I do not have an answer to this issue either. I agree that this is a
hard issue to solve.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: subtle change in behavior with tun driver
  2015-01-23  0:28   ` Ani Sinha
@ 2015-01-23  7:38     ` Michael S. Tsirkin
  0 siblings, 0 replies; 4+ messages in thread
From: Michael S. Tsirkin @ 2015-01-23  7:38 UTC (permalink / raw)
  To: Ani Sinha; +Cc: fruggeri, netdev

On Thu, Jan 22, 2015 at 04:28:09PM -0800, Ani Sinha wrote:
> On Thu, Jan 22, 2015 at 1:53 AM, Michael S. Tsirkin <mst@redhat.com> wrote:
> > On Wed, Jan 21, 2015 at 02:36:17PM -0800, Ani Sinha wrote:
> >> Hi guys :
> >>
> >> Commit 5d097109257c03 ("tun: only queue packets on device") seems to
> >> have introduced a subtle change in behavior in the tun driver in the
> >> default (non IFF_ONE_QUEUE) case. Previously when the queues got full
> >> and eventually sk_wmem_alloc of the socket exceeded sk_sndbuf value,
> >> the user would be given a feedback by returning EAGAIN from sendto()
> >> etc. That way, the user could retry sending the packet again.
> >
> > This behaviour is common, but by no means guaranteed.
> > For example, if socket buffer size is large enough,
> > packets are small enough, or there are multiple sockets
> > transmitting through tun, packets would previously
> > accumulate in qdisc, followed by packet drops
> > without EAGAIN.
> 
> Ah I see. pfifo_fast_enqueue() also starts dropping packets when it's
> length exceeds a threshold. So I supposed we do not have a strong
> argument for bringing back the old semantics because it wasn't
> guranteed in every scenario in the first place.
> 
> >
> >> Unfortunately, with this new  default single queue mode, the driver
> >> silently drops the packet when the device queue is full without giving
> >> userland any feedback. This makes it appear to userland as though the
> >> packet was transmitted successfully. It seems there is a semantic
> >> change in the driver with this commit.
> >>
> >> If the receiving process gets stuck for a short interval and is unable
> >> to drain packets and then restarts again, one might see strange packet
> >> drops in the kernel without getting any error back on the sender's
> >> side. It kind of feels wrong.
> >>
> >> Any thoughts?
> >>
> >> Ani
> >
> > Unfortunately - since it's pretty common for unpriveledged userspace to
> > drive the tun device - blocking the queue indefinitely as was done
> > previously leads to deadlocks for some apps, this was deemed worse than
> > some performance degradation.
> 
> agreed. However applications which needs protection from deadlock can
> exclusively set IFF_ONE_QUEUE mode and live with the fact that they
> might see dropped packets.

This doesn't work because applications are just using the
linux networking stack. Packets end up in tun.
It's up to tun not to break the stack, we can't
change all applications out there.

> >
> > As a simple work-around, if you want packets to accumulate in the qdisc,
> > it's easy to implement by using a non work conserving qdisc.
> > Set the limits to match the speed at which your application
> > is able to consume the packets.
> >
> > I've been thinking about using some kind of watchdog to
> > make it safe to put the old non IFF_ONE_QUEUE semantics back,
> > unfortunately due to application being able to consume packets at the
> > same time it's not trivial to do in a non-racy way.
> 
> I do not have an answer to this issue either. I agree that this is a
> hard issue to solve.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2015-01-23  7:38 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-01-21 22:36 subtle change in behavior with tun driver Ani Sinha
2015-01-22  9:53 ` Michael S. Tsirkin
2015-01-23  0:28   ` Ani Sinha
2015-01-23  7:38     ` Michael S. Tsirkin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.