linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* 2.4.27 Potential race in n_tty.c:write_chan()
@ 2004-12-05 20:54 Ryan Reading
  2004-12-05 22:40 ` Paul Fulghum
  0 siblings, 1 reply; 4+ messages in thread
From: Ryan Reading @ 2004-12-05 20:54 UTC (permalink / raw)
  To: linux-kernel

I have been staring at this for a while and it seems there is a
potential race condition in the write_chan() function in n_tty.c.  The
problem I have noticed is in tty USB drivers.  Here is a rough sketch
of the current write_chan().

static ssize_t write_chan()
{
...
add_wait_queue(&tty->write_wait, &wait);
while(1) {     
  set_current_state(TASK_INTERRUPTIBLE);

  ...  // <- Window for race

  c = tty->driver.write(tty, 1, b, nr);
  nr -= c
  if (!nr)
    break;	<-- schedule() isn't called if all data was written!

  schedule();
}
set_current_state(TASK_RUNNING);
remove_wait_queue(&tty->write_wait, &wait);
return;
}

So when write_chan() calls usb_driver::write(), typically the driver
calls usb_submit_urb().  The write() call then returns immediately
indicating that all of the data has been written (assuming it is less
than the USB packets size).  The driver however is still waiting for an
interrupt to complete the write and wakeup the current kernel path.  If
write_chan() is called again and the interrupt is received within the
window I outlined above, the current_state will be reset to TASK_RUNNING
before the next usb_driver::write() is ever called!  If this happens, it
seems that we would lose synchronisity and potentially lock the kernel
path.

It is also my understanding that the usb interrupt is generated from the
ACK/NAK of the original usb_submit_urb().  If the driver is returning
immediately without waiting on the interrupt and schedule() is never
being called, there is no guarantee that the write() happened
successfully (although we return that it has).  It seems if a driver
wanted to guarantee this, it would have to artificially wait of the
interrupt before returning.

Anyone have any thoughts on this?  Thanks.

-- Ryan


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: 2.4.27 Potential race in n_tty.c:write_chan()
  2004-12-05 20:54 2.4.27 Potential race in n_tty.c:write_chan() Ryan Reading
@ 2004-12-05 22:40 ` Paul Fulghum
  2004-12-06  4:42   ` Ryan Reading
       [not found]   ` <1102307945.1876.27.camel@localhost>
  0 siblings, 2 replies; 4+ messages in thread
From: Paul Fulghum @ 2004-12-05 22:40 UTC (permalink / raw)
  To: Ryan Reading; +Cc: linux-kernel

On Sun, 2004-12-05 at 15:54 -0500, Ryan Reading wrote:
> So when write_chan() calls usb_driver::write(), typically the driver
> calls usb_submit_urb().  The write() call then returns immediately
> indicating that all of the data has been written (assuming it is less
> than the USB packets size).  The driver however is still waiting for an
> interrupt to complete the write and wakeup the current kernel path.  If
> write_chan() is called again and the interrupt is received within the
> window I outlined above, the current_state will be reset to TASK_RUNNING
> before the next usb_driver::write() is ever called!  If this happens, it
> seems that we would lose synchronisity and potentially lock the kernel
> path.

The line discipline write routine is serialized
on a per tty basis in do_tty_write() of tty_io.c
using tty->atomic_write semaphore, so you will not
reenter write_chan() for a particular tty instance.

Even if this were not the case, if the task state
changes to TASK_RUNNING inside the window
you describe, the only thing that happens is the loop
executes again. The driver must decide if it can accept
more data or not and return the appropriate value.

There is no potential for deadlock.

> It is also my understanding that the usb interrupt is generated from the
> ACK/NAK of the original usb_submit_urb().  If the driver is returning
> immediately without waiting on the interrupt and schedule() is never
> being called, there is no guarantee that the write() happened
> successfully (although we return that it has).  It seems if a driver
> wanted to guarantee this, it would have to artificially wait of the
> interrupt before returning.

True, but this is a matter of layering.

The line discipline knows nothing about the driver's concept
of write completion apart from the driver's write method
return value. If it is critical for the write not to complete
until the URB is sent, it is up to the driver to block
and return the appropriate return value.
 
--
Paul Fulghum
paulkf@microgate.com


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: 2.4.27 Potential race in n_tty.c:write_chan()
  2004-12-05 22:40 ` Paul Fulghum
@ 2004-12-06  4:42   ` Ryan Reading
       [not found]   ` <1102307945.1876.27.camel@localhost>
  1 sibling, 0 replies; 4+ messages in thread
From: Ryan Reading @ 2004-12-06  4:42 UTC (permalink / raw)
  To: linux-kernel

On Sun, 2004-12-05 at 17:40, Paul Fulghum wrote:
> On Sun, 2004-12-05 at 15:54 -0500, Ryan Reading wrote:
> > So when write_chan() calls usb_driver::write(), typically the driver
> > calls usb_submit_urb().  The write() call then returns immediately
> > indicating that all of the data has been written (assuming it is
less
> > than the USB packets size).  The driver however is still waiting for
an
> > interrupt to complete the write and wakeup the current kernel path. 
If
> > write_chan() is called again and the interrupt is received within
the
> > window I outlined above, the current_state will be reset to
TASK_RUNNING
> > before the next usb_driver::write() is ever called!  If this
happens, it
> > seems that we would lose synchronisity and potentially lock the
kernel
> > path.
> 
> The line discipline write routine is serialized
> on a per tty basis in do_tty_write() of tty_io.c
> using tty->atomic_write semaphore, so you will not
> reenter write_chan() for a particular tty instance.

Per tty instance this is true, but this is not the case for multiple
userland::write() calls.

> Even if this were not the case, if the task state
> changes to TASK_RUNNING inside the window
> you describe, the only thing that happens is the loop
> executes again. The driver must decide if it can accept
> more data or not and return the appropriate value.
> 
> There is no potential for deadlock.

Yes this does seem to be true.  However, it can throw a driver into a
wierd state if this happens.  Ideally the driver should be able to
recover.  I'm not sure if we can guarantee that user_land protocol could
recover though, especially if write()s aren't guaranteed to complete
successfully.

> > It is also my understanding that the usb interrupt is generated from
the
> > ACK/NAK of the original usb_submit_urb().  If the driver is
returning
> > immediately without waiting on the interrupt and schedule() is never
> > being called, there is no guarantee that the write() happened
> > successfully (although we return that it has).  It seems if a driver
> > wanted to guarantee this, it would have to artificially wait of the
> > interrupt before returning.
> 
> True, but this is a matter of layering.
> 
> The line discipline knows nothing about the driver's concept
> of write completion apart from the driver's write method
> return value. If it is critical for the write not to complete
> until the URB is sent, it is up to the driver to block
> and return the appropriate return value.

I guess my biggest concern here is the definition of the
tty_driver::write() interface.  Should a driver be able to rely on the
schedule() call to complete its write?  It seems that since the driver
is responsible for waking up the tty->write_wait, it should be able to
rely on the schedule() call.  However in this implementation it cannot
which I'm sure is for performance reasons.  Because of this, I don't
think the USB layer should be interacting with the tty layer in this
fashion.

So for this implementation of write_chan(), the tty_driver::write()
interface should note that the driver needs only to wakeup the
tty->write_wait iff it does not write all of the requested data in this
call.  Is this a fair assessment and can this be depended on for future
implementations?

Anyway, part of this discussion needs to happen on the USB mailing list
since the usb-skeleton driver outlines this method of interacting with
the tty layer.  Given the current implementation I don't think the
usb-skeleton is correct.

Thanks for your help.

-- Ryan


> --
> Paul Fulghum
> paulkf@microgate.com
> 


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: 2.4.27 Potential race in n_tty.c:write_chan()
       [not found]   ` <1102307945.1876.27.camel@localhost>
@ 2004-12-06 14:15     ` Paul Fulghum
  0 siblings, 0 replies; 4+ messages in thread
From: Paul Fulghum @ 2004-12-06 14:15 UTC (permalink / raw)
  To: Ryan Reading; +Cc: Linux Kernel Mailing List

Ryan Reading wrote:
> Per tty instance this is true, but this is not the case for multiple
> userland::write() calls.

If the write calls are accessing different
tty instances there is no interaction
between the write calls.

> Ideally the driver should be able to
> recover.  I'm not sure if we can guarantee that user_land protocol could
> recover though, especially if write()s aren't guaranteed to complete
> successfully.

I don't see that there is anything for the driver
to recover from. The driver sees a series of
write method calls and blocks, processes, or
rejects them as appropriate.

> Should a driver be able to rely on the
> schedule() call to complete its write?  It seems that since the driver
> is responsible for waking up the tty->write_wait, it should be able to
> rely on the schedule() call.  However in this implementation it cannot
> which I'm sure is for performance reasons.  Because of this, I don't
> think the USB layer should be interacting with the tty layer in this
> fashion.

The driver should not require anything from the caller
after the write method has returned. The driver waking
a caller is a service *to* the caller *if*
the caller is interested (on a wait queue).

The only purpose of the schedule() call is for the
caller to temporarily yield execution while waiting
on write_wait. If the caller doesn't care, such as when
the driver indicates all data accepted, then there
is no reason to call schedule(). In this case, the
break is hit and the caller removes itself from the wait queue.

> So for this implementation of write_chan(), the tty_driver::write()
> interface should note that the driver needs only to wakeup the
> tty->write_wait iff it does not write all of the requested data in this
> call.  Is this a fair assessment and can this be depended on for future
> implementations?

The write_wait queue wakes code blocked in the write path
due to a component (ldisc, driver, etc) being busy
so it can try again. It is not an indication that a particular
write call completed successfully.
It is a write throttling mechanism. When a waiting
process is woken, there is no guarantee that *all*
components in the write path are ready for more data.
The process must observe return codes and wait
again if so indicated.

Short of a major rewrite of the tty code,
this behavior will be stable.

-- 
Paul Fulghum
paulkf@microgate.com

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2004-12-06 14:15 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-12-05 20:54 2.4.27 Potential race in n_tty.c:write_chan() Ryan Reading
2004-12-05 22:40 ` Paul Fulghum
2004-12-06  4:42   ` Ryan Reading
     [not found]   ` <1102307945.1876.27.camel@localhost>
2004-12-06 14:15     ` Paul Fulghum

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).