All of lore.kernel.org
 help / color / mirror / Atom feed
* RE: [BUG][2.6.8.1] serial driver hangs SMP kernel, but not the UP kernel
@ 2005-01-06 14:55 Tim_T_Murphy
  0 siblings, 0 replies; 22+ messages in thread
From: Tim_T_Murphy @ 2005-01-06 14:55 UTC (permalink / raw)
  To: rmk+lkml; +Cc: linux-kernel

sorry for the huge delay since my last post on this, but disabling
low_latency is resulting in dropped characters.

this looks to be exactly what was reported in
http://www.uwsg.iu.edu/hypermail/linux/kernel/0212.0/0412.html

anything i can do to avoid dropping characters without using
low_latency, which still hangs SMP kernels?
thanks,
tim
> -----Original Message-----
> From: Murphy, Tim T 
> Sent: Monday, November 01, 2004 10:07 AM
> To: 'Russell King'
> Cc: linux-kernel@vger.kernel.org
> Subject: RE: [BUG][2.6.8.1] serial driver hangs SMP kernel, 
> but not the
> UP kernel
> 
> 
> > Thanks for testing - I'll be adding this to mainline kernels.
> Thanks Russell.
> I'd be glad to help by testing any further low_latency 
> related patches also.
> Tim
> 

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [BUG][2.6.8.1] serial driver hangs SMP kernel, but not the UP kernel
  2005-01-07  1:54     ` Alan Cox
@ 2005-01-07 14:04       ` Paul Fulghum
  0 siblings, 0 replies; 22+ messages in thread
From: Paul Fulghum @ 2005-01-07 14:04 UTC (permalink / raw)
  To: Alan Cox; +Cc: Tim_T_Murphy, rmk+lkml, Linux Kernel Mailing List

Alan Cox wrote:
> On Gwe, 2005-01-07 at 00:43, Paul Fulghum wrote:
> 
>>IIRC that guarantees a deadlock on SMP due to the
>>generic serial layer trying to grab a spinlock
>>that is already held. (Which prompted the original
>>bug report by Tim several months ago)
> 
> 
> I fixed the tty locking issues with that. If there are any left they
> should be solely in the serial generic code and I've no idea there

Yes, that is where the locking problems were.
When I last looked at it the problem call path was:

serial8250_interrupt();
    spin_lock(port->lock);
    serial8250_handle_port();
       receive_chars();
          flip.work.func(); /* if FLIP buffer full or low_latency set */
              ldisc->receive_buf(); /* N_TTY */
                  tty->driver->flush_chars();
                     uart_start();
                        spin_lock(port->lock); *BANG*

--
Paul Fulghum
Microgate Systems, Ltd


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [BUG][2.6.8.1] serial driver hangs SMP kernel, but not the UP kernel
  2005-01-07  0:43   ` Paul Fulghum
@ 2005-01-07  1:54     ` Alan Cox
  2005-01-07 14:04       ` Paul Fulghum
  0 siblings, 1 reply; 22+ messages in thread
From: Alan Cox @ 2005-01-07  1:54 UTC (permalink / raw)
  To: Paul Fulghum; +Cc: Tim_T_Murphy, rmk+lkml, Linux Kernel Mailing List

On Gwe, 2005-01-07 at 00:43, Paul Fulghum wrote:
> IIRC that guarantees a deadlock on SMP due to the
> generic serial layer trying to grab a spinlock
> that is already held. (Which prompted the original
> bug report by Tim several months ago)

I fixed the tty locking issues with that. If there are any left they
should be solely in the serial generic code and I've no idea there


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [BUG][2.6.8.1] serial driver hangs SMP kernel, but not the UP kernel
  2005-01-06 23:11 ` Alan Cox
@ 2005-01-07  0:43   ` Paul Fulghum
  2005-01-07  1:54     ` Alan Cox
  0 siblings, 1 reply; 22+ messages in thread
From: Paul Fulghum @ 2005-01-07  0:43 UTC (permalink / raw)
  To: Alan Cox; +Cc: Tim_T_Murphy, rmk+lkml, Linux Kernel Mailing List

Alan Cox wrote:
> On Iau, 2005-01-06 at 22:47, Tim_T_Murphy@Dell.com wrote:
> 
>>>anything i can do to avoid dropping characters without using 
>>>low_latency, which still hangs SMP kernels?
>>
>>this patch fixes the problem for me, but its probably an awful hack -- a
>>brief interrupt storm occurs until tty processes its buffer, but IMHO
>>that's better than dropping characters.
> 
> Presumably this is a device with a fake 8250 that produces sudden large
> bursts of data ? If so then for now you -need- to set low_latency and
> should probably do it by the PCI vendor subid/device id. The problem is
> that the serial layer expects serial data arriving at serial speeds. It
> completely breaks down when it hits an emulation of a generic uart that
> suddenely receives 32Kbytes of data at ethernet speed.
> 
> The longer term fix for this is when the flip buffers go away, and the
> same problem gets cleaned up for things like mainframes and some of the
> high performance DMA devices. Until then just set low_latency and
> comment it as "not your fault" 8)

IIRC that guarantees a deadlock on SMP due to the
generic serial layer trying to grab a spinlock
that is already held. (Which prompted the original
bug report by Tim several months ago)

Perhaps the FIFO trigger threshold for this
specific device can be altered
to try and smooth the amount of data dumped
per IRQ.

--
Paul Fulghum
paulkf@microgate.com

^ permalink raw reply	[flat|nested] 22+ messages in thread

* RE: [BUG][2.6.8.1] serial driver hangs SMP kernel, but not the UP kernel
@ 2005-01-06 23:50 Tim_T_Murphy
  0 siblings, 0 replies; 22+ messages in thread
From: Tim_T_Murphy @ 2005-01-06 23:50 UTC (permalink / raw)
  To: rmk+lkml; +Cc: linux-kernel


> this patch fixes the problem for me, but its probably an awful hack --

> a brief interrupt storm occurs until tty processes its buffer, 
> but IMHO that's better than dropping characters.

sorry, i see now that its not an interrupt storm but rather the
interrupt handler doesn't end until it quits due to 'too much work'.

tim

^ permalink raw reply	[flat|nested] 22+ messages in thread

* RE: [BUG][2.6.8.1] serial driver hangs SMP kernel, but not the UP kernel
  2005-01-06 22:47 Tim_T_Murphy
@ 2005-01-06 23:11 ` Alan Cox
  2005-01-07  0:43   ` Paul Fulghum
  0 siblings, 1 reply; 22+ messages in thread
From: Alan Cox @ 2005-01-06 23:11 UTC (permalink / raw)
  To: Tim_T_Murphy; +Cc: rmk+lkml, Linux Kernel Mailing List

On Iau, 2005-01-06 at 22:47, Tim_T_Murphy@Dell.com wrote:
> > anything i can do to avoid dropping characters without using 
> > low_latency, which still hangs SMP kernels?
> 
> this patch fixes the problem for me, but its probably an awful hack -- a
> brief interrupt storm occurs until tty processes its buffer, but IMHO
> that's better than dropping characters.

On a PCI device you may never get to process the buffer if you do that.
2.6.10 throws away the other bytes carefully and clears the IRQ.

Presumably this is a device with a fake 8250 that produces sudden large
bursts of data ? If so then for now you -need- to set low_latency and
should probably do it by the PCI vendor subid/device id. The problem is
that the serial layer expects serial data arriving at serial speeds. It
completely breaks down when it hits an emulation of a generic uart that
suddenely receives 32Kbytes of data at ethernet speed.

The longer term fix for this is when the flip buffers go away, and the
same problem gets cleaned up for things like mainframes and some of the
high performance DMA devices. Until then just set low_latency and
comment it as "not your fault" 8)

Alan


^ permalink raw reply	[flat|nested] 22+ messages in thread

* RE: [BUG][2.6.8.1] serial driver hangs SMP kernel, but not the UP kernel
@ 2005-01-06 22:47 Tim_T_Murphy
  2005-01-06 23:11 ` Alan Cox
  0 siblings, 1 reply; 22+ messages in thread
From: Tim_T_Murphy @ 2005-01-06 22:47 UTC (permalink / raw)
  To: rmk+lkml; +Cc: linux-kernel


> anything i can do to avoid dropping characters without using 
> low_latency, which still hangs SMP kernels?

this patch fixes the problem for me, but its probably an awful hack -- a
brief interrupt storm occurs until tty processes its buffer, but IMHO
that's better than dropping characters.

is there a better alternative?
thanks,
tim

--- 8250-orig.c	2005-01-06 16:25:24.000000000 -0600
+++ 8250.c	2005-01-06 16:27:21.000000000 -0600
@@ -989,8 +989,10 @@
 		if (unlikely(tty->flip.count >= TTY_FLIPBUF_SIZE)) {
 			if(tty->low_latency)
 				tty_flip_buffer_push(tty);
-			/* If this failed then we will throw away the
-			   bytes but must do so to clear interrupts */
+			else
+				break;
+			/* If this failed then we will just leave now 
+			   rather than dropping bytes (interrupts not
cleared) */
 		}
 		ch = serial_inp(up, UART_RX);
 		flag = TTY_NORMAL;

^ permalink raw reply	[flat|nested] 22+ messages in thread

* RE: [BUG][2.6.8.1] serial driver hangs SMP kernel, but not the UP kernel
@ 2004-11-01 16:06 Tim_T_Murphy
  0 siblings, 0 replies; 22+ messages in thread
From: Tim_T_Murphy @ 2004-11-01 16:06 UTC (permalink / raw)
  To: rmk+lkml; +Cc: linux-kernel

> Thanks for testing - I'll be adding this to mainline kernels.
Thanks Russell.
I'd be glad to help by testing any further low_latency related patches
also.
Tim

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [BUG][2.6.8.1] serial driver hangs SMP kernel, but not the UP kernel
  2004-11-01 14:28 Tim_T_Murphy
@ 2004-11-01 14:35 ` Russell King
  0 siblings, 0 replies; 22+ messages in thread
From: Russell King @ 2004-11-01 14:35 UTC (permalink / raw)
  To: Tim_T_Murphy; +Cc: linux-kernel

On Mon, Nov 01, 2004 at 08:28:35AM -0600, Tim_T_Murphy@Dell.com wrote:
> > Ok, could you check whether this patch automatically detects 
> > the serial port please?
> 
> Yes, other than fixing a couple typos: 
> 	uart_offest -> uart_offset
> 	PCI_ID_ANY -> PCI_ANY_ID

Thanks for testing - I'll be adding this to mainline kernels.

-- 
Russell King
 Linux kernel    2.6 ARM Linux   - http://www.arm.linux.org.uk/
 maintainer of:  2.6 PCMCIA      - http://pcmcia.arm.linux.org.uk/
                 2.6 Serial core

^ permalink raw reply	[flat|nested] 22+ messages in thread

* RE: [BUG][2.6.8.1] serial driver hangs SMP kernel, but not the UP kernel
@ 2004-11-01 14:28 Tim_T_Murphy
  2004-11-01 14:35 ` Russell King
  0 siblings, 1 reply; 22+ messages in thread
From: Tim_T_Murphy @ 2004-11-01 14:28 UTC (permalink / raw)
  To: rmk+lkml; +Cc: linux-kernel

> Ok, could you check whether this patch automatically detects 
> the serial port please?

Yes, other than fixing a couple typos: 
	uart_offest -> uart_offset
	PCI_ID_ANY -> PCI_ANY_ID
I now get ttyS4 in my /proc/tty/driver/serial output, on bootup:

serinfo:1.0 driver revision:
0: uart:16550A port:000003F8 irq:4 tx:22 rx:0 RI
1: uart:16550A port:000002F8 irq:3 tx:22 rx:0 RI
2: uart:unknown port:000003E8 irq:4
3: uart:unknown port:000002E8 irq:3
4: uart:16550A port:0000EC40 irq:201 tx:0 rx:0 CTS|DSR|CD
5: uart:unknown port:00000000 irq:0
6: uart:unknown port:00000000 irq:0
7: uart:unknown port:00000000 irq:0

Also: the removal of "low_latency" does avoid the hang with the SMP
kernel; I am removing this setting from our service startup script.  In
addition, I will be changing the script to only perform the setserial
commands against an unused tty if it cannot first identify a tty that
already describes our virtual uart (ala Russell's 8250_pci fix).

Thanks to all who replied, much appreciated!
Tim

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [BUG][2.6.8.1] serial driver hangs SMP kernel, but not the UP kernel
  2004-10-30 22:43     ` Alan Cox
@ 2004-10-31  0:26       ` Paul Fulghum
  0 siblings, 0 replies; 22+ messages in thread
From: Paul Fulghum @ 2004-10-31  0:26 UTC (permalink / raw)
  To: Alan Cox; +Cc: Russell King, Tim_T_Murphy, Linux Kernel Mailing List

On Sat, 2004-10-30 at 17:43, Alan Cox wrote:
> On Sad, 2004-10-30 at 00:40, Paul Fulghum wrote:
> > Would it make sense to do something like (in tty_io.c) the following?
> 
> Not really because it can legally occur if you flip the low latency
> flag while a transaction is queued. It might work if you waited for
> scheduled work to complete in the flag changing.

I don't see how having flush_to_ldisc() queued
or already running (on another processor) negates
the prohibition on calling tty_flip_buffer_push()
with low_latency set in interrupt context.

The comments for tty_flip_buffer_push() state the
function should not be called in interrupt context
if low_latency is set (no exceptions are listed).
Meaning flush_to_ldisc() should only be called
in process context.

If flush_to_ldisc() is queued or already executing,
there is no protection against calling
flush_to_ldisc() again, directly in interrupt context.
TTY_DONT_FLIP is no protection, that is only set
in read_chan() of n_tty.c

If I'm missing something, please point it out.

-- 
Paul Fulghum
paulkf@microgate.com



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [BUG][2.6.8.1] serial driver hangs SMP kernel, but not the UP kernel
  2004-10-29 23:40   ` Paul Fulghum
@ 2004-10-30 22:43     ` Alan Cox
  2004-10-31  0:26       ` Paul Fulghum
  0 siblings, 1 reply; 22+ messages in thread
From: Alan Cox @ 2004-10-30 22:43 UTC (permalink / raw)
  To: Paul Fulghum; +Cc: Russell King, Tim_T_Murphy, Linux Kernel Mailing List

On Sad, 2004-10-30 at 00:40, Paul Fulghum wrote:
> On Fri, 2004-10-29 at 15:20, Russell King wrote:
> > At a guess, you've enabled "low latency" setting on this port ?
> 
> Would it make sense to do something like (in tty_io.c) the following?

Not really because it can legally occur if you flip the low latency
flag while a transaction is queued. It might work if you waited for
scheduled work to complete in the flag changing.


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [BUG][2.6.8.1] serial driver hangs SMP kernel, but not the UP kernel
  2004-10-29 23:30 Tim_T_Murphy
@ 2004-10-30 16:02 ` Russell King
  0 siblings, 0 replies; 22+ messages in thread
From: Russell King @ 2004-10-30 16:02 UTC (permalink / raw)
  To: Tim_T_Murphy; +Cc: linux-kernel

On Fri, Oct 29, 2004 at 06:30:01PM -0500, Tim_T_Murphy@Dell.com wrote:
> 
> > Well, if you forward lspci -vvx and the "maddr" and "irqno"
> information
> > (in private mail if you prefer) then I'll fix 8250_pci to work.
> 
> maddr:	10		# note, this is for the UP kernel. for SMP,
> maddr=201
> irqno:	ec40
> lspci -d 1028:0008 -vvx:

Ok, could you check whether this patch automatically detects the serial
port please?

Thanks.

diff -up -x BitKeeper -x ChangeSet -x SCCS -x _xlk -x *.orig -x *.rej orig/drivers/serial/8250_pci.c linux/drivers/serial/8250_pci.c
--- orig/drivers/serial/8250_pci.c	Sat Oct 23 11:39:13 2004
+++ linux/drivers/serial/8250_pci.c	Sat Oct 30 16:57:59 2004
@@ -1026,6 +1026,7 @@ enum pci_board_num_t {
 
 	pbn_b1_bt_2_921600,
 
+	pbn_b1_1_1382400,
 	pbn_b1_2_1382400,
 	pbn_b1_4_1382400,
 	pbn_b1_8_1382400,
@@ -1253,6 +1254,12 @@ static struct pci_board pci_boards[] __d
 		.uart_offset	= 8,
 	},
 
+	[pbn_b1_1_1382400] = {
+		.flags		= FL_BASE1,
+		.num_ports	= 1,
+		.base_baud	= 1382400,
+		.uart_offest	= 8,
+	},
 	[pbn_b1_2_1382400] = {
 		.flags		= FL_BASE1,
 		.num_ports	= 2,
@@ -2109,6 +2116,13 @@ static struct pci_device_id serial_pci_t
 		pbn_b0_bt_1_460800 },
 
 	/*
+	 * Dell Remote Access Card III - Tim_T_Murphy@Dell.com
+	 */
+	{	PCI_VENDOR_ID_DELL, PCI_DEVICE_ID_DELL_RACIII,
+		PCI_ID_ANY, PCI_ID_ANY, 0, 0,
+		pbn_b1_1_1382400 },
+
+	/*
 	 * RAStel 2 port modem, gerg@moreton.com.au
 	 */
 	{	PCI_VENDOR_ID_MORETON, PCI_DEVICE_ID_RASTEL_2PORT,
diff -up -x BitKeeper -x ChangeSet -x SCCS -x _xlk -x *.orig -x *.rej orig/include/linux/pci_ids.h linux/include/linux/pci_ids.h
--- orig/include/linux/pci_ids.h	Sat Oct 23 11:40:03 2004
+++ linux/include/linux/pci_ids.h	Sat Oct 30 16:52:46 2004
@@ -522,6 +522,7 @@
 #define PCI_DEVICE_ID_AI_M1435		0x1435
 
 #define PCI_VENDOR_ID_DELL              0x1028
+#define PCI_DEVICE_ID_DELL_RACIII	0x0008
 
 #define PCI_VENDOR_ID_MATROX		0x102B
 #define PCI_DEVICE_ID_MATROX_MGA_2	0x0518

-- 
Russell King
 Linux kernel    2.6 ARM Linux   - http://www.arm.linux.org.uk/
 maintainer of:  2.6 PCMCIA      - http://pcmcia.arm.linux.org.uk/
                 2.6 Serial core

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [BUG][2.6.8.1] serial driver hangs SMP kernel, but not the UP kernel
  2004-10-29 20:20 ` Russell King
  2004-10-29 22:18   ` Paul Fulghum
@ 2004-10-29 23:40   ` Paul Fulghum
  2004-10-30 22:43     ` Alan Cox
  1 sibling, 1 reply; 22+ messages in thread
From: Paul Fulghum @ 2004-10-29 23:40 UTC (permalink / raw)
  To: Russell King; +Cc: Tim_T_Murphy, Linux Kernel list

On Fri, 2004-10-29 at 15:20, Russell King wrote:
> At a guess, you've enabled "low latency" setting on this port ?

Would it make sense to do something like (in tty_io.c) the following?

void tty_flip_buffer_push(struct tty_struct *tty)
{
	if (tty->low_latency) {
		if (in_interrupt()) {
			printk(KERN_ERR "tty_flip_buffer_push called with low latency from interrupt!\n");
			dump_stack();
			schedule_delayed_work(&tty->flip.work, 1);
		}
		else
			flush_to_ldisc((void *) tty);
	}
	else
		schedule_delayed_work(&tty->flip.work, 1);
}

-- 
Paul Fulghum
paulkf@microgate.com



^ permalink raw reply	[flat|nested] 22+ messages in thread

* RE: [BUG][2.6.8.1] serial driver hangs SMP kernel, but not the UP kernel
@ 2004-10-29 23:33 Tim_T_Murphy
  0 siblings, 0 replies; 22+ messages in thread
From: Tim_T_Murphy @ 2004-10-29 23:33 UTC (permalink / raw)
  To: rmk+lkml; +Cc: linux-kernel

> maddr:	10		# note, this is for the UP kernel. for
SMP, maddr=201
> irqno:	ec40

duh, i got maddr and irqno backwards in my last post, sorry.
Tim

^ permalink raw reply	[flat|nested] 22+ messages in thread

* RE: [BUG][2.6.8.1] serial driver hangs SMP kernel, but not the UP kernel
@ 2004-10-29 23:30 Tim_T_Murphy
  2004-10-30 16:02 ` Russell King
  0 siblings, 1 reply; 22+ messages in thread
From: Tim_T_Murphy @ 2004-10-29 23:30 UTC (permalink / raw)
  To: rmk+lkml; +Cc: linux-kernel


> Well, if you forward lspci -vvx and the "maddr" and "irqno"
information
> (in private mail if you prefer) then I'll fix 8250_pci to work.

maddr:	10		# note, this is for the UP kernel. for SMP,
maddr=201
irqno:	ec40
lspci -d 1028:0008 -vvx:

00:08.1 Class ff00: Dell Remote Access Card III
	Subsystem: Dell Remote Access Card III
	Control: I/O+ Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop-
ParErr- Stepping- SERR+ FastB2B-
	Status: Cap+ 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort-
<TAbort- <MAbort- >SERR- <PERR-
	Interrupt: pin B routed to IRQ 10
	Region 0: Memory at fe202000 (32-bit, non-prefetchable)
[size=4K]
	Region 1: I/O ports at ec40 [size=64]
	Region 2: Memory at feb00000 (32-bit, prefetchable) [size=512K]
	Capabilities: [48] Power Management version 2
		Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA
PME(D0-,D1-,D2-,D3hot-,D3cold-)
		Status: D0 PME-Enable- DSel=0 DScale=0 PME-
00: 28 10 08 00 03 01 90 02 00 00 00 ff 10 20 80 00
10: 00 20 20 fe 41 ec 00 00 08 00 b0 fe 00 00 00 00
20: 00 00 00 00 00 00 00 00 00 00 00 00 28 10 08 00
30: 00 00 00 00 48 00 00 00 00 00 00 00 0a 02 00 00

> I think dropping low_latency will work around the problem for the time
> being.

Thanks a lot for the help and advice, I will try this and report
results.

Tim

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [BUG][2.6.8.1] serial driver hangs SMP kernel, but not the UP kernel
  2004-10-29 20:20 ` Russell King
@ 2004-10-29 22:18   ` Paul Fulghum
  2004-10-29 23:40   ` Paul Fulghum
  1 sibling, 0 replies; 22+ messages in thread
From: Paul Fulghum @ 2004-10-29 22:18 UTC (permalink / raw)
  To: Russell King; +Cc: Tim_T_Murphy, Linux Kernel list

On Fri, 2004-10-29 at 15:20, Russell King wrote:
> At a guess, you've enabled "low latency" setting on this port ?

Ah, that would explain the problem better than
the code path I saw (flip buffer full).
The problem is still the same: calling the flip
work routine from the ISR, which calls through
N_TTY receive_buf->flush_chars->start_tx.

-- 
Paul Fulghum
paulkf@microgate.com



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [BUG][2.6.8.1] serial driver hangs SMP kernel, but not the UP kernel
  2004-10-29 21:04 Tim_T_Murphy
@ 2004-10-29 21:14 ` Russell King
  0 siblings, 0 replies; 22+ messages in thread
From: Russell King @ 2004-10-29 21:14 UTC (permalink / raw)
  To: Tim_T_Murphy; +Cc: linux-kernel

On Fri, Oct 29, 2004 at 04:04:40PM -0500, Tim_T_Murphy@Dell.com wrote:
> > Shouldn't 8250_pci setup the ports already for you?  If not, what
> > needs to be done to achieve this.  Using setserial to setup ports
> > for PCI cards isn't the preferred way of doing this.
> 
> good question, i will have to understand more to answer it though.
> our product has used this method for almost 2 years now.

Well, if you forward lspci -vvx and the "maddr" and "irqno" information
(in private mail if you prefer) then I'll fix 8250_pci to work.

> > At a guess, you've enabled "low latency" setting on this port ?
> 
> yes.  here's a snippet from the script:
> 
> 	echo -n "Starting ${racsvc}: "
> 	# set serial characteristics for RAC device
> 	setserial /dev/${ttyid} \
> 		port 0x${maddr} irq ${irqno} ^skip_test autoconfig
> 	setserial /dev/${ttyid} \
> 		uart 16550A low_latency baud_base 1382400	\
> 		close_delay 0 closing_wait infinite
> 	# now start pppd
> 	/sbin/modprobe -q ppp >/dev/null 2>&1
> 	/sbin/modprobe -q ppp_async >/dev/null 2>&1
> 	daemon pppd call ${service}
> 	RETVAL=$?

I think dropping low_latency will work around the problem for the time
being.

-- 
Russell King
 Linux kernel    2.6 ARM Linux   - http://www.arm.linux.org.uk/
 maintainer of:  2.6 PCMCIA      - http://pcmcia.arm.linux.org.uk/
                 2.6 Serial core

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [BUG][2.6.8.1] serial driver hangs SMP kernel, but not the UP kernel
  2004-10-29 19:55 Tim_T_Murphy
  2004-10-29 20:20 ` Russell King
@ 2004-10-29 21:08 ` Paul Fulghum
  1 sibling, 0 replies; 22+ messages in thread
From: Paul Fulghum @ 2004-10-29 21:08 UTC (permalink / raw)
  To: Tim_T_Murphy; +Cc: linux-kernel

On Fri, 2004-10-29 at 14:55, Tim_T_Murphy@Dell.com wrote:
> Oct 29 13:34:48 racjag-1 chat[3886]: expect (CLIENTSERVER)
> Oct 29 13:34:48 racjag-1 kernel: drivers/serial/serial_core.c:102: spin_lock(drivers/serial/serial_core.c:023f2548) already locked by drivers/serial/8250.c/1015
> Oct 29 13:34:48 racjag-1 kernel: drivers/serial/8250.c:1017: spin_unlock(drivers/serial/serial_core.c:023f2548) not locked
> Oct 29 13:34:48 racjag-1 chat[3886]: CLIENTSERVER

One way this can happen is a receive interrupt:

serial8250_interrupt();
    spin_lock(port->lock);
    serial8250_handle_port();
       receive_chars();
          flip.work.func(); /* if FLIP buffer full */
             ldisc->receive_buf(); /* N_TTY */
                 tty->driver->flush_chars();
                     uart_start();
                        spin_lock(port->lock); *BANG*

Try the attached patch and report what happens.

-- 
Paul Fulghum
paulkf@microgate.com

--- linux-2.6.8/drivers/serial/8250.c	2004-08-14 00:36:13.000000000 -0500
+++ b/drivers/serial/8250.c	2004-10-29 15:58:28.076014336 -0500
@@ -830,9 +830,13 @@ receive_chars(struct uart_8250_port *up,
 
 	do {
 		if (unlikely(tty->flip.count >= TTY_FLIPBUF_SIZE)) {
-			tty->flip.work.func((void *)tty);
-			if (tty->flip.count >= TTY_FLIPBUF_SIZE)
-				return; // if TTY_DONT_FLIP is set
+			/* no room in flip buffer, discard rx FIFO contents to clear IRQ */
+			do {
+				serial_inp(up, UART_RX);
+				up->port.icount.overrun++;
+				*status = serial_inp(up, UART_LSR);
+			} while ((*status & UART_LSR_DR) && (max_count-- > 0));
+			return;	/* if TTY_DONT_FLIP is set */
 		}
 		ch = serial_inp(up, UART_RX);
 		*tty->flip.char_buf_ptr = ch;



^ permalink raw reply	[flat|nested] 22+ messages in thread

* RE: [BUG][2.6.8.1] serial driver hangs SMP kernel, but not the UP kernel
@ 2004-10-29 21:04 Tim_T_Murphy
  2004-10-29 21:14 ` Russell King
  0 siblings, 1 reply; 22+ messages in thread
From: Tim_T_Murphy @ 2004-10-29 21:04 UTC (permalink / raw)
  To: rmk+lkml; +Cc: linux-kernel


> Shouldn't 8250_pci setup the ports already for you?  If not, what
needs
> to be done to achieve this.  Using setserial to setup ports for PCI
cards
> isn't the preferred way of doing this.

good question, i will have to understand more to answer it though.
our product has used this method for almost 2 years now.

> At a guess, you've enabled "low latency" setting on this port ?

yes.  here's a snippet from the script:

	echo -n "Starting ${racsvc}: "
	# set serial characteristics for RAC device
	setserial /dev/${ttyid} \
		port 0x${maddr} irq ${irqno} ^skip_test autoconfig
	setserial /dev/${ttyid} \
		uart 16550A low_latency baud_base 1382400	\
		close_delay 0 closing_wait infinite
	# now start pppd
	/sbin/modprobe -q ppp >/dev/null 2>&1
	/sbin/modprobe -q ppp_async >/dev/null 2>&1
	daemon pppd call ${service}
	RETVAL=$?

Thanks
Tim

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [BUG][2.6.8.1] serial driver hangs SMP kernel, but not the UP kernel
  2004-10-29 19:55 Tim_T_Murphy
@ 2004-10-29 20:20 ` Russell King
  2004-10-29 22:18   ` Paul Fulghum
  2004-10-29 23:40   ` Paul Fulghum
  2004-10-29 21:08 ` Paul Fulghum
  1 sibling, 2 replies; 22+ messages in thread
From: Russell King @ 2004-10-29 20:20 UTC (permalink / raw)
  To: Tim_T_Murphy; +Cc: linux-kernel

On Fri, Oct 29, 2004 at 02:55:10PM -0500, Tim_T_Murphy@Dell.com wrote:
> I've read about several problems others are having with the new 2.6
> serial driver in the list, and tried to see if their solutions solved
> my issue also, but unfortunately none that I have tried yet have helped.

Well, this is the first I know of this kind of problem...

> We're migrating our applications for the Dell Remote Access Controller
> (DRAC) to run on a 2.6 kernel from a 2.4 kernel. Communication between
> the apps and the DRAC happen over a ppp link which is established via
> a service startup script; the script uses setserial to prepare an unused
> tty (based on the assigned hardware information, obtained via lspci),
> and the script then calls pppd to finish/establish the link.

Shouldn't 8250_pci setup the ports already for you?  If not, what needs
to be done to achieve this.  Using setserial to setup ports for PCI cards
isn't the preferred way of doing this.

At a guess, you've enabled "low latency" setting on this port ?

-- 
Russell King
 Linux kernel    2.6 ARM Linux   - http://www.arm.linux.org.uk/
 maintainer of:  2.6 PCMCIA      - http://pcmcia.arm.linux.org.uk/
                 2.6 Serial core

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [BUG][2.6.8.1] serial driver hangs SMP kernel, but not the UP kernel
@ 2004-10-29 19:55 Tim_T_Murphy
  2004-10-29 20:20 ` Russell King
  2004-10-29 21:08 ` Paul Fulghum
  0 siblings, 2 replies; 22+ messages in thread
From: Tim_T_Murphy @ 2004-10-29 19:55 UTC (permalink / raw)
  To: linux-kernel

I am new to the list, hope this is ok..
I've read about several problems others are having with the new 2.6 serial driver in the list, and tried to see if their solutions solved my issue also, but unfortunately none that I have tried yet have helped.

We're migrating our applications for the Dell Remote Access Controller (DRAC) to run on a 2.6 kernel from a 2.4 kernel. Communication between the apps and the DRAC happen over a ppp link which is established via a service startup script; the script uses setserial to prepare an unused tty (based on the assigned hardware information, obtained via lspci), and the script then calls pppd to finish/establish the link.

Everything works fine with the UP kernel -- Although, there is a message in syslog regarding a spinlock (issued at approximately the same point in time where the SMP kernel hangs):
---
Oct 29 13:34:47 racjag-1 kernel: CSLIP: code copyright 1989 Regents of the University of California
Oct 29 13:34:47 racjag-1 kernel: PPP generic driver version 2.4.2
Oct 29 13:34:47 racjag-1 udev[3875]: creating device node '/dev/ppp'
Oct 29 13:34:47 racjag-1 pppd[3884]: pppd 2.4.2 started by root, uid 0
Oct 29 13:34:47 racjag-1 racser: pppd startup succeeded
Oct 29 13:34:48 racjag-1 chat[3886]: send (CLIENT^M)
Oct 29 13:34:48 racjag-1 chat[3886]: expect (CLIENTSERVER)
Oct 29 13:34:48 racjag-1 kernel: drivers/serial/serial_core.c:102: spin_lock(drivers/serial/serial_core.c:023f2548) already locked by drivers/serial/8250.c/1015
Oct 29 13:34:48 racjag-1 kernel: drivers/serial/8250.c:1017: spin_unlock(drivers/serial/serial_core.c:023f2548) not locked
Oct 29 13:34:48 racjag-1 chat[3886]: CLIENTSERVER
Oct 29 13:34:48 racjag-1 chat[3886]:  -- got it 
Oct 29 13:34:48 racjag-1 chat[3886]: send ()
Oct 29 13:34:48 racjag-1 pppd[3884]: Serial connection established.
Oct 29 13:34:48 racjag-1 pppd[3884]: Using interface ppp0
Oct 29 13:34:48 racjag-1 pppd[3884]: Connect: ppp0 <--> /dev/ttyS2
Oct 29 13:34:49 racjag-1 pppd[3884]: local  IP address 192.168.234.235
Oct 29 13:34:49 racjag-1 pppd[3884]: remote IP address 192.168.234.236
---

With the SMP kernel, it hangs very soon after starting pppd.
I enabled DEBUG in the serial driver and captured the syslog when the problem happens, but this is not detailed enough for me to finger the exact problem:
---
Oct 28 14:04:52 racjag-1 kernel: CSLIP: code copyright 1989 Regents of the University of California
Oct 28 14:04:52 racjag-1 kernel: PPP generic driver version 2.4.2
Oct 28 14:04:52 racjag-1 udev[3621]: creating device node '/dev/ppp'
Oct 28 14:05:19 racjag-1 kernel: uart_open(2) called
Oct 28 14:05:19 racjag-1 kernel: Trying to free nonexistent resource <00000000-00000007>
Oct 28 14:05:19 racjag-1 kernel: uart_close(2) called
Oct 28 14:05:19 racjag-1 kernel: uart_flush_buffer(2) called
Oct 28 14:05:19 racjag-1 kernel: uart_open(2) called
Oct 28 14:05:19 racjag-1 kernel: uart_close(2) called
Oct 28 14:05:19 racjag-1 kernel: uart_flush_buffer(2) called
Oct 28 14:05:19 racjag-1 pppd[3681]: pppd 2.4.1 started by root, uid 0
Oct 28 14:05:19 racjag-1 kernel: uart_open(2) called
Oct 28 14:05:19 racjag-1 racser: pppd startup succeeded
Oct 28 14:05:20 racjag-1 kernel: uart_open(2) called
Oct 28 14:05:20 racjag-1 kernel: uart_close(2) called
Oct 28 14:05:20 racjag-1 chat[3683]: send (CLIENT^M)
---
The system hangs right there; must press and hold power to get the system to shut down.

Any suggestions to narrow down the cause?  Please cc my email as I do not subscribe to this list.
Thanks,
Tim


^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2005-01-07 14:09 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2005-01-06 14:55 [BUG][2.6.8.1] serial driver hangs SMP kernel, but not the UP kernel Tim_T_Murphy
  -- strict thread matches above, loose matches on Subject: below --
2005-01-06 23:50 Tim_T_Murphy
2005-01-06 22:47 Tim_T_Murphy
2005-01-06 23:11 ` Alan Cox
2005-01-07  0:43   ` Paul Fulghum
2005-01-07  1:54     ` Alan Cox
2005-01-07 14:04       ` Paul Fulghum
2004-11-01 16:06 Tim_T_Murphy
2004-11-01 14:28 Tim_T_Murphy
2004-11-01 14:35 ` Russell King
2004-10-29 23:33 Tim_T_Murphy
2004-10-29 23:30 Tim_T_Murphy
2004-10-30 16:02 ` Russell King
2004-10-29 21:04 Tim_T_Murphy
2004-10-29 21:14 ` Russell King
2004-10-29 19:55 Tim_T_Murphy
2004-10-29 20:20 ` Russell King
2004-10-29 22:18   ` Paul Fulghum
2004-10-29 23:40   ` Paul Fulghum
2004-10-30 22:43     ` Alan Cox
2004-10-31  0:26       ` Paul Fulghum
2004-10-29 21:08 ` Paul Fulghum

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.