All of lore.kernel.org
 help / color / mirror / Atom feed
* [2.6.20.21 review 00/35] 2.6.20.21 -stable review
@ 2007-10-13 14:28 Willy Tarreau
  2007-10-13 15:28 ` [2.6.20.21 review 01/35] ACPICA: Fixed possible corruption of global GPE list Willy Tarreau
                   ` (30 more replies)
  0 siblings, 31 replies; 37+ messages in thread
From: Willy Tarreau @ 2007-10-13 14:28 UTC (permalink / raw)
  To: linux-kernel, stable

This is the start of the review cycle for the stable 2.6.20.21
kernel release. This version catches up with 2.6.22.9, and 35
patches will be posted as a response to this message. I tried
hard to trim the patches to a minimally reasonable set, and
unless I have introduced a big regression or build error, this
version will remain the last 2.6.20.x (at least from me).

2.6.23.1 is already there, 2.6.22.9 is quite clean, 2.6.16.54
still lives, there is no reason to maintain a fourth branch.

The following security issues are solved :
  CVE-2007-3104: store sysfs inode nrs in s_ino to avoid readdir oopses
  CVE-2007-4571: Convert snd-page-alloc proc file to use seq_file

No known vulnerability fix is pending anymore.

The rolled up patch can be found here :
   ftp.kernel.org/pub/linux/kernel/v2.6/stable-review/patch-2.6.20.21-rc1.gz

Responses should be made by October 17, 2007, 19:00:00 UTC.
Anything received after that time might be too late.

Thanks,
Willy


^ permalink raw reply	[flat|nested] 37+ messages in thread

* [2.6.20.21 review 01/35] ACPICA: Fixed possible corruption of global GPE list
  2007-10-13 14:28 [2.6.20.21 review 00/35] 2.6.20.21 -stable review Willy Tarreau
@ 2007-10-13 15:28 ` Willy Tarreau
  2007-10-13 15:28 ` [2.6.20.21 review 02/35] AVR32: Fix atomic_add_unless() and atomic_sub_unless() Willy Tarreau
                   ` (29 subsequent siblings)
  30 siblings, 0 replies; 37+ messages in thread
From: Willy Tarreau @ 2007-10-13 15:28 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Bob Moore, Len Brown, Chuck Ebbert, Greg Kroah-Hartman

[-- Attachment #1: 0008-ACPICA-Fixed-possible-corruption-of-global-GPE-list.patch --]
[-- Type: text/plain, Size: 1044 bytes --]

ACPICA: Fixed possible corruption of global GPE list

Fixed a problem in acpi_ev_delete_gpe_xrupt where the global interrupt
list could be corrupted if the interrupt being removed was at
the head of the list. Reported by Linn Crosetto.

Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Cc: Chuck Ebbert <cebbert@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
---
 drivers/acpi/events/evgpeblk.c |    4 ++++
 1 files changed, 4 insertions(+), 0 deletions(-)

Index: 2.6/drivers/acpi/events/evgpeblk.c
===================================================================
--- 2.6.orig/drivers/acpi/events/evgpeblk.c
+++ 2.6/drivers/acpi/events/evgpeblk.c
@@ -586,6 +586,10 @@ acpi_ev_delete_gpe_xrupt(struct acpi_gpe
 	flags = acpi_os_acquire_lock(acpi_gbl_gpe_lock);
 	if (gpe_xrupt->previous) {
 		gpe_xrupt->previous->next = gpe_xrupt->next;
+	} else {
+		/* No previous, update list head */
+
+		acpi_gbl_gpe_xrupt_list_head = gpe_xrupt->next;
 	}
 
 	if (gpe_xrupt->next) {

-- 

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [2.6.20.21 review 02/35] AVR32: Fix atomic_add_unless() and atomic_sub_unless()
  2007-10-13 14:28 [2.6.20.21 review 00/35] 2.6.20.21 -stable review Willy Tarreau
  2007-10-13 15:28 ` [2.6.20.21 review 01/35] ACPICA: Fixed possible corruption of global GPE list Willy Tarreau
@ 2007-10-13 15:28 ` Willy Tarreau
  2007-10-13 15:28 ` [2.6.20.21 review 03/35] r8169: avoid needless NAPI poll scheduling Willy Tarreau
                   ` (28 subsequent siblings)
  30 siblings, 0 replies; 37+ messages in thread
From: Willy Tarreau @ 2007-10-13 15:28 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: Haavard Skinnemoen, Greg Kroah-Hartman

[-- Attachment #1: 0010-AVR32-Fix-atomic_add_unless-and-atomic_sub_unless.patch --]
[-- Type: text/plain, Size: 1355 bytes --]

These functions depend on "result" being initalized to 0, but "result"
is not included as an input constraint to the inline assembly block
following its initialization, only as an output constraint. Thus gcc
thinks it doesn't need to initialize it, so result ends up undefined
if the "unless" condition is true.

This fixes an oops in sunrpc where the faulty atomics caused
rpciod_up() to not start the workqueue as it should.

Signed-off-by: Haavard Skinnemoen <hskinnemoen@atmel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
---
 include/asm-avr32/atomic.h |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

Index: 2.6/include/asm-avr32/atomic.h
===================================================================
--- 2.6.orig/include/asm-avr32/atomic.h
+++ 2.6/include/asm-avr32/atomic.h
@@ -101,7 +101,7 @@ static inline int atomic_sub_unless(atom
 		"	mov	%1, 1
"
 		"1:"
 		: "=&r"(tmp), "=&r"(result), "=o"(v->counter)
-		: "m"(v->counter), "rKs21"(a), "rKs21"(u)
+		: "m"(v->counter), "rKs21"(a), "rKs21"(u), "1"(result)
 		: "cc", "memory");
 
 	return result;
@@ -137,7 +137,7 @@ static inline int atomic_add_unless(atom
 			"	mov	%1, 1
"
 			"1:"
 			: "=&r"(tmp), "=&r"(result), "=o"(v->counter)
-			: "m"(v->counter), "r"(a), "ir"(u)
+			: "m"(v->counter), "r"(a), "ir"(u), "1"(result)
 			: "cc", "memory");
 	}
 

-- 

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [2.6.20.21 review 03/35] r8169: avoid needless NAPI poll scheduling
  2007-10-13 14:28 [2.6.20.21 review 00/35] 2.6.20.21 -stable review Willy Tarreau
  2007-10-13 15:28 ` [2.6.20.21 review 01/35] ACPICA: Fixed possible corruption of global GPE list Willy Tarreau
  2007-10-13 15:28 ` [2.6.20.21 review 02/35] AVR32: Fix atomic_add_unless() and atomic_sub_unless() Willy Tarreau
@ 2007-10-13 15:28 ` Willy Tarreau
  2007-10-13 15:28 ` [2.6.20.21 review 04/35] i386: allow debuggers to access the vsyscall page with compat vDSO Willy Tarreau
                   ` (27 subsequent siblings)
  30 siblings, 0 replies; 37+ messages in thread
From: Willy Tarreau @ 2007-10-13 15:28 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: Thomas M=FCller, Francois Romieu, Greg Kroah-Hartman

[-- Attachment #1: 0011-r8169-avoid-needless-NAPI-poll-scheduling.patch --]
[-- Type: text/plain, Size: 1667 bytes --]

Theory  : though needless, it should not have hurt.
Practice: it does not play nice with DEBUG_SHIRQ + LOCKDEP + UP
(see https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=3D242572).


The patch makes sense in itself but I should dig why it has an effect
on #242572 (assuming that NAPI do not change in a near future).

Patch in mainline as 313b0305b5a1e7e0fb39383befbf79558ce68a9c.
Backported to 2.6.22-stable by Thomas M=FCller.

Signed-off-by: Thomas M=FCller <thomas@mathtm.de>
Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
---
 drivers/net/r8169.c |   18 ++++++++++--------
 1 files changed, 10 insertions(+), 8 deletions(-)

Index: 2.6/drivers/net/r8169.c
===================================================================
--- 2.6.orig/drivers/net/r8169.c
+++ 2.6/drivers/net/r8169.c
@@ -2646,14 +2646,16 @@ rtl8169_interrupt(int irq, void *dev_ins
 			rtl8169_check_link_status(dev, tp, ioaddr);
 
 #ifdef CONFIG_R8169_NAPI
-		RTL_W16(IntrMask, rtl8169_intr_mask & ~rtl8169_napi_event);
-		tp->intr_mask = ~rtl8169_napi_event;
+		if (status & rtl8169_napi_event) {
+			RTL_W16(IntrMask, rtl8169_intr_mask & ~rtl8169_napi_event);
+			tp->intr_mask = ~rtl8169_napi_event;
 
-		if (likely(netif_rx_schedule_prep(dev)))
-			__netif_rx_schedule(dev);
-		else if (netif_msg_intr(tp)) {
-			printk(KERN_INFO "%s: interrupt %04x taken in poll
",
-			       dev->name, status);
+			if (likely(netif_rx_schedule_prep(dev)))
+				__netif_rx_schedule(dev);
+			else if (netif_msg_intr(tp)) {
+				printk(KERN_INFO "%s: interrupt %04x in poll
",
+				       dev->name, status);
+			}
 		}
 		break;
 #else

-- 

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [2.6.20.21 review 04/35] i386: allow debuggers to access the vsyscall page with compat vDSO
  2007-10-13 14:28 [2.6.20.21 review 00/35] 2.6.20.21 -stable review Willy Tarreau
                   ` (2 preceding siblings ...)
  2007-10-13 15:28 ` [2.6.20.21 review 03/35] r8169: avoid needless NAPI poll scheduling Willy Tarreau
@ 2007-10-13 15:28 ` Willy Tarreau
  2007-10-13 15:28 ` [2.6.20.21 review 05/35] DCCP: Fix DCCP GFP_KERNEL allocation in atomic context Willy Tarreau
                   ` (26 subsequent siblings)
  30 siblings, 0 replies; 37+ messages in thread
From: Willy Tarreau @ 2007-10-13 15:28 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: Jan Beulich, Andi Kleen, Greg Kroah-Hartman

[-- Attachment #1: 0015-i386-allow-debuggers-to-access-the-vsyscall-page-wi.patch --]
[-- Type: text/plain, Size: 777 bytes --]

From: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
---
 arch/i386/kernel/sysenter.c |    4 +++-
 1 files changed, 3 insertions(+), 1 deletions(-)

Index: 2.6/arch/i386/kernel/sysenter.c
===================================================================
--- 2.6.orig/arch/i386/kernel/sysenter.c
+++ 2.6/arch/i386/kernel/sysenter.c
@@ -183,7 +183,9 @@ struct vm_area_struct *get_gate_vma(stru
 
 int in_gate_area(struct task_struct *task, unsigned long addr)
 {
-	return 0;
+	const struct vm_area_struct *vma = get_gate_vma(task);
+
+	return vma && addr >= vma->vm_start && addr < vma->vm_end;
 }
 
 int in_gate_area_no_task(unsigned long addr)

-- 

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [2.6.20.21 review 05/35] DCCP: Fix DCCP GFP_KERNEL allocation in atomic context
  2007-10-13 14:28 [2.6.20.21 review 00/35] 2.6.20.21 -stable review Willy Tarreau
                   ` (3 preceding siblings ...)
  2007-10-13 15:28 ` [2.6.20.21 review 04/35] i386: allow debuggers to access the vsyscall page with compat vDSO Willy Tarreau
@ 2007-10-13 15:28 ` Willy Tarreau
  2007-10-13 15:28 ` [2.6.20.21 review 06/35] Netfilter: Missing Kbuild entry for netfilter Willy Tarreau
                   ` (25 subsequent siblings)
  30 siblings, 0 replies; 37+ messages in thread
From: Willy Tarreau @ 2007-10-13 15:28 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Gerrit Renker, Arnaldo Carvalho de Melo, David S. Miller,
	Greg Kroah-Hartman

[-- Attachment #1: 0029-DCCP-Fix-DCCP-GFP_KERNEL-allocation-in-atomic-conte.patch --]
[-- Type: text/plain, Size: 2039 bytes --]

This fixes the following bug reported in syslog:

[ 4039.051658] BUG: sleeping function called from invalid context at /usr/src/davem-2.6/mm/slab.c:3032
[ 4039.051668] in_atomic():1, irqs_disabled():0
[ 4039.051670] INFO: lockdep is turned off.
[ 4039.051674]  [<c0104c0f>] show_trace_log_lvl+0x1a/0x30
[ 4039.051687]  [<c0104d4d>] show_trace+0x12/0x14
[ 4039.051691]  [<c0104d65>] dump_stack+0x16/0x18
[ 4039.051695]  [<c011371e>] __might_sleep+0xaf/0xbe
[ 4039.051700]  [<c0157b66>] __kmalloc+0xb1/0xd0
[ 4039.051706]  [<f090416f>] ccid2_hc_tx_alloc_seq+0x35/0xc3 [dccp_ccid2]
[ 4039.051717]  [<f09048d6>] ccid2_hc_tx_packet_sent+0x27f/0x2d9 [dccp_ccid2]
[ 4039.051723]  [<f085486b>] dccp_write_xmit+0x1eb/0x338 [dccp]
[ 4039.051741]  [<f085603d>] dccp_sendmsg+0x113/0x18f [dccp]
[ 4039.051750]  [<c03907fc>] inet_sendmsg+0x2e/0x4c
[ 4039.051758]  [<c033a47d>] sock_aio_write+0xd5/0x107
[ 4039.051766]  [<c015abc1>] do_sync_write+0xcd/0x11c
[ 4039.051772]  [<c015b296>] vfs_write+0x118/0x11f
[ 4039.051840]  [<c015b932>] sys_write+0x3d/0x64
[ 4039.051845]  [<c0103e7c>] syscall_call+0x7/0xb
[ 4039.051848]  =======================

The problem was that GFP_KERNEL was used; fixed by using gfp_any().

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
---
 net/dccp/ccids/ccid2.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

Index: 2.6/net/dccp/ccids/ccid2.c
===================================================================
--- 2.6.orig/net/dccp/ccids/ccid2.c
+++ 2.6/net/dccp/ccids/ccid2.c
@@ -298,7 +298,7 @@ static void ccid2_hc_tx_packet_sent(stru
 		int rc;
 
 		ccid2_pr_debug("allocating more space in history
");
-		rc = ccid2_hc_tx_alloc_seq(hctx, CCID2_SEQBUF_LEN, GFP_KERNEL);
+		rc = ccid2_hc_tx_alloc_seq(hctx, CCID2_SEQBUF_LEN, gfp_any());
 		BUG_ON(rc); /* XXX what do we do? */
 
 		next = hctx->ccid2hctx_seqh->ccid2s_next;

-- 

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [2.6.20.21 review 06/35] Netfilter: Missing Kbuild entry for netfilter
  2007-10-13 14:28 [2.6.20.21 review 00/35] 2.6.20.21 -stable review Willy Tarreau
                   ` (4 preceding siblings ...)
  2007-10-13 15:28 ` [2.6.20.21 review 05/35] DCCP: Fix DCCP GFP_KERNEL allocation in atomic context Willy Tarreau
@ 2007-10-13 15:28 ` Willy Tarreau
  2007-10-13 15:28 ` [2.6.20.21 review 07/35] SNAP: Fix SNAP protocol header accesses Willy Tarreau
                   ` (24 subsequent siblings)
  30 siblings, 0 replies; 37+ messages in thread
From: Willy Tarreau @ 2007-10-13 15:28 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: Chuck Ebbert, David S. Miller, Greg Kroah-Hartman

[-- Attachment #1: 0033-Netfilter-Missing-Kbuild-entry-for-netfilter.patch --]
[-- Type: text/plain, Size: 817 bytes --]

Author: Chuck Ebbert <cebbert@redhat.com>

Add xt_statistic.h to the list of headers to install.

Apparently needed to build newer versions of iptables.

Signed-off-by: Chuck Ebbert <cebbert@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
---
 include/linux/netfilter/Kbuild |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

Index: 2.6/include/linux/netfilter/Kbuild
===================================================================
--- 2.6.orig/include/linux/netfilter/Kbuild
+++ 2.6/include/linux/netfilter/Kbuild
@@ -28,6 +28,7 @@ header-y += xt_policy.h
 header-y += xt_realm.h
 header-y += xt_sctp.h
 header-y += xt_state.h
+header-y += xt_statistic.h
 header-y += xt_string.h
 header-y += xt_tcpmss.h
 header-y += xt_tcpudp.h

-- 

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [2.6.20.21 review 07/35] SNAP: Fix SNAP protocol header accesses.
  2007-10-13 14:28 [2.6.20.21 review 00/35] 2.6.20.21 -stable review Willy Tarreau
                   ` (5 preceding siblings ...)
  2007-10-13 15:28 ` [2.6.20.21 review 06/35] Netfilter: Missing Kbuild entry for netfilter Willy Tarreau
@ 2007-10-13 15:28 ` Willy Tarreau
  2007-10-13 15:28 ` [2.6.20.21 review 09/35] SPARC64: Fix sparc64 task stack traces Willy Tarreau
                   ` (23 subsequent siblings)
  30 siblings, 0 replies; 37+ messages in thread
From: Willy Tarreau @ 2007-10-13 15:28 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: Herbert Xu, David S. Miller, Greg Kroah-Hartman

[-- Attachment #1: 0034-SNAP-Fix-SNAP-protocol-header-accesses.patch --]
[-- Type: text/plain, Size: 1280 bytes --]

The snap_rcv code reads 5 bytes so we should make sure that
we have 5 bytes in the head before proceeding.

Based on diagnosis and fix by Evgeniy Polyakov, reported by
Alan J. Wylie.

Patch also kills the skb->sk assignment before kfree_skb
since it's redundant.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
---
 net/802/psnap.c |   17 ++++++++++++-----
 1 files changed, 12 insertions(+), 5 deletions(-)

Index: 2.6/net/802/psnap.c
===================================================================
--- 2.6.orig/net/802/psnap.c
+++ 2.6/net/802/psnap.c
@@ -55,6 +55,9 @@ static int snap_rcv(struct sk_buff *skb,
 		.type = __constant_htons(ETH_P_SNAP),
 	};
 
+	if (unlikely(!pskb_may_pull(skb, 5)))
+		goto drop;
+
 	rcu_read_lock();
 	proto = find_snap_client(skb->h.raw);
 	if (proto) {
@@ -62,14 +65,18 @@ static int snap_rcv(struct sk_buff *skb,
 		skb->h.raw  += 5;
 		skb_pull_rcsum(skb, 5);
 		rc = proto->rcvfunc(skb, dev, &snap_packet_type, orig_dev);
-	} else {
-		skb->sk = NULL;
-		kfree_skb(skb);
-		rc = 1;
 	}
-
 	rcu_read_unlock();
+
+	if (unlikely(!proto))
+		goto drop;
+
+out:
 	return rc;
+
+drop:
+	kfree_skb(skb);
+	goto out;
 }
 
 /*

-- 

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [2.6.20.21 review 09/35] SPARC64: Fix sparc64 task stack traces.
  2007-10-13 14:28 [2.6.20.21 review 00/35] 2.6.20.21 -stable review Willy Tarreau
                   ` (6 preceding siblings ...)
  2007-10-13 15:28 ` [2.6.20.21 review 07/35] SNAP: Fix SNAP protocol header accesses Willy Tarreau
@ 2007-10-13 15:28 ` Willy Tarreau
  2007-10-13 15:28 ` [2.6.20.21 review 10/35] TCP: Do not autobind ports for TCP sockets Willy Tarreau
                   ` (22 subsequent siblings)
  30 siblings, 0 replies; 37+ messages in thread
From: Willy Tarreau @ 2007-10-13 15:28 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: David S. Miller, Greg Kroah-Hartman

[-- Attachment #1: 0036-SPARC64-Fix-sparc64-task-stack-traces.patch --]
[-- Type: text/plain, Size: 2095 bytes --]

It didn't handle that case at all, and now dump_stack()
can be implemented directly as show_stack(current, NULL)

Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
---
 arch/sparc64/kernel/traps.c |   18 +++++++++++-------
 arch/sparc64/mm/fault.c     |    5 +----
 2 files changed, 12 insertions(+), 11 deletions(-)

Index: 2.6/arch/sparc64/kernel/traps.c
===================================================================
--- 2.6.orig/arch/sparc64/kernel/traps.c
+++ 2.6/arch/sparc64/kernel/traps.c
@@ -2146,12 +2146,20 @@ static void user_instruction_dump (unsig
 void show_stack(struct task_struct *tsk, unsigned long *_ksp)
 {
 	unsigned long pc, fp, thread_base, ksp;
-	void *tp = task_stack_page(tsk);
+	struct thread_info *tp;
 	struct reg_window *rw;
 	int count = 0;
 
 	ksp = (unsigned long) _ksp;
-
+	if (!tsk)
+		tsk = current;
+	tp = task_thread_info(tsk);
+	if (ksp == 0UL) {
+		if (tsk == current)
+			asm("mov %%fp, %0" : "=r" (ksp));
+		else
+			ksp = tp->ksp;
+	}
 	if (tp == current_thread_info())
 		flushw_all();
 
@@ -2180,11 +2188,7 @@ void show_stack(struct task_struct *tsk,
 
 void dump_stack(void)
 {
-	unsigned long *ksp;
-
-	__asm__ __volatile__("mov	%%fp, %0"
-			     : "=r" (ksp));
-	show_stack(current, ksp);
+	show_stack(current, NULL);
 }
 
 EXPORT_SYMBOL(dump_stack);
Index: 2.6/arch/sparc64/mm/fault.c
===================================================================
--- 2.6.orig/arch/sparc64/mm/fault.c
+++ 2.6/arch/sparc64/mm/fault.c
@@ -129,15 +129,12 @@ static void __kprobes unhandled_fault(un
 
 static void bad_kernel_pc(struct pt_regs *regs, unsigned long vaddr)
 {
-	unsigned long *ksp;
-
 	printk(KERN_CRIT "OOPS: Bogus kernel PC [%016lx] in fault handler
",
 	       regs->tpc);
 	printk(KERN_CRIT "OOPS: RPC [%016lx]
", regs->u_regs[15]);
 	print_symbol("RPC: <%s>
", regs->u_regs[15]);
 	printk(KERN_CRIT "OOPS: Fault was to vaddr[%lx]
", vaddr);
-	__asm__("mov %%sp, %0" : "=r" (ksp));
-	show_stack(current, ksp);
+	dump_stack();
 	unhandled_fault(regs->tpc, current, regs);
 }
 

-- 

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [2.6.20.21 review 10/35] TCP: Do not autobind ports for TCP sockets
  2007-10-13 14:28 [2.6.20.21 review 00/35] 2.6.20.21 -stable review Willy Tarreau
                   ` (7 preceding siblings ...)
  2007-10-13 15:28 ` [2.6.20.21 review 09/35] SPARC64: Fix sparc64 task stack traces Willy Tarreau
@ 2007-10-13 15:28 ` Willy Tarreau
  2007-10-13 15:28 ` [2.6.20.21 review 11/35] TCP: Fix TCP rate-halving on bidirectional flows Willy Tarreau
                   ` (21 subsequent siblings)
  30 siblings, 0 replies; 37+ messages in thread
From: Willy Tarreau @ 2007-10-13 15:28 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: David S. Miller, Greg Kroah-Hartman

[-- Attachment #1: 0038-TCP-Do-not-autobind-ports-for-TCP-sockets.patch --]
[-- Type: text/plain, Size: 3852 bytes --]

[TCP]: Invoke tcp_sendmsg() directly, do not use inet_sendmsg().

As discovered by Evegniy Polyakov, if we try to sendmsg after
a connection reset, we can do incredibly stupid things.

The core issue is that inet_sendmsg() tries to autobind the
socket, but we should never do that for TCP.  Instead we should
just go straight into TCP's sendmsg() code which will do all
of the necessary state and pending socket error checks.

TCP's sendpage already directly vectors to tcp_sendpage(), so this
merely brings sendmsg() in line with that.

Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
---
 include/net/tcp.h   |    2 +-
 net/ipv4/af_inet.c  |    2 +-
 net/ipv4/tcp.c      |    3 ++-
 net/ipv4/tcp_ipv4.c |    1 -
 net/ipv6/af_inet6.c |    2 +-
 net/ipv6/tcp_ipv6.c |    1 -
 6 files changed, 5 insertions(+), 6 deletions(-)

Index: 2.6/include/net/tcp.h
===================================================================
--- 2.6.orig/include/net/tcp.h
+++ 2.6/include/net/tcp.h
@@ -273,7 +273,7 @@ extern int			tcp_v4_remember_stamp(struc
 
 extern int		    	tcp_v4_tw_remember_stamp(struct inet_timewait_sock *tw);
 
-extern int			tcp_sendmsg(struct kiocb *iocb, struct sock *sk,
+extern int			tcp_sendmsg(struct kiocb *iocb, struct socket *sock,
 					    struct msghdr *msg, size_t size);
 extern ssize_t			tcp_sendpage(struct socket *sock, struct page *page, int offset, size_t size, int flags);
 
Index: 2.6/net/ipv4/af_inet.c
===================================================================
--- 2.6.orig/net/ipv4/af_inet.c
+++ 2.6/net/ipv4/af_inet.c
@@ -803,7 +803,7 @@ const struct proto_ops inet_stream_ops =
 	.shutdown	   = inet_shutdown,
 	.setsockopt	   = sock_common_setsockopt,
 	.getsockopt	   = sock_common_getsockopt,
-	.sendmsg	   = inet_sendmsg,
+	.sendmsg	   = tcp_sendmsg,
 	.recvmsg	   = sock_common_recvmsg,
 	.mmap		   = sock_no_mmap,
 	.sendpage	   = tcp_sendpage,
Index: 2.6/net/ipv4/tcp.c
===================================================================
--- 2.6.orig/net/ipv4/tcp.c
+++ 2.6/net/ipv4/tcp.c
@@ -658,9 +658,10 @@ static inline int select_size(struct soc
 	return tmp;
 }
 
-int tcp_sendmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg,
+int tcp_sendmsg(struct kiocb *iocb, struct socket *sock, struct msghdr *msg,
 		size_t size)
 {
+	struct sock *sk = sock->sk;
 	struct iovec *iov;
 	struct tcp_sock *tp = tcp_sk(sk);
 	struct sk_buff *skb;
Index: 2.6/net/ipv4/tcp_ipv4.c
===================================================================
--- 2.6.orig/net/ipv4/tcp_ipv4.c
+++ 2.6/net/ipv4/tcp_ipv4.c
@@ -2427,7 +2427,6 @@ struct proto tcp_prot = {
 	.shutdown		= tcp_shutdown,
 	.setsockopt		= tcp_setsockopt,
 	.getsockopt		= tcp_getsockopt,
-	.sendmsg		= tcp_sendmsg,
 	.recvmsg		= tcp_recvmsg,
 	.backlog_rcv		= tcp_v4_do_rcv,
 	.hash			= tcp_v4_hash,
Index: 2.6/net/ipv6/af_inet6.c
===================================================================
--- 2.6.orig/net/ipv6/af_inet6.c
+++ 2.6/net/ipv6/af_inet6.c
@@ -473,7 +473,7 @@ const struct proto_ops inet6_stream_ops 
 	.shutdown	   = inet_shutdown,		/* ok		*/
 	.setsockopt	   = sock_common_setsockopt,	/* ok		*/
 	.getsockopt	   = sock_common_getsockopt,	/* ok		*/
-	.sendmsg	   = inet_sendmsg,		/* ok		*/
+	.sendmsg	   = tcp_sendmsg,		/* ok		*/
 	.recvmsg	   = sock_common_recvmsg,	/* ok		*/
 	.mmap		   = sock_no_mmap,
 	.sendpage	   = tcp_sendpage,
Index: 2.6/net/ipv6/tcp_ipv6.c
===================================================================
--- 2.6.orig/net/ipv6/tcp_ipv6.c
+++ 2.6/net/ipv6/tcp_ipv6.c
@@ -2127,7 +2127,6 @@ struct proto tcpv6_prot = {
 	.shutdown		= tcp_shutdown,
 	.setsockopt		= tcp_setsockopt,
 	.getsockopt		= tcp_getsockopt,
-	.sendmsg		= tcp_sendmsg,
 	.recvmsg		= tcp_recvmsg,
 	.backlog_rcv		= tcp_v6_do_rcv,
 	.hash			= tcp_v6_hash,

-- 

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [2.6.20.21 review 11/35] TCP: Fix TCP rate-halving on bidirectional flows.
  2007-10-13 14:28 [2.6.20.21 review 00/35] 2.6.20.21 -stable review Willy Tarreau
                   ` (8 preceding siblings ...)
  2007-10-13 15:28 ` [2.6.20.21 review 10/35] TCP: Do not autobind ports for TCP sockets Willy Tarreau
@ 2007-10-13 15:28 ` Willy Tarreau
  2007-10-13 15:28 ` [2.6.20.21 review 13/35] USB: allow retry on descriptor fetch errors Willy Tarreau
                   ` (20 subsequent siblings)
  30 siblings, 0 replies; 37+ messages in thread
From: Willy Tarreau @ 2007-10-13 15:28 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Ilpo JÀrvinen, David S. Miller, Greg Kroah-Hartman

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: 0039-TCP-Fix-TCP-rate-halving-on-bidirectional-flows.patch --]
[-- Type: text/plain, Size: 2073 bytes --]

Actually, the ratehalving seems to work too well, as cwnd is
reduced on every second ACK even though the packets in flight
remains unchanged. Recoveries in a bidirectional flows suffer
quite badly because of this, both NewReno and SACK are affected.

After this patch, rate halving is performed for ACK only if
packets in flight was supposedly changed too.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
---
 net/ipv4/tcp_input.c |   21 ++++++++++++---------
 1 files changed, 12 insertions(+), 9 deletions(-)

Index: 2.6/net/ipv4/tcp_input.c
===================================================================
--- 2.6.orig/net/ipv4/tcp_input.c
+++ 2.6/net/ipv4/tcp_input.c
@@ -1702,19 +1702,22 @@ static inline u32 tcp_cwnd_min(const str
 }
 
 /* Decrease cwnd each second ack. */
-static void tcp_cwnd_down(struct sock *sk)
+static void tcp_cwnd_down(struct sock *sk, int flag)
 {
 	struct tcp_sock *tp = tcp_sk(sk);
 	int decr = tp->snd_cwnd_cnt + 1;
 
-	tp->snd_cwnd_cnt = decr&1;
-	decr >>= 1;
+	if ((flag&FLAG_FORWARD_PROGRESS) ||
+	    (IsReno(tp) && !(flag&FLAG_NOT_DUP))) {
+		tp->snd_cwnd_cnt = decr&1;
+		decr >>= 1;
 
-	if (decr && tp->snd_cwnd > tcp_cwnd_min(sk))
-		tp->snd_cwnd -= decr;
+		if (decr && tp->snd_cwnd > tcp_cwnd_min(sk))
+			tp->snd_cwnd -= decr;
 
-	tp->snd_cwnd = min(tp->snd_cwnd, tcp_packets_in_flight(tp)+1);
-	tp->snd_cwnd_stamp = tcp_time_stamp;
+		tp->snd_cwnd = min(tp->snd_cwnd, tcp_packets_in_flight(tp)+1);
+		tp->snd_cwnd_stamp = tcp_time_stamp;
+	}
 }
 
 /* Nothing was retransmitted or returned timestamp is less
@@ -1899,7 +1902,7 @@ static void tcp_try_to_open(struct sock 
 		}
 		tcp_moderate_cwnd(tp);
 	} else {
-		tcp_cwnd_down(sk);
+		tcp_cwnd_down(sk, flag);
 	}
 }
 
@@ -2100,7 +2103,7 @@ tcp_fastretrans_alert(struct sock *sk, u
 
 	if (is_dupack || tcp_head_timedout(sk, tp))
 		tcp_update_scoreboard(sk, tp);
-	tcp_cwnd_down(sk);
+	tcp_cwnd_down(sk, flag);
 	tcp_xmit_retransmit_queue(sk);
 }
 

-- 

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [2.6.20.21 review 13/35] USB: allow retry on descriptor fetch errors
  2007-10-13 14:28 [2.6.20.21 review 00/35] 2.6.20.21 -stable review Willy Tarreau
                   ` (9 preceding siblings ...)
  2007-10-13 15:28 ` [2.6.20.21 review 11/35] TCP: Fix TCP rate-halving on bidirectional flows Willy Tarreau
@ 2007-10-13 15:28 ` Willy Tarreau
  2007-10-13 17:15   ` [2.6.20.21 review 12/35] TCP: Fix TCP handling of SACK in bidirectional flows Ilpo Järvinen
  2007-10-13 15:28 ` [2.6.20.21 review 14/35] USB: fix DoS in pwc USB video driver Willy Tarreau
                   ` (19 subsequent siblings)
  30 siblings, 1 reply; 37+ messages in thread
From: Willy Tarreau @ 2007-10-13 15:28 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: Alan Stern, Greg Kroah-Hartman

[-- Attachment #1: 0046-USB-allow-retry-on-descriptor-fetch-errors.patch --]
[-- Type: text/plain, Size: 1182 bytes --]

This patch (as964) was suggested by Steffen Koepf.  It makes
usb_get_descriptor() retry on all errors other than ETIMEDOUT, instead
of only on EPIPE.  This helps with some devices.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
---
 drivers/usb/core/message.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

Index: 2.6/drivers/usb/core/message.c
===================================================================
--- 2.6.orig/drivers/usb/core/message.c
+++ 2.6/drivers/usb/core/message.c
@@ -608,12 +608,12 @@ int usb_get_descriptor(struct usb_device
 	memset(buf,0,size);	// Make sure we parse really received data
 
 	for (i = 0; i < 3; ++i) {
-		/* retry on length 0 or stall; some devices are flakey */
+		/* retry on length 0 or error; some devices are flakey */
 		result = usb_control_msg(dev, usb_rcvctrlpipe(dev, 0),
 				USB_REQ_GET_DESCRIPTOR, USB_DIR_IN,
 				(type << 8) + index, 0, buf, size,
 				USB_CTRL_GET_TIMEOUT);
-		if (result == 0 || result == -EPIPE)
+		if (result <= 0 && result != -ETIMEDOUT)
 			continue;
 		if (result > 1 && ((u8 *)buf)[1] != type) {
 			result = -EPROTO;

-- 

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [2.6.20.21 review 14/35] USB: fix DoS in pwc USB video driver
  2007-10-13 14:28 [2.6.20.21 review 00/35] 2.6.20.21 -stable review Willy Tarreau
                   ` (10 preceding siblings ...)
  2007-10-13 15:28 ` [2.6.20.21 review 13/35] USB: allow retry on descriptor fetch errors Willy Tarreau
@ 2007-10-13 15:28 ` Willy Tarreau
  2007-10-13 15:28 ` [2.6.20.21 review 15/35] Convert snd-page-alloc proc file to use seq_file Willy Tarreau
                   ` (18 subsequent siblings)
  30 siblings, 0 replies; 37+ messages in thread
From: Willy Tarreau @ 2007-10-13 15:28 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: Oliver Neukum, Greg Kroah-Hartman

[-- Attachment #1: 0047-USB-fix-DoS-in-pwc-USB-video-driver.patch --]
[-- Type: text/plain, Size: 3929 bytes --]

the pwc driver has a disconnect method that waits for user space to
close the device. This opens up an opportunity for a DoS attack,
blocking the USB subsystem and making khubd's task busy wait in
kernel space. This patch shifts freeing resources to close if an opened
device is disconnected.

Signed-off-by: Oliver Neukum <oneukum@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
---
 drivers/media/video/pwc/pwc-if.c |   52 +++++++++++++++++++++++++------------
 drivers/media/video/pwc/pwc.h    |    1 +
 2 files changed, 36 insertions(+), 17 deletions(-)

Index: 2.6/drivers/media/video/pwc/pwc-if.c
===================================================================
--- 2.6.orig/drivers/media/video/pwc/pwc-if.c
+++ 2.6/drivers/media/video/pwc/pwc-if.c
@@ -1196,12 +1196,19 @@ static int pwc_video_open(struct inode *
 	return 0;
 }
 
+
+static void pwc_cleanup(struct pwc_device *pdev)
+{
+	pwc_remove_sysfs_files(pdev->vdev);
+	video_unregister_device(pdev->vdev);
+}
+
 /* Note that all cleanup is done in the reverse order as in _open */
 static int pwc_video_close(struct inode *inode, struct file *file)
 {
 	struct video_device *vdev = file->private_data;
 	struct pwc_device *pdev;
-	int i;
+	int i, hint;
 
 	PWC_DEBUG_OPEN(">> video_close called(vdev = 0x%p).
", vdev);
 
@@ -1224,8 +1231,9 @@ static int pwc_video_close(struct inode 
 	pwc_isoc_cleanup(pdev);
 	pwc_free_buffers(pdev);
 
+	lock_kernel();
 	/* Turn off LEDS and power down camera, but only when not unplugged */
-	if (pdev->error_status != EPIPE) {
+	if (!pdev->unplugged) {
 		/* Turn LEDs off */
 		if (pwc_set_leds(pdev, 0, 0) < 0)
 			PWC_DEBUG_MODULE("Failed to set LED on/off time.
");
@@ -1234,9 +1242,19 @@ static int pwc_video_close(struct inode 
 			if (i < 0)
 				PWC_ERROR("Failed to power down camera (%d)
", i);
 		}
+		pdev->vopen--;
+		PWC_DEBUG_OPEN("<< video_close() vopen=%d
", i);
+	} else {
+		pwc_cleanup(pdev);
+		/* Free memory (don't set pdev to 0 just yet) */
+		kfree(pdev);
+		/* search device_hint[] table if we occupy a slot, by any chance */
+		for (hint = 0; hint < MAX_DEV_HINTS; hint++)
+			if (device_hint[hint].pdev == pdev)
+				device_hint[hint].pdev = NULL;
 	}
-	pdev->vopen--;
-	PWC_DEBUG_OPEN("<< video_close() vopen=%d
", pdev->vopen);
+	unlock_kernel();
+
 	return 0;
 }
 
@@ -1783,21 +1801,21 @@ static void usb_pwc_disconnect(struct us
 	/* Alert waiting processes */
 	wake_up_interruptible(&pdev->frameq);
 	/* Wait until device is closed */
-	while (pdev->vopen)
-		schedule();
-	/* Device is now closed, so we can safely unregister it */
-	PWC_DEBUG_PROBE("Unregistering video device in disconnect().
");
-	pwc_remove_sysfs_files(pdev->vdev);
-	video_unregister_device(pdev->vdev);
-
-	/* Free memory (don't set pdev to 0 just yet) */
-	kfree(pdev);
+	if(pdev->vopen) {
+		pdev->unplugged = 1;
+	} else {
+		/* Device is closed, so we can safely unregister it */
+		PWC_DEBUG_PROBE("Unregistering video device in disconnect().
");
+		pwc_cleanup(pdev);
+		/* Free memory (don't set pdev to 0 just yet) */
+		kfree(pdev);
 
 disconnect_out:
-	/* search device_hint[] table if we occupy a slot, by any chance */
-	for (hint = 0; hint < MAX_DEV_HINTS; hint++)
-		if (device_hint[hint].pdev == pdev)
-			device_hint[hint].pdev = NULL;
+		/* search device_hint[] table if we occupy a slot, by any chance */
+		for (hint = 0; hint < MAX_DEV_HINTS; hint++)
+			if (device_hint[hint].pdev == pdev)
+				device_hint[hint].pdev = NULL;
+	}
 
 	unlock_kernel();
 }
Index: 2.6/drivers/media/video/pwc/pwc.h
===================================================================
--- 2.6.orig/drivers/media/video/pwc/pwc.h
+++ 2.6/drivers/media/video/pwc/pwc.h
@@ -198,6 +198,7 @@ struct pwc_device
    char vsnapshot;		/* snapshot mode */
    char vsync;			/* used by isoc handler */
    char vmirror;		/* for ToUCaM series */
+	char unplugged;
 
    int cmd_len;
    unsigned char cmd_buf[13];

-- 

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [2.6.20.21 review 15/35] Convert snd-page-alloc proc file to use seq_file
  2007-10-13 14:28 [2.6.20.21 review 00/35] 2.6.20.21 -stable review Willy Tarreau
                   ` (11 preceding siblings ...)
  2007-10-13 15:28 ` [2.6.20.21 review 14/35] USB: fix DoS in pwc USB video driver Willy Tarreau
@ 2007-10-13 15:28 ` Willy Tarreau
  2007-10-13 15:28 ` [2.6.20.21 review 16/35] setpgid(child) fails if the child was forked by sub-thread Willy Tarreau
                   ` (17 subsequent siblings)
  30 siblings, 0 replies; 37+ messages in thread
From: Willy Tarreau @ 2007-10-13 15:28 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: Takashi Iwai, Linus Torvalds

[-- Attachment #1: 0001-Convert-snd-page-alloc-proc-file-to-use-seq_file.patch2 --]
[-- Type: text/plain, Size: 5136 bytes --]

Use seq_file for the proc file read/write of snd-page-alloc module.
This automatically fixes bugs in the old proc code.

Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
---
 sound/core/memalloc.c |   68 ++++++++++++++++++++++++++++---------------------
 1 files changed, 39 insertions(+), 29 deletions(-)

Index: 2.6/sound/core/memalloc.c
===================================================================
--- 2.6.orig/sound/core/memalloc.c
+++ 2.6/sound/core/memalloc.c
@@ -27,6 +27,7 @@
 #include <linux/pci.h>
 #include <linux/slab.h>
 #include <linux/mm.h>
+#include <linux/seq_file.h>
 #include <asm/uaccess.h>
 #include <linux/dma-mapping.h>
 #include <linux/moduleparam.h>
@@ -483,10 +484,8 @@ static void free_all_reserved_pages(void
 #define SND_MEM_PROC_FILE	"driver/snd-page-alloc"
 static struct proc_dir_entry *snd_mem_proc;
 
-static int snd_mem_proc_read(char *page, char **start, off_t off,
-			     int count, int *eof, void *data)
+static int snd_mem_proc_read(struct seq_file *seq, void *offset)
 {
-	int len = 0;
 	long pages = snd_allocated_pages >> (PAGE_SHIFT-12);
 	struct list_head *p;
 	struct snd_mem_list *mem;
@@ -494,44 +493,47 @@ static int snd_mem_proc_read(char *page,
 	static char *types[] = { "UNKNOWN", "CONT", "DEV", "DEV-SG", "SBUS" };
 
 	mutex_lock(&list_mutex);
-	len += snprintf(page + len, count - len,
-			"pages  : %li bytes (%li pages per %likB)
",
-			pages * PAGE_SIZE, pages, PAGE_SIZE / 1024);
+	seq_printf(seq, "pages  : %li bytes (%li pages per %likB)
",
+		   pages * PAGE_SIZE, pages, PAGE_SIZE / 1024);
 	devno = 0;
 	list_for_each(p, &mem_list_head) {
 		mem = list_entry(p, struct snd_mem_list, list);
 		devno++;
-		len += snprintf(page + len, count - len,
-				"buffer %d : ID %08x : type %s
",
-				devno, mem->id, types[mem->buffer.dev.type]);
-		len += snprintf(page + len, count - len,
-				"  addr = 0x%lx, size = %d bytes
",
-				(unsigned long)mem->buffer.addr, (int)mem->buffer.bytes);
+		seq_printf(seq, "buffer %d : ID %08x : type %s
",
+			   devno, mem->id, types[mem->buffer.dev.type]);
+		seq_printf(seq, "  addr = 0x%lx, size = %d bytes
",
+			   (unsigned long)mem->buffer.addr,
+			   (int)mem->buffer.bytes);
 	}
 	mutex_unlock(&list_mutex);
-	return len;
+	return 0;
+}
+
+static int snd_mem_proc_open(struct inode *inode, struct file *file)
+{
+	return single_open(file, snd_mem_proc_read, NULL);
 }
 
 /* FIXME: for pci only - other bus? */
 #ifdef CONFIG_PCI
 #define gettoken(bufp) strsep(bufp, " 	
")
 
-static int snd_mem_proc_write(struct file *file, const char __user *buffer,
-			      unsigned long count, void *data)
+static ssize_t snd_mem_proc_write(struct file *file, const char __user * buffer,
+				  size_t count, loff_t * ppos)
 {
 	char buf[128];
 	char *token, *p;
 
-	if (count > ARRAY_SIZE(buf) - 1)
-		count = ARRAY_SIZE(buf) - 1;
+	if (count > sizeof(buf) - 1)
+		return -EINVAL;
 	if (copy_from_user(buf, buffer, count))
 		return -EFAULT;
-	buf[ARRAY_SIZE(buf) - 1] = '';
+	buf[count] = '';
 
 	p = buf;
 	token = gettoken(&p);
 	if (! token || *token == '#')
-		return (int)count;
+		return count;
 	if (strcmp(token, "add") == 0) {
 		char *endp;
 		int vendor, device, size, buffers;
@@ -552,7 +554,7 @@ static int snd_mem_proc_write(struct fil
 		    (buffers = simple_strtol(token, NULL, 0)) <= 0 ||
 		    buffers > 4) {
 			printk(KERN_ERR "snd-page-alloc: invalid proc write format
");
-			return (int)count;
+			return count;
 		}
 		vendor &= 0xffff;
 		device &= 0xffff;
@@ -564,7 +566,7 @@ static int snd_mem_proc_write(struct fil
 				if (pci_set_dma_mask(pci, mask) < 0 ||
 				    pci_set_consistent_dma_mask(pci, mask) < 0) {
 					printk(KERN_ERR "snd-page-alloc: cannot set DMA mask %lx for pci %04x:%04x
", mask, vendor, device);
-					return (int)count;
+					return count;
 				}
 			}
 			for (i = 0; i < buffers; i++) {
@@ -574,7 +576,7 @@ static int snd_mem_proc_write(struct fil
 							size, &dmab) < 0) {
 					printk(KERN_ERR "snd-page-alloc: cannot allocate buffer pages (size = %d)
", size);
 					pci_dev_put(pci);
-					return (int)count;
+					return count;
 				}
 				snd_dma_reserve_buf(&dmab, snd_dma_pci_buf_id(pci));
 			}
@@ -600,9 +602,21 @@ static int snd_mem_proc_write(struct fil
 		free_all_reserved_pages();
 	else
 		printk(KERN_ERR "snd-page-alloc: invalid proc cmd
");
-	return (int)count;
+	return count;
 }
 #endif /* CONFIG_PCI */
+
+static const struct file_operations snd_mem_proc_fops = {
+	.owner		= THIS_MODULE,
+	.open		= snd_mem_proc_open,
+	.read		= seq_read,
+#ifdef CONFIG_PCI
+	.write		= snd_mem_proc_write,
+#endif
+	.llseek		= seq_lseek,
+	.release	= single_release,
+};
+
 #endif /* CONFIG_PROC_FS */
 
 /*
@@ -613,12 +627,8 @@ static int __init snd_mem_init(void)
 {
 #ifdef CONFIG_PROC_FS
 	snd_mem_proc = create_proc_entry(SND_MEM_PROC_FILE, 0644, NULL);
-	if (snd_mem_proc) {
-		snd_mem_proc->read_proc = snd_mem_proc_read;
-#ifdef CONFIG_PCI
-		snd_mem_proc->write_proc = snd_mem_proc_write;
-#endif
-	}
+	if (snd_mem_proc)
+		snd_mem_proc->proc_fops = &snd_mem_proc_fops;
 #endif
 	return 0;
 }

-- 

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [2.6.20.21 review 16/35] setpgid(child) fails if the child was forked by sub-thread
  2007-10-13 14:28 [2.6.20.21 review 00/35] 2.6.20.21 -stable review Willy Tarreau
                   ` (12 preceding siblings ...)
  2007-10-13 15:28 ` [2.6.20.21 review 15/35] Convert snd-page-alloc proc file to use seq_file Willy Tarreau
@ 2007-10-13 15:28 ` Willy Tarreau
  2007-10-13 15:28 ` [2.6.20.21 review 17/35] sigqueue_free: fix the race with collect_signal() Willy Tarreau
                   ` (16 subsequent siblings)
  30 siblings, 0 replies; 37+ messages in thread
From: Willy Tarreau @ 2007-10-13 15:28 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Oleg Nesterov, Roland McGrath, Andrew Morton, Linus Torvalds,
	Greg Kroah-Hartman

[-- Attachment #1: 0058-setpgid-child-fails-if-the-child-was-forked-by-sub.patch --]
[-- Type: text/plain, Size: 1465 bytes --]

commit b07e35f94a7b6a059f889b904529ee907dc0634d in mainline tree

Spotted by Marcin Kowalczyk <qrczak@knm.org.pl>.

sys_setpgid(child) fails if the child was forked by sub-thread.

Fix the "is it our child" check. The previous commit
ee0acf90d320c29916ba8c5c1b2e908d81f5057d was not complete.

(this patch asks for the new same_thread_group() helper, but mainline doesn't
 have it yet).

Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Acked-by: Roland McGrath <roland@redhat.com>
Tested-by: "Marcin 'Qrczak' Kowalczyk" <qrczak@knm.org.pl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
---
 kernel/sys.c |    3 +--
 1 files changed, 1 insertions(+), 2 deletions(-)

Index: 2.6/kernel/sys.c
===================================================================
--- 2.6.orig/kernel/sys.c
+++ 2.6/kernel/sys.c
@@ -1358,7 +1358,6 @@ asmlinkage long sys_times(struct tms __u
  * Auch. Had to add the 'did_exec' flag to conform completely to POSIX.
  * LBT 04.03.94
  */
-
 asmlinkage long sys_setpgid(pid_t pid, pid_t pgid)
 {
 	struct task_struct *p;
@@ -1386,7 +1385,7 @@ asmlinkage long sys_setpgid(pid_t pid, p
 	if (!thread_group_leader(p))
 		goto out;
 
-	if (p->real_parent == group_leader) {
+	if (p->real_parent->tgid == group_leader->tgid) {
 		err = -EPERM;
 		if (process_session(p) != process_session(group_leader))
 			goto out;

-- 

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [2.6.20.21 review 17/35] sigqueue_free: fix the race with collect_signal()
  2007-10-13 14:28 [2.6.20.21 review 00/35] 2.6.20.21 -stable review Willy Tarreau
                   ` (13 preceding siblings ...)
  2007-10-13 15:28 ` [2.6.20.21 review 16/35] setpgid(child) fails if the child was forked by sub-thread Willy Tarreau
@ 2007-10-13 15:28 ` Willy Tarreau
  2007-10-13 15:28 ` [2.6.20.21 review 18/35] USB: fix linked list insertion bugfix for usb core Willy Tarreau
                   ` (15 subsequent siblings)
  30 siblings, 0 replies; 37+ messages in thread
From: Willy Tarreau @ 2007-10-13 15:28 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Oleg Nesterov, taoyue, Jeremy Katz, Sukadev Bhattiprolu,
	Alexey Dobriyan, Ingo Molnar, Thomas Gleixner, Roland McGrath,
	Andrew Morton, Linus Torvalds, Greg Kroah-Hartman

[-- Attachment #1: 0059-sigqueue_free-fix-the-race-with-collect_signal.patch --]
[-- Type: text/plain, Size: 2706 bytes --]

commit 60187d2708caa870f0825d753df1612ea688eb9e in mainline.

Spotted by taoyue <yue.tao@windriver.com> and Jeremy Katz <jeremy.katz@windriver.com>.

collect_signal:				sigqueue_free:

	list_del_init(&first->list);
						if (!list_empty(&q->list)) {
							// not taken
						}
						q->flags &= ~SIGQUEUE_PREALLOC;

	__sigqueue_free(first);			__sigqueue_free(q);

Now, __sigqueue_free() is called twice on the same "struct sigqueue" with the
obviously bad implications.

In particular, this double free breaks the array_cache->avail logic, so the
same sigqueue could be "allocated" twice, and the bug can manifest itself via
the "impossible" BUG_ON(!SIGQUEUE_PREALLOC) in sigqueue_free/send_sigqueue.

Hopefully this can explain these mysterious bug-reports, see

	http://marc.info/?t=118766926500003
	http://marc.info/?t=118466273000005

Alexey Dobriyan reports this patch makes the difference for the testcase, but
nobody has an access to the application which opened the problems originally.

Also, this patch removes tasklist lock/unlock, ->siglock is enough.

Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Cc: taoyue <yue.tao@windriver.com>
Cc: Jeremy Katz <jeremy.katz@windriver.com>
Cc: Sukadev Bhattiprolu <sukadev@us.ibm.com>
Cc: Alexey Dobriyan <adobriyan@sw.ru>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Roland McGrath <roland@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
---
 kernel/signal.c |   19 +++++++++----------
 1 files changed, 9 insertions(+), 10 deletions(-)

Index: 2.6/kernel/signal.c
===================================================================
--- 2.6.orig/kernel/signal.c
+++ 2.6/kernel/signal.c
@@ -1345,20 +1345,19 @@ struct sigqueue *sigqueue_alloc(void)
 void sigqueue_free(struct sigqueue *q)
 {
 	unsigned long flags;
+	spinlock_t *lock = &current->sighand->siglock;
+
 	BUG_ON(!(q->flags & SIGQUEUE_PREALLOC));
 	/*
 	 * If the signal is still pending remove it from the
-	 * pending queue.
+	 * pending queue. We must hold ->siglock while testing
+	 * q->list to serialize with collect_signal().
 	 */
-	if (unlikely(!list_empty(&q->list))) {
-		spinlock_t *lock = &current->sighand->siglock;
-		read_lock(&tasklist_lock);
-		spin_lock_irqsave(lock, flags);
-		if (!list_empty(&q->list))
-			list_del_init(&q->list);
-		spin_unlock_irqrestore(lock, flags);
-		read_unlock(&tasklist_lock);
-	}
+	spin_lock_irqsave(lock, flags);
+	if (!list_empty(&q->list))
+		list_del_init(&q->list);
+	spin_unlock_irqrestore(lock, flags);
+
 	q->flags &= ~SIGQUEUE_PREALLOC;
 	__sigqueue_free(q);
 }

-- 

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [2.6.20.21 review 18/35] USB: fix linked list insertion bugfix for usb core
  2007-10-13 14:28 [2.6.20.21 review 00/35] 2.6.20.21 -stable review Willy Tarreau
                   ` (14 preceding siblings ...)
  2007-10-13 15:28 ` [2.6.20.21 review 17/35] sigqueue_free: fix the race with collect_signal() Willy Tarreau
@ 2007-10-13 15:28 ` Willy Tarreau
  2007-10-13 15:28 ` [2.6.20.21 review 19/35] POWERPC: Flush registers to proper task context Willy Tarreau
                   ` (14 subsequent siblings)
  30 siblings, 0 replies; 37+ messages in thread
From: Willy Tarreau @ 2007-10-13 15:28 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: Nathael Pajani, Greg Kroah-Hartman

[-- Attachment #1: 0062-USB-fix-linked-list-insertion-bugfix-for-usb-core.patch --]
[-- Type: text/plain, Size: 950 bytes --]

commit e5dd01154c1e9ca2400f4682602d1a4fa54c25dd in mainline.

This patch fixes the order of list_add_tail() arguments in
usb_store_new_id() so the list can have more than one single element.

Signed-off-by: Nathael Pajani <nathael.pajani@cpe.fr>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
---
 drivers/usb/core/driver.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

Index: 2.6/drivers/usb/core/driver.c
===================================================================
--- 2.6.orig/drivers/usb/core/driver.c
+++ 2.6/drivers/usb/core/driver.c
@@ -66,7 +66,7 @@ static ssize_t store_new_id(struct devic
 	dynid->id.match_flags = USB_DEVICE_ID_MATCH_DEVICE;
 
 	spin_lock(&usb_drv->dynids.lock);
-	list_add_tail(&usb_drv->dynids.list, &dynid->node);
+	list_add_tail(&dynid->node, &usb_drv->dynids.list);
 	spin_unlock(&usb_drv->dynids.lock);
 
 	if (get_driver(driver)) {

-- 

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [2.6.20.21 review 19/35] POWERPC: Flush registers to proper task context
  2007-10-13 14:28 [2.6.20.21 review 00/35] 2.6.20.21 -stable review Willy Tarreau
                   ` (15 preceding siblings ...)
  2007-10-13 15:28 ` [2.6.20.21 review 18/35] USB: fix linked list insertion bugfix for usb core Willy Tarreau
@ 2007-10-13 15:28 ` Willy Tarreau
  2007-10-13 15:28 ` [2.6.20.21 review 21/35] V4L: cx88: Avoid a NULL pointer dereference during mpeg_open() Willy Tarreau
                   ` (13 subsequent siblings)
  30 siblings, 0 replies; 37+ messages in thread
From: Willy Tarreau @ 2007-10-13 15:28 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: Kumar Gala, Greg Kroah-Hartman

[-- Attachment #1: 0064-POWERPC-Flush-registers-to-proper-task-context.patch --]
[-- Type: text/plain, Size: 1508 bytes --]

commit 0ee6c15e7ba7b36a217cdadb292eeaf32a057a59 in mainline.

When we flush register state for FP, Altivec, or SPE in flush_*_to_thread
we need to respect the task_struct that the caller has passed to us.

Most cases we are called with current, however sometimes (ptrace) we may
be passed a different task_struct.

This showed up when using gdbserver debugging a simple program that used
floating point. When gdb tried to show the FP regs they all showed up as
0, because the child's FP registers were never properly flushed to memory.

Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
---
 arch/powerpc/kernel/process.c |    6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

Index: 2.6/arch/powerpc/kernel/process.c
===================================================================
--- 2.6.orig/arch/powerpc/kernel/process.c
+++ 2.6/arch/powerpc/kernel/process.c
@@ -84,7 +84,7 @@ void flush_fp_to_thread(struct task_stru
 			 */
 			BUG_ON(tsk != current);
 #endif
-			giveup_fpu(current);
+			giveup_fpu(tsk);
 		}
 		preempt_enable();
 	}
@@ -144,7 +144,7 @@ void flush_altivec_to_thread(struct task
 #ifdef CONFIG_SMP
 			BUG_ON(tsk != current);
 #endif
-			giveup_altivec(current);
+			giveup_altivec(tsk);
 		}
 		preempt_enable();
 	}
@@ -183,7 +183,7 @@ void flush_spe_to_thread(struct task_str
 #ifdef CONFIG_SMP
 			BUG_ON(tsk != current);
 #endif
-			giveup_spe(current);
+			giveup_spe(tsk);
 		}
 		preempt_enable();
 	}

-- 

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [2.6.20.21 review 21/35] V4L: cx88: Avoid a NULL pointer dereference during mpeg_open()
  2007-10-13 14:28 [2.6.20.21 review 00/35] 2.6.20.21 -stable review Willy Tarreau
                   ` (16 preceding siblings ...)
  2007-10-13 15:28 ` [2.6.20.21 review 19/35] POWERPC: Flush registers to proper task context Willy Tarreau
@ 2007-10-13 15:28 ` Willy Tarreau
  2007-10-13 15:28 ` [2.6.20.21 review 22/35] Fix "Fix DAC960 driver on machines which dont support 64-bit DMA" Willy Tarreau
                   ` (12 subsequent siblings)
  30 siblings, 0 replies; 37+ messages in thread
From: Willy Tarreau @ 2007-10-13 15:28 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Steven Toth, Mauro Carvalho Chehab, Michael Krufky, Greg Kroah-Hartman

[-- Attachment #1: 0068-V4L-cx88-Avoid-a-NULL-pointer-dereference-during-m.patch --]
[-- Type: text/plain, Size: 1114 bytes --]

(cherry picked from commit 48200baeab95fd39a7f4c4f3536c7142a64ac335)

[PATCH] V4L: cx88: Avoid a NULL pointer dereference during mpeg_open()

Bug: With a hardware encoder board installed as cx88[1] and a
non-encoder boards installed as cx88[0], an OOPS is generated
during cx8802_get_device() called from mpeg_open().

Signed-off-by: Steven Toth <stoth@hauppauge.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
Signed-off-by: Michael Krufky <mkrufky@linuxtv.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
---
 drivers/media/video/cx88/cx88-mpeg.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

Index: 2.6/drivers/media/video/cx88/cx88-mpeg.c
===================================================================
--- 2.6.orig/drivers/media/video/cx88/cx88-mpeg.c
+++ 2.6/drivers/media/video/cx88/cx88-mpeg.c
@@ -556,7 +556,7 @@ struct cx8802_dev * cx8802_get_device(st
 
 	list_for_each(list,&cx8802_devlist) {
 		h = list_entry(list, struct cx8802_dev, devlist);
-		if (h->mpeg_dev->minor == minor)
+		if (h->mpeg_dev && h->mpeg_dev->minor == minor)
 			return h;
 	}
 

-- 

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [2.6.20.21 review 22/35] Fix "Fix DAC960 driver on machines which dont support 64-bit DMA"
  2007-10-13 14:28 [2.6.20.21 review 00/35] 2.6.20.21 -stable review Willy Tarreau
                   ` (17 preceding siblings ...)
  2007-10-13 15:28 ` [2.6.20.21 review 21/35] V4L: cx88: Avoid a NULL pointer dereference during mpeg_open() Willy Tarreau
@ 2007-10-13 15:28 ` Willy Tarreau
  2007-10-13 15:28 ` [2.6.20.21 review 23/35] futex_compat: fix list traversal bugs Willy Tarreau
                   ` (11 subsequent siblings)
  30 siblings, 0 replies; 37+ messages in thread
From: Willy Tarreau @ 2007-10-13 15:28 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: dac, Alessandro Polverini, Jeff Garzik, Matthew Wilcox,
	Andrew Morton, Linus Torvalds, Greg Kroah-Hartman

[-- Attachment #1: 0076-Fix-Fix-DAC960-driver-on-machines-which-don-t-suppo.patch --]
[-- Type: text/plain, Size: 1102 bytes --]

commit 3558c9b3232b5f0fd9f32043a191eca20fca64c6 in mainline.

sparc32:

drivers/block/DAC960.c: In function 'DAC960_V1_EnableMemoryMailboxInterface':
drivers/block/DAC960.c:1168: error: 'DMA_32BIT_MASK' undeclared (first use in this function)
drivers/block/DAC960.c:1168: error: (Each undeclared identifier is reported only

Cc: <dac@conglom-o.org>
Cc: Alessandro Polverini <alex@nibbles.it>
Cc: Jeff Garzik <jeff@garzik.org>
Cc: Matthew Wilcox <matthew@wil.cx>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
---
 drivers/block/DAC960.c |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

Index: 2.6/drivers/block/DAC960.c
===================================================================
--- 2.6.orig/drivers/block/DAC960.c
+++ 2.6/drivers/block/DAC960.c
@@ -31,6 +31,7 @@
 #include <linux/genhd.h>
 #include <linux/hdreg.h>
 #include <linux/blkpg.h>
+#include <linux/dma-mapping.h>
 #include <linux/interrupt.h>
 #include <linux/ioport.h>
 #include <linux/mm.h>

-- 

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [2.6.20.21 review 23/35] futex_compat: fix list traversal bugs
  2007-10-13 14:28 [2.6.20.21 review 00/35] 2.6.20.21 -stable review Willy Tarreau
                   ` (18 preceding siblings ...)
  2007-10-13 15:28 ` [2.6.20.21 review 22/35] Fix "Fix DAC960 driver on machines which dont support 64-bit DMA" Willy Tarreau
@ 2007-10-13 15:28 ` Willy Tarreau
  2007-10-13 15:28 ` [2.6.20.21 review 24/35] Leases can be hidden by flocks Willy Tarreau
                   ` (10 subsequent siblings)
  30 siblings, 0 replies; 37+ messages in thread
From: Willy Tarreau @ 2007-10-13 15:28 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Ingo Molnar, Thomas Gleixner, David Miller, Arnd Bergmann,
	Andrew Morton, Linus Torvalds, Greg Kroah-Hartman

[-- Attachment #1: 0078-futex_compat-fix-list-traversal-bugs.patch --]
[-- Type: text/plain, Size: 1998 bytes --]

commit 179c85ea53bef807621f335767e41e23f86f01df in mainline.

The futex list traversal on the compat side appears to have
a bug.

It's loop termination condition compares:

        while (compat_ptr(uentry) != &head->list)

But that can't be right because "uentry" has the special
"pi" indicator bit still potentially set at bit 0.  This
is cleared by fetch_robust_entry() into the "entry"
return value.

What this seems to mean is that the list won't terminate
when list iteration gets back to the the head.  And we'll
also process the list head like a normal entry, which could
cause all kinds of problems.

So we should check for equality with "entry".  That pointer
is of the non-compat type so we have to do a little casting
to keep the compiler and sparse happy.

The same problem can in theory occur with the 'pending'
variable, although that has not been reported from users
so far.

Based on the original patch from David Miller.

Acked-by: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: David Miller <davem@davemloft.net>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
---
 kernel/futex_compat.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

Index: 2.6/kernel/futex_compat.c
===================================================================
--- 2.6.orig/kernel/futex_compat.c
+++ 2.6/kernel/futex_compat.c
@@ -61,10 +61,10 @@ void compat_exit_robust_list(struct task
 	if (fetch_robust_entry(&upending, &pending,
 			       &head->list_op_pending, &pip))
 		return;
-	if (upending)
+	if (pending)
 		handle_futex_death((void __user *)pending + futex_offset, curr, pip);
 
-	while (compat_ptr(uentry) != &head->list) {
+	while (entry != (struct robust_list __user *) &head->list) {
 		/*
 		 * A pending lock might already be on the list, so
 		 * dont process it twice:

-- 

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [2.6.20.21 review 24/35] Leases can be hidden by flocks
  2007-10-13 14:28 [2.6.20.21 review 00/35] 2.6.20.21 -stable review Willy Tarreau
                   ` (19 preceding siblings ...)
  2007-10-13 15:28 ` [2.6.20.21 review 23/35] futex_compat: fix list traversal bugs Willy Tarreau
@ 2007-10-13 15:28 ` Willy Tarreau
  2007-10-13 15:28 ` [2.6.20.21 review 25/35] nfs: fix oops re sysctls and V4 support Willy Tarreau
                   ` (9 subsequent siblings)
  30 siblings, 0 replies; 37+ messages in thread
From: Willy Tarreau @ 2007-10-13 15:28 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Pavel Emelyanov, J. Bruce Fields, Trond Myklebust, Andrew Morton,
	Linus Torvalds, Greg Kroah-Hartman

[-- Attachment #1: 0079-Leases-can-be-hidden-by-flocks.patch --]
[-- Type: text/plain, Size: 2625 bytes --]

commit 0e2f6db88a6900bc9db576d6b478b12ee60d61f7 in mainline.

The inode->i_flock list contains the leases, flocks and posix
locks in the specified order. However, the flocks are added in
the head of this list thus hiding the leases from F_GETLEASE
command, from time_out_leases() and other code that expects
the leases to come first.

The following example will demonstrate this:

#define _GNU_SOURCE

#include <unistd.h>
#include <fcntl.h>
#include <stdio.h>
#include <sys/file.h>

static void show_lease(int fd)
{
        int res;

        res = fcntl(fd, F_GETLEASE);
        switch (res) {
                case F_RDLCK:
                        printf("Read lease
");
                        break;
                case F_WRLCK:
                        printf("Write lease
");
                        break;
                case F_UNLCK:
                        printf("No leases
");
                        break;
                default:
                        printf("Some shit
");
                        break;
        }
}

int main(int argc, char **argv)
{
        int fd, res;

        fd = open(argv[1], O_RDONLY);
        if (fd == -1) {
                perror("Can't open file");
                return 1;
        }

        res = fcntl(fd, F_SETLEASE, F_WRLCK);
        if (res == -1) {
                perror("Can't set lease");
                return 1;
        }

        show_lease(fd);

        if (flock(fd, LOCK_SH) == -1) {
                perror("Can't flock shared");
                return 1;
        }

        show_lease(fd);

        return 0;
}

The first call to show_lease() will show the write lease set, but
the second will show no leases.

Fix the flock adding so that the leases always stay in the head
of this list.

Found during making the flocks pid-namespaces aware.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: "J. Bruce Fields" <bfields@fieldses.org>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
---
 fs/locks.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

Index: 2.6/fs/locks.c
===================================================================
--- 2.6.orig/fs/locks.c
+++ 2.6/fs/locks.c
@@ -790,7 +790,7 @@ find_conflict:
 	if (request->fl_flags & FL_ACCESS)
 		goto out;
 	locks_copy_lock(new_fl, request);
-	locks_insert_lock(&inode->i_flock, new_fl);
+	locks_insert_lock(before, new_fl);
 	new_fl = NULL;
 	error = 0;
 

-- 

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [2.6.20.21 review 25/35] nfs: fix oops re sysctls and V4 support
  2007-10-13 14:28 [2.6.20.21 review 00/35] 2.6.20.21 -stable review Willy Tarreau
                   ` (20 preceding siblings ...)
  2007-10-13 15:28 ` [2.6.20.21 review 24/35] Leases can be hidden by flocks Willy Tarreau
@ 2007-10-13 15:28 ` Willy Tarreau
  2007-10-13 15:28 ` [2.6.20.21 review 26/35] dir_index: error out instead of BUG on corrupt dx dirs Willy Tarreau
                   ` (8 subsequent siblings)
  30 siblings, 0 replies; 37+ messages in thread
From: Willy Tarreau @ 2007-10-13 15:28 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Alexey Dobriyan, Trond Myklebust, J. Bruce Fields, Andrew Morton,
	Linus Torvalds, Greg Kroah-Hartman

[-- Attachment #1: 0081-nfs-fix-oops-re-sysctls-and-V4-support.patch --]
[-- Type: text/plain, Size: 2876 bytes --]

commit 49af7ee181f4f516ac99eba85d3f70ed42cabe76 in mainline.

NFS unregisters sysctls only if V4 support is compiled in.  However, sysctl
table is not V4 specific, so unregister it always.

Steps to reproduce:

	[build nfs.ko with CONFIG_NFS_V4=n]
	modrobe nfs
	rmmod nfs
	ls /proc/sys

Unable to handle kernel paging request at ffffffff880661c0 RIP:
 [<ffffffff802af8e3>] proc_sys_readdir+0xd3/0x350
PGD 203067 PUD 207063 PMD 7e216067 PTE 0
Oops: 0000 [1] SMP
CPU 1
Modules linked in: lockd nfs_acl sunrpc
Pid: 3335, comm: ls Not tainted 2.6.23-rc3-bloat #2
RIP: 0010:[<ffffffff802af8e3>]  [<ffffffff802af8e3>] proc_sys_readdir+0xd3/0x350
RSP: 0018:ffff81007fd93e78  EFLAGS: 00010286
RAX: ffffffff880661c0 RBX: ffffffff80466370 RCX: ffffffff880661c0
RDX: 00000000000014c0 RSI: ffff81007f3ad020 RDI: ffff81007efd8b40
RBP: 0000000000000018 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000001 R11: ffffffff802a8570 R12: ffffffff880661c0
R13: ffff81007e219640 R14: ffff81007efd8b40 R15: ffff81007ded7280
FS:  00002ba25ef03060(0000) GS:ffff81007ff81258(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: ffffffff880661c0 CR3: 000000007dfaf000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process ls (pid: 3335, threadinfo ffff81007fd92000, task ffff81007d8a0000)
Stack:  ffff81007f3ad150 ffffffff80283f30 ffff81007fd93f48 ffff81007efd8b40
 ffff81007ee00440 0000000422222222 0000000200035593 ffffffff88037e9a
 2222222222222222 ffffffff80466500 ffff81007e416400 ffff81007e219640
Call Trace:
 [<ffffffff80283f30>] filldir+0x0/0xf0
 [<ffffffff80283f30>] filldir+0x0/0xf0
 [<ffffffff802840c7>] vfs_readdir+0xa7/0xc0
 [<ffffffff80284376>] sys_getdents+0x96/0xe0
 [<ffffffff8020bb3e>] system_call+0x7e/0x83

Code: 41 8b 14 24 85 d2 74 dc 49 8b 44 24 08 48 85 c0 74 e7 49 3b
RIP  [<ffffffff802af8e3>] proc_sys_readdir+0xd3/0x350
 RSP <ffff81007fd93e78>
CR2: ffffffff880661c0
Kernel panic - not syncing: Fatal exception

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Acked-by: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
---
 fs/nfs/super.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

Index: 2.6/fs/nfs/super.c
===================================================================
--- 2.6.orig/fs/nfs/super.c
+++ 2.6/fs/nfs/super.c
@@ -180,8 +180,8 @@ void __exit unregister_nfs_fs(void)
 		remove_shrinker(acl_shrinker);
 #ifdef CONFIG_NFS_V4
 	unregister_filesystem(&nfs4_fs_type);
-	nfs_unregister_sysctl();
 #endif
+	nfs_unregister_sysctl();
 	unregister_filesystem(&nfs_fs_type);
 }
 

-- 

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [2.6.20.21 review 26/35] dir_index: error out instead of BUG on corrupt dx dirs
  2007-10-13 14:28 [2.6.20.21 review 00/35] 2.6.20.21 -stable review Willy Tarreau
                   ` (21 preceding siblings ...)
  2007-10-13 15:28 ` [2.6.20.21 review 25/35] nfs: fix oops re sysctls and V4 support Willy Tarreau
@ 2007-10-13 15:28 ` Willy Tarreau
  2007-10-13 15:28 ` [2.6.20.21 review 27/35] ieee1394: ohci1394: fix initialization if built non-modular Willy Tarreau
                   ` (7 subsequent siblings)
  30 siblings, 0 replies; 37+ messages in thread
From: Willy Tarreau @ 2007-10-13 15:28 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Eric Sandeen, Duane Griffin, Theodore Tso, Andrew Morton,
	Linus Torvalds, Greg Kroah-Hartman

[-- Attachment #1: 0082-dir_index-error-out-instead-of-BUG-on-corrupt-dx-di.patch --]
[-- Type: text/plain, Size: 4364 bytes --]

commit 3d82abae9523c33d4a16fdfdfd2bdde316d7b56a in mainline.

Convert asserts (BUGs) in dx_probe from bad on-disk data to recoverable
errors with helpful warnings.  With help catching other asserts from Duane
Griffin <duaneg@dghda.com>

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Acked-by: Duane Griffin <duaneg@dghda.com>
Acked-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
---
 fs/ext3/namei.c |   34 ++++++++++++++++++++++++++++++----
 fs/ext4/namei.c |   34 ++++++++++++++++++++++++++++++----
 2 files changed, 60 insertions(+), 8 deletions(-)

Index: 2.6/fs/ext3/namei.c
===================================================================
--- 2.6.orig/fs/ext3/namei.c
+++ 2.6/fs/ext3/namei.c
@@ -380,13 +380,28 @@ dx_probe(struct dentry *dentry, struct i
 
 	entries = (struct dx_entry *) (((char *)&root->info) +
 				       root->info.info_length);
-	assert(dx_get_limit(entries) == dx_root_limit(dir,
-						      root->info.info_length));
+
+	if (dx_get_limit(entries) != dx_root_limit(dir,
+						   root->info.info_length)) {
+		ext3_warning(dir->i_sb, __FUNCTION__,
+			     "dx entry: limit != root limit");
+		brelse(bh);
+		*err = ERR_BAD_DX_DIR;
+		goto fail;
+	}
+
 	dxtrace (printk("Look up %x", hash));
 	while (1)
 	{
 		count = dx_get_count(entries);
-		assert (count && count <= dx_get_limit(entries));
+		if (!count || count > dx_get_limit(entries)) {
+			ext3_warning(dir->i_sb, __FUNCTION__,
+				     "dx entry: no count or count > limit");
+			brelse(bh);
+			*err = ERR_BAD_DX_DIR;
+			goto fail2;
+		}
+
 		p = entries + 1;
 		q = entries + count - 1;
 		while (p <= q)
@@ -424,8 +439,15 @@ dx_probe(struct dentry *dentry, struct i
 		if (!(bh = ext3_bread (NULL,dir, dx_get_block(at), 0, err)))
 			goto fail2;
 		at = entries = ((struct dx_node *) bh->b_data)->entries;
-		assert (dx_get_limit(entries) == dx_node_limit (dir));
+		if (dx_get_limit(entries) != dx_node_limit (dir)) {
+			ext3_warning(dir->i_sb, __FUNCTION__,
+				     "dx entry: limit != node limit");
+			brelse(bh);
+			*err = ERR_BAD_DX_DIR;
+			goto fail2;
+		}
 		frame++;
+		frame->bh = NULL;
 	}
 fail2:
 	while (frame >= frame_in) {
@@ -433,6 +455,10 @@ fail2:
 		frame--;
 	}
 fail:
+	if (*err == ERR_BAD_DX_DIR)
+		ext3_warning(dir->i_sb, __FUNCTION__,
+			     "Corrupt dir inode %ld, running e2fsck is "
+			     "recommended.", dir->i_ino);
 	return NULL;
 }
 
Index: 2.6/fs/ext4/namei.c
===================================================================
--- 2.6.orig/fs/ext4/namei.c
+++ 2.6/fs/ext4/namei.c
@@ -380,13 +380,28 @@ dx_probe(struct dentry *dentry, struct i
 
 	entries = (struct dx_entry *) (((char *)&root->info) +
 				       root->info.info_length);
-	assert(dx_get_limit(entries) == dx_root_limit(dir,
-						      root->info.info_length));
+
+	if (dx_get_limit(entries) != dx_root_limit(dir,
+						   root->info.info_length)) {
+		ext4_warning(dir->i_sb, __FUNCTION__,
+			     "dx entry: limit != root limit");
+		brelse(bh);
+		*err = ERR_BAD_DX_DIR;
+		goto fail;
+	}
+
 	dxtrace (printk("Look up %x", hash));
 	while (1)
 	{
 		count = dx_get_count(entries);
-		assert (count && count <= dx_get_limit(entries));
+		if (!count || count > dx_get_limit(entries)) {
+			ext4_warning(dir->i_sb, __FUNCTION__,
+				     "dx entry: no count or count > limit");
+			brelse(bh);
+			*err = ERR_BAD_DX_DIR;
+			goto fail2;
+		}
+
 		p = entries + 1;
 		q = entries + count - 1;
 		while (p <= q)
@@ -424,8 +439,15 @@ dx_probe(struct dentry *dentry, struct i
 		if (!(bh = ext4_bread (NULL,dir, dx_get_block(at), 0, err)))
 			goto fail2;
 		at = entries = ((struct dx_node *) bh->b_data)->entries;
-		assert (dx_get_limit(entries) == dx_node_limit (dir));
+		if (dx_get_limit(entries) != dx_node_limit (dir)) {
+			ext4_warning(dir->i_sb, __FUNCTION__,
+				     "dx entry: limit != node limit");
+			brelse(bh);
+			*err = ERR_BAD_DX_DIR;
+			goto fail2;
+		}
 		frame++;
+		frame->bh = NULL;
 	}
 fail2:
 	while (frame >= frame_in) {
@@ -433,6 +455,10 @@ fail2:
 		frame--;
 	}
 fail:
+	if (*err == ERR_BAD_DX_DIR)
+		ext4_warning(dir->i_sb, __FUNCTION__,
+			     "Corrupt dir inode %ld, running e2fsck is "
+			     "recommended.", dir->i_ino);
 	return NULL;
 }
 

-- 

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [2.6.20.21 review 27/35] ieee1394: ohci1394: fix initialization if built non-modular
  2007-10-13 14:28 [2.6.20.21 review 00/35] 2.6.20.21 -stable review Willy Tarreau
                   ` (22 preceding siblings ...)
  2007-10-13 15:28 ` [2.6.20.21 review 26/35] dir_index: error out instead of BUG on corrupt dx dirs Willy Tarreau
@ 2007-10-13 15:28 ` Willy Tarreau
  2007-10-13 15:28 ` [2.6.20.21 review 28/35] Fix race with shared tag queue maps Willy Tarreau
                   ` (6 subsequent siblings)
  30 siblings, 0 replies; 37+ messages in thread
From: Willy Tarreau @ 2007-10-13 15:28 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: Stefan Richter, Greg Kroah-Hartman

[-- Attachment #1: 0083-ieee1394-ohci1394-fix-initialization-if-built-non.patch --]
[-- Type: text/plain, Size: 1906 bytes --]

Initialization of ohci1394 was broken according to one reporter if the
driver was statically linked, i.e. not built as loadable module.  Dmesg:

  PCI: Device 0000:02:07.0 not available because of resource collisions
  ohci1394: Failed to enable OHCI hardware.

This was reported for a Toshiba Satellite 5100-503.  The cause is commit
8df4083c5291b3647e0381d3c69ab2196f5dd3b7 in Linux 2.6.19-rc1 which only
served purposes of early remote debugging via FireWire.  This
functionality is better provided by the currently out-of-tree driver
ohci1394_earlyinit.  Reversal of the commit was OK'd by Andi Kleen.

Same as pre-2.6.23 commit be7963b7e7f08a149e247c0bf29a4abd174e0929.

Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
---
 drivers/ieee1394/ieee1394_core.c |    2 +-
 drivers/ieee1394/ohci1394.c      |    4 +---
 2 files changed, 2 insertions(+), 4 deletions(-)

Index: 2.6/drivers/ieee1394/ieee1394_core.c
===================================================================
--- 2.6.orig/drivers/ieee1394/ieee1394_core.c
+++ 2.6/drivers/ieee1394/ieee1394_core.c
@@ -1170,7 +1170,7 @@ static void __exit ieee1394_cleanup(void
 	unregister_chrdev_region(IEEE1394_CORE_DEV, 256);
 }
 
-fs_initcall(ieee1394_init); /* same as ohci1394 */
+module_init(ieee1394_init);
 module_exit(ieee1394_cleanup);
 
 /* Exported symbols */
Index: 2.6/drivers/ieee1394/ohci1394.c
===================================================================
--- 2.6.orig/drivers/ieee1394/ohci1394.c
+++ 2.6/drivers/ieee1394/ohci1394.c
@@ -3785,7 +3785,5 @@ static int __init ohci1394_init(void)
 	return pci_register_driver(&ohci1394_pci_driver);
 }
 
-/* Register before most other device drivers.
- * Useful for remote debugging via physical DMA, e.g. using firescope. */
-fs_initcall(ohci1394_init);
+module_init(ohci1394_init);
 module_exit(ohci1394_cleanup);

-- 

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [2.6.20.21 review 28/35] Fix race with shared tag queue maps
  2007-10-13 14:28 [2.6.20.21 review 00/35] 2.6.20.21 -stable review Willy Tarreau
                   ` (23 preceding siblings ...)
  2007-10-13 15:28 ` [2.6.20.21 review 27/35] ieee1394: ohci1394: fix initialization if built non-modular Willy Tarreau
@ 2007-10-13 15:28 ` Willy Tarreau
  2007-10-13 15:28 ` [2.6.20.21 review 29/35] crypto: blkcipher_get_spot() handling of buffer at end of page Willy Tarreau
                   ` (5 subsequent siblings)
  30 siblings, 0 replies; 37+ messages in thread
From: Willy Tarreau @ 2007-10-13 15:28 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: Jens Axboe, Linus Torvalds, Greg Kroah-Hartman

[-- Attachment #1: 0085-Fix-race-with-shared-tag-queue-maps.patch --]
[-- Type: text/plain, Size: 2017 bytes --]

The commit in Linus upstream git tree is
f3da54ba140c6427fa4a32913e1bf406f41b5dda.

Fix race with shared tag queue maps

There's a race condition in blk_queue_end_tag() for shared tag maps,
users include stex (promise supertrak thingy) and qla2xxx.  The former
at least has reported bugs in this area, not sure why we haven't seen
any for the latter.  It could be because the window is narrow and that
other conditions in the qla2xxx code hide this.  It's a real bug,
though, as the stex smp users can attest.

We need to ensure two things - the tag bit clearing needs to happen
AFTER we cleared the tag pointer, as the tag bit clearing/setting is
what protects this map.  Secondly, we need to ensure that the visibility
of the tag pointer and tag bit clear are ordered properly.

[ I removed the SMP barriers - "test_and_clear_bit()" already implies
  all the required barriers.  -- Linus ]

Also see http://bugzilla.kernel.org/show_bug.cgi?id=7842

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
---
 block/ll_rw_blk.c |   13 +++++++------
 1 files changed, 7 insertions(+), 6 deletions(-)

Index: 2.6/block/ll_rw_blk.c
===================================================================
--- 2.6.orig/block/ll_rw_blk.c
+++ 2.6/block/ll_rw_blk.c
@@ -1072,12 +1072,6 @@ void blk_queue_end_tag(request_queue_t *
 		 */
 		return;
 
-	if (unlikely(!__test_and_clear_bit(tag, bqt->tag_map))) {
-		printk(KERN_ERR "%s: attempt to clear non-busy tag (%d)
",
-		       __FUNCTION__, tag);
-		return;
-	}
-
 	list_del_init(&rq->queuelist);
 	rq->cmd_flags &= ~REQ_QUEUED;
 	rq->tag = -1;
@@ -1087,6 +1081,13 @@ void blk_queue_end_tag(request_queue_t *
 		       __FUNCTION__, tag);
 
 	bqt->tag_index[tag] = NULL;
+
+	if (unlikely(!test_and_clear_bit(tag, bqt->tag_map))) {
+		printk(KERN_ERR "%s: attempt to clear non-busy tag (%d)
",
+		       __FUNCTION__, tag);
+		return;
+	}
+
 	bqt->busy--;
 }
 

-- 

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [2.6.20.21 review 29/35] crypto: blkcipher_get_spot() handling of buffer at end of page
  2007-10-13 14:28 [2.6.20.21 review 00/35] 2.6.20.21 -stable review Willy Tarreau
                   ` (24 preceding siblings ...)
  2007-10-13 15:28 ` [2.6.20.21 review 28/35] Fix race with shared tag queue maps Willy Tarreau
@ 2007-10-13 15:28 ` Willy Tarreau
  2007-10-13 15:28 ` [2.6.20.21 review 30/35] fix realtek phy id in forcedeth Willy Tarreau
                   ` (4 subsequent siblings)
  30 siblings, 0 replies; 37+ messages in thread
From: Willy Tarreau @ 2007-10-13 15:28 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: Herbert Xu, Greg Kroah-Hartman

[-- Attachment #1: 0086-crypto-blkcipher_get_spot-handling-of-buffer-at-e.patch --]
[-- Type: text/plain, Size: 1961 bytes --]

This corresponds to upstream changesets
e4630f9fd8cdc14eb1caa08dafe649eb5ae09985 and
32528d0fbda1093eeeaa7d0a2c498bbb5154099d.

[CRYPTO] blkcipher: Fix handling of kmalloc page straddling

The function blkcipher_get_spot tries to return a buffer of
the specified length that does not straddle a page.  It has
an off-by-one bug so it may advance a page unnecessarily.

What's worse, one of its callers doesn't provide a buffer
that's sufficiently long for this operation.

This patch fixes both problems.  Thanks to Bob Gilligan for
diagnosing this problem and providing a fix.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
---
 crypto/blkcipher.c |   11 +++++++----
 1 files changed, 7 insertions(+), 4 deletions(-)

Index: 2.6/crypto/blkcipher.c
===================================================================
--- 2.6.orig/crypto/blkcipher.c
+++ 2.6/crypto/blkcipher.c
@@ -58,11 +58,13 @@ static inline void blkcipher_unmap_dst(s
 	scatterwalk_unmap(walk->dst.virt.addr, 1);
 }
 
+/* Get a spot of the specified length that does not straddle a page.
+ * The caller needs to ensure that there is enough space for this operation.
+ */
 static inline u8 *blkcipher_get_spot(u8 *start, unsigned int len)
 {
-	if (offset_in_page(start + len) < len)
-		return (u8 *)((unsigned long)(start + len) & PAGE_MASK);
-	return start;
+	u8 *end_page = (u8 *)(((unsigned long)(start + len - 1)) & PAGE_MASK);
+	return start > end_page ? start : end_page;
 }
 
 static inline unsigned int blkcipher_done_slow(struct crypto_blkcipher *tfm,
@@ -154,7 +156,8 @@ static inline int blkcipher_next_slow(st
 	if (walk->buffer)
 		goto ok;
 
-	n = bsize * 2 + (alignmask & ~(crypto_tfm_ctx_alignment() - 1));
+	n = bsize * 3 - (alignmask + 1) +
+	    (alignmask & ~(crypto_tfm_ctx_alignment() - 1));
 	walk->buffer = kmalloc(n, GFP_ATOMIC);
 	if (!walk->buffer)
 		return blkcipher_walk_done(desc, walk, -ENOMEM);

-- 

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [2.6.20.21 review 30/35] fix realtek phy id in forcedeth
  2007-10-13 14:28 [2.6.20.21 review 00/35] 2.6.20.21 -stable review Willy Tarreau
                   ` (25 preceding siblings ...)
  2007-10-13 15:28 ` [2.6.20.21 review 29/35] crypto: blkcipher_get_spot() handling of buffer at end of page Willy Tarreau
@ 2007-10-13 15:28 ` Willy Tarreau
  2007-10-13 15:28 ` [2.6.20.21 review 31/35] Fix IPV6 append OOPS Willy Tarreau
                   ` (3 subsequent siblings)
  30 siblings, 0 replies; 37+ messages in thread
From: Willy Tarreau @ 2007-10-13 15:28 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Willy Tarreau, Ayaz Abdulla, Chuck Ebbert, Jeff Garzik,
	Greg Kroah-Hartman

[-- Attachment #1: 0087-fix-realtek-phy-id-in-forcedeth.patch --]
[-- Type: text/plain, Size: 989 bytes --]

commit ba685fb2abd71162bea6895a99449c1071b01402 in mainline.

As noticed by Chuck Ebbert, commit c5e3ae8823693b260ce1f217adca8add1bc0b3de
introduced a copy-paste typo, as realtek phy is 0x732 and not 0x1c1. Obvious
fix below suggested by Ayaz Abdulla.

Signed-off-by: Willy Tarreau <w@1wt.eu>
Cc: Ayaz Abdulla <aabdulla@nvidia.com>
Cc: Chuck Ebbert <cebbert@redhat.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
---
 drivers/net/forcedeth.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

Index: 2.6/drivers/net/forcedeth.c
===================================================================
--- 2.6.orig/drivers/net/forcedeth.c
+++ 2.6/drivers/net/forcedeth.c
@@ -554,6 +554,7 @@ union ring_type {
 #define PHY_OUI_MARVELL	0x5043
 #define PHY_OUI_CICADA	0x03f1
 #define PHY_OUI_VITESSE	0x01c1
+#define PHY_OUI_REALTEK	0x0732
 #define PHYID1_OUI_MASK	0x03ff
 #define PHYID1_OUI_SHFT	6
 #define PHYID2_OUI_MASK	0xfc00

-- 

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [2.6.20.21 review 31/35] Fix IPV6 append OOPS.
  2007-10-13 14:28 [2.6.20.21 review 00/35] 2.6.20.21 -stable review Willy Tarreau
                   ` (26 preceding siblings ...)
  2007-10-13 15:28 ` [2.6.20.21 review 30/35] fix realtek phy id in forcedeth Willy Tarreau
@ 2007-10-13 15:28 ` Willy Tarreau
  2007-10-13 15:28 ` [2.6.20.21 review 33/35] Fix ipv6 double-sock-release with MSG_CONFIRM Willy Tarreau
                   ` (2 subsequent siblings)
  30 siblings, 0 replies; 37+ messages in thread
From: Willy Tarreau @ 2007-10-13 15:28 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: YOSHIFUJI Hideaki, David S. Miller, Greg Kroah-Hartman

[-- Attachment #1: 0091-Fix-IPV6-append-OOPS.patch --]
[-- Type: text/plain, Size: 1639 bytes --]

commit e1f52208bb968291f7d9142eff60b62984b4a511 in mainline.

[IPv6]: Fix NULL pointer dereference in ip6_flush_pending_frames

Some of skbs in sk->write_queue do not have skb->dst because
we do not fill skb->dst when we allocate new skb in append_data().

BTW, I think we may not need to (or we should not) increment some stats
when using corking; if 100 sendmsg() (with MSG_MORE) result in 2 packets,
how many should we increment?

If 100, we should set skb->dst for every queued skbs.

If 1 (or 2 (*)), we increment the stats for the first queued skb and
we should just skip incrementing OutDiscards for the rest of queued skbs,
adn we should also impelement this semantics in other places;
e.g., we should increment other stats just once, not 100 times.

*: depends on the place we are discarding the datagram.

I guess should just increment by 1 (or 2).

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
---
 net/ipv6/ip6_output.c |    5 +++--
 1 files changed, 3 insertions(+), 2 deletions(-)

Index: 2.6/net/ipv6/ip6_output.c
===================================================================
--- 2.6.orig/net/ipv6/ip6_output.c
+++ 2.6/net/ipv6/ip6_output.c
@@ -1357,8 +1357,9 @@ void ip6_flush_pending_frames(struct soc
 	struct sk_buff *skb;
 
 	while ((skb = __skb_dequeue_tail(&sk->sk_write_queue)) != NULL) {
-		IP6_INC_STATS(ip6_dst_idev(skb->dst),
-			      IPSTATS_MIB_OUTDISCARDS);
+		if (skb->dst)
+			IP6_INC_STATS(ip6_dst_idev(skb->dst),
+				      IPSTATS_MIB_OUTDISCARDS);
 		kfree_skb(skb);
 	}
 

-- 

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [2.6.20.21 review 33/35] Fix ipv6 double-sock-release with MSG_CONFIRM
  2007-10-13 14:28 [2.6.20.21 review 00/35] 2.6.20.21 -stable review Willy Tarreau
                   ` (27 preceding siblings ...)
  2007-10-13 15:28 ` [2.6.20.21 review 31/35] Fix IPV6 append OOPS Willy Tarreau
@ 2007-10-13 15:28 ` Willy Tarreau
  2007-10-13 15:28 ` [2.6.20.21 review 34/35] Fix datagram recvmsg NULL iov handling regression Willy Tarreau
  2007-10-13 15:28 ` [2.6.20.21 review 35/35] sysfs: store sysfs inode nrs in s_ino to avoid readdir oopses Willy Tarreau
  30 siblings, 0 replies; 37+ messages in thread
From: Willy Tarreau @ 2007-10-13 15:28 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: YOSHIFUJI Hideaki, David S. Miller, Greg Kroah-Hartman

[-- Attachment #1: 0093-Fix-ipv6-double-sock-release-with-MSG_CONFIRM.patch --]
[-- Type: text/plain, Size: 780 bytes --]

commit 3ef9d943d26dea764f4fecf3767001c90b778b0c in mainline

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
---
 net/ipv6/raw.c |    3 +--
 1 files changed, 1 insertions(+), 2 deletions(-)

Index: 2.6/net/ipv6/raw.c
===================================================================
--- 2.6.orig/net/ipv6/raw.c
+++ 2.6/net/ipv6/raw.c
@@ -851,11 +851,10 @@ back_from_confirm:
 			ip6_flush_pending_frames(sk);
 		else if (!(msg->msg_flags & MSG_MORE))
 			err = rawv6_push_pending_frames(sk, &fl, rp);
+		release_sock(sk);
 	}
 done:
 	dst_release(dst);
-	if (!inet->hdrincl)
-		release_sock(sk);
 out:	
 	fl6_sock_release(flowlabel);
 	return err<0?err:len;

-- 

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [2.6.20.21 review 34/35] Fix datagram recvmsg NULL iov handling regression.
  2007-10-13 14:28 [2.6.20.21 review 00/35] 2.6.20.21 -stable review Willy Tarreau
                   ` (28 preceding siblings ...)
  2007-10-13 15:28 ` [2.6.20.21 review 33/35] Fix ipv6 double-sock-release with MSG_CONFIRM Willy Tarreau
@ 2007-10-13 15:28 ` Willy Tarreau
  2007-10-13 15:28 ` [2.6.20.21 review 35/35] sysfs: store sysfs inode nrs in s_ino to avoid readdir oopses Willy Tarreau
  30 siblings, 0 replies; 37+ messages in thread
From: Willy Tarreau @ 2007-10-13 15:28 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: Herbert Xu, David S. Miller, Greg Kroah-Hartman

[-- Attachment #1: 0100-Fix-datagram-recvmsg-NULL-iov-handling-regression.patch --]
[-- Type: text/plain, Size: 1084 bytes --]

commit ef8aef55ce61fd0e2af798695f7386ac756ae1e7 in mainline

Subject: [2.6.20.21 review 34/35] [PATCH] [NET]: Do not dereference iov if length is zero

When msg_iovlen is zero we shouldn't try to dereference
msg_iov.  Right now the only thing that tries to do so
is skb_copy_and_csum_datagram_iovec.  Since the total
length should also be zero if msg_iovlen is zero, it's
sufficient to check the total length there and simply
return if it's zero.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
---
 net/core/datagram.c |    3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

Index: 2.6/net/core/datagram.c
===================================================================
--- 2.6.orig/net/core/datagram.c
+++ 2.6/net/core/datagram.c
@@ -444,6 +444,9 @@ int skb_copy_and_csum_datagram_iovec(str
 	__wsum csum;
 	int chunk = skb->len - hlen;
 
+	if (!chunk)
+		return 0;
+
 	/* Skip filled elements.
 	 * Pretty silly, look at memcpy_toiovec, though 8)
 	 */

-- 

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [2.6.20.21 review 35/35] sysfs: store sysfs inode nrs in s_ino to avoid readdir oopses
  2007-10-13 14:28 [2.6.20.21 review 00/35] 2.6.20.21 -stable review Willy Tarreau
                   ` (29 preceding siblings ...)
  2007-10-13 15:28 ` [2.6.20.21 review 34/35] Fix datagram recvmsg NULL iov handling regression Willy Tarreau
@ 2007-10-13 15:28 ` Willy Tarreau
  30 siblings, 0 replies; 37+ messages in thread
From: Willy Tarreau @ 2007-10-13 15:28 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: Eric Sandeen, Tejun Heo, Greg Kroah-Hartman

[-- Attachment #1: 0001-sysfs-store-sysfs-inode-nrs-in-s_ino-to-avoid-readd.patch --]
[-- Type: text/plain, Size: 3788 bytes --]

Backport of
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.22-rc1/2.6.22-rc1-mm1/broken-out/gregkh-driver-sysfs-allocate-inode-number-using-ida.patch

For regular files in sysfs, sysfs_readdir wants to traverse
sysfs_dirent->s_dentry->d_inode->i_ino to get to the inode number.
But, the dentry can be reclaimed under memory pressure, and there is
no synchronization with readdir.  This patch follows Tejun's scheme of
allocating and storing an inode number in the new s_ino member of a
sysfs_dirent, when dirents are created, and retrieving it from there
for readdir, so that the pointer chain doesn't have to be traversed.

Tejun's upstream patch uses a new-ish "ida" allocator which brings
along some extra complexity; this -stable patch has a brain-dead
incrementing counter which does not guarantee uniqueness, but because
sysfs doesn't hash inodes as iunique expects, uniqueness wasn't
guaranteed today anyway.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
---
 fs/sysfs/dir.c   |   16 +++++++++++-----
 fs/sysfs/inode.c |    1 +
 fs/sysfs/mount.c |    1 +
 fs/sysfs/sysfs.h |    1 +
 4 files changed, 14 insertions(+), 5 deletions(-)

Index: 2.6/fs/sysfs/dir.c
===================================================================
--- 2.6.orig/fs/sysfs/dir.c
+++ 2.6/fs/sysfs/dir.c
@@ -29,6 +29,14 @@ static struct dentry_operations sysfs_de
 	.d_iput		= sysfs_d_iput,
 };
 
+static unsigned int sysfs_inode_counter;
+ino_t sysfs_get_inum(void)
+{
+	if (unlikely(sysfs_inode_counter < 3))
+		sysfs_inode_counter = 3;
+	return sysfs_inode_counter++;
+}
+
 /*
  * Allocates a new sysfs_dirent and links it to the parent sysfs_dirent
  */
@@ -42,6 +50,7 @@ static struct sysfs_dirent * sysfs_new_d
 		return NULL;
 
 	memset(sd, 0, sizeof(*sd));
+	sd->s_ino = sysfs_get_inum();
 	atomic_set(&sd->s_count, 1);
 	atomic_set(&sd->s_event, 1);
 	INIT_LIST_HEAD(&sd->s_children);
@@ -461,7 +470,7 @@ static int sysfs_readdir(struct file * f
 
 	switch (i) {
 		case 0:
-			ino = dentry->d_inode->i_ino;
+			ino = parent_sd->s_ino;
 			if (filldir(dirent, ".", 1, i, ino, DT_DIR) < 0)
 				break;
 			filp->f_pos++;
@@ -490,10 +499,7 @@ static int sysfs_readdir(struct file * f
 
 				name = sysfs_get_name(next);
 				len = strlen(name);
-				if (next->s_dentry)
-					ino = next->s_dentry->d_inode->i_ino;
-				else
-					ino = iunique(sysfs_sb, 2);
+				ino = next->s_ino;
 
 				if (filldir(dirent, name, len, filp->f_pos, ino,
 						 dt_type(next)) < 0)
Index: 2.6/fs/sysfs/inode.c
===================================================================
--- 2.6.orig/fs/sysfs/inode.c
+++ 2.6/fs/sysfs/inode.c
@@ -129,6 +129,7 @@ struct inode * sysfs_new_inode(mode_t mo
 		inode->i_mapping->a_ops = &sysfs_aops;
 		inode->i_mapping->backing_dev_info = &sysfs_backing_dev_info;
 		inode->i_op = &sysfs_inode_operations;
+		inode->i_ino = sd->s_ino;
 		lockdep_set_class(&inode->i_mutex, &sysfs_inode_imutex_key);
 
 		if (sd->s_iattr) {
Index: 2.6/fs/sysfs/mount.c
===================================================================
--- 2.6.orig/fs/sysfs/mount.c
+++ 2.6/fs/sysfs/mount.c
@@ -29,6 +29,7 @@ static struct sysfs_dirent sysfs_root = 
 	.s_element	= NULL,
 	.s_type		= SYSFS_ROOT,
 	.s_iattr	= NULL,
+	.s_ino		= 1,
 };
 
 static int sysfs_fill_super(struct super_block *sb, void *data, int silent)
Index: 2.6/include/linux/sysfs.h
===================================================================
--- 2.6.orig/include/linux/sysfs.h
+++ 2.6/include/linux/sysfs.h
@@ -73,6 +73,7 @@ struct sysfs_dirent {
 	void 			* s_element;
 	int			s_type;
 	umode_t			s_mode;
+	ino_t			s_ino;
 	struct dentry		* s_dentry;
 	struct iattr		* s_iattr;
 	atomic_t		s_event;

-- 

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [2.6.20.21 review 12/35] TCP: Fix TCP handling of SACK in bidirectional flows.
  2007-10-13 15:28 ` [2.6.20.21 review 13/35] USB: allow retry on descriptor fetch errors Willy Tarreau
@ 2007-10-13 17:15   ` Ilpo Järvinen
  2007-10-13 17:22     ` Willy Tarreau
  0 siblings, 1 reply; 37+ messages in thread
From: Ilpo Järvinen @ 2007-10-13 17:15 UTC (permalink / raw)
  To: Willy Tarreau; +Cc: LKML, stable, David S. Miller, Greg Kroah-Hartman

[-- Attachment #1: Type: TEXT/PLAIN, Size: 2711 bytes --]

On Sat, 13 Oct 2007, Willy Tarreau wrote:

> It's possible that new SACK blocks that should trigger new LOST
> markings arrive with new data (which previously made is_dupack
> false). In addition, I think this fixes a case where we get
> a cumulative ACK with enough SACK blocks to trigger the fast
> recovery (is_dupack would be false there too).
> 
> I'm not completely pleased with this solution because readability
> of the code is somewhat questionable as 'is_dupack' in SACK case
> is no longer about dupacks only but would mean something like
> 'lost_marker_work_todo' too... But because of Eifel stuff done
> in CA_Recovery, the FLAG_DATA_SACKED check cannot be placed to
> the if statement which seems attractive solution. Nevertheless,
> I didn't like adding another variable just for that either... :-)
> 
> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
> Signed-off-by: David S. Miller <davem@davemloft.net>
> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
> ---
>  net/ipv4/tcp_input.c |    5 ++++-
>  1 files changed, 4 insertions(+), 1 deletions(-)
> 
> Index: 2.6/net/ipv4/tcp_input.c
> ===================================================================
> --- 2.6.orig/net/ipv4/tcp_input.c
> +++ 2.6/net/ipv4/tcp_input.c
> @@ -1951,7 +1951,10 @@ tcp_fastretrans_alert(struct sock *sk, u
>  {
>  	struct inet_connection_sock *icsk = inet_csk(sk);
>  	struct tcp_sock *tp = tcp_sk(sk);
> -	int is_dupack = (tp->snd_una == prior_snd_una && !(flag&FLAG_NOT_DUP));
> +	int is_dupack = (tp->snd_una == prior_snd_una &&
> +			 (!(flag&FLAG_NOT_DUP) ||
> +			  ((flag&FLAG_DATA_SACKED) &&
> +			   (tp->fackets_out > tp->reordering))));
>  
>  	/* Some technical things:
>  	 * 1. Reno does not count dupacks (sacked_out) automatically. */

FYI,

This ended up being a non complete fix. Day after these two patches 
(11-12) I submitted two other patches to complete this fix series (got 
bitten by release-early-release-often, fixed day-after-submission 
thoughts in those two later patches). For some reason these two keep 
floating around as separate ones from those two later ones.

To make things even more complicated, eb7bdad82e8 (see stable-2.6.22) 
could have been split more logically to do_lost addition and 
FLAG_SND_UNA_ADVANCED parts (but that didn't occur to me back then).

All of them listed here (from stable-2.6.22 since one of them is
reduced from mainline version):

6d742fb6e2b8913457e1282e1be77d6f4e45af00 Fix TCP DSACK cwnd handling
eb7bdad82e8af48e1ed1b650268dc85ca7e9ff39 Handle snd_una in tcp_cwnd_down()
8385cffd22359ad561a173accefeb354bd606ce4 TCP: Fix TCP handling of SACK in 
783366ad4b212cde069c50903494eb6a6b83958c TCP: Fix TCP rate-halving on 


-- 
 i.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [2.6.20.21 review 12/35] TCP: Fix TCP handling of SACK in bidirectional flows.
  2007-10-13 17:15   ` [2.6.20.21 review 12/35] TCP: Fix TCP handling of SACK in bidirectional flows Ilpo Järvinen
@ 2007-10-13 17:22     ` Willy Tarreau
  2007-10-13 17:50       ` Adrian Bunk
  0 siblings, 1 reply; 37+ messages in thread
From: Willy Tarreau @ 2007-10-13 17:22 UTC (permalink / raw)
  To: Ilpo Järvinen; +Cc: LKML, stable, David S. Miller, Greg Kroah-Hartman

Hi Ilpo,

On Sat, Oct 13, 2007 at 08:15:52PM +0300, Ilpo Järvinen wrote:
> On Sat, 13 Oct 2007, Willy Tarreau wrote:
> 
> > It's possible that new SACK blocks that should trigger new LOST
> > markings arrive with new data (which previously made is_dupack
> > false). In addition, I think this fixes a case where we get
> > a cumulative ACK with enough SACK blocks to trigger the fast
> > recovery (is_dupack would be false there too).
> > 
> > I'm not completely pleased with this solution because readability
> > of the code is somewhat questionable as 'is_dupack' in SACK case
> > is no longer about dupacks only but would mean something like
> > 'lost_marker_work_todo' too... But because of Eifel stuff done
> > in CA_Recovery, the FLAG_DATA_SACKED check cannot be placed to
> > the if statement which seems attractive solution. Nevertheless,
> > I didn't like adding another variable just for that either... :-)
> > 
> > Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
> > Signed-off-by: David S. Miller <davem@davemloft.net>
> > Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
> > ---
> >  net/ipv4/tcp_input.c |    5 ++++-
> >  1 files changed, 4 insertions(+), 1 deletions(-)
> > 
> > Index: 2.6/net/ipv4/tcp_input.c
> > ===================================================================
> > --- 2.6.orig/net/ipv4/tcp_input.c
> > +++ 2.6/net/ipv4/tcp_input.c
> > @@ -1951,7 +1951,10 @@ tcp_fastretrans_alert(struct sock *sk, u
> >  {
> >  	struct inet_connection_sock *icsk = inet_csk(sk);
> >  	struct tcp_sock *tp = tcp_sk(sk);
> > -	int is_dupack = (tp->snd_una == prior_snd_una && !(flag&FLAG_NOT_DUP));
> > +	int is_dupack = (tp->snd_una == prior_snd_una &&
> > +			 (!(flag&FLAG_NOT_DUP) ||
> > +			  ((flag&FLAG_DATA_SACKED) &&
> > +			   (tp->fackets_out > tp->reordering))));
> >  
> >  	/* Some technical things:
> >  	 * 1. Reno does not count dupacks (sacked_out) automatically. */
> 
> FYI,
> 
> This ended up being a non complete fix. Day after these two patches 
> (11-12) I submitted two other patches to complete this fix series (got 
> bitten by release-early-release-often, fixed day-after-submission 
> thoughts in those two later patches). For some reason these two keep 
> floating around as separate ones from those two later ones.
> 
> To make things even more complicated, eb7bdad82e8 (see stable-2.6.22) 
> could have been split more logically to do_lost addition and 
> FLAG_SND_UNA_ADVANCED parts (but that didn't occur to me back then).
> 
> All of them listed here (from stable-2.6.22 since one of them is
> reduced from mainline version):
> 
> 6d742fb6e2b8913457e1282e1be77d6f4e45af00 Fix TCP DSACK cwnd handling
> eb7bdad82e8af48e1ed1b650268dc85ca7e9ff39 Handle snd_una in tcp_cwnd_down()
> 8385cffd22359ad561a173accefeb354bd606ce4 TCP: Fix TCP handling of SACK in 
> 783366ad4b212cde069c50903494eb6a6b83958c TCP: Fix TCP rate-halving on 

Thanks for your help, I really appreciate it. In fact, I've reviewed them
four, but two of them did not apply and the code looked somewhat different,
so I considered them irrelevant to 2.6.20. I didn't understand that they
were all related, so maybe I checked them in a wrong order.

I'll recheck all that in the right sequence and will merge them four, or
get back to you if something still puzzles me.

Thanks!
Willy


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [2.6.20.21 review 12/35] TCP: Fix TCP handling of SACK in bidirectional flows.
  2007-10-13 17:22     ` Willy Tarreau
@ 2007-10-13 17:50       ` Adrian Bunk
  2007-10-13 18:10         ` Willy Tarreau
  0 siblings, 1 reply; 37+ messages in thread
From: Adrian Bunk @ 2007-10-13 17:50 UTC (permalink / raw)
  To: Willy Tarreau
  Cc: Ilpo Järvinen, LKML, stable, David S. Miller, Greg Kroah-Hartman

On Sat, Oct 13, 2007 at 07:22:14PM +0200, Willy Tarreau wrote:
>...
> Thanks for your help, I really appreciate it. In fact, I've reviewed them
> four, but two of them did not apply and the code looked somewhat different,
> so I considered them irrelevant to 2.6.20. I didn't understand that they
> were all related, so maybe I checked them in a wrong order.
> 
> I'll recheck all that in the right sequence and will merge them four, or
> get back to you if something still puzzles me.

I discussed this issue with Ilpo just yesterday regarding 2.6.16, and 
the result of our discussion was that I reverted it.

TCP being in some situations a bit more conservative than it should be 
isn't a big issue and not worth backporting with a risk of introducing
a regression.

I'd recommend you simply drop the two patches for 2.6.20.

> Thanks!
> Willy

cu
Adrian

-- 

       "Is there not promise of rain?" Ling Tan asked suddenly out
        of the darkness. There had been need of rain for many days.
       "Only a promise," Lao Er said.
                                       Pearl S. Buck - Dragon Seed


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [2.6.20.21 review 12/35] TCP: Fix TCP handling of SACK in bidirectional flows.
  2007-10-13 17:50       ` Adrian Bunk
@ 2007-10-13 18:10         ` Willy Tarreau
  2007-10-14  8:55           ` Ilpo Järvinen
  0 siblings, 1 reply; 37+ messages in thread
From: Willy Tarreau @ 2007-10-13 18:10 UTC (permalink / raw)
  To: Adrian Bunk
  Cc: Ilpo Järvinen, LKML, stable, David S. Miller, Greg Kroah-Hartman

Hi Adrian,

On Sat, Oct 13, 2007 at 07:50:36PM +0200, Adrian Bunk wrote:
> On Sat, Oct 13, 2007 at 07:22:14PM +0200, Willy Tarreau wrote:
> >...
> > Thanks for your help, I really appreciate it. In fact, I've reviewed them
> > four, but two of them did not apply and the code looked somewhat different,
> > so I considered them irrelevant to 2.6.20. I didn't understand that they
> > were all related, so maybe I checked them in a wrong order.
> > 
> > I'll recheck all that in the right sequence and will merge them four, or
> > get back to you if something still puzzles me.
> 
> I discussed this issue with Ilpo just yesterday regarding 2.6.16, and 
> the result of our discussion was that I reverted it.

OK.

> TCP being in some situations a bit more conservative than it should be 
> isn't a big issue and not worth backporting with a risk of introducing
> a regression.

I agree with this. The impression I got from the description of the two
patches I merged was that the problems they fix were quite annoying. But
maybe I should take that with a grain of salt.

> I'd recommend you simply drop the two patches for 2.6.20.

That sounds OK to me. If 2.6.16 is fine without the patches, 2.6.20
certainly is, particularly if we keep in mind that it's a last version.

Thanks very much for your insights Adrian,
Willy


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [2.6.20.21 review 12/35] TCP: Fix TCP handling of SACK in bidirectional flows.
  2007-10-13 18:10         ` Willy Tarreau
@ 2007-10-14  8:55           ` Ilpo Järvinen
  0 siblings, 0 replies; 37+ messages in thread
From: Ilpo Järvinen @ 2007-10-14  8:55 UTC (permalink / raw)
  To: Willy Tarreau
  Cc: Adrian Bunk, LKML, stable, David S. Miller, Greg Kroah-Hartman

On Sat, 13 Oct 2007, Willy Tarreau wrote:
> On Sat, Oct 13, 2007 at 07:50:36PM +0200, Adrian Bunk wrote:
> > On Sat, Oct 13, 2007 at 07:22:14PM +0200, Willy Tarreau wrote:
> > >...
> > > Thanks for your help, I really appreciate it. In fact, I've reviewed them
> > > four, but two of them did not apply and the code looked somewhat different,
> > > so I considered them irrelevant to 2.6.20. I didn't understand that they
> > > were all related, so maybe I checked them in a wrong order.
> > > 
> > > I'll recheck all that in the right sequence and will merge them four, or
> > > get back to you if something still puzzles me.
> > 
> > I discussed this issue with Ilpo just yesterday regarding 2.6.16, and 
> > the result of our discussion was that I reverted it.
> 
> OK.
> 
> > TCP being in some situations a bit more conservative than it should be 

This of course understatement from what I wrote about the performance:
"I would rather put it this way: Without those four patches TCP
just is very much more conservative than with them but still
works. " Emphasis on word very. Yet this isn't a correctness issue.

> > isn't a big issue and not worth backporting with a risk of introducing
> > a regression.

I agree that it's not that big issue due to the cases that are affected. 
The flow must simulataneously be bidirectional and get at least one of the 
directions to suffer from congestion. Considering that this problem has 
been there very long (definately predates 2.6 series, probably it has 
occured since ratehalving was added, I don't know when), and nobody has 
complained because of poor performance, I'd claim it's highly unlikely 
they will, in the near future... :-)

> I agree with this. The impression I got from the description of the two
> patches I merged was that the problems they fix were quite annoying. But
> maybe I should take that with a grain of salt.

No, it's not a grain of salt. I would say its utterly broken, out loud. 
But many people are not that much into time-seq graphs (that I'm familiar 
with), they are pleased when it seems to work well enough even though 
from my perspective, it is simply unacceptable in worst cases (not 
speaking of theoretical ones here, have seen very bad performance). Not 
that it always is that bad, depends on phase of the opposite direction 
what happens.

Somebody asked me when those four patches were made about this, I put 
these there back then:

http://www.cs.helsinki.fi/u/ijjarvin/bidir-showcase/

They are generated from my old test archives and thus may have a bit 
differences in TCP variant, which may slightly differ from mainline
here and there (my point was just to show how it breaks). Typical 
people wouldn't even notice those minor differences compared with the 
bidir brokeness which is very visible in all except in the fixed "ok"
case. Please judge for yourself whether I overexaggrated or not... :-)


-- 
 i.

^ permalink raw reply	[flat|nested] 37+ messages in thread

end of thread, other threads:[~2007-10-14  8:55 UTC | newest]

Thread overview: 37+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-10-13 14:28 [2.6.20.21 review 00/35] 2.6.20.21 -stable review Willy Tarreau
2007-10-13 15:28 ` [2.6.20.21 review 01/35] ACPICA: Fixed possible corruption of global GPE list Willy Tarreau
2007-10-13 15:28 ` [2.6.20.21 review 02/35] AVR32: Fix atomic_add_unless() and atomic_sub_unless() Willy Tarreau
2007-10-13 15:28 ` [2.6.20.21 review 03/35] r8169: avoid needless NAPI poll scheduling Willy Tarreau
2007-10-13 15:28 ` [2.6.20.21 review 04/35] i386: allow debuggers to access the vsyscall page with compat vDSO Willy Tarreau
2007-10-13 15:28 ` [2.6.20.21 review 05/35] DCCP: Fix DCCP GFP_KERNEL allocation in atomic context Willy Tarreau
2007-10-13 15:28 ` [2.6.20.21 review 06/35] Netfilter: Missing Kbuild entry for netfilter Willy Tarreau
2007-10-13 15:28 ` [2.6.20.21 review 07/35] SNAP: Fix SNAP protocol header accesses Willy Tarreau
2007-10-13 15:28 ` [2.6.20.21 review 09/35] SPARC64: Fix sparc64 task stack traces Willy Tarreau
2007-10-13 15:28 ` [2.6.20.21 review 10/35] TCP: Do not autobind ports for TCP sockets Willy Tarreau
2007-10-13 15:28 ` [2.6.20.21 review 11/35] TCP: Fix TCP rate-halving on bidirectional flows Willy Tarreau
2007-10-13 15:28 ` [2.6.20.21 review 13/35] USB: allow retry on descriptor fetch errors Willy Tarreau
2007-10-13 17:15   ` [2.6.20.21 review 12/35] TCP: Fix TCP handling of SACK in bidirectional flows Ilpo Järvinen
2007-10-13 17:22     ` Willy Tarreau
2007-10-13 17:50       ` Adrian Bunk
2007-10-13 18:10         ` Willy Tarreau
2007-10-14  8:55           ` Ilpo Järvinen
2007-10-13 15:28 ` [2.6.20.21 review 14/35] USB: fix DoS in pwc USB video driver Willy Tarreau
2007-10-13 15:28 ` [2.6.20.21 review 15/35] Convert snd-page-alloc proc file to use seq_file Willy Tarreau
2007-10-13 15:28 ` [2.6.20.21 review 16/35] setpgid(child) fails if the child was forked by sub-thread Willy Tarreau
2007-10-13 15:28 ` [2.6.20.21 review 17/35] sigqueue_free: fix the race with collect_signal() Willy Tarreau
2007-10-13 15:28 ` [2.6.20.21 review 18/35] USB: fix linked list insertion bugfix for usb core Willy Tarreau
2007-10-13 15:28 ` [2.6.20.21 review 19/35] POWERPC: Flush registers to proper task context Willy Tarreau
2007-10-13 15:28 ` [2.6.20.21 review 21/35] V4L: cx88: Avoid a NULL pointer dereference during mpeg_open() Willy Tarreau
2007-10-13 15:28 ` [2.6.20.21 review 22/35] Fix "Fix DAC960 driver on machines which dont support 64-bit DMA" Willy Tarreau
2007-10-13 15:28 ` [2.6.20.21 review 23/35] futex_compat: fix list traversal bugs Willy Tarreau
2007-10-13 15:28 ` [2.6.20.21 review 24/35] Leases can be hidden by flocks Willy Tarreau
2007-10-13 15:28 ` [2.6.20.21 review 25/35] nfs: fix oops re sysctls and V4 support Willy Tarreau
2007-10-13 15:28 ` [2.6.20.21 review 26/35] dir_index: error out instead of BUG on corrupt dx dirs Willy Tarreau
2007-10-13 15:28 ` [2.6.20.21 review 27/35] ieee1394: ohci1394: fix initialization if built non-modular Willy Tarreau
2007-10-13 15:28 ` [2.6.20.21 review 28/35] Fix race with shared tag queue maps Willy Tarreau
2007-10-13 15:28 ` [2.6.20.21 review 29/35] crypto: blkcipher_get_spot() handling of buffer at end of page Willy Tarreau
2007-10-13 15:28 ` [2.6.20.21 review 30/35] fix realtek phy id in forcedeth Willy Tarreau
2007-10-13 15:28 ` [2.6.20.21 review 31/35] Fix IPV6 append OOPS Willy Tarreau
2007-10-13 15:28 ` [2.6.20.21 review 33/35] Fix ipv6 double-sock-release with MSG_CONFIRM Willy Tarreau
2007-10-13 15:28 ` [2.6.20.21 review 34/35] Fix datagram recvmsg NULL iov handling regression Willy Tarreau
2007-10-13 15:28 ` [2.6.20.21 review 35/35] sysfs: store sysfs inode nrs in s_ino to avoid readdir oopses Willy Tarreau

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.