linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: 2.5.46-bk3: BUG in skbuff.c:178
@ 2002-11-08 19:33 Petr Vandrovec
  2002-11-08 22:02 ` Petr Vandrovec
  0 siblings, 1 reply; 7+ messages in thread
From: Petr Vandrovec @ 2002-11-08 19:33 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel, bwindle, acme

On  8 Nov 02 at 12:01, Andrew Morton wrote:
> > Single-CPU system, running 2.5.46-bk3. Whiling compiling bk4, and running
> > a script that was pinging every host on my subnet (I was running arp -a
> > to see what was in the arp table at the time), I hit this BUG.
> 
> I'd be suspecting the seq_file conversion in arp.c.  The read_lock_bh()
> stuff in there looks, umm, unclear ;)

Yes, see my emails from 23th Oct, 25th Oct (2.5.44: Strange oopses from 
userspace), from Nov 6th + Nov 7th: Preempt count check when leaving
IRQ.

But while yesterday I had no idea, today I have one (it looks like that
nobody else is going to fix it for me :-( ) :
seq subsystem can call arp_seq_start/next/stop several times, but
state->is_pneigh is set to 0 only once, by memset in arp_seq_open :-(

I think that arp_seq_start should do

  {
+   struct arp_iter_state* state = seq->private;
+   seq->is_pneigh = 0;
+   seq->bucket = 0;
    read_lock_bh(&arp_tbl.lock);
    return *pos ? arp_get_bucket(seq, pos) : (void *)1;
  }

and we can drop memset from arp_seq_open. I'll try it, and if it will
survive my tests, I'll send real patch.  
  
                                        Best regards,
                                                Petr Vandrovec
                                                vandrove@vc.cvut.cz
                                                

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: 2.5.46-bk3: BUG in skbuff.c:178
  2002-11-08 19:33 2.5.46-bk3: BUG in skbuff.c:178 Petr Vandrovec
@ 2002-11-08 22:02 ` Petr Vandrovec
  2002-11-10  4:18   ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 7+ messages in thread
From: Petr Vandrovec @ 2002-11-08 22:02 UTC (permalink / raw)
  To: akpm; +Cc: linux-kernel, bwindle, acme

On Fri, Nov 08, 2002 at 09:33:24PM +0200, Petr Vandrovec wrote:
> On  8 Nov 02 at 12:01, Andrew Morton wrote:
> > > Single-CPU system, running 2.5.46-bk3. Whiling compiling bk4, and running
> > > a script that was pinging every host on my subnet (I was running arp -a
> > > to see what was in the arp table at the time), I hit this BUG.
> > 
> > I'd be suspecting the seq_file conversion in arp.c.  The read_lock_bh()
> > stuff in there looks, umm, unclear ;)
> 
> Yes, see my emails from 23th Oct, 25th Oct (2.5.44: Strange oopses from 
> userspace), from Nov 6th + Nov 7th: Preempt count check when leaving
> IRQ.
> 
> But while yesterday I had no idea, today I have one (it looks like that
> nobody else is going to fix it for me :-( ) :
> seq subsystem can call arp_seq_start/next/stop several times, but
> state->is_pneigh is set to 0 only once, by memset in arp_seq_open :-(
> 
> I think that arp_seq_start should do
> 
>   {
> +   struct arp_iter_state* state = seq->private;
> +   seq->is_pneigh = 0;
> +   seq->bucket = 0;
>     read_lock_bh(&arp_tbl.lock);
>     return *pos ? arp_get_bucket(seq, pos) : (void *)1;
>   }
> 
> and we can drop memset from arp_seq_open. I'll try it, and if it will
> survive my tests, I'll send real patch.  

It was not that trivial: arp was storing current position in three fields:
pos, bucket and is_pneigh - so any code seeking in /proc/net/arp just
returned random data, and eventually locked up box...

Patch below removes 'bucket' from arp_iter_state, and merges it to
the pos. It is based on assumption that there is no more than 16M of
entries in each bucket, and that NEIGH_HASHMASK + 1 + PNEIGH_HASHMASK + 1 < 127
(currently it is 48). As loff_t is 64bit even on i386, there is plenty
of space to grow, but it could require apps compiled with O_LARGEFILE,
so I decided to use only 31bit space.

I also removed __inline__ from neigh_get_bucket. This way all functions
were compiled by gcc-2.95.4 on i386 without local variables on stack...

Because of there is now only one entry in arp_iter_state, it is possible
to use seq->private directly instead of allocating memory for arp_iter_state.
Also whole lock obtaining in arp_seq_start could be greatly simplified,
but I'd like to hear your opinions whether merging pos + bucket together
in the way I did is way to go or not, before I'll dig into this more.

I tested code below here: box no more crashes, and I believe that
whole arp table is visible in /proc/net/arp.

					Best regards,
						Petr Vandrovec
						vandrove@vc.cvut.cz


--- linux-2.5.46-c985.dist/net/ipv4/arp.c	2002-11-08 21:44:01.000000000 +0100
+++ linux-2.5.46-c985/net/ipv4/arp.c	2002-11-08 22:46:44.000000000 +0100
@@ -1139,23 +1139,39 @@
 #endif /* CONFIG_AX25 */
 
 struct arp_iter_state {
-	int is_pneigh, bucket;
+	int is_pneigh;
 };
 
-static __inline__ struct neighbour *neigh_get_bucket(struct seq_file *seq,
+#define ARP_FIRST_NEIGH		(1)
+#define ARP_FIRST_PNEIGH	(ARP_FIRST_NEIGH + NEIGH_HASHMASK + 1)
+
+static inline unsigned int get_arp_pos(loff_t pos, unsigned int* idx) {
+	*idx = pos & 0x00FFFFFF;
+	return pos >> 24;
+}
+
+static inline unsigned int make_arp_pos(unsigned int bucket, unsigned int idx) {
+	return (bucket << 24) | idx;
+}
+
+static inline loff_t next_bucket(loff_t pos) {
+	return (pos + 0x00FFFFFF) & ~0x00FFFFFF;
+}
+
+static struct neighbour *neigh_get_bucket(struct seq_file *seq,
 						     loff_t *pos)
 {
 	struct neighbour *n = NULL;
-	struct arp_iter_state* state = seq->private;
-	loff_t l = *pos;
+	unsigned int l;
+	unsigned int bucket = get_arp_pos(*pos, &l) - ARP_FIRST_NEIGH;
 	int i;
 
-	for (; state->bucket <= NEIGH_HASHMASK; ++state->bucket)
-		for (i = 0, n = arp_tbl.hash_buckets[state->bucket]; n;
+	for (; bucket <= NEIGH_HASHMASK; ++bucket)
+		for (i = 0, n = arp_tbl.hash_buckets[bucket]; n;
 		     ++i, n = n->next)
 			/* Do not confuse users "arp -a" with magic entries */
 			if ((n->nud_state & ~NUD_NOARP) && !l--) {
-				*pos = i;
+				*pos = make_arp_pos(bucket + ARP_FIRST_NEIGH, i);
 				goto out;
 			}
 out:
@@ -1166,15 +1182,15 @@
 							 loff_t *pos)
 {
 	struct pneigh_entry *n = NULL;
-	struct arp_iter_state* state = seq->private;
-	loff_t l = *pos;
+	unsigned int l;
+	unsigned int bucket = get_arp_pos(*pos, &l) - ARP_FIRST_PNEIGH;
 	int i;
 
-	for (; state->bucket <= PNEIGH_HASHMASK; ++state->bucket)
-		for (i = 0, n = arp_tbl.phash_buckets[state->bucket]; n;
+	for (; bucket <= PNEIGH_HASHMASK; ++bucket)
+		for (i = 0, n = arp_tbl.phash_buckets[bucket]; n;
 		     ++i, n = n->next)
 			if (!l--) {
-				*pos = i;
+				*pos = make_arp_pos(bucket + ARP_FIRST_PNEIGH, i);
 				goto out;
 			}
 out:
@@ -1190,8 +1206,7 @@
 
 		read_unlock_bh(&arp_tbl.lock);
 		state->is_pneigh = 1;
-		state->bucket	 = 0;
-		*pos		 = 0;
+		*pos		 = make_arp_pos(ARP_FIRST_PNEIGH, 0);
 		rc = pneigh_get_bucket(seq, pos);
 	}
 	return rc;
@@ -1199,8 +1214,21 @@
 
 static void *arp_seq_start(struct seq_file *seq, loff_t *pos)
 {
+	struct arp_iter_state* state = seq->private;
+	unsigned int idx;
+	unsigned int bucket;
+	
+	state->is_pneigh = 0;
 	read_lock_bh(&arp_tbl.lock);
-	return *pos ? arp_get_bucket(seq, pos) : (void *)1;
+	bucket = get_arp_pos(*pos, &idx);
+	if (bucket < ARP_FIRST_NEIGH)
+		return (void *)1;
+	if (bucket < ARP_FIRST_PNEIGH) {
+		return arp_get_bucket(seq, pos);
+	}
+	read_unlock_bh(&arp_tbl.lock);
+	state->is_pneigh = 1;
+	return pneigh_get_bucket(seq, pos);
 }
 
 static void *arp_seq_next(struct seq_file *seq, void *v, loff_t *pos)
@@ -1209,34 +1237,33 @@
 	struct arp_iter_state* state;
 
 	if (v == (void *)1) {
+		*pos = make_arp_pos(1, 0);
 		rc = arp_get_bucket(seq, pos);
 		goto out;
 	}
-
 	state = seq->private;
 	if (!state->is_pneigh) {
 		struct neighbour *n = v;
 
+		BUG_ON((*pos < make_arp_pos(ARP_FIRST_NEIGH, 0)) || (*pos >= make_arp_pos(ARP_FIRST_PNEIGH, 0)));
 		rc = n = n->next;
 		if (n)
 			goto out;
-		*pos = 0;
-		++state->bucket;
+		*pos = next_bucket(*pos);
 		rc = neigh_get_bucket(seq, pos);
 		if (rc)
 			goto out;
 		read_unlock_bh(&arp_tbl.lock);
 		state->is_pneigh = 1;
-		state->bucket	 = 0;
-		*pos		 = 0;
+		*pos		 = make_arp_pos(ARP_FIRST_PNEIGH, 0);
 		rc = pneigh_get_bucket(seq, pos);
 	} else {
 		struct pneigh_entry *pn = v;
 
+		BUG_ON(*pos < make_arp_pos(ARP_FIRST_PNEIGH, 0));
 		pn = pn->next;
 		if (!pn) {
-			++state->bucket;
-			*pos = 0;
+			*pos = next_bucket(*pos);
 			pn   = pneigh_get_bucket(seq, pos);
 		}
 		rc = pn;

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: 2.5.46-bk3: BUG in skbuff.c:178
  2002-11-08 22:02 ` Petr Vandrovec
@ 2002-11-10  4:18   ` Arnaldo Carvalho de Melo
  2002-11-11  2:26     ` Petr Vandrovec
  0 siblings, 1 reply; 7+ messages in thread
From: Arnaldo Carvalho de Melo @ 2002-11-10  4:18 UTC (permalink / raw)
  To: Petr Vandrovec; +Cc: linux-kernel

Em Fri, Nov 08, 2002 at 11:02:15PM +0100, Petr Vandrovec escreveu:
> On Fri, Nov 08, 2002 at 09:33:24PM +0200, Petr Vandrovec wrote:
> > On  8 Nov 02 at 12:01, Andrew Morton wrote:
 
> Patch below removes 'bucket' from arp_iter_state, and merges it to the pos.
> It is based on assumption that there is no more than 16M of entries in each
> bucket, and that NEIGH_HASHMASK + 1 + PNEIGH_HASHMASK + 1 < 127

I did that in the past, but it gets too ugly, see previous changeset in
bk tree, lemme see... 1.781.1.52, but anyway, I was aware of this bug but I
was on the run, going to Japan and back in 5 days :-\ Well, I have already
sent this one to several people, so if you could review/test it...

===== net/ipv4/arp.c 1.13 vs edited =====
--- 1.13/net/ipv4/arp.c	Mon Oct 21 01:17:08 2002
+++ edited/net/ipv4/arp.c	Wed Nov  6 08:00:22 2002
@@ -1140,67 +1140,122 @@
 
 struct arp_iter_state {
 	int is_pneigh, bucket;
+	union {
+		struct neighbour    *n;
+		struct pneigh_entry *pn;
+	} u;
 };
 
-static __inline__ struct neighbour *neigh_get_bucket(struct seq_file *seq,
-						     loff_t *pos)
+static struct neighbour *neigh_get_first(struct seq_file *seq)
 {
-	struct neighbour *n = NULL;
 	struct arp_iter_state* state = seq->private;
-	loff_t l = *pos;
-	int i;
 
-	for (; state->bucket <= NEIGH_HASHMASK; ++state->bucket)
-		for (i = 0, n = arp_tbl.hash_buckets[state->bucket]; n;
-		     ++i, n = n->next)
-			/* Do not confuse users "arp -a" with magic entries */
-			if ((n->nud_state & ~NUD_NOARP) && !l--) {
-				*pos = i;
-				goto out;
-			}
-out:
-	return n;
+	state->is_pneigh = 0;
+
+	for (state->bucket = 0;
+	     state->bucket <= NEIGH_HASHMASK;
+	     ++state->bucket) {
+		state->u.n = arp_tbl.hash_buckets[state->bucket];
+		while (state->u.n && !(state->u.n->nud_state & ~NUD_NOARP))
+			state->u.n = state->u.n->next;
+		if (state->u.n)
+			break;
+	}
+
+	return state->u.n;
 }
 
-static __inline__ struct pneigh_entry *pneigh_get_bucket(struct seq_file *seq,
-							 loff_t *pos)
+static struct neighbour *neigh_get_next(struct seq_file *seq)
 {
-	struct pneigh_entry *n = NULL;
 	struct arp_iter_state* state = seq->private;
-	loff_t l = *pos;
-	int i;
 
-	for (; state->bucket <= PNEIGH_HASHMASK; ++state->bucket)
-		for (i = 0, n = arp_tbl.phash_buckets[state->bucket]; n;
-		     ++i, n = n->next)
-			if (!l--) {
-				*pos = i;
-				goto out;
-			}
-out:
-	return n;
+	for (; state->bucket <= NEIGH_HASHMASK;
+	     ++state->bucket,
+	     state->u.n = arp_tbl.hash_buckets[state->bucket]) {
+		if (state->u.n)
+			do {
+				state->u.n = state->u.n->next;
+				/* Don't confuse "arp -a" w/ magic entries */
+			} while (state->u.n &&
+				 !(state->u.n->nud_state & ~NUD_NOARP));
+		if (state->u.n)
+			break;
+	}
+
+	return state->u.n;
+}
+
+static loff_t neigh_get_idx(struct seq_file *seq, loff_t pos)
+{
+	neigh_get_first(seq);
+	while (pos && neigh_get_next(seq))
+		--pos;
+	return pos;
+}
+
+static struct pneigh_entry *pneigh_get_first(struct seq_file *seq)
+{
+	struct arp_iter_state* state = seq->private;
+
+	state->is_pneigh = 1;
+
+	for (state->bucket = 0;
+	     state->bucket <= PNEIGH_HASHMASK;
+	     ++state->bucket) {
+		state->u.pn = arp_tbl.phash_buckets[state->bucket];
+		if (state->u.pn)
+			break;
+	}
+	return state->u.pn;
+}
+
+static struct pneigh_entry *pneigh_get_next(struct seq_file *seq)
+{
+	struct arp_iter_state* state = seq->private;
+
+	for (; state->bucket <= PNEIGH_HASHMASK;
+	     ++state->bucket,
+	     state->u.pn = arp_tbl.phash_buckets[state->bucket]) {
+		if (state->u.pn)
+			state->u.pn = state->u.pn->next;
+		
+		if (state->u.pn)
+			break;
+	}
+	return state->u.pn;
 }
 
-static __inline__ void *arp_get_bucket(struct seq_file *seq, loff_t *pos)
+static loff_t pneigh_get_idx(struct seq_file *seq, loff_t pos)
 {
-	void *rc = neigh_get_bucket(seq, pos);
+	pneigh_get_first(seq);
+	while (pos && pneigh_get_next(seq))
+		--pos;
+	return pos;
+}
+
+static void *arp_get_idx(struct seq_file *seq, loff_t pos)
+{
+	struct arp_iter_state* state = seq->private;
+	void *rc;
+	loff_t p;
+
+	read_lock_bh(&arp_tbl.lock);
+	p = neigh_get_idx(seq, pos);
 
-	if (!rc) {
+	if (p || !state->u.n) {
 		struct arp_iter_state* state = seq->private;
 
 		read_unlock_bh(&arp_tbl.lock);
-		state->is_pneigh = 1;
-		state->bucket	 = 0;
-		*pos		 = 0;
-		rc = pneigh_get_bucket(seq, pos);
-	}
+		pneigh_get_idx(seq, p);
+		rc = state->u.pn;
+	} else
+		rc = state->u.n;
 	return rc;
 }
 
 static void *arp_seq_start(struct seq_file *seq, loff_t *pos)
 {
-	read_lock_bh(&arp_tbl.lock);
-	return *pos ? arp_get_bucket(seq, pos) : (void *)1;
+	return *pos ? arp_get_idx(seq, *pos - 1) : (void *)1;
 }
 
 static void *arp_seq_next(struct seq_file *seq, void *v, loff_t *pos)
@@ -1209,38 +1264,19 @@
 	struct arp_iter_state* state;
 
 	if (v == (void *)1) {
-		rc = arp_get_bucket(seq, pos);
+		rc = arp_get_idx(seq, 0);
 		goto out;
 	}
 
 	state = seq->private;
 	if (!state->is_pneigh) {
-		struct neighbour *n = v;
-
-		rc = n = n->next;
-		if (n)
-			goto out;
-		*pos = 0;
-		++state->bucket;
-		rc = neigh_get_bucket(seq, pos);
+		rc = neigh_get_next(seq);
 		if (rc)
 			goto out;
 		read_unlock_bh(&arp_tbl.lock);
-		state->is_pneigh = 1;
-		state->bucket	 = 0;
-		*pos		 = 0;
-		rc = pneigh_get_bucket(seq, pos);
-	} else {
-		struct pneigh_entry *pn = v;
-
-		pn = pn->next;
-		if (!pn) {
-			++state->bucket;
-			*pos = 0;
-			pn   = pneigh_get_bucket(seq, pos);
-		}
-		rc = pn;
-	}
+		rc = pneigh_get_first(seq);
+	} else
+		rc = pneigh_get_next(seq);
 out:
 	++*pos;
 	return rc;
@@ -1291,7 +1327,6 @@
 static __inline__ void arp_format_pneigh_entry(struct seq_file *seq,
 					       struct pneigh_entry *n)
 {
-
 	struct net_device *dev = n->dev;
 	int hatype = dev ? dev->type : 0;
 	char tbuf[16];

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: 2.5.46-bk3: BUG in skbuff.c:178
  2002-11-10  4:18   ` Arnaldo Carvalho de Melo
@ 2002-11-11  2:26     ` Petr Vandrovec
  2002-11-11  2:42       ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 7+ messages in thread
From: Petr Vandrovec @ 2002-11-11  2:26 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo; +Cc: linux-kernel

On Sun, Nov 10, 2002 at 02:18:55AM -0200, Arnaldo Carvalho de Melo wrote:
> Em Fri, Nov 08, 2002 at 11:02:15PM +0100, Petr Vandrovec escreveu:
> > On Fri, Nov 08, 2002 at 09:33:24PM +0200, Petr Vandrovec wrote:
> > > On  8 Nov 02 at 12:01, Andrew Morton wrote:
>  
> > Patch below removes 'bucket' from arp_iter_state, and merges it to the pos.
> > It is based on assumption that there is no more than 16M of entries in each
> > bucket, and that NEIGH_HASHMASK + 1 + PNEIGH_HASHMASK + 1 < 127
> 
> I did that in the past, but it gets too ugly, see previous changeset in
> bk tree, lemme see... 1.781.1.52, but anyway, I was aware of this bug but I
> was on the run, going to Japan and back in 5 days :-\ Well, I have already
> sent this one to several people, so if you could review/test it...

I tried to find how it is supposed to work, and after I tried to boot kernel
(at home) with it, I can say that it does not work...

I tried it only at home (where arp table is empty by default), so I did not
test whether lock is released properly (if there will be arp_seq_start
and arp_seq_stop, with pos == 0 and without intervening arp_seq_next, you'll 
unlock unlocked arp_tbl.lock in arp_seq_stop (and from what I see in
seq_file.c, it can happen), but when I just tried to ping various addresses 
at vmware's vmnet8, I got very short output of /proc/net/arp, although it 
should contain couple of entries:

ppc:~# cat /proc/net/arp
IP address       HW type     Flags       HW address            Mask     Device
ppc:~# ping 192.168.27.2
PING 192.168.27.2 (192.168.27.2) 56(84) bytes of data.

--- 192.168.27.2 ping statistics ---
1 packets transmitted, 0 received, 100% packet loss, time 0ms

ppc:~# ping 192.168.27.3
PING 192.168.27.3 (192.168.27.3) 56(84) bytes of data.

--- 192.168.27.3 ping statistics ---
1 packets transmitted, 0 received, 100% packet loss, time 0ms

ppc:~# ping 192.168.27.4
PING 192.168.27.4 (192.168.27.4) 56(84) bytes of data.

--- 192.168.27.4 ping statistics ---
1 packets transmitted, 0 received, 100% packet loss, time 0ms

ppc:~# ping 192.168.27.5
PING 192.168.27.5 (192.168.27.5) 56(84) bytes of data.

--- 192.168.27.5 ping statistics ---
1 packets transmitted, 0 received, 100% packet loss, time 0ms

ppc:~# ping 192.168.27.6
PING 192.168.27.6 (192.168.27.6) 56(84) bytes of data.

--- 192.168.27.6 ping statistics ---
1 packets transmitted, 0 received, 100% packet loss, time 0ms

ppc:~# cat /proc/net/arp
IP address       HW type     Flags       HW address            Mask     Device
192.168.27.2     0x1         0x0         00:00:00:00:00:00     *        vmnet8
ppc:~#

Not something I expect. Before reboot it was listing all 6 addresses, not
only first.
						Best regards,
							Petr Vandrovec
							vandrove@vc.cvut.cz

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: 2.5.46-bk3: BUG in skbuff.c:178
  2002-11-11  2:26     ` Petr Vandrovec
@ 2002-11-11  2:42       ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 7+ messages in thread
From: Arnaldo Carvalho de Melo @ 2002-11-11  2:42 UTC (permalink / raw)
  To: Petr Vandrovec; +Cc: linux-kernel

Em Mon, Nov 11, 2002 at 03:26:02AM +0100, Petr Vandrovec escreveu:
> On Sun, Nov 10, 2002 at 02:18:55AM -0200, Arnaldo Carvalho de Melo wrote:
> > Em Fri, Nov 08, 2002 at 11:02:15PM +0100, Petr Vandrovec escreveu:
> > > On Fri, Nov 08, 2002 at 09:33:24PM +0200, Petr Vandrovec wrote:
> > > > On  8 Nov 02 at 12:01, Andrew Morton wrote:
> >  
> > > Patch below removes 'bucket' from arp_iter_state, and merges it to the pos.
> > > It is based on assumption that there is no more than 16M of entries in each
> > > bucket, and that NEIGH_HASHMASK + 1 + PNEIGH_HASHMASK + 1 < 127
> > 
> > I did that in the past, but it gets too ugly, see previous changeset in
> > bk tree, lemme see... 1.781.1.52, but anyway, I was aware of this bug but I
> > was on the run, going to Japan and back in 5 days :-\ Well, I have already
> > sent this one to several people, so if you could review/test it...
> 
> I tried to find how it is supposed to work, and after I tried to boot kernel
> (at home) with it, I can say that it does not work...

I'm working on this now... :-\

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: 2.5.46-bk3: BUG in skbuff.c:178
  2002-11-08 19:42 Burton Windle
@ 2002-11-08 20:01 ` Andrew Morton
  0 siblings, 0 replies; 7+ messages in thread
From: Andrew Morton @ 2002-11-08 20:01 UTC (permalink / raw)
  To: Burton Windle, Arnaldo Carvalho de Melo; +Cc: linux-kernel

Burton Windle wrote:
> 
> Single-CPU system, running 2.5.46-bk3. Whiling compiling bk4, and running
> a script that was pinging every host on my subnet (I was running arp -a
> to see what was in the arp table at the time), I hit this BUG.
> 
> Debug: sleeping function called from illegal context at mm/slab.c:1305
> Call Trace:
>  [<c011247c>] __might_sleep+0x54/0x58
>  [<c012a3e2>] kmem_flagcheck+0x1e/0x50
>  [<c012ab6a>] kmem_cache_alloc+0x12/0xc8
>  [<c0226e0c>] sock_alloc_inode+0x10/0x68
>  [<c014cb65>] alloc_inode+0x15/0x180
>  [<c014d397>] new_inode+0xb/0x78
>  [<c0227093>] sock_alloc+0xf/0x68
>  [<c0227d65>] sock_create+0x8d/0xe4
>  [<c0227dd9>] sys_socket+0x1d/0x58
>  [<c0228a13>] sys_socketcall+0x5f/0x1f4
>  [<c0108903>] syscall_call+0x7/0xb
> 
> bad: scheduling while atomic!

Something somewhere has caused a preempt_count imbalance.  What
you're seeing here are the downstream effects of an earlier bug.

I'd be suspecting the seq_file conversion in arp.c.  The read_lock_bh()
stuff in there looks, umm, unclear ;)

(Could we pleeeeze nuke the __inline__'s in there too?)

^ permalink raw reply	[flat|nested] 7+ messages in thread

* 2.5.46-bk3: BUG in skbuff.c:178
@ 2002-11-08 19:42 Burton Windle
  2002-11-08 20:01 ` Andrew Morton
  0 siblings, 1 reply; 7+ messages in thread
From: Burton Windle @ 2002-11-08 19:42 UTC (permalink / raw)
  To: linux-kernel

Single-CPU system, running 2.5.46-bk3. Whiling compiling bk4, and running
a script that was pinging every host on my subnet (I was running arp -a
to see what was in the arp table at the time), I hit this BUG.

Debug: sleeping function called from illegal context at mm/slab.c:1305
Call Trace:
 [<c011247c>] __might_sleep+0x54/0x58
 [<c012a3e2>] kmem_flagcheck+0x1e/0x50
 [<c012ab6a>] kmem_cache_alloc+0x12/0xc8
 [<c0226e0c>] sock_alloc_inode+0x10/0x68
 [<c014cb65>] alloc_inode+0x15/0x180
 [<c014d397>] new_inode+0xb/0x78
 [<c0227093>] sock_alloc+0xf/0x68
 [<c0227d65>] sock_create+0x8d/0xe4
 [<c0227dd9>] sys_socket+0x1d/0x58
 [<c0228a13>] sys_socketcall+0x5f/0x1f4
 [<c0108903>] syscall_call+0x7/0xb

bad: scheduling while atomic!
Call Trace:
 [<c01110b1>] schedule+0x3d/0x2c8
 [<c010892a>] work_resched+0x5/0x16

alloc_skb called nonatomically from interrupt c022966e
------------[ cut here ]------------
kernel BUG at net/core/skbuff.c:178!
invalid operand: 0000
CPU:    0
EIP:    0060:[<c022a073>]    Not tainted
EFLAGS: 00010202
EIP is at alloc_skb+0x43/0x1a4
eax: 0000003a   ebx: c27d1044   ecx: c3fff360   edx: c0343e50
esi: 00000000   edi: 000001d0   ebp: c27d1ca4   esp: c1ad3e90
ds: 0068   es: 0068   ss: 0068
Process arp (pid: 5029, threadinfo=c1ad2000 task=c3fff360)
Stack: c02bf140 c022966e c27d1044 00000000 0000006e c022966e 00000001 000001d0
       c6bb65e4 c02679a1 c27d1044 00000001 00000000 000001d0 c6bb65e4 c1ad3f14
       0000006e bffff78c 00000018 7fffffff 00000000 c27d1044 fffffff4 bffff71c
Call Trace:
 [<c022966e>] sock_wmalloc+0x26/0x50
 [<c022966e>] sock_wmalloc+0x26/0x50
 [<c02679a1>] unix_stream_connect+0xb1/0x3e8
 [<c0228177>] sys_connect+0x5b/0x78
 [<c0228a40>] sys_socketcall+0x8c/0x1f4
 [<c0108903>] syscall_call+0x7/0xb

Code: 0f 0b b2 00 e3 f0 2b c0 83 c4 08 83 e7 ef 31 c0 9c 59 fa be
 <0>Kernel panic: Aiee, killing interrupt handler!
In interrupt handler - not syncing

--
Burton Windle                           burton@fint.org
Linux: the "grim reaper of innocent orphaned children."
          from /usr/src/linux-2.4.18/init/main.c:461



^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2002-11-11  2:35 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2002-11-08 19:33 2.5.46-bk3: BUG in skbuff.c:178 Petr Vandrovec
2002-11-08 22:02 ` Petr Vandrovec
2002-11-10  4:18   ` Arnaldo Carvalho de Melo
2002-11-11  2:26     ` Petr Vandrovec
2002-11-11  2:42       ` Arnaldo Carvalho de Melo
2002-11-08 19:42 Burton Windle
2002-11-08 20:01 ` Andrew Morton

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).