linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Nick Piggin <nickpiggin@yahoo.com.au>
To: Dave Jones <davej@redhat.com>
Cc: Andrew Morton <akpm@osdl.org>,
	linux-kernel@vger.kernel.org, Hugh Dickins <hugh@veritas.com>,
	Chris Rankin <cj.rankin@ntlworld.com>
Subject: Re: -mm merge plans for 2.6.20
Date: Tue, 19 Dec 2006 18:02:34 +1100	[thread overview]
Message-ID: <45878E8A.6000506@yahoo.com.au> (raw)
In-Reply-To: <20061219064454.GG31146@redhat.com>

[-- Attachment #1: Type: text/plain, Size: 2953 bytes --]

Dave Jones wrote:
> On Tue, Dec 19, 2006 at 04:20:37PM +1100, Nick Piggin wrote:
>  > Dave Jones wrote:
>  > 
>  > > Eeek! page_mapcount(page) went negative! (-2)
>  > 
>  > Hmm, probably happened once before, too.
> 
> You're right. Going back further in the log, I noticed
> that it had happened again exactly at the time that cron restarted vpnc.
> The first time, the flags were different..
> 
>  Dec  4 00:01:03 firewall kernel: Eeek! page_mapcount(page) went negative! (-1)
>  Dec  4 00:01:03 firewall kernel:   page->flags = 400
>  Dec  4 00:01:03 firewall kernel:   page->count = 1
>  Dec  4 00:01:03 firewall kernel:   page->mapping = 00000000

Still reserved, with a NULL mapping. I'd say it could be the same page.

> 
>  > >   page->flags = 404
>  > 
>  > What's that? PG_referenced|PG_reserved? So I'd say it is likely
>  > that some driver has got its refcounting wrong.
> 
> At the time that it bit me, here's what was loaded..
> 
> tun ipt_MASQUERADE iptable_nat ip_nat ipt_LOG xt_limit ipv6
> ip_conntrack_netbios_ns ipt_REJECT xt_state ip_conntrack nfnetlink xt_tcpudp
> iptable_filter ip_tables x_tables video sbs i2c_ec button battery asus_acpi ac
> parport_pc lp parport pcspkr ide_cd i2c_viapro i2c_core cdrom 3c59x via_rhine
> via_ircc mii irda crc_ccitt serio_raw dm_snapshot dm_zero dm_mirror dm_mod ext3
> jbd ehci_hcd ohci_hcd uhci_hcd
> 
> The scary ones (i2c, irda) weren't in use at all, and had never been opened afaik,
> so the potential for those to be corrupting memory is slim, but not out of the
> question. (Why the hell asus_acpi is loaded is a mystery, this isn't an Asus,
> or a laptop. Probably dumb initscripts).

OK that could be useful if I do some grepping and see which ones are
setting PG_reserved.

>  > And I see we've got another report for 2.6.19.1 from Chris, which
>  > is equally vague.
> 
> I'll be moving that box to 2.6.19.x at some point real soon, so I'll holler
> if I see it again on a later kernel.
> 
>  > IMO the pattern is much too consistent to be able to attribute
>  > them all to hardware problems. And considering it takes so long
>  > for these things to appear, can we get something like the attached
>  > patch upstream at least until we manage to stamp them out?
> 
> Sounds like a good idea to me.
> 
> ACKed-by: Dave Jones <davej@redhat.com>

Thanks.

> 
>  > Any other debugging info we can add?
> 
> Would it be useful to print the pfn of the page ?
> In cases like mine, where it bit twice before it killed the box, it
> might be interesting to see if its always the same page.  Not sure
> what that would prove/disprove though.

Might help. I guess the site where it is allocated from might be
another one, although I'm hoping that if we know what ->nopage is
being used then we'll be able to track it. OTOH it may be using
remap_pfn_range from fops->mmap, rather than nopage... I wonder
how we could get at that info? vma->vm_file->f_op->mmap?

-- 
SUSE Labs, Novell Inc.

[-- Attachment #2: mm-rmap-debug-more.patch --]
[-- Type: text/plain, Size: 4401 bytes --]

Index: linux-2.6/include/linux/rmap.h
===================================================================
--- linux-2.6.orig/include/linux/rmap.h	2006-12-04 19:56:17.000000000 +1100
+++ linux-2.6/include/linux/rmap.h	2006-12-19 16:14:30.000000000 +1100
@@ -72,7 +72,7 @@ void __anon_vma_link(struct vm_area_stru
 void page_add_anon_rmap(struct page *, struct vm_area_struct *, unsigned long);
 void page_add_new_anon_rmap(struct page *, struct vm_area_struct *, unsigned long);
 void page_add_file_rmap(struct page *);
-void page_remove_rmap(struct page *);
+void page_remove_rmap(struct page *, struct vm_area_struct *);
 
 /**
  * page_dup_rmap - duplicate pte mapping to a page
Index: linux-2.6/mm/filemap_xip.c
===================================================================
--- linux-2.6.orig/mm/filemap_xip.c	2006-12-04 19:07:10.000000000 +1100
+++ linux-2.6/mm/filemap_xip.c	2006-12-19 16:14:30.000000000 +1100
@@ -189,7 +189,7 @@ __xip_unmap (struct address_space * mapp
 			/* Nuke the page table entry. */
 			flush_cache_page(vma, address, pte_pfn(*pte));
 			pteval = ptep_clear_flush(vma, address, pte);
-			page_remove_rmap(page);
+			page_remove_rmap(page, vma);
 			dec_mm_counter(mm, file_rss);
 			BUG_ON(pte_dirty(pteval));
 			pte_unmap_unlock(pte, ptl);
Index: linux-2.6/mm/fremap.c
===================================================================
--- linux-2.6.orig/mm/fremap.c	2006-12-04 19:56:20.000000000 +1100
+++ linux-2.6/mm/fremap.c	2006-12-19 16:14:30.000000000 +1100
@@ -33,7 +33,7 @@ static int zap_pte(struct mm_struct *mm,
 		if (page) {
 			if (pte_dirty(pte))
 				set_page_dirty(page);
-			page_remove_rmap(page);
+			page_remove_rmap(page, vma);
 			page_cache_release(page);
 		}
 	} else {
Index: linux-2.6/mm/memory.c
===================================================================
--- linux-2.6.orig/mm/memory.c	2006-12-04 19:56:21.000000000 +1100
+++ linux-2.6/mm/memory.c	2006-12-19 16:14:30.000000000 +1100
@@ -681,7 +681,7 @@ static unsigned long zap_pte_range(struc
 					mark_page_accessed(page);
 				file_rss--;
 			}
-			page_remove_rmap(page);
+			page_remove_rmap(page, vma);
 			tlb_remove_page(tlb, page);
 			continue;
 		}
@@ -1576,7 +1576,7 @@ gotten:
 	page_table = pte_offset_map_lock(mm, pmd, address, &ptl);
 	if (likely(pte_same(*page_table, orig_pte))) {
 		if (old_page) {
-			page_remove_rmap(old_page);
+			page_remove_rmap(old_page, vma);
 			if (!PageAnon(old_page)) {
 				dec_mm_counter(mm, file_rss);
 				inc_mm_counter(mm, anon_rss);
Index: linux-2.6/mm/rmap.c
===================================================================
--- linux-2.6.orig/mm/rmap.c	2006-12-04 19:56:21.000000000 +1100
+++ linux-2.6/mm/rmap.c	2006-12-19 18:02:18.000000000 +1100
@@ -47,6 +47,7 @@
 #include <linux/rmap.h>
 #include <linux/rcupdate.h>
 #include <linux/module.h>
+#include <linux/kallsyms.h>
 
 #include <asm/tlbflush.h>
 
@@ -567,14 +568,20 @@ void page_add_file_rmap(struct page *pag
  *
  * The caller needs to hold the pte lock.
  */
-void page_remove_rmap(struct page *page)
+void page_remove_rmap(struct page *page, struct vm_area_struct *vma)
 {
 	if (atomic_add_negative(-1, &page->_mapcount)) {
 		if (unlikely(page_mapcount(page) < 0)) {
 			printk (KERN_EMERG "Eeek! page_mapcount(page) went negative! (%d)\n", page_mapcount(page));
+			printk (KERN_EMERG "  page pfn = %lx\n", page_to_pfn(page));
 			printk (KERN_EMERG "  page->flags = %lx\n", page->flags);
 			printk (KERN_EMERG "  page->count = %x\n", page_count(page));
 			printk (KERN_EMERG "  page->mapping = %p\n", page->mapping);
+			print_symbol (KERN_EMERG "  vma->vm_ops = %s\n", (unsigned long)vma->vm_ops);
+			if (vma->vm_ops)
+				print_symbol (KERN_EMERG "  vma->vm_ops->nopage = %s\n", (unsigned long)vma->vm_ops->nopage);
+			if (vma->vm_file && vma->vm_file->f_op)
+				print_symbol (KERN_EMERG "  vma->vm_file->f_op->mmap = %s\n", (unsigned long)vma->vm_file->f_op->mmap);
 			BUG();
 		}
 
@@ -679,7 +686,7 @@ static int try_to_unmap_one(struct page 
 		dec_mm_counter(mm, file_rss);
 
 
-	page_remove_rmap(page);
+	page_remove_rmap(page, vma);
 	page_cache_release(page);
 
 out_unmap:
@@ -769,7 +776,7 @@ static void try_to_unmap_cluster(unsigne
 		if (pte_dirty(pteval))
 			set_page_dirty(page);
 
-		page_remove_rmap(page);
+		page_remove_rmap(page, vma);
 		page_cache_release(page);
 		dec_mm_counter(mm, file_rss);
 		(*mapcount)--;

  reply	other threads:[~2006-12-19  7:03 UTC|newest]

Thread overview: 93+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-12-05  4:40 -mm merge plans for 2.6.20 Andrew Morton
2006-12-05  4:56 ` Jeff Garzik
2006-12-05  5:41   ` Andrew Morton
2006-12-05  7:04     ` Jens Axboe
2006-12-05 15:00     ` Mark Lord
2006-12-06 19:19     ` Conke Hu
2006-12-06 19:26       ` Randy Dunlap
2006-12-06 19:40       ` Jeff Garzik
2006-12-06 22:36       ` Andrew Morton
2006-12-05  5:14 ` Paul Mackerras
2006-12-05  5:42   ` Andrew Morton
2006-12-05  5:53     ` Nick Piggin
2006-12-05  5:49   ` Nick Piggin
2006-12-05  8:36 ` Gautham R Shenoy
2006-12-05  8:47 ` Peter Zijlstra
2006-12-05 11:06 ` ext2 future [was Re: -mm merge plans for 2.6.20] Pavel Machek
2006-12-05 13:23 ` -mm merge plans for 2.6.20 John W. Linville
2006-12-05 14:27 ` Roman Zippel
2006-12-06  3:46   ` Horst Schirmeier
2006-12-05 16:02 ` Dave Jones
2006-12-12 17:49   ` Dave Jones
2006-12-19  5:20     ` Nick Piggin
2006-12-19  6:44       ` Dave Jones
2006-12-19  7:02         ` Nick Piggin [this message]
2007-01-07 17:36       ` page_mapcount(page) went negative Dave Jones
2007-01-10 23:53         ` Nick Piggin
2006-12-05 17:35 ` -mm merge plans for 2.6.20 James Simmons
2006-12-05 18:01   ` Andrew Morton
2006-12-05 18:25     ` James Simmons
2006-12-05 18:37       ` [PATCH] backlight sysfs change to the fbdev drivers James Simmons
2006-12-05 19:43       ` -mm merge plans for 2.6.20 Andrew Morton
2006-12-05 19:59         ` James Simmons
2006-12-05 20:20           ` Andrew Morton
2006-12-05 21:34             ` James Simmons
2006-12-06 23:40               ` Andrew Morton
2006-12-07 14:31                 ` James Simmons
2006-12-05 20:40       ` Miguel Ojeda
2006-12-06 14:42         ` James Simmons
2006-12-05 19:18 ` Josef Sipek
2006-12-05 19:21   ` [PATCH 1/2] fsstack: Make fsstack_copy_attr_all copy inode size Josef Sipek
2006-12-05 19:22   ` [PATCH 2/2] fsstack: Fix up ecryptfs's fsstack usage Josef Sipek
2006-12-05 22:28     ` Andrew Morton
2006-12-05 22:38       ` Josef Sipek
2006-12-05 22:49         ` Andrew Morton
2006-12-05 23:16           ` Josef Sipek
2006-12-05 21:00 ` -mm merge plans for 2.6.20 Ingo Molnar
2006-12-05 21:17 ` -mm merge plans for 2.6.20, scheduler bits Ingo Molnar
2006-12-05 20:59   ` Siddha, Suresh B
2006-12-05 21:47     ` Ingo Molnar
2006-12-05 21:29   ` Miguel Ojeda
2006-12-06  2:59 ` -mm merge plans for 2.6.20 Roman Zippel
2006-12-06  4:30   ` Andrew Morton
2006-12-06  8:32     ` Thomas Gleixner
2006-12-06 12:54       ` Roman Zippel
2006-12-06 13:11         ` Ingo Molnar
2006-12-06 14:33           ` Roman Zippel
2006-12-06 15:22             ` Ingo Molnar
2006-12-06 16:42               ` Roman Zippel
2006-12-06 16:58                 ` Ingo Molnar
2006-12-06 16:59                 ` Ingo Molnar
2006-12-12 20:40             ` [RFC] HZ free ntp john stultz
2006-12-13  9:51               ` Ingo Molnar
2006-12-13 18:48                 ` john stultz
2006-12-13 13:47               ` Roman Zippel
2006-12-13 19:19                 ` john stultz
2006-12-13 20:40                   ` Roman Zippel
2006-12-20  1:32                     ` john stultz
2006-12-20  1:54                       ` john stultz
2006-12-21  4:26                         ` Andrew Morton
2007-01-01 18:29                         ` Roman Zippel
2007-01-02 19:46                           ` john stultz
2007-01-02 20:50                             ` john stultz
2007-01-06 16:56                               ` Roman Zippel
2007-01-22 19:27                         ` [patch] HZ-free NTP Ingo Molnar
2007-01-22 19:39                           ` Ingo Molnar
2007-01-01 16:27                       ` [RFC] HZ free ntp Roman Zippel
2007-01-02 19:42                         ` john stultz
2007-01-06 16:46                           ` Roman Zippel
2006-12-06 12:33     ` -mm merge plans for 2.6.20 Roman Zippel
2006-12-08 14:09 ` Stephen Smalley
2006-12-08 20:58   ` Andrew Morton
2006-12-10 15:07     ` Mimi Zohar
2006-12-09  9:30 ` Randy Dunlap
2006-12-09  9:44   ` Andrew Morton
2006-12-10 20:12     ` Randy Dunlap
2006-12-05 23:55 Alessandro Guido
2006-12-06  0:13 ` Andrew Morton
2006-12-08 18:32 Steve French
2006-12-08 21:38 ` Andrew Morton
2006-12-10  3:18 Chuck Ebbert
2006-12-11  4:19 ` Steve French
2006-12-11  9:32 Chuck Ebbert
2006-12-13  1:09 Chuck Ebbert

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=45878E8A.6000506@yahoo.com.au \
    --to=nickpiggin@yahoo.com.au \
    --cc=akpm@osdl.org \
    --cc=cj.rankin@ntlworld.com \
    --cc=davej@redhat.com \
    --cc=hugh@veritas.com \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).