All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrew Morton <akpm@linux-foundation.org>
To: Jeremiah Mahler <jmmahler@gmail.com>
Cc: linux-kernel@vger.kernel.org,
	Matthew Wilcox <matthew.r.wilcox@intel.com>,
	Hugh Dickins <hughd@google.com>,
	Mel Gorman <mgorman@techsingularity.net>
Subject: Re: [REGRESSION] mm: filemap_map_pages NULL pointer dereference
Date: Fri, 5 Feb 2016 13:59:09 -0800	[thread overview]
Message-ID: <20160205135909.1f80ffb93b0c8beee0a72be0@linux-foundation.org> (raw)
In-Reply-To: <20160205180502.GA5869@hudson.localdomain>

On Fri, 5 Feb 2016 10:05:02 -0800 Jeremiah Mahler <jmmahler@gmail.com> wrote:

> all,
> 
> On a Lenovo X1 Carbon running -next (20160201+, 20160203+) I have
> experienced several system hangs.  I usually notice it first when
> my browser (Chrome) stops responding but then other programs will stop
> responding as well.  The only fix is a reboot.  It is sporadic but it
> will usually occur once a day.
> 
> In the logs there will be a
> 
>   unable to handle kernel NULL pointer dereference
> 
> message related to filemap_map_pages+0x10d/0x290 (below).
> 
> ------------------------------------------------------------
> ...
> [51985.993033] BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
> [51985.993087] IP: [<ffffffff8114a19d>] filemap_map_pages+0x10d/0x290
> [51985.993123] PGD 2c772067 PUD 0 
> [51985.993144] Oops: 0000 [#1] SMP 
> [51985.993166] Modules linked in: ctr ccm cpufreq_conservative cpufreq_stats cpufreq_userspace cpufreq_powersave binfmt_misc i915 arc4 iwldvm mac80211 x86_pkg_temp_thermal intel_powerclamp coretemp crct10dif_pclmul crc32_pclmul crc32c_intel iTCO_wdt ghash_clmulni_intel iTCO_vendor_support jitterentropy_rng sha256_generic hmac drbg snd_hda_codec_hdmi aesni_intel snd_hda_codec_realtek aes_x86_64 iwlwifi glue_helper snd_hda_codec_generic i2c_algo_bit lrw drm_kms_helper gf128mul ablk_helper cryptd snd_hda_intel drm psmouse snd_hda_codec cfg80211 pcspkr evdev serio_raw snd_hwdep i2c_i801 snd_hda_core sg snd_pcm mei_me lpc_ich mfd_core mei shpchp snd_timer i2c_core wmi thinkpad_acpi nvram snd battery tpm_tis soundcore ac tpm video button intel_smartconnect btusb btbcm btintel bluetooth rfkill loop ipv6 autofs4
> [51985.993591]  ext4 crc16 mbcache jbd2 sd_mod ahci libahci libata ehci_pci sdhci_pci scsi_mod xhci_pci sdhci xhci_hcd ehci_hcd mmc_core usbcore usb_common thermal
> [51985.993680] CPU: 2 PID: 22993 Comm: chrome Not tainted 4.5.0-rc2-next-20160203+ #11
> [51985.993714] Hardware name: LENOVO 3443CTO/3443CTO, BIOS G6ET59WW (2.03 ) 09/11/2012
> [51985.993760] task: ffff88004bb04dc0 ti: ffff88002a2f8000 task.ti: ffff88002a2f8000
> [51985.993804] RIP: 0010:[<ffffffff8114a19d>]  [<ffffffff8114a19d>] filemap_map_pages+0x10d/0x290
> [51985.993845] RSP: 0000:ffff88002a2fbdf8  EFLAGS: 00010202
> [51985.993874] RAX: 00000007fffffff8 RBX: 0000000000000001 RCX: 0000000000000003
> [51985.993911] RDX: 0000000000000000 RSI: ffffea00005bdd1c RDI: ffffea00005bdd00
> [51985.993948] RBP: ffff8800beff4220 R08: 000000000000007f R09: 0000000000000000
> [51985.993985] R10: 0000000000000000 R11: ffff8800a39382b8 R12: ffff8801182b9440
> [51985.994023] R13: ffff88002a2fbe90 R14: ffff8800be568d80 R15: 0000000000000008
> [51985.994061] FS:  00007f3e20276a40(0000) GS:ffff88011e300000(0000) knlGS:0000000000000000
> [51985.994103] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [51985.994134] CR2: 0000000000000008 CR3: 00000000be6eb000 CR4: 00000000001406e0
> [51985.994172] Stack:
> [51985.994184]  ffff8800beff4228 00007f3e0ae63000 0000000000000001 0000000000000000
> [51985.994229]  0000000000000001 00007f3e0ae63000 ffff8800be568d80 0000000000000054
> [51985.994273]  ffff880000000318 ffff88003584c318 ffff8800840076c0 ffffffff8117b073
> [51985.994318] Call Trace:
> [51985.994336]  [<ffffffff8117b073>] ? handle_mm_fault+0x13b3/0x1790
> [51985.994370]  [<ffffffff810aa7f1>] ? up_write+0x21/0x30
> [51985.994400]  [<ffffffff81059052>] ? __do_page_fault+0x192/0x410
> [51985.994434]  [<ffffffff814ffcc8>] ? page_fault+0x28/0x30
> [51985.994463] Code: 00 00 00 48 8b 54 24 10 49 3b 55 28 74 48 48 8b 44 24 18 83 e8 01 29 d0 49 8d 04 c7 49 39 c7 74 19 49 83 c7 08 48 83 44 24 10 01 <49> 83 3f 00 74 eb 4d 85 ff 0f 85 3b ff ff ff 48 8b 3c 24 48 8d 
> [51985.994656] RIP  [<ffffffff8114a19d>] filemap_map_pages+0x10d/0x290
> [51985.994692]  RSP <ffff88002a2fbdf8>
> [51985.994711] CR2: 0000000000000008
> 
> ...
>
> Referring again to the RIP line from the trace.
> 
> [51985.994656] RIP  [<ffffffff8114a19d>] filemap_map_pages+0x10d/0x290
> 
> jeri@hudson:~/linux-next$ gdb vmlinux
> (gdb) list *0xffffffff8114a19d
> 0xffffffff8114a19d is in filemap_map_pages
> (include/linux/radix-tree.h:465).
> 460			unsigned size = radix_tree_chunk_size(iter) - 1;
> 461	
> 462			while (size--) {
> 463				slot++;
> 464				iter->index++;
> 465				if (likely(*slot))
> 466					return slot;
> 467				if (flags & RADIX_TREE_ITER_CONTIG) {
> 468					/* forbid switching to the next chunk */
> 469					iter->next_index = 0;
> (gdb)
> 
> Assuming I traced the addresses correctly, this indicates that the
> fault is triggered when the value in the slot pointer is accessed.
> Perhaps slot is being incremented beyond its valid range?

That's super helpful, thanks.

The faulting address was 0x0000000000000008, so radix_tree_next_slot()
was called with slot==NULL.

And looking at it, I don't see how this code can work at all:


: 	radix_tree_for_each_slot(slot, &mapping->page_tree, &iter, vmf->pgoff) {
: 		if (iter.index > vmf->max_pgoff)
: 			break;
: repeat:
: 		page = radix_tree_deref_slot(slot);
: 		if (unlikely(!page))
: 			goto next;
: 		if (radix_tree_exception(page)) {
: 			if (radix_tree_deref_retry(page)) {
: 				slot = radix_tree_iter_retry(&iter);

radix_tree_iter_retry() unconditionally returns NULL

: 				continue;

here we go and execute the third clause of the
radix_tree_for_each_slot() `for' statement:

	     slot = radix_tree_next_slot(slot, iter, 0))

with slot==NULL.  This will dereference 0x8 every time.

: 			}
: 			goto next;
: 		}

  reply	other threads:[~2016-02-05 21:59 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-02-05 18:05 [REGRESSION] mm: filemap_map_pages NULL pointer dereference Jeremiah Mahler
2016-02-05 21:59 ` Andrew Morton [this message]
2016-02-05 22:19 ` Andrew Morton
2016-02-06 18:18   ` Jeremiah Mahler
2016-02-07  8:27     ` Konstantin Khlebnikov
2016-02-07 15:46       ` Jeremiah Mahler
2016-02-07 15:50   ` Jeremiah Mahler

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160205135909.1f80ffb93b0c8beee0a72be0@linux-foundation.org \
    --to=akpm@linux-foundation.org \
    --cc=hughd@google.com \
    --cc=jmmahler@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=matthew.r.wilcox@intel.com \
    --cc=mgorman@techsingularity.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.