* [REGRESSION] mm: filemap_map_pages NULL pointer dereference
@ 2016-02-05 18:05 Jeremiah Mahler
2016-02-05 21:59 ` Andrew Morton
2016-02-05 22:19 ` Andrew Morton
0 siblings, 2 replies; 7+ messages in thread
From: Jeremiah Mahler @ 2016-02-05 18:05 UTC (permalink / raw)
To: linux-kernel; +Cc: Matthew Wilcox, Andrew Morton, Hugh Dickins, Mel Gorman
all,
On a Lenovo X1 Carbon running -next (20160201+, 20160203+) I have
experienced several system hangs. I usually notice it first when
my browser (Chrome) stops responding but then other programs will stop
responding as well. The only fix is a reboot. It is sporadic but it
will usually occur once a day.
In the logs there will be a
unable to handle kernel NULL pointer dereference
message related to filemap_map_pages+0x10d/0x290 (below).
------------------------------------------------------------
...
[51985.993033] BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
[51985.993087] IP: [<ffffffff8114a19d>] filemap_map_pages+0x10d/0x290
[51985.993123] PGD 2c772067 PUD 0
[51985.993144] Oops: 0000 [#1] SMP
[51985.993166] Modules linked in: ctr ccm cpufreq_conservative cpufreq_stats cpufreq_userspace cpufreq_powersave binfmt_misc i915 arc4 iwldvm mac80211 x86_pkg_temp_thermal intel_powerclamp coretemp crct10dif_pclmul crc32_pclmul crc32c_intel iTCO_wdt ghash_clmulni_intel iTCO_vendor_support jitterentropy_rng sha256_generic hmac drbg snd_hda_codec_hdmi aesni_intel snd_hda_codec_realtek aes_x86_64 iwlwifi glue_helper snd_hda_codec_generic i2c_algo_bit lrw drm_kms_helper gf128mul ablk_helper cryptd snd_hda_intel drm psmouse snd_hda_codec cfg80211 pcspkr evdev serio_raw snd_hwdep i2c_i801 snd_hda_core sg snd_pcm mei_me lpc_ich mfd_core mei shpchp snd_timer i2c_core wmi thinkpad_acpi nvram snd battery tpm_tis soundcore ac tpm video button intel_smartconnect btusb btbcm btintel bluetooth rfkill loop ipv6 autofs4
[51985.993591] ext4 crc16 mbcache jbd2 sd_mod ahci libahci libata ehci_pci sdhci_pci scsi_mod xhci_pci sdhci xhci_hcd ehci_hcd mmc_core usbcore usb_common thermal
[51985.993680] CPU: 2 PID: 22993 Comm: chrome Not tainted 4.5.0-rc2-next-20160203+ #11
[51985.993714] Hardware name: LENOVO 3443CTO/3443CTO, BIOS G6ET59WW (2.03 ) 09/11/2012
[51985.993760] task: ffff88004bb04dc0 ti: ffff88002a2f8000 task.ti: ffff88002a2f8000
[51985.993804] RIP: 0010:[<ffffffff8114a19d>] [<ffffffff8114a19d>] filemap_map_pages+0x10d/0x290
[51985.993845] RSP: 0000:ffff88002a2fbdf8 EFLAGS: 00010202
[51985.993874] RAX: 00000007fffffff8 RBX: 0000000000000001 RCX: 0000000000000003
[51985.993911] RDX: 0000000000000000 RSI: ffffea00005bdd1c RDI: ffffea00005bdd00
[51985.993948] RBP: ffff8800beff4220 R08: 000000000000007f R09: 0000000000000000
[51985.993985] R10: 0000000000000000 R11: ffff8800a39382b8 R12: ffff8801182b9440
[51985.994023] R13: ffff88002a2fbe90 R14: ffff8800be568d80 R15: 0000000000000008
[51985.994061] FS: 00007f3e20276a40(0000) GS:ffff88011e300000(0000) knlGS:0000000000000000
[51985.994103] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[51985.994134] CR2: 0000000000000008 CR3: 00000000be6eb000 CR4: 00000000001406e0
[51985.994172] Stack:
[51985.994184] ffff8800beff4228 00007f3e0ae63000 0000000000000001 0000000000000000
[51985.994229] 0000000000000001 00007f3e0ae63000 ffff8800be568d80 0000000000000054
[51985.994273] ffff880000000318 ffff88003584c318 ffff8800840076c0 ffffffff8117b073
[51985.994318] Call Trace:
[51985.994336] [<ffffffff8117b073>] ? handle_mm_fault+0x13b3/0x1790
[51985.994370] [<ffffffff810aa7f1>] ? up_write+0x21/0x30
[51985.994400] [<ffffffff81059052>] ? __do_page_fault+0x192/0x410
[51985.994434] [<ffffffff814ffcc8>] ? page_fault+0x28/0x30
[51985.994463] Code: 00 00 00 48 8b 54 24 10 49 3b 55 28 74 48 48 8b 44 24 18 83 e8 01 29 d0 49 8d 04 c7 49 39 c7 74 19 49 83 c7 08 48 83 44 24 10 01 <49> 83 3f 00 74 eb 4d 85 ff 0f 85 3b ff ff ff 48 8b 3c 24 48 8d
[51985.994656] RIP [<ffffffff8114a19d>] filemap_map_pages+0x10d/0x290
[51985.994692] RSP <ffff88002a2fbdf8>
[51985.994711] CR2: 0000000000000008
[51986.002154] ---[ end trace da60309b42c1da53 ]---
[52006.988971] INFO: rcu_sched self-detected stall on CPU
[52006.988978] 3-...: (5249 ticks this GP) idle=765/140000000000001/0 softirq=931903/931903 fqs=5247
[52006.988980] (t=5250 jiffies g=821290 c=821289 q=1968)
[52006.988982] Task dump for CPU 3:
[52006.988985] CompositorTileW R running task 0 22999 1425 0x00000108
[52006.988988] ffffffff81851580 ffffffff81148a21 ffff88011e397540 ffffffff81851580
[52006.988991] 0000000000000000 ffff880117daa180 ffffffff810c5e19 0000000000983e0c
[52006.988993] ffffffff810cfd7e 0000000000000092 0000000000000092 003b9aca00000000
[52006.988995] Call Trace:
[52006.988996] <IRQ> [<ffffffff81148a21>] ? rcu_dump_cpu_stacks+0x71/0x8a
[52006.989004] [<ffffffff810c5e19>] ? rcu_check_callbacks+0x6e9/0x790
[52006.989007] [<ffffffff810cfd7e>] ? timekeeping_update+0xee/0x150
[52006.989009] [<ffffffff810d8bd0>] ? tick_sched_handle.isra.14+0x50/0x50
[52006.989011] [<ffffffff810ca712>] ? update_process_times+0x32/0x60
[52006.989013] [<ffffffff810d8ba0>] ? tick_sched_handle.isra.14+0x20/0x50
[52006.989014] [<ffffffff810d8c08>] ? tick_sched_timer+0x38/0x70
[52006.989016] [<ffffffff810caf1c>] ? __hrtimer_run_queues+0xec/0x230
[52006.989017] [<ffffffff810cb5fa>] ? hrtimer_interrupt+0x9a/0x1a0
[52006.989020] [<ffffffff81500599>] ? smp_apic_timer_interrupt+0x39/0x50
[52006.989022] [<ffffffff814fea62>] ? apic_timer_interrupt+0x82/0x90
[52006.989023] <EOI> [<ffffffff812dfa05>] ? delay_tsc+0x25/0x50
[52006.989028] [<ffffffff810ac4d6>] ? do_raw_spin_lock+0x86/0x150
[52006.989031] [<ffffffff8117a175>] ? handle_mm_fault+0x4b5/0x1790
[52006.989033] [<ffffffff812e0be4>] ? call_rwsem_down_read_failed+0x14/0x30
[52006.989035] [<ffffffff81059052>] ? __do_page_fault+0x192/0x410
[52006.989037] [<ffffffff814ffcc8>] ? page_fault+0x28/0x30
------------------------------------------------------------
Referring again to the RIP line from the trace.
[51985.994656] RIP [<ffffffff8114a19d>] filemap_map_pages+0x10d/0x290
jeri@hudson:~/linux-next$ gdb vmlinux
(gdb) list *0xffffffff8114a19d
0xffffffff8114a19d is in filemap_map_pages
(include/linux/radix-tree.h:465).
460 unsigned size = radix_tree_chunk_size(iter) - 1;
461
462 while (size--) {
463 slot++;
464 iter->index++;
465 if (likely(*slot))
466 return slot;
467 if (flags & RADIX_TREE_ITER_CONTIG) {
468 /* forbid switching to the next chunk */
469 iter->next_index = 0;
(gdb)
Assuming I traced the addresses correctly, this indicates that the
fault is triggered when the value in the slot pointer is accessed.
Perhaps slot is being incremented beyond its valid range?
--
- Jeremiah Mahler
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [REGRESSION] mm: filemap_map_pages NULL pointer dereference
2016-02-05 18:05 [REGRESSION] mm: filemap_map_pages NULL pointer dereference Jeremiah Mahler
@ 2016-02-05 21:59 ` Andrew Morton
2016-02-05 22:19 ` Andrew Morton
1 sibling, 0 replies; 7+ messages in thread
From: Andrew Morton @ 2016-02-05 21:59 UTC (permalink / raw)
To: Jeremiah Mahler; +Cc: linux-kernel, Matthew Wilcox, Hugh Dickins, Mel Gorman
On Fri, 5 Feb 2016 10:05:02 -0800 Jeremiah Mahler <jmmahler@gmail.com> wrote:
> all,
>
> On a Lenovo X1 Carbon running -next (20160201+, 20160203+) I have
> experienced several system hangs. I usually notice it first when
> my browser (Chrome) stops responding but then other programs will stop
> responding as well. The only fix is a reboot. It is sporadic but it
> will usually occur once a day.
>
> In the logs there will be a
>
> unable to handle kernel NULL pointer dereference
>
> message related to filemap_map_pages+0x10d/0x290 (below).
>
> ------------------------------------------------------------
> ...
> [51985.993033] BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
> [51985.993087] IP: [<ffffffff8114a19d>] filemap_map_pages+0x10d/0x290
> [51985.993123] PGD 2c772067 PUD 0
> [51985.993144] Oops: 0000 [#1] SMP
> [51985.993166] Modules linked in: ctr ccm cpufreq_conservative cpufreq_stats cpufreq_userspace cpufreq_powersave binfmt_misc i915 arc4 iwldvm mac80211 x86_pkg_temp_thermal intel_powerclamp coretemp crct10dif_pclmul crc32_pclmul crc32c_intel iTCO_wdt ghash_clmulni_intel iTCO_vendor_support jitterentropy_rng sha256_generic hmac drbg snd_hda_codec_hdmi aesni_intel snd_hda_codec_realtek aes_x86_64 iwlwifi glue_helper snd_hda_codec_generic i2c_algo_bit lrw drm_kms_helper gf128mul ablk_helper cryptd snd_hda_intel drm psmouse snd_hda_codec cfg80211 pcspkr evdev serio_raw snd_hwdep i2c_i801 snd_hda_core sg snd_pcm mei_me lpc_ich mfd_core mei shpchp snd_timer i2c_core wmi thinkpad_acpi nvram snd battery tpm_tis soundcore ac tpm video button intel_smartconnect btusb btbcm btintel bluetooth rfkill loop ipv6 autofs4
> [51985.993591] ext4 crc16 mbcache jbd2 sd_mod ahci libahci libata ehci_pci sdhci_pci scsi_mod xhci_pci sdhci xhci_hcd ehci_hcd mmc_core usbcore usb_common thermal
> [51985.993680] CPU: 2 PID: 22993 Comm: chrome Not tainted 4.5.0-rc2-next-20160203+ #11
> [51985.993714] Hardware name: LENOVO 3443CTO/3443CTO, BIOS G6ET59WW (2.03 ) 09/11/2012
> [51985.993760] task: ffff88004bb04dc0 ti: ffff88002a2f8000 task.ti: ffff88002a2f8000
> [51985.993804] RIP: 0010:[<ffffffff8114a19d>] [<ffffffff8114a19d>] filemap_map_pages+0x10d/0x290
> [51985.993845] RSP: 0000:ffff88002a2fbdf8 EFLAGS: 00010202
> [51985.993874] RAX: 00000007fffffff8 RBX: 0000000000000001 RCX: 0000000000000003
> [51985.993911] RDX: 0000000000000000 RSI: ffffea00005bdd1c RDI: ffffea00005bdd00
> [51985.993948] RBP: ffff8800beff4220 R08: 000000000000007f R09: 0000000000000000
> [51985.993985] R10: 0000000000000000 R11: ffff8800a39382b8 R12: ffff8801182b9440
> [51985.994023] R13: ffff88002a2fbe90 R14: ffff8800be568d80 R15: 0000000000000008
> [51985.994061] FS: 00007f3e20276a40(0000) GS:ffff88011e300000(0000) knlGS:0000000000000000
> [51985.994103] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [51985.994134] CR2: 0000000000000008 CR3: 00000000be6eb000 CR4: 00000000001406e0
> [51985.994172] Stack:
> [51985.994184] ffff8800beff4228 00007f3e0ae63000 0000000000000001 0000000000000000
> [51985.994229] 0000000000000001 00007f3e0ae63000 ffff8800be568d80 0000000000000054
> [51985.994273] ffff880000000318 ffff88003584c318 ffff8800840076c0 ffffffff8117b073
> [51985.994318] Call Trace:
> [51985.994336] [<ffffffff8117b073>] ? handle_mm_fault+0x13b3/0x1790
> [51985.994370] [<ffffffff810aa7f1>] ? up_write+0x21/0x30
> [51985.994400] [<ffffffff81059052>] ? __do_page_fault+0x192/0x410
> [51985.994434] [<ffffffff814ffcc8>] ? page_fault+0x28/0x30
> [51985.994463] Code: 00 00 00 48 8b 54 24 10 49 3b 55 28 74 48 48 8b 44 24 18 83 e8 01 29 d0 49 8d 04 c7 49 39 c7 74 19 49 83 c7 08 48 83 44 24 10 01 <49> 83 3f 00 74 eb 4d 85 ff 0f 85 3b ff ff ff 48 8b 3c 24 48 8d
> [51985.994656] RIP [<ffffffff8114a19d>] filemap_map_pages+0x10d/0x290
> [51985.994692] RSP <ffff88002a2fbdf8>
> [51985.994711] CR2: 0000000000000008
>
> ...
>
> Referring again to the RIP line from the trace.
>
> [51985.994656] RIP [<ffffffff8114a19d>] filemap_map_pages+0x10d/0x290
>
> jeri@hudson:~/linux-next$ gdb vmlinux
> (gdb) list *0xffffffff8114a19d
> 0xffffffff8114a19d is in filemap_map_pages
> (include/linux/radix-tree.h:465).
> 460 unsigned size = radix_tree_chunk_size(iter) - 1;
> 461
> 462 while (size--) {
> 463 slot++;
> 464 iter->index++;
> 465 if (likely(*slot))
> 466 return slot;
> 467 if (flags & RADIX_TREE_ITER_CONTIG) {
> 468 /* forbid switching to the next chunk */
> 469 iter->next_index = 0;
> (gdb)
>
> Assuming I traced the addresses correctly, this indicates that the
> fault is triggered when the value in the slot pointer is accessed.
> Perhaps slot is being incremented beyond its valid range?
That's super helpful, thanks.
The faulting address was 0x0000000000000008, so radix_tree_next_slot()
was called with slot==NULL.
And looking at it, I don't see how this code can work at all:
: radix_tree_for_each_slot(slot, &mapping->page_tree, &iter, vmf->pgoff) {
: if (iter.index > vmf->max_pgoff)
: break;
: repeat:
: page = radix_tree_deref_slot(slot);
: if (unlikely(!page))
: goto next;
: if (radix_tree_exception(page)) {
: if (radix_tree_deref_retry(page)) {
: slot = radix_tree_iter_retry(&iter);
radix_tree_iter_retry() unconditionally returns NULL
: continue;
here we go and execute the third clause of the
radix_tree_for_each_slot() `for' statement:
slot = radix_tree_next_slot(slot, iter, 0))
with slot==NULL. This will dereference 0x8 every time.
: }
: goto next;
: }
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [REGRESSION] mm: filemap_map_pages NULL pointer dereference
2016-02-05 18:05 [REGRESSION] mm: filemap_map_pages NULL pointer dereference Jeremiah Mahler
2016-02-05 21:59 ` Andrew Morton
@ 2016-02-05 22:19 ` Andrew Morton
2016-02-06 18:18 ` Jeremiah Mahler
2016-02-07 15:50 ` Jeremiah Mahler
1 sibling, 2 replies; 7+ messages in thread
From: Andrew Morton @ 2016-02-05 22:19 UTC (permalink / raw)
To: Jeremiah Mahler
Cc: linux-kernel, Matthew Wilcox, Hugh Dickins, Mel Gorman,
Stephen Rothwell, Konstantin Khlebnikov
On Fri, 5 Feb 2016 10:05:02 -0800 Jeremiah Mahler <jmmahler@gmail.com> wrote:
> On a Lenovo X1 Carbon running -next (20160201+, 20160203+) I have
> experienced several system hangs. I usually notice it first when
> my browser (Chrome) stops responding but then other programs will stop
> responding as well. The only fix is a reboot. It is sporadic but it
> will usually occur once a day.
>
> In the logs there will be a
>
> unable to handle kernel NULL pointer dereference
This should fix it up.
From: Konstantin Khlebnikov <koct9i@gmail.com>
Subject: radix-tree: fix oops after radix_tree_iter_retry
Helper radix_tree_iter_retry() resets next_index to the current index. In
following radix_tree_next_slot current chunk size becomes zero. This
isn't checked and it tries to dereference null pointer in slot.
Tagged iterator is fine because retry happens only at slot 0 where tag
bitmask in iter->tags is filled with single bit.
Fixes: 46437f9a554f ("radix-tree: fix race in gang lookup")
Signed-off-by: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Matthew Wilcox <willy@linux.intel.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Ohad Ben-Cohen <ohad@wizery.com>
Cc: Jeremiah Mahler <jmmahler@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
include/linux/radix-tree.h | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff -puN include/linux/radix-tree.h~radix-tree-fix-oops-after-radix_tree_iter_retry include/linux/radix-tree.h
--- a/include/linux/radix-tree.h~radix-tree-fix-oops-after-radix_tree_iter_retry
+++ a/include/linux/radix-tree.h
@@ -400,7 +400,7 @@ void **radix_tree_iter_retry(struct radi
* @iter: pointer to radix tree iterator
* Returns: current chunk size
*/
-static __always_inline unsigned
+static __always_inline long
radix_tree_chunk_size(struct radix_tree_iter *iter)
{
return iter->next_index - iter->index;
@@ -434,9 +434,9 @@ radix_tree_next_slot(void **slot, struct
return slot + offset + 1;
}
} else {
- unsigned size = radix_tree_chunk_size(iter) - 1;
+ long size = radix_tree_chunk_size(iter);
- while (size--) {
+ while (--size > 0) {
slot++;
iter->index++;
if (likely(*slot))
_
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [REGRESSION] mm: filemap_map_pages NULL pointer dereference
2016-02-05 22:19 ` Andrew Morton
@ 2016-02-06 18:18 ` Jeremiah Mahler
2016-02-07 8:27 ` Konstantin Khlebnikov
2016-02-07 15:50 ` Jeremiah Mahler
1 sibling, 1 reply; 7+ messages in thread
From: Jeremiah Mahler @ 2016-02-06 18:18 UTC (permalink / raw)
To: Andrew Morton
Cc: linux-kernel, Matthew Wilcox, Hugh Dickins, Mel Gorman,
Stephen Rothwell, Konstantin Khlebnikov
Andrew,
On Fri, Feb 05, 2016 at 02:19:40PM -0800, Andrew Morton wrote:
> On Fri, 5 Feb 2016 10:05:02 -0800 Jeremiah Mahler <jmmahler@gmail.com> wrote:
>
[...]
> > unable to handle kernel NULL pointer dereference
>
> This should fix it up.
>
[...]
>
> include/linux/radix-tree.h | 6 +++---
> 1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff -puN include/linux/radix-tree.h~radix-tree-fix-oops-after-radix_tree_iter_retry include/linux/radix-tree.h
> --- a/include/linux/radix-tree.h~radix-tree-fix-oops-after-radix_tree_iter_retry
> +++ a/include/linux/radix-tree.h
> @@ -400,7 +400,7 @@ void **radix_tree_iter_retry(struct radi
> * @iter: pointer to radix tree iterator
> * Returns: current chunk size
> */
> -static __always_inline unsigned
> +static __always_inline long
> radix_tree_chunk_size(struct radix_tree_iter *iter)
> {
> return iter->next_index - iter->index;
> @@ -434,9 +434,9 @@ radix_tree_next_slot(void **slot, struct
> return slot + offset + 1;
> }
> } else {
> - unsigned size = radix_tree_chunk_size(iter) - 1;
> + long size = radix_tree_chunk_size(iter);
>
> - while (size--) {
> + while (--size > 0) {
> slot++;
> iter->index++;
> if (likely(*slot))
> _
>
I have applied this patch to my kernel and so far the bug has not
come back. Thanks for the quick fix.
Although I don't quite understand how this fixes the slot==NULL problem.
Unless I am missing something, it looks like the while loop will be
executed the same number of times but the size variable will no
longer go negative as it did before.
--
- Jeremiah Mahler
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [REGRESSION] mm: filemap_map_pages NULL pointer dereference
2016-02-06 18:18 ` Jeremiah Mahler
@ 2016-02-07 8:27 ` Konstantin Khlebnikov
2016-02-07 15:46 ` Jeremiah Mahler
0 siblings, 1 reply; 7+ messages in thread
From: Konstantin Khlebnikov @ 2016-02-07 8:27 UTC (permalink / raw)
To: Jeremiah Mahler, Andrew Morton, Linux Kernel Mailing List,
Matthew Wilcox, Hugh Dickins, Mel Gorman, Stephen Rothwell,
Konstantin Khlebnikov
On Sat, Feb 6, 2016 at 9:18 PM, Jeremiah Mahler <jmmahler@gmail.com> wrote:
> Andrew,
>
> On Fri, Feb 05, 2016 at 02:19:40PM -0800, Andrew Morton wrote:
>> On Fri, 5 Feb 2016 10:05:02 -0800 Jeremiah Mahler <jmmahler@gmail.com> wrote:
>>
> [...]
>> > unable to handle kernel NULL pointer dereference
>>
>> This should fix it up.
>>
> [...]
>>
>> include/linux/radix-tree.h | 6 +++---
>> 1 file changed, 3 insertions(+), 3 deletions(-)
>>
>> diff -puN include/linux/radix-tree.h~radix-tree-fix-oops-after-radix_tree_iter_retry include/linux/radix-tree.h
>> --- a/include/linux/radix-tree.h~radix-tree-fix-oops-after-radix_tree_iter_retry
>> +++ a/include/linux/radix-tree.h
>> @@ -400,7 +400,7 @@ void **radix_tree_iter_retry(struct radi
>> * @iter: pointer to radix tree iterator
>> * Returns: current chunk size
>> */
>> -static __always_inline unsigned
>> +static __always_inline long
>> radix_tree_chunk_size(struct radix_tree_iter *iter)
>> {
>> return iter->next_index - iter->index;
>> @@ -434,9 +434,9 @@ radix_tree_next_slot(void **slot, struct
>> return slot + offset + 1;
>> }
>> } else {
>> - unsigned size = radix_tree_chunk_size(iter) - 1;
>> + long size = radix_tree_chunk_size(iter);
>>
>> - while (size--) {
>> + while (--size > 0) {
>> slot++;
>> iter->index++;
>> if (likely(*slot))
>> _
>>
>
> I have applied this patch to my kernel and so far the bug has not
> come back. Thanks for the quick fix.
>
> Although I don't quite understand how this fixes the slot==NULL problem.
> Unless I am missing something, it looks like the while loop will be
> executed the same number of times but the size variable will no
> longer go negative as it did before.
That's simple. Slot is dereferenced after checking remaining size.
Old version checked only for != 0. After iter-retry size is zero and
afrer "- 1" it overlaps into positive range. In new version it's signed and
checked for > 0.
>
> --
> - Jeremiah Mahler
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [REGRESSION] mm: filemap_map_pages NULL pointer dereference
2016-02-07 8:27 ` Konstantin Khlebnikov
@ 2016-02-07 15:46 ` Jeremiah Mahler
0 siblings, 0 replies; 7+ messages in thread
From: Jeremiah Mahler @ 2016-02-07 15:46 UTC (permalink / raw)
To: Konstantin Khlebnikov
Cc: Andrew Morton, Linux Kernel Mailing List, Matthew Wilcox,
Hugh Dickins, Mel Gorman, Stephen Rothwell
Konstantin,
On Sun, Feb 07, 2016 at 11:27:53AM +0300, Konstantin Khlebnikov wrote:
> On Sat, Feb 6, 2016 at 9:18 PM, Jeremiah Mahler <jmmahler@gmail.com> wrote:
[...]
> >> -static __always_inline unsigned
> >> +static __always_inline long
> >> radix_tree_chunk_size(struct radix_tree_iter *iter)
> >> {
> >> return iter->next_index - iter->index;
> >> @@ -434,9 +434,9 @@ radix_tree_next_slot(void **slot, struct
> >> return slot + offset + 1;
> >> }
> >> } else {
> >> - unsigned size = radix_tree_chunk_size(iter) - 1;
> >> + long size = radix_tree_chunk_size(iter);
> >>
> >> - while (size--) {
> >> + while (--size > 0) {
> >> slot++;
> >> iter->index++;
> >> if (likely(*slot))
> >> _
> >>
> >
> > I have applied this patch to my kernel and so far the bug has not
> > come back. Thanks for the quick fix.
> >
> > Although I don't quite understand how this fixes the slot==NULL problem.
> > Unless I am missing something, it looks like the while loop will be
> > executed the same number of times but the size variable will no
> > longer go negative as it did before.
>
> That's simple. Slot is dereferenced after checking remaining size.
> Old version checked only for != 0. After iter-retry size is zero and
> afrer "- 1" it overlaps into positive range. In new version it's signed and
> checked for > 0.
>
OK, I get it now. Thanks for the explanation.
--
- Jeremiah Mahler
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [REGRESSION] mm: filemap_map_pages NULL pointer dereference
2016-02-05 22:19 ` Andrew Morton
2016-02-06 18:18 ` Jeremiah Mahler
@ 2016-02-07 15:50 ` Jeremiah Mahler
1 sibling, 0 replies; 7+ messages in thread
From: Jeremiah Mahler @ 2016-02-07 15:50 UTC (permalink / raw)
To: Andrew Morton
Cc: linux-kernel, Matthew Wilcox, Hugh Dickins, Mel Gorman,
Stephen Rothwell, Konstantin Khlebnikov
Konstantin, Andrew,
On Fri, Feb 05, 2016 at 02:19:40PM -0800, Andrew Morton wrote:
> On Fri, 5 Feb 2016 10:05:02 -0800 Jeremiah Mahler <jmmahler@gmail.com> wrote:
>
[...]
>
> This should fix it up.
>
> From: Konstantin Khlebnikov <koct9i@gmail.com>
> Subject: radix-tree: fix oops after radix_tree_iter_retry
>
> Helper radix_tree_iter_retry() resets next_index to the current index. In
> following radix_tree_next_slot current chunk size becomes zero. This
> isn't checked and it tries to dereference null pointer in slot.
>
> Tagged iterator is fine because retry happens only at slot 0 where tag
> bitmask in iter->tags is filled with single bit.
>
> Fixes: 46437f9a554f ("radix-tree: fix race in gang lookup")
> Signed-off-by: Konstantin Khlebnikov <koct9i@gmail.com>
> Cc: Matthew Wilcox <willy@linux.intel.com>
> Cc: Hugh Dickins <hughd@google.com>
> Cc: Ohad Ben-Cohen <ohad@wizery.com>
> Cc: Jeremiah Mahler <jmmahler@gmail.com>
> Cc: <stable@vger.kernel.org>
> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
> ---
>
> include/linux/radix-tree.h | 6 +++---
> 1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff -puN include/linux/radix-tree.h~radix-tree-fix-oops-after-radix_tree_iter_retry include/linux/radix-tree.h
> --- a/include/linux/radix-tree.h~radix-tree-fix-oops-after-radix_tree_iter_retry
> +++ a/include/linux/radix-tree.h
> @@ -400,7 +400,7 @@ void **radix_tree_iter_retry(struct radi
> * @iter: pointer to radix tree iterator
> * Returns: current chunk size
> */
> -static __always_inline unsigned
> +static __always_inline long
> radix_tree_chunk_size(struct radix_tree_iter *iter)
> {
> return iter->next_index - iter->index;
> @@ -434,9 +434,9 @@ radix_tree_next_slot(void **slot, struct
> return slot + offset + 1;
> }
> } else {
> - unsigned size = radix_tree_chunk_size(iter) - 1;
> + long size = radix_tree_chunk_size(iter);
>
> - while (size--) {
> + while (--size > 0) {
> slot++;
> iter->index++;
> if (likely(*slot))
> _
>
Fix is still working great after a couple days.
Tested-by: Jeremiah Mahler <jmmahler@gmail.com>
--
- Jeremiah Mahler
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2016-02-07 15:50 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-02-05 18:05 [REGRESSION] mm: filemap_map_pages NULL pointer dereference Jeremiah Mahler
2016-02-05 21:59 ` Andrew Morton
2016-02-05 22:19 ` Andrew Morton
2016-02-06 18:18 ` Jeremiah Mahler
2016-02-07 8:27 ` Konstantin Khlebnikov
2016-02-07 15:46 ` Jeremiah Mahler
2016-02-07 15:50 ` Jeremiah Mahler
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.