All of lore.kernel.org
 help / color / mirror / Atom feed
* memcontrol.c BUG
@ 2015-01-27 22:13 Dave Airlie
  2015-01-28  8:48 ` Chris Wilson
  0 siblings, 1 reply; 10+ messages in thread
From: Dave Airlie @ 2015-01-27 22:13 UTC (permalink / raw)
  To: intel-gfx

https://bugzilla.redhat.com/show_bug.cgi?id=1165369

ov 18 09:23:22 elissa.gathman.org kernel: page:f5e36a40 count:2
mapcount:0 mapping:  (null) index:0x0
Nov 18 09:23:22 elissa.gathman.org kernel: page flags:
0x80090029(locked|uptodate|lru|swapcache|swapbacked)
Nov 18 09:23:22 elissa.gathman.org kernel: page dumped because:
VM_BUG_ON_PAGE(!lrucare && PageLRU(oldpage))
Nov 18 09:23:23 elissa.gathman.org kernel: ------------[ cut here ]------------
Nov 18 09:23:23 elissa.gathman.org kernel: kernel BUG at mm/memcontrol.c:6733!
Nov 18 09:23:23 elissa.gathman.org kernel: invalid opcode: 0000 [#1] SMP
Nov 18 09:23:23 elissa.gathman.org kernel: Modules linked in: tcp_lp
vfat fat fuse tun ccm bnep bluetooth ip6t_rpfilter ip6t_REJECT
xt_conntrack ebtable_nat ebtable_broute bridge stp llc ebtable_filter
ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6
ip6table_mangle ip6table_security ip6table_raw ip6table_filter
ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4
nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw
snd_hda_codec_idt snd_hda_codec_hdmi snd_hda_codec_generic mmc_block
snd_hda_intel snd_hda_controller iTCO_wdt snd_hda_codec
iTCO_vendor_support dell_wmi sdhci_pci gpio_ich sparse_keymap arc4
joydev iwl3945 snd_hwdep serio_raw sdhci snd_seq dell_laptop iwlegacy
dcdbas mac80211 coretemp i2c_i801 r592 mmc_core snd_seq_device snd_pcm
cfg80211 memstick rfkill snd_timer lpc_ich
Nov 18 09:23:23 elissa.gathman.org kernel:  snd soundcore wmi
acpi_cpufreq hid_logitech_dj firewire_ohci firewire_core ata_generic
i915 pata_acpi crc_itu_t i2c_algo_bit sky2 drm_kms_helper drm video
Nov 18 09:23:23 elissa.gathman.org kernel: CPU: 0 PID: 966 Comm: Xorg
Tainted: G        W      3.17.2-200.fc20.i686 #1
Nov 18 09:23:23 elissa.gathman.org kernel: Hardware name: Dell Inc.
Inspiron 1525                   /0U990C, BIOS A16 10/16/2008
Nov 18 09:23:24 elissa.gathman.org kernel: task: e88adf80 ti: e88a2000
task.ti: e88a2000
Nov 18 09:23:24 elissa.gathman.org kernel: EIP: 0060:[<c05763ab>]
EFLAGS: 00013282 CPU: 0
Nov 18 09:23:24 elissa.gathman.org kernel: EIP is at
mem_cgroup_migrate+0x14b/0x180
Nov 18 09:23:24 elissa.gathman.org kernel: EAX: 00000000 EBX: f54eec40
ECX: f73d8ac8 EDX: 00000000
Nov 18 09:23:24 elissa.gathman.org kernel: ESI: f5e36a40 EDI: c0cd3a40
EBP: e88a3b9c ESP: e88a3b84
Nov 18 09:23:24 elissa.gathman.org kernel:  DS: 007b ES: 007b FS: 00d8
GS: 00e0 SS: 0068
Nov 18 09:23:24 elissa.gathman.org kernel: CR0: 80050033 CR2: b69e6000
CR3: 2f3ab000 CR4: 000007d0
Nov 18 09:23:24 elissa.gathman.org kernel: Stack:
Nov 18 09:23:24 elissa.gathman.org kernel:  c0d215c0 f54eec40 2a278f72
f54eec40 00000000 c0cd3a40 e88a3bc8 c053d71b
Nov 18 09:23:24 elissa.gathman.org kernel:  f54eec40 e88a3c20 00008537
fffba000 c0cd3a40 f5e36a40 00008537 ef3df828
Nov 18 09:23:24 elissa.gathman.org kernel:  ffffffef e88a3c34 c053db59
e88a3c04 000214de ef3df8f8 e88adf80 e88a3c48
Nov 18 09:23:24 elissa.gathman.org kernel: Call Trace:
Nov 18 09:23:25 elissa.gathman.org kernel:  [<c053d71b>]
shmem_replace_page.isra.28+0x11b/0x200
Nov 18 09:23:25 elissa.gathman.org kernel:  [<c053db59>]
shmem_getpage_gfp+0x239/0x770
Nov 18 09:23:25 elissa.gathman.org kernel:  [<c053e15f>]
shmem_read_mapping_page_gfp+0x3f/0x70
Nov 18 09:23:25 elissa.gathman.org kernel:  [<c06cd870>] ? sg_kfree+0x30/0x30
Nov 18 09:23:25 elissa.gathman.org kernel:  [<f7f61be1>]
i915_gem_object_get_pages_gtt+0x141/0x2c0 [i915]
Nov 18 09:23:25 elissa.gathman.org kernel:  [<c0409f50>] ?
nommu_map_sg+0x40/0xb0
Nov 18 09:23:25 elissa.gathman.org kernel:  [<c07a8c92>] ?
intel_gtt_insert_sg_entries+0x72/0xa0
Nov 18 09:23:25 elissa.gathman.org kernel:  [<f7f5df66>]
i915_gem_object_get_pages+0x66/0xa0 [i915]
Nov 18 09:23:25 elissa.gathman.org kernel:  [<f7f621ed>]
i915_gem_object_pin+0x3ed/0x6e0 [i915]
Nov 18 09:23:25 elissa.gathman.org kernel:  [<f7f57665>]
i915_gem_execbuffer_reserve_vma.isra.11+0x75/0x130 [i915]
Nov 18 09:23:25 elissa.gathman.org kernel:  [<f7f579c5>]
i915_gem_execbuffer_reserve+0x2a5/0x2d0 [i915]
Nov 18 09:23:25 elissa.gathman.org kernel:  [<f7f582ae>]
i915_gem_do_execbuffer.isra.18+0x51e/0x11d0 [i915]
Nov 18 09:23:25 elissa.gathman.org kernel:  [<f7f6363c>] ?
i915_gem_object_set_to_gtt_domain+0xfc/0x1b0 [i915]
Nov 18 09:23:25 elissa.gathman.org kernel:  [<f7f639f5>] ?
i915_gem_object_ggtt_unpin+0x15/0x90 [i915]
Nov 18 09:23:25 elissa.gathman.org kernel:  [<f7f59499>] ?
i915_gem_execbuffer2+0x59/0x2b0 [i915]
Nov 18 09:23:25 elissa.gathman.org kernel:  [<f7f594cb>]
i915_gem_execbuffer2+0x8b/0x2b0 [i915]
Nov 18 09:23:25 elissa.gathman.org kernel:  [<f7f59440>] ?
i915_gem_execbuffer+0x4e0/0x4e0 [i915]
Nov 18 09:23:25 elissa.gathman.org kernel:  [<f7e062bf>]
drm_ioctl+0x1cf/0x520 [drm]
Nov 18 09:23:25 elissa.gathman.org kernel:  [<f7f59440>] ?
i915_gem_execbuffer+0x4e0/0x4e0 [i915]
Nov 18 09:23:25 elissa.gathman.org kernel:  [<c06c5790>] ?
timerqueue_add+0x50/0xb0
Nov 18 09:23:25 elissa.gathman.org kernel:  [<c04b1515>] ? ktime_get+0x45/0xe0
Nov 18 09:23:25 elissa.gathman.org kernel:  [<f7e060f0>] ?
drm_copy_field+0x70/0x70 [drm]
Nov 18 09:23:25 elissa.gathman.org kernel:  [<c058df5a>]
do_vfs_ioctl+0x2ea/0x4b0
Nov 18 09:23:25 elissa.gathman.org kernel:  [<c065cc72>] ?
inode_has_perm.isra.32+0x32/0x40
Nov 18 09:23:25 elissa.gathman.org kernel:  [<c065cdc7>] ?
file_has_perm+0x97/0xa0
Nov 18 09:23:25 elissa.gathman.org kernel:  [<c065d83c>] ?
selinux_file_ioctl+0x4c/0xf0
Nov 18 09:23:25 elissa.gathman.org kernel:  [<c058e180>] SyS_ioctl+0x60/0x90
Nov 18 09:23:25 elissa.gathman.org kernel:  [<c0a1ea9f>]
sysenter_do_call+0x12/0x12
Nov 18 09:23:25 elissa.gathman.org kernel: Code: f3 bd c0 89 d8 e8 b6
79 fb ff 0f 0b ba 18 f3 bd c0 89 f0 e8 a8 79 fb ff 0f 0b e8 01 cb ed
ff ba 4c fc bd c0 89 f0 e8 95 79 fb ff <0f> 0b a8 04 0f 85 60 ff ff ff
ba 0c fd bd c0 89 f0 e8 7f 79 fb
Nov 18 09:23:25 elissa.gathman.org kernel: EIP: [<c05763ab>]
mem_cgroup_migrate+0x14b/0x180 SS:ESP 0068:e88a3b84
Nov 18 09:23:25 elissa.gathman.org kernel: ---[ end trace 9aaca320302a6e84 ]---

discuss, 965GM I believe.

Dave.
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: memcontrol.c BUG
  2015-01-27 22:13 memcontrol.c BUG Dave Airlie
@ 2015-01-28  8:48 ` Chris Wilson
  2015-01-28 14:32   ` [Intel-gfx] " Michal Hocko
  0 siblings, 1 reply; 10+ messages in thread
From: Chris Wilson @ 2015-01-28  8:48 UTC (permalink / raw)
  To: Dave Airlie, Johannes Weiner
  Cc: Vladimir Davydov, intel-gfx, Hugh Dickins, Felipe Balbi,
	Michal Hocko, Tejun Heo, Andrew Morton, Jet Chen

On Wed, Jan 28, 2015 at 08:13:06AM +1000, Dave Airlie wrote:
> https://bugzilla.redhat.com/show_bug.cgi?id=1165369
> 
> ov 18 09:23:22 elissa.gathman.org kernel: page:f5e36a40 count:2
> mapcount:0 mapping:  (null) index:0x0
> Nov 18 09:23:22 elissa.gathman.org kernel: page flags:
> 0x80090029(locked|uptodate|lru|swapcache|swapbacked)
> Nov 18 09:23:22 elissa.gathman.org kernel: page dumped because:
> VM_BUG_ON_PAGE(!lrucare && PageLRU(oldpage))
> Nov 18 09:23:23 elissa.gathman.org kernel: ------------[ cut here ]------------
> Nov 18 09:23:23 elissa.gathman.org kernel: kernel BUG at mm/memcontrol.c:6733!
> Nov 18 09:23:23 elissa.gathman.org kernel: invalid opcode: 0000 [#1] SMP
> Nov 18 09:23:23 elissa.gathman.org kernel: Modules linked in: tcp_lp
> vfat fat fuse tun ccm bnep bluetooth ip6t_rpfilter ip6t_REJECT
> xt_conntrack ebtable_nat ebtable_broute bridge stp llc ebtable_filter
> ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6
> ip6table_mangle ip6table_security ip6table_raw ip6table_filter
> ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4
> nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw
> snd_hda_codec_idt snd_hda_codec_hdmi snd_hda_codec_generic mmc_block
> snd_hda_intel snd_hda_controller iTCO_wdt snd_hda_codec
> iTCO_vendor_support dell_wmi sdhci_pci gpio_ich sparse_keymap arc4
> joydev iwl3945 snd_hwdep serio_raw sdhci snd_seq dell_laptop iwlegacy
> dcdbas mac80211 coretemp i2c_i801 r592 mmc_core snd_seq_device snd_pcm
> cfg80211 memstick rfkill snd_timer lpc_ich
> Nov 18 09:23:23 elissa.gathman.org kernel:  snd soundcore wmi
> acpi_cpufreq hid_logitech_dj firewire_ohci firewire_core ata_generic
> i915 pata_acpi crc_itu_t i2c_algo_bit sky2 drm_kms_helper drm video
> Nov 18 09:23:23 elissa.gathman.org kernel: CPU: 0 PID: 966 Comm: Xorg
> Tainted: G        W      3.17.2-200.fc20.i686 #1
> Nov 18 09:23:23 elissa.gathman.org kernel: Hardware name: Dell Inc.
> Inspiron 1525                   /0U990C, BIOS A16 10/16/2008
> Nov 18 09:23:24 elissa.gathman.org kernel: task: e88adf80 ti: e88a2000
> task.ti: e88a2000
> Nov 18 09:23:24 elissa.gathman.org kernel: EIP: 0060:[<c05763ab>]
> EFLAGS: 00013282 CPU: 0
> Nov 18 09:23:24 elissa.gathman.org kernel: EIP is at
> mem_cgroup_migrate+0x14b/0x180
> Nov 18 09:23:24 elissa.gathman.org kernel: EAX: 00000000 EBX: f54eec40
> ECX: f73d8ac8 EDX: 00000000
> Nov 18 09:23:24 elissa.gathman.org kernel: ESI: f5e36a40 EDI: c0cd3a40
> EBP: e88a3b9c ESP: e88a3b84
> Nov 18 09:23:24 elissa.gathman.org kernel:  DS: 007b ES: 007b FS: 00d8
> GS: 00e0 SS: 0068
> Nov 18 09:23:24 elissa.gathman.org kernel: CR0: 80050033 CR2: b69e6000
> CR3: 2f3ab000 CR4: 000007d0
> Nov 18 09:23:24 elissa.gathman.org kernel: Stack:
> Nov 18 09:23:24 elissa.gathman.org kernel:  c0d215c0 f54eec40 2a278f72
> f54eec40 00000000 c0cd3a40 e88a3bc8 c053d71b
> Nov 18 09:23:24 elissa.gathman.org kernel:  f54eec40 e88a3c20 00008537
> fffba000 c0cd3a40 f5e36a40 00008537 ef3df828
> Nov 18 09:23:24 elissa.gathman.org kernel:  ffffffef e88a3c34 c053db59
> e88a3c04 000214de ef3df8f8 e88adf80 e88a3c48
> Nov 18 09:23:24 elissa.gathman.org kernel: Call Trace:
> Nov 18 09:23:25 elissa.gathman.org kernel:  [<c053d71b>]
> shmem_replace_page.isra.28+0x11b/0x200
> Nov 18 09:23:25 elissa.gathman.org kernel:  [<c053db59>]
> shmem_getpage_gfp+0x239/0x770
> Nov 18 09:23:25 elissa.gathman.org kernel:  [<c053e15f>]
> shmem_read_mapping_page_gfp+0x3f/0x70
> Nov 18 09:23:25 elissa.gathman.org kernel:  [<c06cd870>] ? sg_kfree+0x30/0x30
> Nov 18 09:23:25 elissa.gathman.org kernel:  [<f7f61be1>]
> i915_gem_object_get_pages_gtt+0x141/0x2c0 [i915]
> Nov 18 09:23:25 elissa.gathman.org kernel:  [<c0409f50>] ?
> nommu_map_sg+0x40/0xb0
> Nov 18 09:23:25 elissa.gathman.org kernel:  [<c07a8c92>] ?
> intel_gtt_insert_sg_entries+0x72/0xa0
> Nov 18 09:23:25 elissa.gathman.org kernel:  [<f7f5df66>]
> i915_gem_object_get_pages+0x66/0xa0 [i915]
> Nov 18 09:23:25 elissa.gathman.org kernel:  [<f7f621ed>]
> i915_gem_object_pin+0x3ed/0x6e0 [i915]
> Nov 18 09:23:25 elissa.gathman.org kernel:  [<f7f57665>]
> i915_gem_execbuffer_reserve_vma.isra.11+0x75/0x130 [i915]
> Nov 18 09:23:25 elissa.gathman.org kernel:  [<f7f579c5>]
> i915_gem_execbuffer_reserve+0x2a5/0x2d0 [i915]
> Nov 18 09:23:25 elissa.gathman.org kernel:  [<f7f582ae>]
> i915_gem_do_execbuffer.isra.18+0x51e/0x11d0 [i915]
> Nov 18 09:23:25 elissa.gathman.org kernel:  [<f7f6363c>] ?
> i915_gem_object_set_to_gtt_domain+0xfc/0x1b0 [i915]
> Nov 18 09:23:25 elissa.gathman.org kernel:  [<f7f639f5>] ?
> i915_gem_object_ggtt_unpin+0x15/0x90 [i915]
> Nov 18 09:23:25 elissa.gathman.org kernel:  [<f7f59499>] ?
> i915_gem_execbuffer2+0x59/0x2b0 [i915]
> Nov 18 09:23:25 elissa.gathman.org kernel:  [<f7f594cb>]
> i915_gem_execbuffer2+0x8b/0x2b0 [i915]
> Nov 18 09:23:25 elissa.gathman.org kernel:  [<f7f59440>] ?
> i915_gem_execbuffer+0x4e0/0x4e0 [i915]
> Nov 18 09:23:25 elissa.gathman.org kernel:  [<f7e062bf>]
> drm_ioctl+0x1cf/0x520 [drm]
> Nov 18 09:23:25 elissa.gathman.org kernel:  [<f7f59440>] ?
> i915_gem_execbuffer+0x4e0/0x4e0 [i915]
> Nov 18 09:23:25 elissa.gathman.org kernel:  [<c06c5790>] ?
> timerqueue_add+0x50/0xb0
> Nov 18 09:23:25 elissa.gathman.org kernel:  [<c04b1515>] ? ktime_get+0x45/0xe0
> Nov 18 09:23:25 elissa.gathman.org kernel:  [<f7e060f0>] ?
> drm_copy_field+0x70/0x70 [drm]
> Nov 18 09:23:25 elissa.gathman.org kernel:  [<c058df5a>]
> do_vfs_ioctl+0x2ea/0x4b0
> Nov 18 09:23:25 elissa.gathman.org kernel:  [<c065cc72>] ?
> inode_has_perm.isra.32+0x32/0x40
> Nov 18 09:23:25 elissa.gathman.org kernel:  [<c065cdc7>] ?
> file_has_perm+0x97/0xa0
> Nov 18 09:23:25 elissa.gathman.org kernel:  [<c065d83c>] ?
> selinux_file_ioctl+0x4c/0xf0
> Nov 18 09:23:25 elissa.gathman.org kernel:  [<c058e180>] SyS_ioctl+0x60/0x90
> Nov 18 09:23:25 elissa.gathman.org kernel:  [<c0a1ea9f>]
> sysenter_do_call+0x12/0x12
> Nov 18 09:23:25 elissa.gathman.org kernel: Code: f3 bd c0 89 d8 e8 b6
> 79 fb ff 0f 0b ba 18 f3 bd c0 89 f0 e8 a8 79 fb ff 0f 0b e8 01 cb ed
> ff ba 4c fc bd c0 89 f0 e8 95 79 fb ff <0f> 0b a8 04 0f 85 60 ff ff ff
> ba 0c fd bd c0 89 f0 e8 7f 79 fb
> Nov 18 09:23:25 elissa.gathman.org kernel: EIP: [<c05763ab>]
> mem_cgroup_migrate+0x14b/0x180 SS:ESP 0068:e88a3b84
> Nov 18 09:23:25 elissa.gathman.org kernel: ---[ end trace 9aaca320302a6e84 ]---
> 
> discuss, 965GM I believe.

965GM and that it uniquely uses

mask = GFP_HIGHUSER | __GFP_RECLAIMABLE;
if (IS_CRESTLINE(dev) || IS_BROADWATER(dev)) {
	/* 965gm cannot relocate objects above 4GiB. */
	mask &= ~__GFP_HIGHMEM;
	mask |= __GFP_DMA32;
}

for its shmemfs gfp mask.

I think we decided that

commit 0a31bc97c80c3fa87b32c091d9a930ac19cd0c40
Author: Johannes Weiner <hannes@cmpxchg.org>
Date:   Fri Aug 8 14:19:22 2014 -0700

    mm: memcontrol: rewrite uncharge API

was the likely candidate (given that it was the only change to
mm/shmemfs.c in the regresssing time frame).
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Intel-gfx] memcontrol.c BUG
  2015-01-28  8:48 ` Chris Wilson
@ 2015-01-28 14:32   ` Michal Hocko
  2015-01-29  8:16     ` Chris Wilson
  2015-01-30  2:04       ` Hugh Dickins
  0 siblings, 2 replies; 10+ messages in thread
From: Michal Hocko @ 2015-01-28 14:32 UTC (permalink / raw)
  To: Chris Wilson, Dave Airlie, Johannes Weiner, intel-gfx,
	Hugh Dickins, Tejun Heo, Vladimir Davydov, Jet Chen,
	Felipe Balbi, Andrew Morton
  Cc: linux-mm

On Wed 28-01-15 08:48:52, Chris Wilson wrote:
> On Wed, Jan 28, 2015 at 08:13:06AM +1000, Dave Airlie wrote:
> > https://bugzilla.redhat.com/show_bug.cgi?id=1165369
> > 
> > ov 18 09:23:22 elissa.gathman.org kernel: page:f5e36a40 count:2
> > mapcount:0 mapping:  (null) index:0x0
> > Nov 18 09:23:22 elissa.gathman.org kernel: page flags:
> > 0x80090029(locked|uptodate|lru|swapcache|swapbacked)
> > Nov 18 09:23:22 elissa.gathman.org kernel: page dumped because:
> > VM_BUG_ON_PAGE(!lrucare && PageLRU(oldpage))
> > Nov 18 09:23:23 elissa.gathman.org kernel: ------------[ cut here ]------------
> > Nov 18 09:23:23 elissa.gathman.org kernel: kernel BUG at mm/memcontrol.c:6733!

I guess this matches the following bugon in your kernel:
        VM_BUG_ON_PAGE(!lrucare && PageLRU(oldpage), oldpage);

so the oldpage is on the LRU list already. I am completely unfamiliar
with 965GM but is the page perhaps shared with somebody with a different
gfp mask requirement (e.g. userspace accessing the memory via mmap)? So
the other (racing) caller didn't need to move the page and put it on
LRU.

If yes we need to tell shmem_replace_page to do the lrucare handling.

diff --git a/mm/shmem.c b/mm/shmem.c
index 339e06639956..e3cdc1a16c0f 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -1013,7 +1013,7 @@ static int shmem_replace_page(struct page **pagep, gfp_t gfp,
 		 */
 		oldpage = newpage;
 	} else {
-		mem_cgroup_migrate(oldpage, newpage, false);
+		mem_cgroup_migrate(oldpage, newpage, true);
 		lru_cache_add_anon(newpage);
 		*pagep = newpage;
 	}

[...]
> 965GM and that it uniquely uses
> 
> mask = GFP_HIGHUSER | __GFP_RECLAIMABLE;
> if (IS_CRESTLINE(dev) || IS_BROADWATER(dev)) {
> 	/* 965gm cannot relocate objects above 4GiB. */
> 	mask &= ~__GFP_HIGHMEM;
> 	mask |= __GFP_DMA32;
> }
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [Intel-gfx] memcontrol.c BUG
  2015-01-28 14:32   ` [Intel-gfx] " Michal Hocko
@ 2015-01-29  8:16     ` Chris Wilson
  2015-01-29 23:26       ` Dave Airlie
  2015-01-30  2:04       ` Hugh Dickins
  1 sibling, 1 reply; 10+ messages in thread
From: Chris Wilson @ 2015-01-29  8:16 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Dave Airlie, Johannes Weiner, intel-gfx, Hugh Dickins, Tejun Heo,
	Vladimir Davydov, Jet Chen, Felipe Balbi, Andrew Morton,
	linux-mm

On Wed, Jan 28, 2015 at 03:32:43PM +0100, Michal Hocko wrote:
> On Wed 28-01-15 08:48:52, Chris Wilson wrote:
> > On Wed, Jan 28, 2015 at 08:13:06AM +1000, Dave Airlie wrote:
> > > https://bugzilla.redhat.com/show_bug.cgi?id=1165369
> > > 
> > > ov 18 09:23:22 elissa.gathman.org kernel: page:f5e36a40 count:2
> > > mapcount:0 mapping:  (null) index:0x0
> > > Nov 18 09:23:22 elissa.gathman.org kernel: page flags:
> > > 0x80090029(locked|uptodate|lru|swapcache|swapbacked)
> > > Nov 18 09:23:22 elissa.gathman.org kernel: page dumped because:
> > > VM_BUG_ON_PAGE(!lrucare && PageLRU(oldpage))
> > > Nov 18 09:23:23 elissa.gathman.org kernel: ------------[ cut here ]------------
> > > Nov 18 09:23:23 elissa.gathman.org kernel: kernel BUG at mm/memcontrol.c:6733!
> 
> I guess this matches the following bugon in your kernel:
>         VM_BUG_ON_PAGE(!lrucare && PageLRU(oldpage), oldpage);
> 
> so the oldpage is on the LRU list already. I am completely unfamiliar
> with 965GM but is the page perhaps shared with somebody with a different
> gfp mask requirement (e.g. userspace accessing the memory via mmap)? So
> the other (racing) caller didn't need to move the page and put it on
> LRU.

Generally, yes. The shmemfs filp is exported through a vm_mmap() as well
as pinned into the GPU via shmem_read_mapping_page_gfp(). But I would
not expect that to be the case very often, if at all, on 965GM as the
two access paths are incoherent. Still it sounds promising, hopefully
Dave can put it into a fedora kernel for testing?
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Intel-gfx] memcontrol.c BUG
  2015-01-29  8:16     ` Chris Wilson
@ 2015-01-29 23:26       ` Dave Airlie
  0 siblings, 0 replies; 10+ messages in thread
From: Dave Airlie @ 2015-01-29 23:26 UTC (permalink / raw)
  To: Chris Wilson, Michal Hocko, Dave Airlie, Johannes Weiner,
	intel-gfx, Hugh Dickins, Tejun Heo, Vladimir Davydov, Jet Chen,
	Felipe Balbi, Andrew Morton, Linux Memory Management List

On 29 January 2015 at 18:16, Chris Wilson <chris@chris-wilson.co.uk> wrote:
> On Wed, Jan 28, 2015 at 03:32:43PM +0100, Michal Hocko wrote:
>> On Wed 28-01-15 08:48:52, Chris Wilson wrote:
>> > On Wed, Jan 28, 2015 at 08:13:06AM +1000, Dave Airlie wrote:
>> > > https://bugzilla.redhat.com/show_bug.cgi?id=1165369
>> > >
>> > > ov 18 09:23:22 elissa.gathman.org kernel: page:f5e36a40 count:2
>> > > mapcount:0 mapping:  (null) index:0x0
>> > > Nov 18 09:23:22 elissa.gathman.org kernel: page flags:
>> > > 0x80090029(locked|uptodate|lru|swapcache|swapbacked)
>> > > Nov 18 09:23:22 elissa.gathman.org kernel: page dumped because:
>> > > VM_BUG_ON_PAGE(!lrucare && PageLRU(oldpage))
>> > > Nov 18 09:23:23 elissa.gathman.org kernel: ------------[ cut here ]------------
>> > > Nov 18 09:23:23 elissa.gathman.org kernel: kernel BUG at mm/memcontrol.c:6733!
>>
>> I guess this matches the following bugon in your kernel:
>>         VM_BUG_ON_PAGE(!lrucare && PageLRU(oldpage), oldpage);
>>
>> so the oldpage is on the LRU list already. I am completely unfamiliar
>> with 965GM but is the page perhaps shared with somebody with a different
>> gfp mask requirement (e.g. userspace accessing the memory via mmap)? So
>> the other (racing) caller didn't need to move the page and put it on
>> LRU.
>
> Generally, yes. The shmemfs filp is exported through a vm_mmap() as well
> as pinned into the GPU via shmem_read_mapping_page_gfp(). But I would
> not expect that to be the case very often, if at all, on 965GM as the
> two access paths are incoherent. Still it sounds promising, hopefully
> Dave can put it into a fedora kernel for testing?

http://kojipkgs.fedoraproject.org/scratch/airlied/task_8760024/

done, also asked on the bug for testers.

Dave.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Intel-gfx] memcontrol.c BUG
  2015-01-28 14:32   ` [Intel-gfx] " Michal Hocko
@ 2015-01-30  2:04       ` Hugh Dickins
  2015-01-30  2:04       ` Hugh Dickins
  1 sibling, 0 replies; 10+ messages in thread
From: Hugh Dickins @ 2015-01-30  2:04 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Chris Wilson, Dave Airlie, Johannes Weiner, intel-gfx,
	Hugh Dickins, Tejun Heo, Vladimir Davydov, Jet Chen,
	Felipe Balbi, Andrew Morton, linux-mm

On Wed, 28 Jan 2015, Michal Hocko wrote:
> On Wed 28-01-15 08:48:52, Chris Wilson wrote:
> > On Wed, Jan 28, 2015 at 08:13:06AM +1000, Dave Airlie wrote:
> > > https://bugzilla.redhat.com/show_bug.cgi?id=1165369
> > > 
> > > ov 18 09:23:22 elissa.gathman.org kernel: page:f5e36a40 count:2
> > > mapcount:0 mapping:  (null) index:0x0
> > > Nov 18 09:23:22 elissa.gathman.org kernel: page flags:
> > > 0x80090029(locked|uptodate|lru|swapcache|swapbacked)
> > > Nov 18 09:23:22 elissa.gathman.org kernel: page dumped because:
> > > VM_BUG_ON_PAGE(!lrucare && PageLRU(oldpage))
> > > Nov 18 09:23:23 elissa.gathman.org kernel: ------------[ cut here ]------------
> > > Nov 18 09:23:23 elissa.gathman.org kernel: kernel BUG at mm/memcontrol.c:6733!
> 
> I guess this matches the following bugon in your kernel:
>         VM_BUG_ON_PAGE(!lrucare && PageLRU(oldpage), oldpage);
> 
> so the oldpage is on the LRU list already. I am completely unfamiliar
> with 965GM but is the page perhaps shared with somebody with a different
> gfp mask requirement (e.g. userspace accessing the memory via mmap)? So
> the other (racing) caller didn't need to move the page and put it on
> LRU.

It would be surprising (but not impossible) for oldpage not to be on
the LRU already: it's a swapin readahead page that has every right to
be on LRU, but turns out to have been allocated from an unsuitable zone,
once we discover that it's needed in one of these odd hardware-limited
mappings.  (Whereas newpage is newly allocated and not yet on LRU.)

> 
> If yes we need to tell shmem_replace_page to do the lrucare handling.

Absolutely, thanks Michal.  It would also be good to change the comment
on mem_cgroup_migrate() in mm/memcontrol.c, from "@lrucare: both pages..."
to "@lrucare: either or both pages..." - though I certainly won't pretend
that the corrected wording would have prevented this bug creeping in!

> 
> diff --git a/mm/shmem.c b/mm/shmem.c
> index 339e06639956..e3cdc1a16c0f 100644
> --- a/mm/shmem.c
> +++ b/mm/shmem.c
> @@ -1013,7 +1013,7 @@ static int shmem_replace_page(struct page **pagep, gfp_t gfp,
>  		 */
>  		oldpage = newpage;
>  	} else {
> -		mem_cgroup_migrate(oldpage, newpage, false);
> +		mem_cgroup_migrate(oldpage, newpage, true);
>  		lru_cache_add_anon(newpage);
>  		*pagep = newpage;
>  	}

Acked-by: Hugh Dickins <hughd@google.com>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: memcontrol.c BUG
@ 2015-01-30  2:04       ` Hugh Dickins
  0 siblings, 0 replies; 10+ messages in thread
From: Hugh Dickins @ 2015-01-30  2:04 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Vladimir Davydov, intel-gfx, Hugh Dickins, Felipe Balbi,
	linux-mm, Johannes Weiner, Tejun Heo, Andrew Morton, Jet Chen

On Wed, 28 Jan 2015, Michal Hocko wrote:
> On Wed 28-01-15 08:48:52, Chris Wilson wrote:
> > On Wed, Jan 28, 2015 at 08:13:06AM +1000, Dave Airlie wrote:
> > > https://bugzilla.redhat.com/show_bug.cgi?id=1165369
> > > 
> > > ov 18 09:23:22 elissa.gathman.org kernel: page:f5e36a40 count:2
> > > mapcount:0 mapping:  (null) index:0x0
> > > Nov 18 09:23:22 elissa.gathman.org kernel: page flags:
> > > 0x80090029(locked|uptodate|lru|swapcache|swapbacked)
> > > Nov 18 09:23:22 elissa.gathman.org kernel: page dumped because:
> > > VM_BUG_ON_PAGE(!lrucare && PageLRU(oldpage))
> > > Nov 18 09:23:23 elissa.gathman.org kernel: ------------[ cut here ]------------
> > > Nov 18 09:23:23 elissa.gathman.org kernel: kernel BUG at mm/memcontrol.c:6733!
> 
> I guess this matches the following bugon in your kernel:
>         VM_BUG_ON_PAGE(!lrucare && PageLRU(oldpage), oldpage);
> 
> so the oldpage is on the LRU list already. I am completely unfamiliar
> with 965GM but is the page perhaps shared with somebody with a different
> gfp mask requirement (e.g. userspace accessing the memory via mmap)? So
> the other (racing) caller didn't need to move the page and put it on
> LRU.

It would be surprising (but not impossible) for oldpage not to be on
the LRU already: it's a swapin readahead page that has every right to
be on LRU, but turns out to have been allocated from an unsuitable zone,
once we discover that it's needed in one of these odd hardware-limited
mappings.  (Whereas newpage is newly allocated and not yet on LRU.)

> 
> If yes we need to tell shmem_replace_page to do the lrucare handling.

Absolutely, thanks Michal.  It would also be good to change the comment
on mem_cgroup_migrate() in mm/memcontrol.c, from "@lrucare: both pages..."
to "@lrucare: either or both pages..." - though I certainly won't pretend
that the corrected wording would have prevented this bug creeping in!

> 
> diff --git a/mm/shmem.c b/mm/shmem.c
> index 339e06639956..e3cdc1a16c0f 100644
> --- a/mm/shmem.c
> +++ b/mm/shmem.c
> @@ -1013,7 +1013,7 @@ static int shmem_replace_page(struct page **pagep, gfp_t gfp,
>  		 */
>  		oldpage = newpage;
>  	} else {
> -		mem_cgroup_migrate(oldpage, newpage, false);
> +		mem_cgroup_migrate(oldpage, newpage, true);
>  		lru_cache_add_anon(newpage);
>  		*pagep = newpage;
>  	}

Acked-by: Hugh Dickins <hughd@google.com>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH] memcg, shmem: fix shmem migration to use lrucare. (was: Re: [Intel-gfx] memcontrol.c BUG)
  2015-01-30  2:04       ` Hugh Dickins
@ 2015-02-02 15:00         ` Michal Hocko
  -1 siblings, 0 replies; 10+ messages in thread
From: Michal Hocko @ 2015-02-02 15:00 UTC (permalink / raw)
  To: Hugh Dickins
  Cc: Chris Wilson, Dave Airlie, Johannes Weiner, intel-gfx, Tejun Heo,
	Vladimir Davydov, Jet Chen, Felipe Balbi, Andrew Morton,
	linux-mm

On Thu 29-01-15 18:04:15, Hugh Dickins wrote:
> On Wed, 28 Jan 2015, Michal Hocko wrote:
> > On Wed 28-01-15 08:48:52, Chris Wilson wrote:
> > > On Wed, Jan 28, 2015 at 08:13:06AM +1000, Dave Airlie wrote:
> > > > https://bugzilla.redhat.com/show_bug.cgi?id=1165369
> > > > 
> > > > ov 18 09:23:22 elissa.gathman.org kernel: page:f5e36a40 count:2
> > > > mapcount:0 mapping:  (null) index:0x0
> > > > Nov 18 09:23:22 elissa.gathman.org kernel: page flags:
> > > > 0x80090029(locked|uptodate|lru|swapcache|swapbacked)
> > > > Nov 18 09:23:22 elissa.gathman.org kernel: page dumped because:
> > > > VM_BUG_ON_PAGE(!lrucare && PageLRU(oldpage))
> > > > Nov 18 09:23:23 elissa.gathman.org kernel: ------------[ cut here ]------------
> > > > Nov 18 09:23:23 elissa.gathman.org kernel: kernel BUG at mm/memcontrol.c:6733!
> > 
> > I guess this matches the following bugon in your kernel:
> >         VM_BUG_ON_PAGE(!lrucare && PageLRU(oldpage), oldpage);
> > 
> > so the oldpage is on the LRU list already. I am completely unfamiliar
> > with 965GM but is the page perhaps shared with somebody with a different
> > gfp mask requirement (e.g. userspace accessing the memory via mmap)? So
> > the other (racing) caller didn't need to move the page and put it on
> > LRU.
> 
> It would be surprising (but not impossible) for oldpage not to be on
> the LRU already: it's a swapin readahead page that has every right to
> be on LRU,

True, thanks for pointing this out.

> but turns out to have been allocated from an unsuitable zone,
> once we discover that it's needed in one of these odd hardware-limited
> mappings.  (Whereas newpage is newly allocated and not yet on LRU.)
> 
> > 
> > If yes we need to tell shmem_replace_page to do the lrucare handling.
> 
> Absolutely, thanks Michal.  It would also be good to change the comment
> on mem_cgroup_migrate() in mm/memcontrol.c, from "@lrucare: both pages..."
> to "@lrucare: either or both pages..." - though I certainly won't pretend
> that the corrected wording would have prevented this bug creeping in!

Yes, I have updated the wording.
 
> > diff --git a/mm/shmem.c b/mm/shmem.c
> > index 339e06639956..e3cdc1a16c0f 100644
> > --- a/mm/shmem.c
> > +++ b/mm/shmem.c
> > @@ -1013,7 +1013,7 @@ static int shmem_replace_page(struct page **pagep, gfp_t gfp,
> >  		 */
> >  		oldpage = newpage;
> >  	} else {
> > -		mem_cgroup_migrate(oldpage, newpage, false);
> > +		mem_cgroup_migrate(oldpage, newpage, true);
> >  		lru_cache_add_anon(newpage);
> >  		*pagep = newpage;
> >  	}
> 
> Acked-by: Hugh Dickins <hughd@google.com>

Thanks! The full patch is below. I wasn't sure who was the one to report
the issue so I hope the credits are right. I have marked the patch for
stable because some people are running with VM debugging enabled. AFAICS
the issue is not so harmful without debugging on because the stale
oldpage would be removed from the LRU list eventually.
---

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH] memcg, shmem: fix shmem migration to use lrucare. (was: Re: [Intel-gfx] memcontrol.c BUG)
@ 2015-02-02 15:00         ` Michal Hocko
  0 siblings, 0 replies; 10+ messages in thread
From: Michal Hocko @ 2015-02-02 15:00 UTC (permalink / raw)
  To: Hugh Dickins
  Cc: Chris Wilson, Dave Airlie, Johannes Weiner, intel-gfx, Tejun Heo,
	Vladimir Davydov, Jet Chen, Felipe Balbi, Andrew Morton,
	linux-mm

On Thu 29-01-15 18:04:15, Hugh Dickins wrote:
> On Wed, 28 Jan 2015, Michal Hocko wrote:
> > On Wed 28-01-15 08:48:52, Chris Wilson wrote:
> > > On Wed, Jan 28, 2015 at 08:13:06AM +1000, Dave Airlie wrote:
> > > > https://bugzilla.redhat.com/show_bug.cgi?id=1165369
> > > > 
> > > > ov 18 09:23:22 elissa.gathman.org kernel: page:f5e36a40 count:2
> > > > mapcount:0 mapping:  (null) index:0x0
> > > > Nov 18 09:23:22 elissa.gathman.org kernel: page flags:
> > > > 0x80090029(locked|uptodate|lru|swapcache|swapbacked)
> > > > Nov 18 09:23:22 elissa.gathman.org kernel: page dumped because:
> > > > VM_BUG_ON_PAGE(!lrucare && PageLRU(oldpage))
> > > > Nov 18 09:23:23 elissa.gathman.org kernel: ------------[ cut here ]------------
> > > > Nov 18 09:23:23 elissa.gathman.org kernel: kernel BUG at mm/memcontrol.c:6733!
> > 
> > I guess this matches the following bugon in your kernel:
> >         VM_BUG_ON_PAGE(!lrucare && PageLRU(oldpage), oldpage);
> > 
> > so the oldpage is on the LRU list already. I am completely unfamiliar
> > with 965GM but is the page perhaps shared with somebody with a different
> > gfp mask requirement (e.g. userspace accessing the memory via mmap)? So
> > the other (racing) caller didn't need to move the page and put it on
> > LRU.
> 
> It would be surprising (but not impossible) for oldpage not to be on
> the LRU already: it's a swapin readahead page that has every right to
> be on LRU,

True, thanks for pointing this out.

> but turns out to have been allocated from an unsuitable zone,
> once we discover that it's needed in one of these odd hardware-limited
> mappings.  (Whereas newpage is newly allocated and not yet on LRU.)
> 
> > 
> > If yes we need to tell shmem_replace_page to do the lrucare handling.
> 
> Absolutely, thanks Michal.  It would also be good to change the comment
> on mem_cgroup_migrate() in mm/memcontrol.c, from "@lrucare: both pages..."
> to "@lrucare: either or both pages..." - though I certainly won't pretend
> that the corrected wording would have prevented this bug creeping in!

Yes, I have updated the wording.
 
> > diff --git a/mm/shmem.c b/mm/shmem.c
> > index 339e06639956..e3cdc1a16c0f 100644
> > --- a/mm/shmem.c
> > +++ b/mm/shmem.c
> > @@ -1013,7 +1013,7 @@ static int shmem_replace_page(struct page **pagep, gfp_t gfp,
> >  		 */
> >  		oldpage = newpage;
> >  	} else {
> > -		mem_cgroup_migrate(oldpage, newpage, false);
> > +		mem_cgroup_migrate(oldpage, newpage, true);
> >  		lru_cache_add_anon(newpage);
> >  		*pagep = newpage;
> >  	}
> 
> Acked-by: Hugh Dickins <hughd@google.com>

Thanks! The full patch is below. I wasn't sure who was the one to report
the issue so I hope the credits are right. I have marked the patch for
stable because some people are running with VM debugging enabled. AFAICS
the issue is not so harmful without debugging on because the stale
oldpage would be removed from the LRU list eventually.
---
>From 508815bfdaae75e3286ab2dd714a07201665709c Mon Sep 17 00:00:00 2001
From: Michal Hocko <mhocko@suse.cz>
Date: Mon, 2 Feb 2015 15:22:19 +0100
Subject: [PATCH] memcg, shmem: fix shmem migration to use lrucare.

It has been reported that 965GM might trigger

VM_BUG_ON_PAGE(!lrucare && PageLRU(oldpage), oldpage)

in mem_cgroup_migrate when shmem wants to replace a swap cache page
because of shmem_should_replace_page (the page is allocated from an
inappropriate zone). shmem_replace_page expects that the oldpage is not
on LRU list and calls mem_cgroup_migrate without lrucare. This is obviously
incorrect because swapcache pages might be on the LRU list (e.g. swapin
readahead page).

Fix this by enabling lrucare for the migration in shmem_replace_page.
Also clarify that lrucare should be used even if one of the pages might
be on LRU list.

The BUG_ON will trigger only when CONFIG_DEBUG_VM is enabled but even
without that the migration code might leave the old page on an
inappropriate memcg' LRU which is not that critical because the page
would get removed with its last reference but it is still confusing.

Fixes: 0a31bc97c80c (mm: memcontrol: rewrite uncharge API)
Cc: stable@vger.kernel.org # 3.17+
Reported-by: Chris Wilson <chris@chris-wilson.co.uk>
Reported-by: Dave Airlie <airlied@gmail.com>
Acked-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Michal Hocko <mhocko@suse.cz>
---
 mm/memcontrol.c | 2 +-
 mm/shmem.c      | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 7ce5aa24bc19..c5ac0e209868 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -5743,7 +5743,7 @@ void mem_cgroup_uncharge_list(struct list_head *page_list)
  * mem_cgroup_migrate - migrate a charge to another page
  * @oldpage: currently charged page
  * @newpage: page to transfer the charge to
- * @lrucare: both pages might be on the LRU already
+ * @lrucare: either or both pages might be on the LRU already
  *
  * Migrate the charge from @oldpage to @newpage.
  *
diff --git a/mm/shmem.c b/mm/shmem.c
index 339e06639956..e3cdc1a16c0f 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -1013,7 +1013,7 @@ static int shmem_replace_page(struct page **pagep, gfp_t gfp,
 		 */
 		oldpage = newpage;
 	} else {
-		mem_cgroup_migrate(oldpage, newpage, false);
+		mem_cgroup_migrate(oldpage, newpage, true);
 		lru_cache_add_anon(newpage);
 		*pagep = newpage;
 	}
-- 
2.1.4

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH] memcg, shmem: fix shmem migration to use lrucare. (was: Re: [Intel-gfx] memcontrol.c BUG)
  2015-02-02 15:00         ` Michal Hocko
  (?)
@ 2015-02-02 16:18         ` Johannes Weiner
  -1 siblings, 0 replies; 10+ messages in thread
From: Johannes Weiner @ 2015-02-02 16:18 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Hugh Dickins, Chris Wilson, Dave Airlie, intel-gfx, Tejun Heo,
	Vladimir Davydov, Jet Chen, Felipe Balbi, Andrew Morton,
	linux-mm

On Mon, Feb 02, 2015 at 04:00:51PM +0100, Michal Hocko wrote:
> From: Michal Hocko <mhocko@suse.cz>
> Date: Mon, 2 Feb 2015 15:22:19 +0100
> Subject: [PATCH] memcg, shmem: fix shmem migration to use lrucare.
> 
> It has been reported that 965GM might trigger
> 
> VM_BUG_ON_PAGE(!lrucare && PageLRU(oldpage), oldpage)
> 
> in mem_cgroup_migrate when shmem wants to replace a swap cache page
> because of shmem_should_replace_page (the page is allocated from an
> inappropriate zone). shmem_replace_page expects that the oldpage is not
> on LRU list and calls mem_cgroup_migrate without lrucare. This is obviously
> incorrect because swapcache pages might be on the LRU list (e.g. swapin
> readahead page).
> 
> Fix this by enabling lrucare for the migration in shmem_replace_page.
> Also clarify that lrucare should be used even if one of the pages might
> be on LRU list.
> 
> The BUG_ON will trigger only when CONFIG_DEBUG_VM is enabled but even
> without that the migration code might leave the old page on an
> inappropriate memcg' LRU which is not that critical because the page
> would get removed with its last reference but it is still confusing.
> 
> Fixes: 0a31bc97c80c (mm: memcontrol: rewrite uncharge API)
> Cc: stable@vger.kernel.org # 3.17+
> Reported-by: Chris Wilson <chris@chris-wilson.co.uk>
> Reported-by: Dave Airlie <airlied@gmail.com>
> Acked-by: Hugh Dickins <hughd@google.com>
> Signed-off-by: Michal Hocko <mhocko@suse.cz>

Acked-by: Johannes Weiner <hannes@cmpxchg.org>

Thanks, Michal.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2015-02-02 16:19 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-01-27 22:13 memcontrol.c BUG Dave Airlie
2015-01-28  8:48 ` Chris Wilson
2015-01-28 14:32   ` [Intel-gfx] " Michal Hocko
2015-01-29  8:16     ` Chris Wilson
2015-01-29 23:26       ` Dave Airlie
2015-01-30  2:04     ` Hugh Dickins
2015-01-30  2:04       ` Hugh Dickins
2015-02-02 15:00       ` [PATCH] memcg, shmem: fix shmem migration to use lrucare. (was: Re: [Intel-gfx] memcontrol.c BUG) Michal Hocko
2015-02-02 15:00         ` Michal Hocko
2015-02-02 16:18         ` Johannes Weiner

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.