linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jerome Marchand <jmarchan@redhat.com>
To: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Andrea Arcangeli <aarcange@redhat.com>,
	Hugh Dickins <hughd@google.com>
Cc: Dave Hansen <dave.hansen@intel.com>, Mel Gorman <mgorman@suse.de>,
	Rik van Riel <riel@redhat.com>, Vlastimil Babka <vbabka@suse.cz>,
	Christoph Lameter <cl@gentwo.org>,
	Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>,
	Steve Capper <steve.capper@linaro.org>,
	"Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Michal Hocko <mhocko@suse.cz>,
	Sasha Levin <sasha.levin@oracle.com>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [PATCHv6 00/36] THP refcounting redesign
Date: Tue, 16 Jun 2015 15:17:13 +0200	[thread overview]
Message-ID: <558021D9.4050304@redhat.com> (raw)
In-Reply-To: <1433351167-125878-1-git-send-email-kirill.shutemov@linux.intel.com>

[-- Attachment #1: Type: text/plain, Size: 8495 bytes --]

On 06/03/2015 07:05 PM, Kirill A. Shutemov wrote:
> Hello everybody,
> 
> Here's new revision of refcounting patchset. Please review and consider
> applying.
> 
> The goal of patchset is to make refcounting on THP pages cheaper with
> simpler semantics and allow the same THP compound page to be mapped with
> PMD and PTEs. This is required to get reasonable THP-pagecache
> implementation.
> 
> With the new refcounting design it's much easier to protect against
> split_huge_page(): simple reference on a page will make you the deal.
> It makes gup_fast() implementation simpler and doesn't require
> special-case in futex code to handle tail THP pages.
> 
> It should improve THP utilization over the system since splitting THP in
> one process doesn't necessary lead to splitting the page in all other
> processes have the page mapped.
> 
> The patchset drastically lower complexity of get_page()/put_page()
> codepaths. I encourage people look on this code before-and-after to
> justify time budget on reviewing this patchset.
> 
> = Changelog =
> 
> v6:
>   - rebase to since-4.0;
>   - optimize mapcount handling: significantely reduce overhead for most
>     common cases.
>   - split pages on migrate_pages();
>   - remove infrastructure for handling splitting PMDs on all architectures;
>   - fix page_mapcount() for hugetlb pages;
> 

Hi Kirill,

I ran some LTP mm tests and hugemmap tests trigger the following:

[  438.749457] page:ffffea0000df8000 count:2 mapcount:0 mapping:          (null) index:0x0 compound_mapcount: 0
[  438.750089] flags: 0x3ffc0000004001(locked|head)
[  438.750089] page dumped because: VM_BUG_ON_PAGE(page_mapped(page))
[  438.750089] ------------[ cut here ]------------
[  438.768046] kernel BUG at mm/filemap.c:205!
[  438.768046] invalid opcode: 0000 [#1] SMP 
[  438.768046] Modules linked in: loop ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack ebtable_nat ebtable_broute bridge stp llc ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw ppdev iosf_mbi crct10dif_pclmul crc32_pclmul crc32c_intel joydev ghash_clmulni_intel virtio_balloon pcspkr virtio_console nfsd parport_pc parport floppy pvpanic i2c_piix4 acpi_cpufreq auth_rpcgss nfs_acl lockd grace sunrpc virtio_net qxl virtio_blk drm_kms_helper ttm drm serio_raw ata_generic virtio_pci virtio_ring virtio pata_acpi
[  438.768046] CPU: 1 PID: 12918 Comm: hugemmap01 Not tainted 4.0.0thprfc-kasv6+ #247
[  438.768046] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[  438.768046] task: ffff88007b09cc40 ti: ffff880077b88000 task.ti: ffff880077b88000
[  438.768046] RIP: 0010:[<ffffffff811e2aac>]  [<ffffffff811e2aac>] __delete_from_page_cache+0x4bc/0x5a0
[  438.768046] RSP: 0018:ffff880077b8bc58  EFLAGS: 00010086
[  438.768046] RAX: 0000000000000036 RBX: ffffea0000df8000 RCX: 0000000000000006
[  438.768046] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff88007d5ce9c0
[  438.768046] RBP: ffff880077b8bcb8 R08: 0000000000000001 R09: 0000000000000001
[  438.768046] R10: 0000000000000001 R11: ffff880034e44210 R12: ffffea0000df8000
[  438.768046] R13: ffff88003562cac0 R14: 0000000000000000 R15: ffff88003562cac8
[  438.768046] FS:  00007fda9ccbb700(0000) GS:ffff88007d400000(0000) knlGS:0000000000000000
[  438.768046] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  438.768046] CR2: 00007fda9ccc7000 CR3: 00000000785e6000 CR4: 00000000001407e0
[  438.768046] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  438.768046] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  438.768046] Stack:
[  438.768046]  0000000000000246 ffff88003562cad8 ffff88003562caf0 0000000000000000
[  438.768046]  ffff88003562cad0 000000009bfc6d69 ffff880077b8bcb8 ffffea0000df8000
[  438.768046]  ffff88003562cad8 0000000000000000 ffffea0000df8000 0000000000000000
[  438.768046] Call Trace:
[  438.768046]  [<ffffffff811e2be5>] delete_from_page_cache+0x55/0xd0
[  438.768046]  [<ffffffff81380be5>] truncate_hugepages+0x135/0x290
[  438.768046]  [<ffffffff810e7df5>] ? local_clock+0x15/0x30
[  438.768046]  [<ffffffff8110647f>] ? lock_release_holdtime.part.31+0xf/0x190
[  438.768046]  [<ffffffff81380eb8>] hugetlbfs_evict_inode+0x18/0x40
[  438.768046]  [<ffffffff812982bb>] evict+0xab/0x180
[  438.768046]  [<ffffffff81298cee>] iput+0x1ce/0x390
[  438.768046]  [<ffffffff8128aba9>] do_unlinkat+0x209/0x330
[  438.768046]  [<ffffffff81884632>] ? ret_from_sys_call+0x24/0x5f
[  438.768046]  [<ffffffff811095ed>] ? trace_hardirqs_on_caller+0xfd/0x1c0
[  438.768046]  [<ffffffff8128bf66>] SyS_unlink+0x16/0x20
[  438.768046]  [<ffffffff81884609>] system_call_fastpath+0x12/0x17
[  438.768046] Code: 49 8b 14 24 4c 89 e0 80 e6 80 74 08 4c 89 e7 e8 15 2e 69 00 8b 40 48 83 c0 01 74 25 48 c7 c6 28 fb c6 81 48 89 df e8 d4 43 03 00 <0f> 0b 48 89 df e8 f4 2d 69 00 48 f7 00 00 c0 00 00 49 89 c4 75 
[  438.768046] RIP  [<ffffffff811e2aac>] __delete_from_page_cache+0x4bc/0x5a0
[  438.768046]  RSP <ffff880077b8bc58>
[  438.768046] ---[ end trace 3903188dcb3f3d48 ]---
[  438.768046] BUG: sleeping function called from invalid context at kernel/locking/rwsem.c:41
[  438.768046] in_atomic(): 1, irqs_disabled(): 1, pid: 12918, name: hugemmap01
[  438.768046] INFO: lockdep is turned off.
[  438.768046] irq event stamp: 6218
[  438.768046] hardirqs last  enabled at (6217): [<ffffffff818812df>] __mutex_unlock_slowpath+0xbf/0x190
[  438.768046] hardirqs last disabled at (6218): [<ffffffff8188387f>] _raw_spin_lock_irq+0x1f/0x80
[  438.768046] softirqs last  enabled at (6042): [<ffffffff810b0df7>] __do_softirq+0x377/0x670
[  438.768046] softirqs last disabled at (6027): [<ffffffff810b14ad>] irq_exit+0x11d/0x130
[  438.768046] CPU: 1 PID: 12918 Comm: hugemmap01 Tainted: G      D         4.0.0thprfc-kasv6+ #247
[  438.768046] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[  438.768046]  0000000000000000 000000009bfc6d69 ffff880077b8b8a8 ffffffff81879afa
[  438.768046]  0000000000000000 ffff88007b09cc40 ffff880077b8b8d8 ffffffff810da0cc
[  438.768046]  0000000000000000 ffffffff81c68746 0000000000000029 0000000000000000
[  438.768046] Call Trace:
[  438.768046]  [<ffffffff81879afa>] dump_stack+0x4c/0x65
[  438.768046]  [<ffffffff810da0cc>] ___might_sleep+0x18c/0x250
[  438.768046]  [<ffffffff810da1dd>] __might_sleep+0x4d/0x90
[  438.768046]  [<ffffffff8188163a>] down_read+0x2a/0xa0
[  438.768046]  [<ffffffff810be6c3>] exit_signals+0x33/0x150
[  438.768046]  [<ffffffff810adc2f>] do_exit+0xcf/0xd20
[  438.768046]  [<ffffffff81121006>] ? kmsg_dump+0x166/0x220
[  438.768046]  [<ffffffff81120ed4>] ? kmsg_dump+0x34/0x220
[  438.768046]  [<ffffffff81021cce>] oops_end+0x9e/0xe0
[  438.768046]  [<ffffffff8102224b>] die+0x4b/0x70
[  438.768046]  [<ffffffff8101df80>] do_trap+0xb0/0x150
[  438.768046]  [<ffffffff8101e2f4>] do_error_trap+0xa4/0x180
[  438.768046]  [<ffffffff811e2aac>] ? __delete_from_page_cache+0x4bc/0x5a0
[  438.768046]  [<ffffffff81120255>] ? vprintk_emit+0x285/0x620
[  438.768046]  [<ffffffff81435b9d>] ? trace_hardirqs_off_thunk+0x3a/0x3c
[  438.768046]  [<ffffffff8101ee90>] do_invalid_op+0x20/0x30
[  438.768046]  [<ffffffff818860de>] invalid_op+0x1e/0x30
[  438.768046]  [<ffffffff811e2aac>] ? __delete_from_page_cache+0x4bc/0x5a0
[  438.768046]  [<ffffffff811e2aac>] ? __delete_from_page_cache+0x4bc/0x5a0
[  438.768046]  [<ffffffff811e2be5>] delete_from_page_cache+0x55/0xd0
[  438.768046]  [<ffffffff81380be5>] truncate_hugepages+0x135/0x290
[  438.768046]  [<ffffffff810e7df5>] ? local_clock+0x15/0x30
[  438.768046]  [<ffffffff8110647f>] ? lock_release_holdtime.part.31+0xf/0x190
[  438.768046]  [<ffffffff81380eb8>] hugetlbfs_evict_inode+0x18/0x40
[  438.768046]  [<ffffffff812982bb>] evict+0xab/0x180
[  438.768046]  [<ffffffff81298cee>] iput+0x1ce/0x390
[  438.768046]  [<ffffffff8128aba9>] do_unlinkat+0x209/0x330
[  438.768046]  [<ffffffff81884632>] ? ret_from_sys_call+0x24/0x5f
[  438.768046]  [<ffffffff811095ed>] ? trace_hardirqs_on_caller+0xfd/0x1c0
[  438.768046]  [<ffffffff8128bf66>] SyS_unlink+0x16/0x20
[  438.768046]  [<ffffffff81884609>] system_call_fastpath+0x12/0x17
[  438.768046] note: hugemmap01[12918] exited with preempt_count 1

Jerome


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

  parent reply	other threads:[~2015-06-16 13:17 UTC|newest]

Thread overview: 58+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-06-03 17:05 [PATCHv6 00/36] THP refcounting redesign Kirill A. Shutemov
2015-06-03 17:05 ` [PATCHv6 01/36] mm, proc: adjust PSS calculation Kirill A. Shutemov
2015-06-09 12:29   ` Vlastimil Babka
2015-06-22 10:02     ` Kirill A. Shutemov
2015-06-03 17:05 ` [PATCHv6 02/36] rmap: add argument to charge compound page Kirill A. Shutemov
2015-06-03 17:05 ` [PATCHv6 03/36] memcg: adjust to support new THP refcounting Kirill A. Shutemov
2015-06-03 17:05 ` [PATCHv6 04/36] mm, thp: adjust conditions when we can reuse the page on WP fault Kirill A. Shutemov
2015-06-03 17:05 ` [PATCHv6 05/36] mm: adjust FOLL_SPLIT for new refcounting Kirill A. Shutemov
2015-06-03 17:05 ` [PATCHv6 06/36] mm: handle PTE-mapped tail pages in gerneric fast gup implementaiton Kirill A. Shutemov
2015-06-03 17:05 ` [PATCHv6 07/36] thp, mlock: do not allow huge pages in mlocked area Kirill A. Shutemov
2015-06-03 17:05 ` [PATCHv6 08/36] khugepaged: ignore pmd tables with THP mapped with ptes Kirill A. Shutemov
2015-06-03 17:05 ` [PATCHv6 09/36] thp: rename split_huge_page_pmd() to split_huge_pmd() Kirill A. Shutemov
2015-06-03 17:05 ` [PATCHv6 10/36] mm, vmstats: new THP splitting event Kirill A. Shutemov
2015-06-03 17:05 ` [PATCHv6 11/36] mm: temporally mark THP broken Kirill A. Shutemov
2015-06-03 17:05 ` [PATCHv6 12/36] thp: drop all split_huge_page()-related code Kirill A. Shutemov
2015-06-03 17:05 ` [PATCHv6 13/36] mm: drop tail page refcounting Kirill A. Shutemov
2015-06-09 13:59   ` Vlastimil Babka
2015-06-03 17:05 ` [PATCHv6 14/36] futex, thp: remove special case for THP in get_futex_key Kirill A. Shutemov
2015-06-03 17:05 ` [PATCHv6 15/36] ksm: prepare to new THP semantics Kirill A. Shutemov
2015-06-03 17:05 ` [PATCHv6 16/36] mm, thp: remove compound_lock Kirill A. Shutemov
2015-06-03 17:05 ` [PATCHv6 17/36] arm64, thp: remove infrastructure for handling splitting PMDs Kirill A. Shutemov
2015-06-03 17:05 ` [PATCHv6 18/36] arm, " Kirill A. Shutemov
2015-06-03 17:05 ` [PATCHv6 19/36] mips, " Kirill A. Shutemov
2015-06-03 17:05 ` [PATCHv6 20/36] powerpc, " Kirill A. Shutemov
2015-06-03 17:05 ` [PATCHv6 21/36] s390, " Kirill A. Shutemov
2015-06-03 17:05 ` [PATCHv6 22/36] sparc, " Kirill A. Shutemov
2015-06-03 17:05 ` [PATCHv6 23/36] tile, " Kirill A. Shutemov
2015-06-03 17:05 ` [PATCHv6 24/36] x86, " Kirill A. Shutemov
2015-06-03 17:05 ` [PATCHv6 25/36] mm, " Kirill A. Shutemov
2015-06-03 17:05 ` [PATCHv6 26/36] mm: rework mapcount accounting to enable 4k mapping of THPs Kirill A. Shutemov
2015-06-10 13:47   ` Vlastimil Babka
2015-06-22 10:22     ` Kirill A. Shutemov
2015-06-03 17:05 ` [PATCHv6 27/36] mm: differentiate page_mapped() from page_mapcount() for compound pages Kirill A. Shutemov
2015-06-09 10:58   ` Kirill A. Shutemov
2015-06-10 14:34   ` Vlastimil Babka
2015-06-03 17:05 ` [PATCHv6 28/36] mm, numa: skip PTE-mapped THP on numa fault Kirill A. Shutemov
2015-06-03 17:06 ` [PATCHv6 29/36] thp: implement split_huge_pmd() Kirill A. Shutemov
2015-06-11  9:49   ` Vlastimil Babka
2015-06-22 11:14     ` Kirill A. Shutemov
2015-06-22 16:01       ` Vlastimil Babka
2015-06-03 17:06 ` [PATCHv6 30/36] thp: add option to setup migration entiries during PMD split Kirill A. Shutemov
2015-06-03 17:06 ` [PATCHv6 31/36] thp, mm: split_huge_page(): caller need to lock page Kirill A. Shutemov
2015-06-03 17:06 ` [PATCHv6 32/36] thp: reintroduce split_huge_page() Kirill A. Shutemov
2015-06-10 15:44   ` Vlastimil Babka
2015-06-22 11:28     ` Kirill A. Shutemov
2015-06-03 17:06 ` [PATCHv6 33/36] migrate_pages: try to split pages on qeueuing Kirill A. Shutemov
2015-06-11  9:27   ` Vlastimil Babka
2015-06-22 11:35     ` Kirill A. Shutemov
2015-06-03 17:06 ` [PATCHv6 34/36] thp: introduce deferred_split_huge_page() Kirill A. Shutemov
2015-06-03 17:06 ` [PATCHv6 35/36] mm: re-enable THP Kirill A. Shutemov
2015-06-03 17:06 ` [PATCHv6 36/36] thp: update documentation Kirill A. Shutemov
2015-06-11 12:30   ` Vlastimil Babka
2015-06-22 13:18     ` Kirill A. Shutemov
2015-06-22 16:07       ` Vlastimil Babka
2015-06-16 13:17 ` Jerome Marchand [this message]
2015-06-22 13:21   ` [PATCHv6 00/36] THP refcounting redesign Kirill A. Shutemov
2015-06-22 13:32     ` Jerome Marchand
2015-06-22 13:39       ` Kirill A. Shutemov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=558021D9.4050304@redhat.com \
    --to=jmarchan@redhat.com \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=aneesh.kumar@linux.vnet.ibm.com \
    --cc=cl@gentwo.org \
    --cc=dave.hansen@intel.com \
    --cc=hannes@cmpxchg.org \
    --cc=hughd@google.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@suse.de \
    --cc=mhocko@suse.cz \
    --cc=n-horiguchi@ah.jp.nec.com \
    --cc=riel@redhat.com \
    --cc=sasha.levin@oracle.com \
    --cc=steve.capper@linaro.org \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).