linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Huang, Ying" <ying.huang@intel.com>
To: Bharata B Rao <bharata@amd.com>
Cc: <linux-mm@kvack.org>, <linux-kernel@vger.kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Zi Yan <ziy@nvidia.com>, Yang Shi <shy828301@gmail.com>,
	Baolin Wang <baolin.wang@linux.alibaba.com>,
	Oscar Salvador <osalvador@suse.de>,
	"Matthew Wilcox" <willy@infradead.org>
Subject: Re: [RFC 0/6] migrate_pages(): batch TLB flushing
Date: Fri, 23 Sep 2022 15:52:52 +0800	[thread overview]
Message-ID: <87bkr6jzmz.fsf@yhuang6-desk2.ccr.corp.intel.com> (raw)
In-Reply-To: <477e50ab-9045-0ca2-6979-e2dca71be263@amd.com> (Bharata B. Rao's message of "Thu, 22 Sep 2022 18:20:28 +0530")

Bharata B Rao <bharata@amd.com> writes:

> On 9/21/2022 11:36 AM, Huang Ying wrote:
>> From: "Huang, Ying" <ying.huang@intel.com>
>> 
>> Now, migrate_pages() migrate pages one by one, like the fake code as
>> follows,
>> 
>>   for each page
>>     unmap
>>     flush TLB
>>     copy
>>     restore map
>> 
>> If multiple pages are passed to migrate_pages(), there are
>> opportunities to batch the TLB flushing and copying.  That is, we can
>> change the code to something as follows,
>> 
>>   for each page
>>     unmap
>>   for each page
>>     flush TLB
>>   for each page
>>     copy
>>   for each page
>>     restore map
>> 
>> The total number of TLB flushing IPI can be reduced considerably.  And
>> we may use some hardware accelerator such as DSA to accelerate the
>> page copying.
>> 
>> So in this patch, we refactor the migrate_pages() implementation and
>> implement the TLB flushing batching.  Base on this, hardware
>> accelerated page copying can be implemented.
>> 
>> If too many pages are passed to migrate_pages(), in the naive batched
>> implementation, we may unmap too many pages at the same time.  The
>> possibility for a task to wait for the migrated pages to be mapped
>> again increases.  So the latency may be hurt.  To deal with this
>> issue, the max number of pages be unmapped in batch is restricted to
>> no more than HPAGE_PMD_NR.  That is, the influence is at the same
>> level of THP migration.
>
> Thanks for the patchset. I find it hitting the following BUG() when
> running mmtests/autonumabench:
>
> kernel BUG at mm/migrate.c:2432!
> invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
> CPU: 7 PID: 7150 Comm: numa01 Not tainted 6.0.0-rc5+ #171
> Hardware name: Dell Inc. PowerEdge R6525/024PW1, BIOS 2.5.6 10/06/2021
> RIP: 0010:migrate_misplaced_page+0x670/0x830 
> Code: 36 48 8b 3c c5 e0 7a 19 8d e8 dc 10 f7 ff 4c 89 e7 e8 f4 43 f5 ff 8b 55 bc 85 d2 75 6f 48 8b 45 c0 4c 39 e8 0f 84 b0 fb ff ff <0f> 0b 48 8b 7d 90 e9 ec fc ff ff 48 83 e8 01 e9 48 fa ff ff 48 83
> RSP: 0000:ffffb1b29ec3fd38 EFLAGS: 00010202
> RAX: ffffe946460f8248 RBX: 0000000000000001 RCX: ffffe946460f8248
> RDX: 0000000000000000 RSI: ffffe946460f8248 RDI: ffffb1b29ec3fce0
> RBP: ffffb1b29ec3fda8 R08: 0000000000000000 R09: 0000000000000005
> R10: 0000000000000001 R11: 0000000000000000 R12: ffffe946460f8240
> R13: ffffb1b29ec3fd68 R14: 0000000000000001 R15: ffff9698beed5000
> FS:  00007fcc31fee640(0000) GS:ffff9697b0000000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00007fcc3a3a5000 CR3: 000000016e89c002 CR4: 0000000000770ee0
> PKRU: 55555554
> Call Trace:
>  <TASK>
>  __handle_mm_fault+0xb87/0xff0
>  handle_mm_fault+0x126/0x3c0
>  do_user_addr_fault+0x1ed/0x690
>  exc_page_fault+0x84/0x2c0
>  asm_exc_page_fault+0x27/0x30 
> RIP: 0033:0x7fccfa1a1180
> Code: 81 fa 80 00 00 00 76 d2 c5 fe 7f 40 40 c5 fe 7f 40 60 48 83 c7 80 48 81 fa 00 01 00 00 76 2b 48 8d 90 80 00 00 00 48 83 e2 c0 <c5> fd 7f 02 c5 fd 7f 42 20 c5 fd 7f 42 40 c5 fd 7f 42 60 48 83 ea
> RSP: 002b:00007fcc31fede38 EFLAGS: 00010283
> RAX: 00007fcc39fff010 RBX: 000000000000002c RCX: 00007fccfa11ea3d
> RDX: 00007fcc3a3a5000 RSI: 0000000000000000 RDI: 00007fccf9ffef90
> RBP: 00007fcc39fff010 R08: 00007fcc31fee640 R09: 00007fcc31fee640
> R10: 00007ffdecef614f R11: 0000000000000246 R12: 00000000c0000000
> R13: 0000000000000000 R14: 00007fccfa094850 R15: 00007ffdecef6190
>
> This is BUG_ON(!list_empty(&migratepages)) in migrate_misplaced_page().

Thank you very much for reporting!  I haven't reproduced this yet.  But
I will pay special attention to this when develop the next version, even
if I cannot reproduce this finally.

Best Regards,
Huang, Ying

  reply	other threads:[~2022-09-23  7:53 UTC|newest]

Thread overview: 50+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-21  6:06 [RFC 0/6] migrate_pages(): batch TLB flushing Huang Ying
2022-09-21  6:06 ` [RFC 1/6] mm/migrate_pages: separate huge page and normal pages migration Huang Ying
2022-09-21 15:55   ` Zi Yan
2022-09-22  1:14     ` Huang, Ying
2022-09-22  6:03   ` Baolin Wang
2022-09-22  6:22     ` Huang, Ying
2022-09-21  6:06 ` [RFC 2/6] mm/migrate_pages: split unmap_and_move() to _unmap() and _move() Huang Ying
2022-09-21 16:08   ` Zi Yan
2022-09-22  1:15     ` Huang, Ying
2022-09-22  6:36   ` Baolin Wang
2022-09-26  9:28   ` Alistair Popple
2022-09-26 18:06     ` Yang Shi
2022-09-27  0:02       ` Alistair Popple
2022-09-27  1:51         ` Huang, Ying
2022-09-27 20:34           ` John Hubbard
2022-09-27 20:57             ` Yang Shi
2022-09-28  0:59               ` Alistair Popple
2022-09-28  1:41                 ` Huang, Ying
2022-09-28  1:44                   ` John Hubbard
2022-09-28  1:49                     ` Yang Shi
2022-09-28  1:56                       ` John Hubbard
2022-09-28  2:14                         ` Yang Shi
2022-09-28  2:57                           ` John Hubbard
2022-09-28  3:25                             ` Yang Shi
2022-09-28  3:39                               ` Yang Shi
2022-09-27 20:56           ` Yang Shi
2022-09-27 20:54         ` Yang Shi
2022-09-21  6:06 ` [RFC 3/6] mm/migrate_pages: restrict number of pages to migrate in batch Huang Ying
2022-09-21 16:10   ` Zi Yan
2022-09-21 16:15     ` Zi Yan
2022-09-22  1:15     ` Huang, Ying
2022-09-21  6:06 ` [RFC 4/6] mm/migrate_pages: batch _unmap and _move Huang Ying
2022-09-21  6:06 ` [RFC 5/6] mm/migrate_pages: share more code between " Huang Ying
2022-09-21  6:06 ` [RFC 6/6] mm/migrate_pages: batch flushing TLB Huang Ying
2022-09-21 15:47 ` [RFC 0/6] migrate_pages(): batch TLB flushing Zi Yan
2022-09-22  1:45   ` Huang, Ying
2022-09-22  3:47   ` haoxin
2022-09-22  4:36     ` Huang, Ying
2022-09-22 12:50 ` Bharata B Rao
2022-09-23  7:52   ` Huang, Ying [this message]
2022-09-27 10:46     ` Bharata B Rao
2022-09-28  1:46       ` Huang, Ying
2022-09-26  9:11 ` Alistair Popple
2022-09-27 11:21 ` haoxin
2022-09-28  2:01   ` Huang, Ying
2022-09-28  3:33     ` haoxin
2022-09-28  4:53       ` Huang, Ying
2022-11-01 14:49   ` Hesham Almatary
2022-11-02  3:14     ` Huang, Ying
2022-11-02 14:13       ` Hesham Almatary

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87bkr6jzmz.fsf@yhuang6-desk2.ccr.corp.intel.com \
    --to=ying.huang@intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=baolin.wang@linux.alibaba.com \
    --cc=bharata@amd.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=osalvador@suse.de \
    --cc=shy828301@gmail.com \
    --cc=willy@infradead.org \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).