From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 16080C433EF for ; Mon, 21 Mar 2022 16:34:24 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 34B1F6B0071; Mon, 21 Mar 2022 12:34:24 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 2FA5A6B0073; Mon, 21 Mar 2022 12:34:24 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1C2EA6B0074; Mon, 21 Mar 2022 12:34:24 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0200.hostedemail.com [216.40.44.200]) by kanga.kvack.org (Postfix) with ESMTP id 0D7FA6B0071 for ; Mon, 21 Mar 2022 12:34:24 -0400 (EDT) Received: from smtpin23.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id ACAE118205804 for ; Mon, 21 Mar 2022 16:34:23 +0000 (UTC) X-FDA: 79268941206.23.64F89E1 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf01.hostedemail.com (Postfix) with ESMTP id 2A4E040007 for ; Mon, 21 Mar 2022 16:34:22 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1647880462; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=FCsbFC2lqrB4b9iItZjA46b/97Rp6oUPlGVkVs0uSYI=; b=MnhbAXYsYkCs8/sQbFzXJJfCUK69koffDLaml/WMwMWI4dGQkgqboiVxAVMFkTJlm9rHHJ F2C02aAph1iJpZnKfhxw0DGtpAG8UqSfO9F53Wi3MwtbAljDkAv2PGiCJF/rF3poT8cFv4 edSZgoEtA7/qxP7KgCp0zXE837MU7t4= Received: from mail-wm1-f70.google.com (mail-wm1-f70.google.com [209.85.128.70]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-632-uhh1GCq-NpeRvS24WJfYlQ-1; Mon, 21 Mar 2022 12:34:21 -0400 X-MC-Unique: uhh1GCq-NpeRvS24WJfYlQ-1 Received: by mail-wm1-f70.google.com with SMTP id 12-20020a05600c24cc00b0038c6caa95f7so6099665wmu.4 for ; Mon, 21 Mar 2022 09:34:21 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:date:mime-version:user-agent:subject :content-language:to:cc:references:from:organization:in-reply-to :content-transfer-encoding; bh=FCsbFC2lqrB4b9iItZjA46b/97Rp6oUPlGVkVs0uSYI=; b=w0RxzmVW1C8BmLqnzBtNKkHl7n1twPUQy+IGKQqioGnkdZxQxZjKjilQLhz4jnaq/W 1YGgO3k2yzW6MAYLSqT86xmN++lFsEU7TL/KzE9iTXSOMxngtkuGdk8dS0a1v8SoTH/6 XdD/TLNhJ7CIEEWeYiUd4g8aZYxeF7vTZ818ecfAnShF0qWGz8o63Hwv9ZcaxJ1oXO5p gbKOnepDCzwqt/04qISnC5/WUJ/jMM/zHpuqihnt009Bp9U8gHlk4+QkWB3nqUoYHxHB HQwGWgf/DpFgC502vnePphxRUa7ErRs8EySyhvhuqax3ZwAktmO1F6jMOGDmmkKMw0SM gq8w== X-Gm-Message-State: AOAM532/WElbPjDEZIVvpTc2YprRssdCV08fsEliVMToxr7l61ZUwv5B efWcyjmhGL59T3jdYr3BbgcsngtBmyeTSanuzBBgbkGyYLZ5eU1XifQO4XJGGmauGIsOcKjxmlJ cpaZ8WaP+Dgs= X-Received: by 2002:a05:600c:4f92:b0:38a:1d7:c01c with SMTP id n18-20020a05600c4f9200b0038a01d7c01cmr27760985wmq.164.1647880460365; Mon, 21 Mar 2022 09:34:20 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxwRvUvF4rAMYG2cOMcahCT6kfNdJ2BK9GTqBBBhpSR4iwYrIJdakknBwKOIFrUibcSS+DNhg== X-Received: by 2002:a05:600c:4f92:b0:38a:1d7:c01c with SMTP id n18-20020a05600c4f9200b0038a01d7c01cmr27760965wmq.164.1647880460076; Mon, 21 Mar 2022 09:34:20 -0700 (PDT) Received: from ?IPV6:2003:cb:c704:4900:849b:f76e:5e1f:ff95? (p200300cbc7044900849bf76e5e1fff95.dip0.t-ipconnect.de. [2003:cb:c704:4900:849b:f76e:5e1f:ff95]) by smtp.gmail.com with ESMTPSA id s17-20020adfdb11000000b001f02d5fea43sm14780323wri.98.2022.03.21.09.34.19 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 21 Mar 2022 09:34:19 -0700 (PDT) Message-ID: Date: Mon, 21 Mar 2022 17:34:18 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.6.2 Subject: Re: VM_BUG_ON(!tlb->end) on munmap() with CONT hugetlb pages To: Mike Kravetz , "linux-mm@kvack.org" Cc: anshuman.khandual@arm.com, Will Deacon , "Aneesh Kumar K . V" , 'Catalin Marinas' , Peter Zijlstra References: <811c5c8e-b3a2-85d2-049c-717f17c3a03a@redhat.com> <993f1258-6550-e5d7-1e6f-72e2a24b60f0@oracle.com> <3ba18a1d-d5d8-558f-9576-8119c210e98a@oracle.com> From: David Hildenbrand Organization: Red Hat In-Reply-To: <3ba18a1d-d5d8-558f-9576-8119c210e98a@oracle.com> X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Language: en-US Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Rspam-User: Authentication-Results: imf01.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=MnhbAXYs; dmarc=pass (policy=none) header.from=redhat.com; spf=none (imf01.hostedemail.com: domain of david@redhat.com has no SPF policy when checking 170.10.133.124) smtp.mailfrom=david@redhat.com X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: 2A4E040007 X-Stat-Signature: 8cwicita1zdhd8min8e5ks7qirm1gb5h X-HE-Tag: 1647880462-229800 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 08.03.22 00:06, Mike Kravetz wrote: > On 2/28/22 16:26, Mike Kravetz wrote: >> On 2/28/22 07:39, David Hildenbrand wrote: >>> Hi, >>> >>> playing with anonymous CONT hugetlb pages on aarch64, I stumbled over the following VM_BUG_ON: >>> >>> [ 124.770288] ------------[ cut here ]------------ >>> [ 124.774899] kernel BUG at mm/mmu_gather.c:70! >>> [ 124.779244] Internal error: Oops - BUG: 0 [#1] SMP >>> [ 124.784022] Modules linked in: mlx4_ib ib_uverbs ib_core mlx4_en rfkill vfat fat acpi_ipmi joydev ipmi_ssif igb mlx4_core ipmi_devintf ipmi_msghandler cppc_cpufreq fuse zram ip_tables xfs uas usb_storage dwc3 ulpi ast udc_core i2c_algo_bit drm_vram_helper drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops cec drm_ttm_helper ttm crct10dif_ce drm ghash_ce sbsa_gwdt i2c_xgene_slimpro xgene_hwmon ahci_platform gpio_dwapb xhci_plat_hcd >>> [ 124.823045] CPU: 16 PID: 1160 Comm: test Not tainted 5.16.11-200.fc35.aarch64 #1 >>> [ 124.830428] Hardware name: Lenovo HR350A 7X35CTO1WW /HR350A , BIOS hve104r-1.15 02/26/2021 >>> [ 124.840240] pstate: 40400005 (nZcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) >>> [ 124.847189] pc : __tlb_remove_page_size+0x88/0xe4 >>> [ 124.851885] lr : __unmap_hugepage_range+0x260/0x504 >>> [ 124.856751] sp : ffff80000f6f3ae0 >>> [ 124.860053] x29: ffff80000f6f3ae0 x28: ffff00080b639d24 x27: ffff000802504080 >>> [ 124.867176] x26: fffffc00210f8000 x25: 0000000000000000 x24: ffff80000a9e8750 >>> [ 124.874299] x23: 0000ffff8da20000 x22: ffff000804f0c190 x21: 0000000000010000 >>> [ 124.881423] x20: ffff80000f6f3cb0 x19: ffff80000f6f3cb0 x18: 0000000000000000 >>> [ 124.888545] x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000 >>> [ 124.895668] x14: 0000000000000000 x13: 0008000000000000 x12: 0008000000000080 >>> [ 124.902791] x11: 0008000000000000 x10: 00f80008c3e00f43 x9 : ffff800008404e60 >>> [ 124.909914] x8 : 0846000000000000 x7 : 0000000000000000 x6 : ffff80000a8a4000 >>> [ 124.917038] x5 : 0000000000000040 x4 : 0000000000000000 x3 : 0000000000001000 >>> [ 124.924161] x2 : 0000000000010000 x1 : fffffc00210f8000 x0 : 0000000000000000 >>> [ 124.931284] Call trace: >>> [ 124.933718] __tlb_remove_page_size+0x88/0xe4 >>> [ 124.938062] __unmap_hugepage_range+0x260/0x504 >>> [ 124.942580] __unmap_hugepage_range_final+0x24/0x40 >>> [ 124.947445] unmap_single_vma+0x100/0x11c >>> [ 124.951443] unmap_vmas+0x7c/0xf4 >>> [ 124.954746] unmap_region+0xa4/0xf0 >>> [ 124.958222] __do_munmap+0x1b8/0x50c >>> [ 124.961785] __vm_munmap+0x74/0x120 >>> [ 124.965261] __arm64_sys_munmap+0x40/0x54 >>> [ 124.969257] invoke_syscall+0x50/0x120 >>> [ 124.972995] el0_svc_common.constprop.0+0x4c/0x100 >>> [ 124.977774] do_el0_svc+0x34/0xa0 >>> [ 124.981077] el0_svc+0x30/0xd0 >>> [ 124.984120] el0t_64_sync_handler+0xa4/0x130 >>> [ 124.988377] el0t_64_sync+0x1a4/0x1a8 >>> [ 124.992028] Code: b4000140 f9001660 29410402 17fffff4 (d4210000) >>> [ 124.998109] ---[ end trace a74a76b89c9f2d88 ]--- >>> [ 125.002900] ------------[ cut here ]------------ >>> >>> >>> I'm running with 64k hugetlb on 4k aarch64. Reproducer: >>> >>> #define _GNU_SOURCE >>> #include >>> #include >>> #include >>> #include >>> >>> void main(void) >>> { >>> const size_t size = 64*1024; >>> unsigned long cur; >>> char *area; >>> int fd; >>> >>> fd = memfd_create("test", MFD_HUGETLB | MFD_HUGE_64KB); >>> ftruncate(fd, size); >>> area = mmap(NULL, size, PROT_READ | PROT_WRITE, MAP_PRIVATE, fd, 0); >>> >>> memset(area, 0, size); >>> >>> munmap(area, size); >>> } >>> >>> >>> >>> I assume __unmap_hugepage_range() does a >>> >>> a) tlb_remove_huge_tlb_entry() >>> >>> -> for sz != PMD_SIZE and sz != PUD_SIZE, this calls __tlb_remove_tlb_entry() >>> >>> -> __tlb_remove_tlb_entry() is a NOP on aarch64. __tlb_adjust_range() isn't called. >>> >>> b) tlb_remove_page_size() >>> >>> -> __tlb_remove_page_size() runs into VM_BUG_ON(!tlb->end); >>> >>> >>> Not sure if this is just "ok" and we don't have to adjust the range or if there is >>> some tlb range adjustment missing. >>> >> >> To me, it looks like we are missing range adjustment in the case where >> hugetlb page size != PMD_SIZE and != PUD_SIZE. Not sure how those ranges >> are being flushed because as you note tlb_remove_huge_tlb_entry is pretty >> much a NOP in this case on aarch64. >> >> Cc'ing Will and Peter as they most recently changed this code. Commit >> 2631ed00b049 "tlb: mmu_gather: add tlb_flush_*_range APIs" removed an >> unconditional call to __tlb_adjust_range() in tlb_remove_huge_tlb_entry. >> That might have taken care of range adjustments in earlier versions of >> the code. Not exactly sure what is needed now. > > I verified that commit 2631ed00b049 caused the VM_BUG when it removed the > unconditional call to __tlb_adjust_range(). However, I need some assistance > on the proper solution. > > Just adding the __tlb_adjust_range() call to tlb_remove_huge_tlb_entry in > the case where size != PMD_SIZE and != PUD_SIZE will silence the BUG. > However, one outcome of 2631ed00b049 is that cleared_p* is set if > __tlb_adjust_range is ever called. > > It 'seems' that tlb_flush_pte_range() should be called for the CONT PTE case > on arm64, and tlb_flush_pmd_range() should be called for CONT PMD. But, this > would require an arch specific version of tlb_remove_huge_tlb_entry. > > FYI - This same issue should exist on other architectures that support > hugetlb page sizes != PMD_SIZE and != PUD_SIZE. > > Suggestions on how to proceed? Unfortunately, I have absolutely no clue what would be the right thing to do. Any aarch64 CONT experts? -- Thanks, David / dhildenb