All of lore.kernel.org
 help / color / mirror / Atom feed
From: yuyufen <yuyufen@huawei.com>
To: Mike Kravetz <mike.kravetz@oracle.com>,
	Michal Hocko <mhocko@kernel.org>,
	Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: "linux-mm@kvack.org" <linux-mm@kvack.org>,
	"kirill.shutemov@linux.intel.com"
	<kirill.shutemov@linux.intel.com>
Subject: Re: [PATCH] hugetlbfs: move resv_map to hugetlbfs_inode_info
Date: Tue, 16 Apr 2019 20:57:08 +0800	[thread overview]
Message-ID: <a08b3f2a-e7a6-6f1a-7800-1939ad3316c3@huawei.com> (raw)
In-Reply-To: <f063c3e7-1b37-7592-14c2-78b494dbd825@oracle.com>



On 2019/4/16 1:11, Mike Kravetz wrote:
> On 4/15/19 2:15 AM, Michal Hocko wrote:
>> On Mon 15-04-19 06:16:15, Naoya Horiguchi wrote:
>>> On Fri, Apr 12, 2019 at 04:40:01PM -0700, Mike Kravetz wrote:
>>>> On 4/11/19 9:02 PM, Yufen Yu wrote:
>>>>> Commit 58b6e5e8f1ad ("hugetlbfs: fix memory leak for resv_map")
>>>> ...
>>>>> However, for inode mode that is 'S_ISBLK', hugetlbfs_evict_inode() may
>>>>> free or modify i_mapping->private_data that is owned by bdev inode,
>>>>> which is not expected!
>>>> ...
>>>>> We fix the problem by moving resv_map to hugetlbfs_inode_info. It may
>>>>> be more reasonable.
>>>> Your patches force me to consider these potential issues.  Thank you!
>>>>
>>>> The root of all these problems (including the original leak) is that the
>>>> open of a block special inode will result in bd_acquire() overwriting the
>>>> value of inode->i_mapping.  Since hugetlbfs inodes normally contain a
>>>> resv_map at inode->i_mapping->private_data, a memory leak occurs if we do
>>>> not free the initially allocated resv_map.  In addition, when the
>>>> inode is evicted/destroyed inode->i_mapping may point to an address space
>>>> not associated with the hugetlbfs inode.  If code assumes inode->i_mapping
>>>> points to hugetlbfs inode address space at evict time, there may be bad
>>>> data references or worse.
>>> Let me ask a kind of elementary question: is there any good reason/purpose
>>> to create and use block special files on hugetlbfs?  I never heard about
>>> such usecases.
> I am not aware of this as a common use case.  Yufen Yu may be able to provide
> more details about how the issue was discovered.  My guess is that it was
> discovered via code inspection.

In fact, we discover the issue by running syzkaller. The program like:

15:39:59 executing program 0:
r0 = openat(0xffffffffffffff9c, &(0x7f0000000040)='./file0/file0\x00', 
0x44000, 0x1)
r1 = syz_open_dev$vcsn(&(0x7f00000000c0)='/dev/vcs#\x00', 0x3f, 0x202000)
renameat2(r0, &(0x7f0000000140)='./file0\x00', r0, 
&(0x7f0000000180)='./file0/file0/file0\x00', 0x4)
mkdir(&(0x7f0000000300)='./file0\x00', 0x0)
mount(0x0, &(0x7f0000000200)='./file0\x00', 
&(0x7f0000000240)='hugetlbfs\x00', 0x0, 0x0)
mknod$loop(&(0x7f0000000000)='./file0/file0\x00', 0x6000, 
0xffffffffffffffff)

Yufen
Thanks

>
>>>                  I guess that the conflict of the usage of ->i_mapping is
>>> discovered recently and that's because block special files on hugetlbfs are
>>> just not considered until recently or well defined.  So I think that we might
>>> be better to begin with defining it first.
> Unless I am mistaken, this is just like creating a device special file
> in any other filesystem.  Correct?  hugetlbfs is just some place for the
> inode/file to reside.  What happens when you open/ioctl/close/etc the file
> is really dependent on the vfs layer and underlying driver.
>
>> A absolutely agree. Hugetlbfs is overly complicated even without that.
>> So if this is merely "we have tried it and it has blown up" kinda thing
>> then just refuse the create blockdev files or document it as undefined.
>> You need a root to do so anyway.
> Can we just refuse to create device special files in hugetlbfs?  Do we need
> to worry about breaking any potential users?  I honestly do not know if anyone
> does this today.  However, if they did I believe things would "just work".
> The only known issue is leaking a resv_map structure when the inode is
> destroyed.  I doubt anyone would notice that leak today.
>
> Let me do a little more research.  I think this can all be cleaned up by
> making hugetlbfs always operate on the address space embedded in the inode.
> If nothing else, a change or explanation should be added as to why most code
> operates on inode->mapping and one place operates on &inode->i_data.



      parent reply	other threads:[~2019-04-16 12:57 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-04-12  4:02 [PATCH] hugetlbfs: move resv_map to hugetlbfs_inode_info Yufen Yu
2019-04-12 23:40 ` Mike Kravetz
2019-04-13 11:57   ` yuyufen
2019-04-15  6:16   ` Naoya Horiguchi
2019-04-15  9:15     ` Michal Hocko
2019-04-15 17:11       ` Mike Kravetz
2019-04-15 23:59         ` Naoya Horiguchi
2019-04-16  0:37           ` Mike Kravetz
2019-04-16  6:50         ` Michal Hocko
2019-04-19 20:44           ` [PATCH] hugetlbfs: always use address space in inode for resv_map pointer Mike Kravetz
2019-05-08  7:10             ` yuyufen
2019-05-08 20:16               ` Mike Kravetz
2019-05-09 23:11                 ` Andrew Morton
2019-05-09 23:32                   ` Mike Kravetz
2019-04-16 12:57         ` yuyufen [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a08b3f2a-e7a6-6f1a-7800-1939ad3316c3@huawei.com \
    --to=yuyufen@huawei.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=mike.kravetz@oracle.com \
    --cc=n-horiguchi@ah.jp.nec.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.