linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Dave Hansen <dave.hansen@intel.com>
To: Prakash Sangappa <prakash.sangappa@oracle.com>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [PATCH RFC] hugetlbfs 'noautofill' mount option
Date: Mon, 8 May 2017 08:58:05 -0700	[thread overview]
Message-ID: <48a544c4-61b3-acaf-0386-649f073602b6@intel.com> (raw)
In-Reply-To: <7677d20e-5d53-1fb7-5dac-425edda70b7b@oracle.com>

On 05/03/2017 12:02 PM, Prakash Sangappa wrote:
>>> If we do consider a new madvise()option, will it be acceptable
>>> since this will be specifically for hugetlbfs file mappings?
>> Ideally, it would be something that is *not* specifically for
>> hugetlbfs. MADV_NOAUTOFILL, for instance, could be defined to
>> SIGSEGV whenever memory is touched that was not populated with
>> MADV_WILLNEED, mlock(), etc...
> 
> If this is a generic advice type, necessary support will have to be 
> implemented in various filesystems which can support this.

Yep.

> The proposed behavior for 'noautofill' was to not fill holes in 
> files(like sparse files). In the page fault path, mm would not know
> if the mmapped address on which the fault occurred, is over a hole in
> the file or just that the page is not available in the page cache.

It depends on how you define the feature.  I think you have three choices:

1. "Error" on page fault.  Require all access to be pre-faulted.
2. Allow faults, but "Error" if page cache has to be allocated
3. Allow faults and page cache allocations, but error on filesystem
   backing storage allocation.

All of those are useful in some cases.  But the implementations probably
happen in different places:

#1 can be implemented in core mm code
#2 can be implemented in the VFS
#3 needs filesystem involvement

> The underlying filesystem would be called and it determines if it is
> a hole and that is where it would fail and not fill the hole, if this
> support is added. Normally, filesystem which support sparse
> files(holes in file) automatically fill the hole when accessed. Then
> there is the issue of file system block size and page size. If the 
> block sizes are smaller then page size, it could mean the noautofill 
> would only work if the hole size is equal to or a multiple of, page
> size?

It depends on how you define the feature whether this is true.

> In case of hugetlbfs it is much straight forward. Since this
> filesystem is not like a normal filesystems and and the file sizes
> are multiple of huge pages. The hole will be a multiple of the huge
> page size. For this reason then should the advise be specific to
> hugetlbfs?

Let me paraphrase: it's simpler to implement it if it's specific to
hugetlbfs, thus we should implement it only for hugetlbfs, and keep it
specific to hugetlbfs.

The bigger question is: do we want to continue adding to the complexity
of hugetlbfs and increase its divergence from the core mm?

  parent reply	other threads:[~2017-05-08 15:58 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <326e38dd-b4a8-e0ca-6ff7-af60e8045c74@oracle.com>
2017-05-01 18:00 ` [PATCH RFC] hugetlbfs 'noautofill' mount option Prakash Sangappa
2017-05-02 10:53   ` Anshuman Khandual
2017-05-02 16:07     ` Prakash Sangappa
2017-05-02 21:32   ` Dave Hansen
2017-05-02 23:34     ` Prakash Sangappa
2017-05-02 23:43       ` Dave Hansen
2017-05-03 19:02         ` Prakash Sangappa
2017-05-08  5:57           ` Prakash Sangappa
2017-05-08 15:58           ` Dave Hansen [this message]
2017-05-08 22:12             ` prakash.sangappa
2017-05-09  8:58               ` Christoph Hellwig
2017-05-09 20:59                 ` Prakash Sangappa
2017-05-16 16:51                   ` Prakash Sangappa
2017-06-16 13:15                   ` Andrea Arcangeli
2017-06-20 23:35                     ` Prakash Sangappa
2017-06-27 20:57                       ` Prakash Sangappa

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=48a544c4-61b3-acaf-0386-649f073602b6@intel.com \
    --to=dave.hansen@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=prakash.sangappa@oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).