From: Johannes Thumshirn <jthumshirn@suse.de>
To: Jan Kara <jack@suse.cz>
Cc: linux-fsdevel@vger.kernel.org, linux-mm@kvack.org,
mhocko@suse.cz, linux-nvdimm@lists.01.org
Subject: Re: Problems with VM_MIXEDMAP removal from /proc/<pid>/smaps
Date: Tue, 2 Oct 2018 16:20:10 +0200 [thread overview]
Message-ID: <20181002142010.GB4963@linux-x5ow.site> (raw)
In-Reply-To: <20181002121039.GA3274@linux-x5ow.site>
On Tue, Oct 02, 2018 at 02:10:39PM +0200, Johannes Thumshirn wrote:
> On Tue, Oct 02, 2018 at 12:05:31PM +0200, Jan Kara wrote:
> > Hello,
> >
> > commit e1fb4a086495 "dax: remove VM_MIXEDMAP for fsdax and device dax" has
> > removed VM_MIXEDMAP flag from DAX VMAs. Now our testing shows that in the
> > mean time certain customer of ours started poking into /proc/<pid>/smaps
> > and looks at VMA flags there and if VM_MIXEDMAP is missing among the VMA
> > flags, the application just fails to start complaining that DAX support is
> > missing in the kernel. The question now is how do we go about this?
>
> OK naive question from me, how do we want an application to be able to
> check if it is running on a DAX mapping?
>
> AFAIU DAX is always associated with a file descriptor of some kind (be
> it a real file with filesystem dax or the /dev/dax device file for
> device dax). So could a new fcntl() be of any help here? IS_DAX() only
> checks for the S_DAX flag in inode::i_flags, so this should be doable
> for both fsdax and devdax.
>
> I haven't tried it yet but it should be fairly easy to come up with
> something like this.
OK now I did on a normal file on BTFS (without DAX obviously) and on a
file on XFS with the -o dax mount option.
Here's the RFC:
commit 3a8f0d23c421e8c91bc9d8bd3a956e1ffe3f754b
Author: Johannes Thumshirn <jthumshirn@suse.de>
Date: Tue Oct 2 14:51:33 2018 +0200
fcntl: provide F_GETDAX for applications to query DAX capabilities
Provide a F_GETDAX fcntl(2) command so an application can query
whether it can make use of DAX or not.
Both file-system DAX as well as device DAX mark the DAX capability in
struct inode::i_flags using the S_DAX flag, so we can query it using
the IS_DAX() macro on a struct file's inode.
If the file descriptor is either device DAX or on a DAX capable
file-system '1' is returned back to user-space, if DAX isn't usable
for some reason '0' is returned back.
This patch can be tested with the following small C program:
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <fcntl.h>
#include <libgen.h>
#ifndef F_LINUX_SPECIFIC_BASE
#define F_LINUX_SPECIFIC_BASE 1024
#endif
#define F_GETDAX (F_LINUX_SPECIFIC_BASE + 15)
int main(int argc, char **argv)
{
int dax;
int fd;
int rc;
if (argc != 2) {
printf("Usage: %s file\n", basename(argv[0]));
exit(EXIT_FAILURE);
}
fd = open(argv[1], O_RDONLY);
if (fd < 0) {
perror("open");
exit(EXIT_FAILURE);
}
rc = fcntl(fd, F_GETDAX, &dax);
if (rc < 0) {
perror("fcntl");
close(fd);
exit(EXIT_FAILURE);
}
if (dax) {
printf("fd %d is dax capable\n", fd);
exit(EXIT_FAILURE);
} else {
printf("fd %d is not dax capable\n", fd);
exit(EXIT_SUCCESS);
}
}
Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Cc: Jan Kara <jack@suse.cz>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Dan Williams <dan.j.williams@intel.com>
diff --git a/fs/fcntl.c b/fs/fcntl.c
index 4137d96534a6..0b53f968f569 100644
--- a/fs/fcntl.c
+++ b/fs/fcntl.c
@@ -32,6 +32,22 @@
#define SETFL_MASK (O_APPEND | O_NONBLOCK | O_NDELAY | O_DIRECT | O_NOATIME)
+static int fcntl_get_dax(struct file *filp, unsigned long arg)
+{
+ struct inode *inode = file_inode(filp);
+ u64 *argp = (u64 __user *)arg;
+ u64 dax;
+
+ if (IS_DAX(inode))
+ dax = 1;
+ else
+ dax = 0;
+
+ if (copy_to_user(argp, &dax, sizeof(*argp)))
+ return -EFAULT;
+ return 0;
+}
+
static int setfl(int fd, struct file * filp, unsigned long arg)
{
struct inode * inode = file_inode(filp);
@@ -426,6 +442,9 @@ static long do_fcntl(int fd, unsigned int cmd, unsigned long arg,
case F_SET_FILE_RW_HINT:
err = fcntl_rw_hint(filp, cmd, arg);
break;
+ case F_GETDAX:
+ err = fcntl_get_dax(filp, arg);
+ break;
default:
break;
}
diff --git a/include/uapi/linux/fcntl.h b/include/uapi/linux/fcntl.h
index 6448cdd9a350..65a59c3cc46d 100644
--- a/include/uapi/linux/fcntl.h
+++ b/include/uapi/linux/fcntl.h
@@ -52,6 +52,7 @@
#define F_SET_RW_HINT (F_LINUX_SPECIFIC_BASE + 12)
#define F_GET_FILE_RW_HINT (F_LINUX_SPECIFIC_BASE + 13)
#define F_SET_FILE_RW_HINT (F_LINUX_SPECIFIC_BASE + 14)
+#define F_GETDAX (F_LINUX_SPECIFIC_BASE + 15)
/*
* Valid hint values for F_{GET,SET}_RW_HINT. 0 is "not set", or can be
--
Johannes Thumshirn Storage
jthumshirn@suse.de +49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm
next prev parent reply other threads:[~2018-10-02 14:20 UTC|newest]
Thread overview: 45+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-10-02 10:05 Problems with VM_MIXEDMAP removal from /proc/<pid>/smaps Jan Kara
2018-10-02 10:50 ` Michal Hocko
2018-10-02 13:32 ` Jan Kara
2018-10-02 12:10 ` Johannes Thumshirn
2018-10-02 14:20 ` Johannes Thumshirn [this message]
2018-10-02 14:45 ` Christoph Hellwig
2018-10-02 15:01 ` Johannes Thumshirn
2018-10-02 15:06 ` Christoph Hellwig
2018-10-04 10:09 ` Johannes Thumshirn
2018-10-05 6:25 ` Christoph Hellwig
2018-10-05 6:35 ` Johannes Thumshirn
2018-10-06 1:17 ` Dan Williams
2018-10-14 15:47 ` Dan Williams
2018-10-17 20:01 ` Dan Williams
2018-10-18 17:43 ` Jan Kara
2018-10-18 19:10 ` Dan Williams
2018-10-19 3:01 ` Dave Chinner
2018-10-02 14:29 ` Jan Kara
2018-10-02 14:37 ` Christoph Hellwig
2018-10-02 14:44 ` Johannes Thumshirn
2018-10-02 14:52 ` Christoph Hellwig
2018-10-02 15:31 ` Jan Kara
2018-10-02 20:18 ` Dan Williams
2018-10-03 12:50 ` Jan Kara
2018-10-03 14:38 ` Dan Williams
2018-10-03 15:06 ` Jan Kara
2018-10-03 15:13 ` Dan Williams
2018-10-03 16:44 ` Jan Kara
2018-10-03 21:13 ` Dan Williams
2018-10-04 10:04 ` Johannes Thumshirn
2018-10-02 15:07 ` Jan Kara
2018-10-17 20:23 ` Jeff Moyer
2018-10-18 0:25 ` Dave Chinner
2018-10-18 14:55 ` Jan Kara
2018-10-19 0:43 ` Dave Chinner
2018-10-30 6:30 ` Dan Williams
2018-10-30 22:49 ` Dave Chinner
2018-10-30 22:59 ` Dan Williams
2018-10-31 5:59 ` y-goto
2018-11-01 23:00 ` Dave Chinner
2018-11-02 1:43 ` y-goto
2018-10-18 21:05 ` Jeff Moyer
2018-10-09 19:43 ` Jeff Moyer
2018-10-16 8:25 ` Jan Kara
2018-10-16 12:35 ` Jeff Moyer
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20181002142010.GB4963@linux-x5ow.site \
--to=jthumshirn@suse.de \
--cc=jack@suse.cz \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-nvdimm@lists.01.org \
--cc=mhocko@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).