From: Johannes Thumshirn <jthumshirn@suse.de> To: Jan Kara <jack@suse.cz> Cc: linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-nvdimm@lists.01.org, mhocko@suse.cz, Dan Williams <dan.j.williams@intel.com> Subject: Re: Problems with VM_MIXEDMAP removal from /proc/<pid>/smaps Date: Tue, 2 Oct 2018 16:20:10 +0200 Message-ID: <20181002142010.GB4963@linux-x5ow.site> (raw) In-Reply-To: <20181002121039.GA3274@linux-x5ow.site> On Tue, Oct 02, 2018 at 02:10:39PM +0200, Johannes Thumshirn wrote: > On Tue, Oct 02, 2018 at 12:05:31PM +0200, Jan Kara wrote: > > Hello, > > > > commit e1fb4a086495 "dax: remove VM_MIXEDMAP for fsdax and device dax" has > > removed VM_MIXEDMAP flag from DAX VMAs. Now our testing shows that in the > > mean time certain customer of ours started poking into /proc/<pid>/smaps > > and looks at VMA flags there and if VM_MIXEDMAP is missing among the VMA > > flags, the application just fails to start complaining that DAX support is > > missing in the kernel. The question now is how do we go about this? > > OK naive question from me, how do we want an application to be able to > check if it is running on a DAX mapping? > > AFAIU DAX is always associated with a file descriptor of some kind (be > it a real file with filesystem dax or the /dev/dax device file for > device dax). So could a new fcntl() be of any help here? IS_DAX() only > checks for the S_DAX flag in inode::i_flags, so this should be doable > for both fsdax and devdax. > > I haven't tried it yet but it should be fairly easy to come up with > something like this. OK now I did on a normal file on BTFS (without DAX obviously) and on a file on XFS with the -o dax mount option. Here's the RFC: commit 3a8f0d23c421e8c91bc9d8bd3a956e1ffe3f754b Author: Johannes Thumshirn <jthumshirn@suse.de> Date: Tue Oct 2 14:51:33 2018 +0200 fcntl: provide F_GETDAX for applications to query DAX capabilities Provide a F_GETDAX fcntl(2) command so an application can query whether it can make use of DAX or not. Both file-system DAX as well as device DAX mark the DAX capability in struct inode::i_flags using the S_DAX flag, so we can query it using the IS_DAX() macro on a struct file's inode. If the file descriptor is either device DAX or on a DAX capable file-system '1' is returned back to user-space, if DAX isn't usable for some reason '0' is returned back. This patch can be tested with the following small C program: #include <stdio.h> #include <stdlib.h> #include <unistd.h> #include <fcntl.h> #include <libgen.h> #ifndef F_LINUX_SPECIFIC_BASE #define F_LINUX_SPECIFIC_BASE 1024 #endif #define F_GETDAX (F_LINUX_SPECIFIC_BASE + 15) int main(int argc, char **argv) { int dax; int fd; int rc; if (argc != 2) { printf("Usage: %s file\n", basename(argv[0])); exit(EXIT_FAILURE); } fd = open(argv[1], O_RDONLY); if (fd < 0) { perror("open"); exit(EXIT_FAILURE); } rc = fcntl(fd, F_GETDAX, &dax); if (rc < 0) { perror("fcntl"); close(fd); exit(EXIT_FAILURE); } if (dax) { printf("fd %d is dax capable\n", fd); exit(EXIT_FAILURE); } else { printf("fd %d is not dax capable\n", fd); exit(EXIT_SUCCESS); } } Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de> Cc: Jan Kara <jack@suse.cz> Cc: Michal Hocko <mhocko@suse.cz> Cc: Dan Williams <dan.j.williams@intel.com> diff --git a/fs/fcntl.c b/fs/fcntl.c index 4137d96534a6..0b53f968f569 100644 --- a/fs/fcntl.c +++ b/fs/fcntl.c @@ -32,6 +32,22 @@ #define SETFL_MASK (O_APPEND | O_NONBLOCK | O_NDELAY | O_DIRECT | O_NOATIME) +static int fcntl_get_dax(struct file *filp, unsigned long arg) +{ + struct inode *inode = file_inode(filp); + u64 *argp = (u64 __user *)arg; + u64 dax; + + if (IS_DAX(inode)) + dax = 1; + else + dax = 0; + + if (copy_to_user(argp, &dax, sizeof(*argp))) + return -EFAULT; + return 0; +} + static int setfl(int fd, struct file * filp, unsigned long arg) { struct inode * inode = file_inode(filp); @@ -426,6 +442,9 @@ static long do_fcntl(int fd, unsigned int cmd, unsigned long arg, case F_SET_FILE_RW_HINT: err = fcntl_rw_hint(filp, cmd, arg); break; + case F_GETDAX: + err = fcntl_get_dax(filp, arg); + break; default: break; } diff --git a/include/uapi/linux/fcntl.h b/include/uapi/linux/fcntl.h index 6448cdd9a350..65a59c3cc46d 100644 --- a/include/uapi/linux/fcntl.h +++ b/include/uapi/linux/fcntl.h @@ -52,6 +52,7 @@ #define F_SET_RW_HINT (F_LINUX_SPECIFIC_BASE + 12) #define F_GET_FILE_RW_HINT (F_LINUX_SPECIFIC_BASE + 13) #define F_SET_FILE_RW_HINT (F_LINUX_SPECIFIC_BASE + 14) +#define F_GETDAX (F_LINUX_SPECIFIC_BASE + 15) /* * Valid hint values for F_{GET,SET}_RW_HINT. 0 is "not set", or can be -- Johannes Thumshirn Storage jthumshirn@suse.de +49 911 74053 689 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 N�rnberg GF: Felix Imend�rffer, Jane Smithard, Graham Norton HRB 21284 (AG N�rnberg) Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850
next prev parent reply index Thread overview: 45+ messages / expand[flat|nested] mbox.gz Atom feed top 2018-10-02 10:05 Jan Kara 2018-10-02 10:50 ` Michal Hocko 2018-10-02 13:32 ` Jan Kara 2018-10-02 12:10 ` Johannes Thumshirn 2018-10-02 14:20 ` Johannes Thumshirn [this message] 2018-10-02 14:45 ` Christoph Hellwig 2018-10-02 15:01 ` Johannes Thumshirn 2018-10-02 15:06 ` Christoph Hellwig 2018-10-04 10:09 ` Johannes Thumshirn 2018-10-05 6:25 ` Christoph Hellwig 2018-10-05 6:35 ` Johannes Thumshirn 2018-10-06 1:17 ` Dan Williams 2018-10-14 15:47 ` Dan Williams 2018-10-17 20:01 ` Dan Williams 2018-10-18 17:43 ` Jan Kara 2018-10-18 19:10 ` Dan Williams 2018-10-19 3:01 ` Dave Chinner 2018-10-02 14:29 ` Jan Kara 2018-10-02 14:37 ` Christoph Hellwig 2018-10-02 14:44 ` Johannes Thumshirn 2018-10-02 14:52 ` Christoph Hellwig 2018-10-02 15:31 ` Jan Kara 2018-10-02 20:18 ` Dan Williams 2018-10-03 12:50 ` Jan Kara 2018-10-03 14:38 ` Dan Williams 2018-10-03 15:06 ` Jan Kara 2018-10-03 15:13 ` Dan Williams 2018-10-03 16:44 ` Jan Kara 2018-10-03 21:13 ` Dan Williams 2018-10-04 10:04 ` Johannes Thumshirn 2018-10-02 15:07 ` Jan Kara 2018-10-17 20:23 ` Jeff Moyer 2018-10-18 0:25 ` Dave Chinner 2018-10-18 14:55 ` Jan Kara 2018-10-19 0:43 ` Dave Chinner 2018-10-30 6:30 ` Dan Williams 2018-10-30 22:49 ` Dave Chinner 2018-10-30 22:59 ` Dan Williams 2018-10-31 5:59 ` y-goto 2018-11-01 23:00 ` Dave Chinner 2018-11-02 1:43 ` y-goto 2018-10-18 21:05 ` Jeff Moyer 2018-10-09 19:43 ` Jeff Moyer 2018-10-16 8:25 ` Jan Kara 2018-10-16 12:35 ` Jeff Moyer
Reply instructions: You may reply publically to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20181002142010.GB4963@linux-x5ow.site \ --to=jthumshirn@suse.de \ --cc=dan.j.williams@intel.com \ --cc=jack@suse.cz \ --cc=linux-fsdevel@vger.kernel.org \ --cc=linux-mm@kvack.org \ --cc=linux-nvdimm@lists.01.org \ --cc=mhocko@suse.cz \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
Linux-Fsdevel Archive on lore.kernel.org Archives are clonable: git clone --mirror https://lore.kernel.org/linux-fsdevel/0 linux-fsdevel/git/0.git # If you have public-inbox 1.1+ installed, you may # initialize and index your mirror using the following commands: public-inbox-init -V2 linux-fsdevel linux-fsdevel/ https://lore.kernel.org/linux-fsdevel \ linux-fsdevel@vger.kernel.org public-inbox-index linux-fsdevel Example config snippet for mirrors Newsgroup available over NNTP: nntp://nntp.lore.kernel.org/org.kernel.vger.linux-fsdevel AGPL code for this site: git clone https://public-inbox.org/public-inbox.git