All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH RFC v3 0/6] Intruduce nfsrahead
@ 2022-03-23 20:18 Thiago Becker
  2022-03-23 20:18 ` [PATCH RFC v3 1/6] Create nfsrahead Thiago Becker
                   ` (6 more replies)
  0 siblings, 7 replies; 10+ messages in thread
From: Thiago Becker @ 2022-03-23 20:18 UTC (permalink / raw)
  To: linux-nfs; +Cc: steved, trond.myklebust, anna.schumaker, kolga, Thiago Becker

Recent changes in the linux kernel caused NFS readahead to default to
128 from the previous default of 15 * rsize. This causes performance
penalties to some read-heavy workloads, which can be fixed by
tuning the readahead for that given mount.

Specifically, the read troughput on a sec=krb5p mount drops by 50-75%
when comparing the default readahead with a readahead of 15360.

Previous discussions:
https://lore.kernel.org/linux-nfs/20210803130717.2890565-1-trbecker@gmail.com/
I attempted to add a non-kernel option to mount.nfs, and it was
rejected.

https://lore.kernel.org/linux-nfs/20210811171402.947156-1-trbecker@gmail.com/
Attempted to add a mount option to the kernel, rejected as well.

I had started a separate tool to set the readahead of BDIs, but the
scope is specifically for NFS, so I would like to get the community
feeling for having this in nfs-utils.

This patch series introduces nfs-readahead-udev, a utility to
automatically set NFS readahead when NFS is mounted. The utility is
triggered by udev when a new BDI is added, returns to udev the value of
the readahead that should be used.

The tool currently supports setting read ahead per mountpoint, nfs major
version, or by a global default value.

v2:
    - explain the motivation

v3:
    - adopt already available facilities
    - nfsrahead is now configured in nfs.conf

Thiago Becker (6):
  Create nfsrahead
  nfsrahead: configure udev
  nfsrahead: only set readahead for nfs devices.
  nfsrahead: add logging
  hfsrahead: get the information from the config file.
  nfsrahead: User documentation

 .gitignore                          |   2 +
 configure.ac                        |   1 +
 systemd/nfs.conf.man                |  11 ++
 tools/Makefile.am                   |   2 +-
 tools/nfsrahead/99-nfs_bdi.rules.in |   1 +
 tools/nfsrahead/Makefile.am         |  15 +++
 tools/nfsrahead/main.c              | 179 ++++++++++++++++++++++++++++
 tools/nfsrahead/nfsrahead.man       |  72 +++++++++++
 8 files changed, 282 insertions(+), 1 deletion(-)
 create mode 100644 tools/nfsrahead/99-nfs_bdi.rules.in
 create mode 100644 tools/nfsrahead/Makefile.am
 create mode 100644 tools/nfsrahead/main.c
 create mode 100644 tools/nfsrahead/nfsrahead.man

-- 
2.35.1


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH RFC v3 1/6] Create nfsrahead
  2022-03-23 20:18 [PATCH RFC v3 0/6] Intruduce nfsrahead Thiago Becker
@ 2022-03-23 20:18 ` Thiago Becker
  2022-03-23 20:18 ` [PATCH RFC v3 2/6] nfsrahead: configure udev Thiago Becker
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 10+ messages in thread
From: Thiago Becker @ 2022-03-23 20:18 UTC (permalink / raw)
  To: linux-nfs; +Cc: steved, trond.myklebust, anna.schumaker, kolga, Thiago Becker

This tool is invoked by udev to find and set the readahead value to NFS
mounts.

Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1946283
Signed-off-by: Thiago Becker <tbecker@redhat.com>
---
 .gitignore                  | 1 +
 configure.ac                | 1 +
 tools/Makefile.am           | 2 +-
 tools/nfsrahead/Makefile.am | 3 +++
 tools/nfsrahead/main.c      | 7 +++++++
 5 files changed, 13 insertions(+), 1 deletion(-)
 create mode 100644 tools/nfsrahead/Makefile.am
 create mode 100644 tools/nfsrahead/main.c

diff --git a/.gitignore b/.gitignore
index c89d1cd2..38ab1d39 100644
--- a/.gitignore
+++ b/.gitignore
@@ -61,6 +61,7 @@ utils/statd/statd
 tools/locktest/testlk
 tools/getiversion/getiversion
 tools/nfsconf/nfsconf
+tools/nfsrahead/nfsrahead
 support/export/mount.h
 support/export/mount_clnt.c
 support/export/mount_xdr.c
diff --git a/configure.ac b/configure.ac
index e0f5a930..3e1c183b 100644
--- a/configure.ac
+++ b/configure.ac
@@ -737,6 +737,7 @@ AC_CONFIG_FILES([
 	tools/rpcgen/Makefile
 	tools/mountstats/Makefile
 	tools/nfs-iostat/Makefile
+	tools/nfsrahead/Makefile
 	tools/rpcctl/Makefile
 	tools/nfsdclnts/Makefile
 	tools/nfsconf/Makefile
diff --git a/tools/Makefile.am b/tools/Makefile.am
index c3feabbe..40c17c37 100644
--- a/tools/Makefile.am
+++ b/tools/Makefile.am
@@ -12,6 +12,6 @@ if CONFIG_NFSDCLD
 OPTDIRS += nfsdclddb
 endif
 
-SUBDIRS = locktest rpcdebug nlmtest mountstats nfs-iostat rpcctl nfsdclnts $(OPTDIRS)
+SUBDIRS = locktest rpcdebug nlmtest mountstats nfs-iostat rpcctl nfsdclnts nfsrahead $(OPTDIRS)
 
 MAINTAINERCLEANFILES = Makefile.in
diff --git a/tools/nfsrahead/Makefile.am b/tools/nfsrahead/Makefile.am
new file mode 100644
index 00000000..edff7921
--- /dev/null
+++ b/tools/nfsrahead/Makefile.am
@@ -0,0 +1,3 @@
+libexec_PROGRAMS = nfsrahead
+nfsrahead_SOURCES = main.c
+
diff --git a/tools/nfsrahead/main.c b/tools/nfsrahead/main.c
new file mode 100644
index 00000000..e454108e
--- /dev/null
+++ b/tools/nfsrahead/main.c
@@ -0,0 +1,7 @@
+#include <stdio.h>
+
+int main(int argc, char **argv, char **envp)
+{
+	unsigned int readahead = 128;
+	printf("%d\n", readahead);
+}
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH RFC v3 2/6] nfsrahead: configure udev
  2022-03-23 20:18 [PATCH RFC v3 0/6] Intruduce nfsrahead Thiago Becker
  2022-03-23 20:18 ` [PATCH RFC v3 1/6] Create nfsrahead Thiago Becker
@ 2022-03-23 20:18 ` Thiago Becker
  2022-03-23 20:18 ` [PATCH RFC v3 3/6] nfsrahead: only set readahead for nfs devices Thiago Becker
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 10+ messages in thread
From: Thiago Becker @ 2022-03-23 20:18 UTC (permalink / raw)
  To: linux-nfs; +Cc: steved, trond.myklebust, anna.schumaker, kolga, Thiago Becker

Set the udev rule to call the readahead utility.

Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1946283
Signed-off-by: Thiago Becker <tbecker@redhat.com>
---
 .gitignore                          | 1 +
 tools/nfsrahead/99-nfs_bdi.rules.in | 1 +
 tools/nfsrahead/Makefile.am         | 8 ++++++++
 3 files changed, 10 insertions(+)
 create mode 100644 tools/nfsrahead/99-nfs_bdi.rules.in

diff --git a/.gitignore b/.gitignore
index 38ab1d39..df791a83 100644
--- a/.gitignore
+++ b/.gitignore
@@ -62,6 +62,7 @@ tools/locktest/testlk
 tools/getiversion/getiversion
 tools/nfsconf/nfsconf
 tools/nfsrahead/nfsrahead
+tools/nfsrahead/99-nfs_bdi.rules
 support/export/mount.h
 support/export/mount_clnt.c
 support/export/mount_xdr.c
diff --git a/tools/nfsrahead/99-nfs_bdi.rules.in b/tools/nfsrahead/99-nfs_bdi.rules.in
new file mode 100644
index 00000000..7d55b407
--- /dev/null
+++ b/tools/nfsrahead/99-nfs_bdi.rules.in
@@ -0,0 +1 @@
+SUBSYSTEM=="bdi", ACTION=="add", PROGRAM="_libexecdir_/nfsrahead", ATTR{read_ahead_kb}="%c"
diff --git a/tools/nfsrahead/Makefile.am b/tools/nfsrahead/Makefile.am
index edff7921..b598bec3 100644
--- a/tools/nfsrahead/Makefile.am
+++ b/tools/nfsrahead/Makefile.am
@@ -1,3 +1,11 @@
 libexec_PROGRAMS = nfsrahead
 nfsrahead_SOURCES = main.c
 
+udev_rulesdir = /etc/udev/rules.d
+udev_rules_DATA = 99-nfs_bdi.rules
+
+99-nfs_bdi.rules: 99-nfs_bdi.rules.in $(builddefs)
+	$(SED) "s|_libexecdir_|@libexecdir@|g" 99-nfs_bdi.rules.in > $@
+
+clean-local:
+	$(RM) 99-nfs_bdi.rules
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH RFC v3 3/6] nfsrahead: only set readahead for nfs devices.
  2022-03-23 20:18 [PATCH RFC v3 0/6] Intruduce nfsrahead Thiago Becker
  2022-03-23 20:18 ` [PATCH RFC v3 1/6] Create nfsrahead Thiago Becker
  2022-03-23 20:18 ` [PATCH RFC v3 2/6] nfsrahead: configure udev Thiago Becker
@ 2022-03-23 20:18 ` Thiago Becker
  2022-03-23 20:18 ` [PATCH RFC v3 4/6] nfsrahead: add logging Thiago Becker
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 10+ messages in thread
From: Thiago Becker @ 2022-03-23 20:18 UTC (permalink / raw)
  To: linux-nfs; +Cc: steved, trond.myklebust, anna.schumaker, kolga, Thiago Becker

Limit setting the readahead for nfs devices.

Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1946283
Signed-off-by: Thiago Becker <tbecker@redhat.com>
---
 tools/nfsrahead/Makefile.am |   1 +
 tools/nfsrahead/main.c      | 130 ++++++++++++++++++++++++++++++++++++
 2 files changed, 131 insertions(+)

diff --git a/tools/nfsrahead/Makefile.am b/tools/nfsrahead/Makefile.am
index b598bec3..afccc520 100644
--- a/tools/nfsrahead/Makefile.am
+++ b/tools/nfsrahead/Makefile.am
@@ -1,5 +1,6 @@
 libexec_PROGRAMS = nfsrahead
 nfsrahead_SOURCES = main.c
+nfsrahead_LDFLAGS= -lmount
 
 udev_rulesdir = /etc/udev/rules.d
 udev_rules_DATA = 99-nfs_bdi.rules
diff --git a/tools/nfsrahead/main.c b/tools/nfsrahead/main.c
index e454108e..2cf77424 100644
--- a/tools/nfsrahead/main.c
+++ b/tools/nfsrahead/main.c
@@ -1,7 +1,137 @@
 #include <stdio.h>
+#include <string.h>
+#include <stdlib.h>
+#include <errno.h>
+
+#include <libmount/libmount.h>
+#include <sys/sysmacros.h>
+
+#ifndef MOUNTINFO_PATH
+#define MOUNTINFO_PATH "/proc/self/mountinfo"
+#endif
+
+/* Device information from the system */
+struct device_info {
+	char *device_number;
+	dev_t dev;
+	char *mountpoint;
+	char *fstype;
+};
+
+/* Convert a string in the format n:m to a device number */
+static dev_t dev_from_arg(const char *device_number)
+{
+	char *s = strdup(device_number), *p;
+	char *maj_s, *min_s;
+	unsigned int maj, min;
+	dev_t dev;
+
+	maj_s = p = s;
+	for ( ; *p != ':'; p++)
+		;
+
+	*p = '\0';
+	min_s = p + 1;
+
+	maj = strtol(maj_s, NULL, 10);
+	min = strtol(min_s, NULL, 10);
+
+	dev = makedev(maj, min);
+
+	free(s);
+	return dev;
+}
+
+#define sfree(ptr) if (ptr) free(ptr)
+
+/* device_info maintenance */
+static void init_device_info(struct device_info *di, const char *device_number)
+{
+	di->device_number = strdup(device_number);
+	di->dev = dev_from_arg(device_number);
+	di->mountpoint = NULL;
+	di->fstype = NULL;
+}
+
+
+static void free_device_info(struct device_info *di)
+{
+	sfree(di->mountpoint);
+	sfree(di->fstype);
+	sfree(di->device_number);
+}
+
+static int get_mountinfo(const char *device_number, struct device_info *device_info, const char *mountinfo_path)
+{
+	int ret = 0;
+	struct libmnt_table *mnttbl;
+	struct libmnt_fs *fs;
+	char *target;
+
+	init_device_info(device_info, device_number);
+
+	mnttbl = mnt_new_table();
+
+	if ((ret = mnt_table_parse_file(mnttbl, mountinfo_path)) < 0)
+		goto out_free_tbl;
+
+	if ((fs = mnt_table_find_devno(mnttbl, device_info->dev, MNT_ITER_FORWARD)) == NULL) {
+		ret = ENOENT;
+		goto out_free_tbl;
+	}
+
+	if ((target = (char *)mnt_fs_get_target(fs)) == NULL) {
+		ret = ENOENT;
+		goto out_free_fs;
+	}
+
+	device_info->mountpoint = strdup(target);
+	target = (char *)mnt_fs_get_fstype(fs);
+	if (target)
+		device_info->fstype = strdup(target);
+
+out_free_fs:
+	mnt_free_fs(fs);
+out_free_tbl:
+	mnt_free_table(mnttbl);
+	free(device_info->device_number);
+	device_info->device_number = NULL;
+	return ret;
+}
+
+static int get_device_info(const char *device_number, struct device_info *device_info)
+{
+	int ret = ENOENT;
+	for (int retry_count = 0; retry_count < 10 && ret != 0; retry_count++)
+		ret = get_mountinfo(device_number, device_info, MOUNTINFO_PATH);
+
+	return ret;
+}
 
 int main(int argc, char **argv, char **envp)
 {
+	int ret = 0;
+	struct device_info device;
 	unsigned int readahead = 128;
+
+	if (argc != 2) {
+		return -EINVAL;
+	}
+
+	if ((ret = get_device_info(argv[1], &device)) != 0) {
+		goto out;
+	}
+
+	if (strncmp("nfs", device.fstype, 3) != 0) {
+		ret = -EINVAL;
+		goto out;
+	}
+
+	info("Setting %s readahead to 128\n", device.mountpoint);
+
 	printf("%d\n", readahead);
+
+out:
+	free_device_info(&device);
+	return ret;
 }
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH RFC v3 4/6] nfsrahead: add logging
  2022-03-23 20:18 [PATCH RFC v3 0/6] Intruduce nfsrahead Thiago Becker
                   ` (2 preceding siblings ...)
  2022-03-23 20:18 ` [PATCH RFC v3 3/6] nfsrahead: only set readahead for nfs devices Thiago Becker
@ 2022-03-23 20:18 ` Thiago Becker
  2022-03-23 20:18 ` [PATCH RFC v3 5/6] hfsrahead: get the information from the config file Thiago Becker
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 10+ messages in thread
From: Thiago Becker @ 2022-03-23 20:18 UTC (permalink / raw)
  To: linux-nfs; +Cc: steved, trond.myklebust, anna.schumaker, kolga, Thiago Becker

Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1946283
Signed-off-by: Thiago Becker <tbecker@redhat.com>
---
 tools/nfsrahead/Makefile.am |  1 +
 tools/nfsrahead/main.c      | 40 +++++++++++++++++++++++++++++++------
 2 files changed, 35 insertions(+), 6 deletions(-)

diff --git a/tools/nfsrahead/Makefile.am b/tools/nfsrahead/Makefile.am
index afccc520..d0b5d170 100644
--- a/tools/nfsrahead/Makefile.am
+++ b/tools/nfsrahead/Makefile.am
@@ -1,6 +1,7 @@
 libexec_PROGRAMS = nfsrahead
 nfsrahead_SOURCES = main.c
 nfsrahead_LDFLAGS= -lmount
+nfsrahead_LDADD = ../../support/nfs/libnfsconf.la
 
 udev_rulesdir = /etc/udev/rules.d
 udev_rules_DATA = 99-nfs_bdi.rules
diff --git a/tools/nfsrahead/main.c b/tools/nfsrahead/main.c
index 2cf77424..86c71a67 100644
--- a/tools/nfsrahead/main.c
+++ b/tools/nfsrahead/main.c
@@ -2,14 +2,19 @@
 #include <string.h>
 #include <stdlib.h>
 #include <errno.h>
+#include <unistd.h>
 
 #include <libmount/libmount.h>
 #include <sys/sysmacros.h>
 
+#include "xlog.h"
+
 #ifndef MOUNTINFO_PATH
 #define MOUNTINFO_PATH "/proc/self/mountinfo"
 #endif
 
+#define CONF_NAME "nfsrahead"
+
 /* Device information from the system */
 struct device_info {
 	char *device_number;
@@ -108,26 +113,49 @@ static int get_device_info(const char *device_number, struct device_info *device
 	return ret;
 }
 
+#define L_DEFAULT (L_WARNING | L_ERROR | L_FATAL)
+
 int main(int argc, char **argv, char **envp)
 {
 	int ret = 0;
 	struct device_info device;
-	unsigned int readahead = 128;
-
-	if (argc != 2) {
-		return -EINVAL;
+	unsigned int readahead = 128, verbose = 0, log_stderr = 0;
+	char opt;
+
+	while((opt = getopt(argc, argv, "dF")) != -1) {
+		switch (opt) {
+		case 'd':
+			verbose = 1;
+			break;
+		case 'F':
+			log_stderr = 1;
+			break;
+		}
 	}
 
-	if ((ret = get_device_info(argv[1], &device)) != 0) {
+	xlog_stderr(log_stderr);
+	xlog_syslog(~log_stderr);
+	xlog_config(L_DEFAULT | (L_NOTICE & verbose), 1);
+	xlog_open(CONF_NAME);
+
+	// xlog_err causes the system to exit
+	if ((argc - optind) != 1)
+		xlog_err("expected the device number of a BDI; is udev ok?");
+
+	if ((ret = get_device_info(argv[optind], &device)) != 0) {
+		xlog(L_ERROR, "unable to find device %s\n", argv[optind]);
 		goto out;
 	}
 
 	if (strncmp("nfs", device.fstype, 3) != 0) {
+		xlog(L_NOTICE,
+			"not setting readahead for non supported fstype %s on device %s\n",
+			device.fstype, argv[optind]);
 		ret = -EINVAL;
 		goto out;
 	}
 
-	info("Setting %s readahead to 128\n", device.mountpoint);
+	xlog(L_WARNING, "setting %s readahead to %d\n", device.mountpoint, readahead);
 
 	printf("%d\n", readahead);
 
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH RFC v3 5/6] hfsrahead: get the information from the config file.
  2022-03-23 20:18 [PATCH RFC v3 0/6] Intruduce nfsrahead Thiago Becker
                   ` (3 preceding siblings ...)
  2022-03-23 20:18 ` [PATCH RFC v3 4/6] nfsrahead: add logging Thiago Becker
@ 2022-03-23 20:18 ` Thiago Becker
  2022-03-23 20:18 ` [PATCH RFC v3 6/6] nfsrahead: User documentation Thiago Becker
  2022-03-23 21:32 ` [PATCH RFC v3 0/6] Intruduce nfsrahead Matthew Wilcox
  6 siblings, 0 replies; 10+ messages in thread
From: Thiago Becker @ 2022-03-23 20:18 UTC (permalink / raw)
  To: linux-nfs; +Cc: steved, trond.myklebust, anna.schumaker, kolga, Thiago Becker

Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1946283
Signed-off-by: Thiago Becker <tbecker@redhat.com>
---
 tools/nfsrahead/main.c | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/tools/nfsrahead/main.c b/tools/nfsrahead/main.c
index 86c71a67..bead9f5c 100644
--- a/tools/nfsrahead/main.c
+++ b/tools/nfsrahead/main.c
@@ -8,12 +8,14 @@
 #include <sys/sysmacros.h>
 
 #include "xlog.h"
+#include "conffile.h"
 
 #ifndef MOUNTINFO_PATH
 #define MOUNTINFO_PATH "/proc/self/mountinfo"
 #endif
 
 #define CONF_NAME "nfsrahead"
+#define NFS_DEFAULT_READAHEAD 128
 
 /* Device information from the system */
 struct device_info {
@@ -113,6 +115,14 @@ static int get_device_info(const char *device_number, struct device_info *device
 	return ret;
 }
 
+static int conf_get_readahead(const char *kind) {
+	int readahead = 0;
+
+	if((readahead = conf_get_num(CONF_NAME, kind, -1)) == -1)
+		readahead = conf_get_num(CONF_NAME, "default", NFS_DEFAULT_READAHEAD);
+	
+	return readahead;
+}
 #define L_DEFAULT (L_WARNING | L_ERROR | L_FATAL)
 
 int main(int argc, char **argv, char **envp)
@@ -133,6 +143,8 @@ int main(int argc, char **argv, char **envp)
 		}
 	}
 
+	conf_init_file(NFS_CONFFILE);
+
 	xlog_stderr(log_stderr);
 	xlog_syslog(~log_stderr);
 	xlog_config(L_DEFAULT | (L_NOTICE & verbose), 1);
@@ -155,6 +167,8 @@ int main(int argc, char **argv, char **envp)
 		goto out;
 	}
 
+	readahead = conf_get_readahead(device.fstype);
+
 	xlog(L_WARNING, "setting %s readahead to %d\n", device.mountpoint, readahead);
 
 	printf("%d\n", readahead);
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH RFC v3 6/6] nfsrahead: User documentation
  2022-03-23 20:18 [PATCH RFC v3 0/6] Intruduce nfsrahead Thiago Becker
                   ` (4 preceding siblings ...)
  2022-03-23 20:18 ` [PATCH RFC v3 5/6] hfsrahead: get the information from the config file Thiago Becker
@ 2022-03-23 20:18 ` Thiago Becker
  2022-03-23 21:32 ` [PATCH RFC v3 0/6] Intruduce nfsrahead Matthew Wilcox
  6 siblings, 0 replies; 10+ messages in thread
From: Thiago Becker @ 2022-03-23 20:18 UTC (permalink / raw)
  To: linux-nfs; +Cc: steved, trond.myklebust, anna.schumaker, kolga, Thiago Becker

Add the man page for nfsrahead, and add the new section to nfs.conf.

Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1946283
Signed-off-by: Thiago Becker <tbecker@redhat.com>
---
 systemd/nfs.conf.man          | 11 ++++++
 tools/nfsrahead/Makefile.am   |  2 +
 tools/nfsrahead/nfsrahead.man | 72 +++++++++++++++++++++++++++++++++++
 3 files changed, 85 insertions(+)
 create mode 100644 tools/nfsrahead/nfsrahead.man

diff --git a/systemd/nfs.conf.man b/systemd/nfs.conf.man
index be487a11..e74083e9 100644
--- a/systemd/nfs.conf.man
+++ b/systemd/nfs.conf.man
@@ -294,6 +294,17 @@ Only
 .B debug=
 is recognized.
 
+.TP
+.B nfsrahead
+Recognized values:
+.BR nfs ,
+.BR nfsv4 ,
+.BR default .
+
+See
+.BR nfsrahead (5)
+for deatils.
+
 .SH FILES
 .TP 10n
 .I /etc/nfs.conf
diff --git a/tools/nfsrahead/Makefile.am b/tools/nfsrahead/Makefile.am
index d0b5d170..7342dcba 100644
--- a/tools/nfsrahead/Makefile.am
+++ b/tools/nfsrahead/Makefile.am
@@ -3,6 +3,8 @@ nfsrahead_SOURCES = main.c
 nfsrahead_LDFLAGS= -lmount
 nfsrahead_LDADD = ../../support/nfs/libnfsconf.la
 
+man5_MANS = nfsrahead.man
+
 udev_rulesdir = /etc/udev/rules.d
 udev_rules_DATA = 99-nfs_bdi.rules
 
diff --git a/tools/nfsrahead/nfsrahead.man b/tools/nfsrahead/nfsrahead.man
new file mode 100644
index 00000000..5488f633
--- /dev/null
+++ b/tools/nfsrahead/nfsrahead.man
@@ -0,0 +1,72 @@
+.\" Manpage for nfsrahead.
+.nh
+.ad l
+.TH man 5 "08 Mar 2022" "1.0" "nfsrahead man page"
+.SH NAME
+
+nfsrahead \- Configure the readahead for NFS mounts
+
+.SH SYNOPSIS
+
+nfsrahead [-F] [-d] <device>
+
+.SH DESCRIPTION
+
+\fInfsrahead\fR is a tool intended to be used with udev to set the \fIread_ahead_kb\fR parameter of NFS mounts, according to the configuration file (see \fICONFIGURATION\fR). \fIdevice\fR is the device number for the NFS backing device as provided by the kernel.
+
+.SH OPTIONS
+.TP
+.B -F
+Send messages to 
+.I stderr 
+instead of
+.I syslog
+
+.TP
+.B -d
+Increase the debugging level.
+
+.SH CONFIGURATION
+.I nfsrahead
+is configured in
+.IR /etc/nfs.conf ,
+in the section titled
+.IR nfsrahead .
+It accepts the following configurations.
+
+.TP
+.B nfs=<value>
+The readahead value applied to NFSv3 mounts.
+
+.TP
+.B nfs4=<value>
+The readahead value applied to NFSv4 mounts.
+
+.TP
+.B default=<value>
+The default configuration when none of the configurations above is set.
+
+.SH EXAMPLE CONFIGURATION
+[nfsrahead]
+.br
+nfs=15000              # readahead of 15000 for NFSv3 mounts
+.br
+nfs4=16000             # readahead of 16000 for NFSv4 mounts
+.br
+default=128            # default is 128
+
+.SH SEE ALSO
+
+.BR mount.nfs (8),
+.BR nfs (5),
+.BR nfs.conf (5),
+.BR udev (7),
+.BR bcc-readahead (8)
+
+.SH BUGS
+
+No known bugs.
+
+.SH AUTHOR
+
+Thiago Rafael Becker <trbecker@gmail.com>
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH RFC v3 0/6] Intruduce nfsrahead
  2022-03-23 20:18 [PATCH RFC v3 0/6] Intruduce nfsrahead Thiago Becker
                   ` (5 preceding siblings ...)
  2022-03-23 20:18 ` [PATCH RFC v3 6/6] nfsrahead: User documentation Thiago Becker
@ 2022-03-23 21:32 ` Matthew Wilcox
  2022-03-23 21:58   ` Trond Myklebust
  2022-03-25 12:31   ` Thiago Becker
  6 siblings, 2 replies; 10+ messages in thread
From: Matthew Wilcox @ 2022-03-23 21:32 UTC (permalink / raw)
  To: Thiago Becker
  Cc: linux-nfs, steved, trond.myklebust, anna.schumaker, kolga,
	linux-fsdevel, linux-mm

On Wed, Mar 23, 2022 at 05:18:35PM -0300, Thiago Becker wrote:
> Recent changes in the linux kernel caused NFS readahead to default to
> 128 from the previous default of 15 * rsize. This causes performance
> penalties to some read-heavy workloads, which can be fixed by
> tuning the readahead for that given mount.

Which recent changes?  Something in NFS or something in the VFS/MM?
Did you even think about asking a wider audience than the NFS mailing
list?  I only happened to notice this while I was looking for something
else, otherwise I would never have seen it.  The responses from other
people to your patches were right; you're trying to do this all wrong.

Let's start out with a bug report instead of a solution.  What changed
and when?

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH RFC v3 0/6] Intruduce nfsrahead
  2022-03-23 21:32 ` [PATCH RFC v3 0/6] Intruduce nfsrahead Matthew Wilcox
@ 2022-03-23 21:58   ` Trond Myklebust
  2022-03-25 12:31   ` Thiago Becker
  1 sibling, 0 replies; 10+ messages in thread
From: Trond Myklebust @ 2022-03-23 21:58 UTC (permalink / raw)
  To: tbecker, willy
  Cc: linux-mm, linux-nfs, kolga, steved, anna.schumaker, linux-fsdevel

On Wed, 2022-03-23 at 21:32 +0000, Matthew Wilcox wrote:
> On Wed, Mar 23, 2022 at 05:18:35PM -0300, Thiago Becker wrote:
> > Recent changes in the linux kernel caused NFS readahead to default
> > to
> > 128 from the previous default of 15 * rsize. This causes
> > performance
> > penalties to some read-heavy workloads, which can be fixed by
> > tuning the readahead for that given mount.
> 
> Which recent changes?  Something in NFS or something in the VFS/MM?
> Did you even think about asking a wider audience than the NFS mailing
> list?  I only happened to notice this while I was looking for
> something
> else, otherwise I would never have seen it.  The responses from other
> people to your patches were right; you're trying to do this all
> wrong.
> 
> Let's start out with a bug report instead of a solution.  What
> changed
> and when?

I believe Thiago is talking about the changes introduced by commit
c128e575514c "NFS: Optimise the default readahead size" (i.e. we're
talking about Linux 5.4).

...and yes, as the commit description notes, users who want to change
the default can do so using the standard sysfs mechanism.
AFAICS, all this is doing is providing a toolset to allow users to more
easily set up and edit the udev scripts that will automate these
settings.

-- 
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.myklebust@hammerspace.com



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH RFC v3 0/6] Intruduce nfsrahead
  2022-03-23 21:32 ` [PATCH RFC v3 0/6] Intruduce nfsrahead Matthew Wilcox
  2022-03-23 21:58   ` Trond Myklebust
@ 2022-03-25 12:31   ` Thiago Becker
  1 sibling, 0 replies; 10+ messages in thread
From: Thiago Becker @ 2022-03-25 12:31 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: linux-nfs, Steved, trond.myklebust, anna.schumaker,
	Olga Kornievskaia, linux-fsdevel, linux-mm

Hello,

On Wed, Mar 23, 2022 at 6:32 PM Matthew Wilcox <willy@infradead.org> wrote:
> Which recent changes?  Something in NFS or something in the VFS/MM?
> Did you even think about asking a wider audience than the NFS mailing
> list?  I only happened to notice this while I was looking for something
> else, otherwise I would never have seen it.  The responses from other
> people to your patches were right; you're trying to do this all wrong.
>
> Let's start out with a bug report instead of a solution.  What changed
> and when?
>

As Trond stated, c128e575514c ("NFS: Optimise the default readahead
size") changed the way readahead is calculated for NFS mounts. This
caused some read workloads to underperform, compared to the
performance from previous revisions. To recall, the current policy
is to adopt the system default readahead of 128kiB, and mounts
with sec=krb5p take a performance hit of 50-75% when readahead
is 128. I haven't performed an exhaustive search for other workloads
that might also be affected, but I noticed the meaningful drop in
performance in sec=sys mounts, notes at the end.

The previous policy was to calculate the readahead as a
multiple of rsize, so we prescribed increasing the value to the
complaining part, and this fixed the issue. We are now trying to find a
solution that we can incorporate into the system.

thiago.

----- Tests
===== RAWHIDE (35% performance hit) =====
# uname -r
5.16.0-0.rc0.20211112git5833291ab6de.12.fc36.x86_64

# grep nfs /proc/self/mountinfo
601 60 0:55 / /mnt rw,relatime shared:332 - nfs4
192.168.122.225:/exports
rw,vers=4.2,rsize=262144,wsize=262144,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=192.168.122.83,local_lock=none,addr=192.168.122.225

# cat /sys/class/bdi/0\:55/read_ahead_kb
128

# for i in {0..3} ; do dd if=/mnt/testfile.bin of=/dev/null bs=1M 2>&1
| grep copied ; echo 3 > /proc/sys/vm/drop_caches ; done
4294967296 bytes (4.3 GB, 4.0 GiB) copied, 16.5025 s, 260 MB/s
4294967296 bytes (4.3 GB, 4.0 GiB) copied, 16.4474 s, 261 MB/s
4294967296 bytes (4.3 GB, 4.0 GiB) copied, 18.0181 s, 238 MB/s
4294967296 bytes (4.3 GB, 4.0 GiB) copied, 18.2323 s, 236 MB/s

# echo 15360 > /sys/class/bdi/0\:55/read_ahead_kb

# for i in {0..3} ; do dd if=/mnt/testfile.bin of=/dev/null bs=1M 2>&1
| grep copied ; echo 3 > /proc/sys/vm/drop_caches ; done
4294967296 bytes (4.3 GB, 4.0 GiB) copied, 11.2601 s, 381 MB/s
4294967296 bytes (4.3 GB, 4.0 GiB) copied, 11.1885 s, 384 MB/s
4294967296 bytes (4.3 GB, 4.0 GiB) copied, 11.5877 s, 371 MB/s
4294967296 bytes (4.3 GB, 4.0 GiB) copied, 10.9475 s, 392 MB/s

===== UPSTREAM (30% performance hit) =====
# uname -r
5.17.0+

# grep nfs /proc/self/mountinfo
583 60 0:55 / /mnt rw,relatime shared:302 - nfs4
192.168.122.225:/exports
rw,vers=4.2,rsize=262144,wsize=262144,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=192.168.122.83,local_lock=none,addr=192.168.122.225

# cat /sys/class/bdi/0\:55/read_ahead_kb
128

# for i in {0..3} ; do dd if=/mnt/testfile.bin of=/dev/null bs=1M 2>&1
| grep copied ; echo 3 > /proc/sys/vm/drop_caches ; done
4294967296 bytes (4.3 GB, 4.0 GiB) copied, 17.056 s, 252 MB/s
4294967296 bytes (4.3 GB, 4.0 GiB) copied, 17.1258 s, 251 MB/s
4294967296 bytes (4.3 GB, 4.0 GiB) copied, 16.5981 s, 259 MB/s
4294967296 bytes (4.3 GB, 4.0 GiB) copied, 16.5487 s, 260 MB/s

# echo 15360 > /sys/class/bdi/0\:55/read_ahead_kb

# for i in {0..3} ; do dd if=/mnt/testfile.bin of=/dev/null bs=1M 2>&1
| grep copied ; echo 3 > /proc/sys/vm/drop_caches ; done
4294967296 bytes (4.3 GB, 4.0 GiB) copied, 12.3855 s, 347 MB/s
4294967296 bytes (4.3 GB, 4.0 GiB) copied, 11.2528 s, 382 MB/s
4294967296 bytes (4.3 GB, 4.0 GiB) copied, 11.9849 s, 358 MB/s
4294967296 bytes (4.3 GB, 4.0 GiB) copied, 11.2953 s, 380 MB/s


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2022-03-25 12:31 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-23 20:18 [PATCH RFC v3 0/6] Intruduce nfsrahead Thiago Becker
2022-03-23 20:18 ` [PATCH RFC v3 1/6] Create nfsrahead Thiago Becker
2022-03-23 20:18 ` [PATCH RFC v3 2/6] nfsrahead: configure udev Thiago Becker
2022-03-23 20:18 ` [PATCH RFC v3 3/6] nfsrahead: only set readahead for nfs devices Thiago Becker
2022-03-23 20:18 ` [PATCH RFC v3 4/6] nfsrahead: add logging Thiago Becker
2022-03-23 20:18 ` [PATCH RFC v3 5/6] hfsrahead: get the information from the config file Thiago Becker
2022-03-23 20:18 ` [PATCH RFC v3 6/6] nfsrahead: User documentation Thiago Becker
2022-03-23 21:32 ` [PATCH RFC v3 0/6] Intruduce nfsrahead Matthew Wilcox
2022-03-23 21:58   ` Trond Myklebust
2022-03-25 12:31   ` Thiago Becker

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.