* [lustre-devel] [PATCH 1/6] Autoconf option for rate-limiting Quality of Service (RLQOS)
2017-03-21 19:43 [lustre-devel] [PATCH 0/6] Rate-limiting Quality of Service Yan Li
@ 2017-03-21 19:43 ` Yan Li
2017-03-21 20:09 ` Ben Evans
2017-03-24 22:22 ` Dilger, Andreas
2017-03-21 19:43 ` [lustre-devel] [PATCH 2/6] Added fields to message for RLQOS support Yan Li
` (4 subsequent siblings)
5 siblings, 2 replies; 17+ messages in thread
From: Yan Li @ 2017-03-21 19:43 UTC (permalink / raw)
To: lustre-devel
This patch enables rate-limiting quality of service (RLQOS) support as
talked in the ASCAR paper [1]. The purpose of RLQOS is to provide a
client-side rate limiting mechanism that controls max_rpcs_in_flight
and minimal gap between brw RPC requests (called tau in the code and
paper).
RLQOS can be enabled by passing --enable-rlqos to configure. It then
can be controlled by tunables in procfs of each osc.
[1] http://storageconference.us/2015/Papers/14.Li.pdf
Signed-off-by: Yan Li <yanli@ascar.io>
---
lustre/autoconf/lustre-core.m4 | 17 +++++++++++++++++
lustre/include/Makefile.am | 3 ++-
2 files changed, 19 insertions(+), 1 deletion(-)
diff --git a/lustre/autoconf/lustre-core.m4 b/lustre/autoconf/lustre-core.m4
index 0578325..7f1828e 100644
--- a/lustre/autoconf/lustre-core.m4
+++ b/lustre/autoconf/lustre-core.m4
@@ -369,6 +369,22 @@ AC_COMPILE_IFELSE([AC_LANG_SOURCE([
AC_MSG_RESULT([$enable_ssk])
]) # LC_OPENSSL_SSK
+#
+# LC_CONFIG_RLQOS
+#
+# Rate-limiting Quality of Service support
+#
+AC_DEFUN([LC_CONFIG_RLQOS], [
+AC_MSG_CHECKING([whether to enable rate-limiting quality of service support])
+AC_ARG_ENABLE([rlqos],
+ AC_HELP_STRING([--enable-rlqos],
+ [enable rate-limiting quality of service support]),
+ [], [enable_rlqos="no"])
+AC_MSG_RESULT([$enable_rlqos])
+AS_IF([test "x$enable_rlqos" != xno],
+ [AC_DEFINE(ENABLE_RLQOS, 1, [enable rate-limiting quality of service support])])
+]) # LC_CONFIG_RLQOS
+
# LC_INODE_PERMISION_2ARGS
#
# up to v2.6.27 had a 3 arg version (inode, mask, nameidata)
@@ -2241,6 +2257,7 @@ AC_DEFUN([LC_PROG_LINUX], [
LC_GLIBC_SUPPORT_FHANDLES
LC_CONFIG_GSS
LC_OPENSSL_SSK
+ LC_CONFIG_RLQOS
# 2.6.32
LC_BLK_QUEUE_MAX_SEGMENTS
diff --git a/lustre/include/Makefile.am b/lustre/include/Makefile.am
index 9074ca4..6d72b6e 100644
--- a/lustre/include/Makefile.am
+++ b/lustre/include/Makefile.am
@@ -98,4 +98,5 @@ EXTRA_DIST = \
upcall_cache.h \
lustre_kernelcomm.h \
seq_range.h \
- uapi_kernelcomm.h
+ uapi_kernelcomm.h \
+ rlqos.h
--
1.8.3.1
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [lustre-devel] [PATCH 1/6] Autoconf option for rate-limiting Quality of Service (RLQOS)
2017-03-21 19:43 ` [lustre-devel] [PATCH 1/6] Autoconf option for rate-limiting Quality of Service (RLQOS) Yan Li
@ 2017-03-21 20:09 ` Ben Evans
2017-03-22 14:19 ` Yan Li
2017-03-24 22:22 ` Dilger, Andreas
1 sibling, 1 reply; 17+ messages in thread
From: Ben Evans @ 2017-03-21 20:09 UTC (permalink / raw)
To: lustre-devel
I would remove the #ifdef ENABLE_RLQOS blocks, especially in lustre_idl.h
since you're proposing to add new fields and consume some of the padding
bits. It will cause a lot of headache for the next feature that comes
along and consumes some of those bits.
-Ben Evans
On 3/21/17, 3:43 PM, "lustre-devel on behalf of Yan Li"
<lustre-devel-bounces at lists.lustre.org on behalf of yanli@ascar.io> wrote:
>This patch enables rate-limiting quality of service (RLQOS) support as
>talked in the ASCAR paper [1]. The purpose of RLQOS is to provide a
>client-side rate limiting mechanism that controls max_rpcs_in_flight
>and minimal gap between brw RPC requests (called tau in the code and
>paper).
>
>RLQOS can be enabled by passing --enable-rlqos to configure. It then
>can be controlled by tunables in procfs of each osc.
>
>[1] http://storageconference.us/2015/Papers/14.Li.pdf
>
>Signed-off-by: Yan Li <yanli@ascar.io>
>---
> lustre/autoconf/lustre-core.m4 | 17 +++++++++++++++++
> lustre/include/Makefile.am | 3 ++-
> 2 files changed, 19 insertions(+), 1 deletion(-)
>
>diff --git a/lustre/autoconf/lustre-core.m4
>b/lustre/autoconf/lustre-core.m4
>index 0578325..7f1828e 100644
>--- a/lustre/autoconf/lustre-core.m4
>+++ b/lustre/autoconf/lustre-core.m4
>@@ -369,6 +369,22 @@ AC_COMPILE_IFELSE([AC_LANG_SOURCE([
> AC_MSG_RESULT([$enable_ssk])
> ]) # LC_OPENSSL_SSK
>
>+#
>+# LC_CONFIG_RLQOS
>+#
>+# Rate-limiting Quality of Service support
>+#
>+AC_DEFUN([LC_CONFIG_RLQOS], [
>+AC_MSG_CHECKING([whether to enable rate-limiting quality of service
>support])
>+AC_ARG_ENABLE([rlqos],
>+ AC_HELP_STRING([--enable-rlqos],
>+ [enable rate-limiting quality of service support]),
>+ [], [enable_rlqos="no"])
>+AC_MSG_RESULT([$enable_rlqos])
>+AS_IF([test "x$enable_rlqos" != xno],
>+ [AC_DEFINE(ENABLE_RLQOS, 1, [enable rate-limiting quality of service
>support])])
>+]) # LC_CONFIG_RLQOS
>+
> # LC_INODE_PERMISION_2ARGS
> #
> # up to v2.6.27 had a 3 arg version (inode, mask, nameidata)
>@@ -2241,6 +2257,7 @@ AC_DEFUN([LC_PROG_LINUX], [
> LC_GLIBC_SUPPORT_FHANDLES
> LC_CONFIG_GSS
> LC_OPENSSL_SSK
>+ LC_CONFIG_RLQOS
>
> # 2.6.32
> LC_BLK_QUEUE_MAX_SEGMENTS
>diff --git a/lustre/include/Makefile.am b/lustre/include/Makefile.am
>index 9074ca4..6d72b6e 100644
>--- a/lustre/include/Makefile.am
>+++ b/lustre/include/Makefile.am
>@@ -98,4 +98,5 @@ EXTRA_DIST = \
> upcall_cache.h \
> lustre_kernelcomm.h \
> seq_range.h \
>- uapi_kernelcomm.h
>+ uapi_kernelcomm.h \
>+ rlqos.h
>--
>1.8.3.1
>
>_______________________________________________
>lustre-devel mailing list
>lustre-devel at lists.lustre.org
>http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org
^ permalink raw reply [flat|nested] 17+ messages in thread
* [lustre-devel] [PATCH 1/6] Autoconf option for rate-limiting Quality of Service (RLQOS)
2017-03-21 20:09 ` Ben Evans
@ 2017-03-22 14:19 ` Yan Li
2017-03-22 14:27 ` Ben Evans
0 siblings, 1 reply; 17+ messages in thread
From: Yan Li @ 2017-03-22 14:19 UTC (permalink / raw)
To: lustre-devel
On 03/21/2017 01:09 PM, Ben Evans wrote:
> I would remove the #ifdef ENABLE_RLQOS blocks, especially in lustre_idl.h
> since you're proposing to add new fields and consume some of the padding
> bits. It will cause a lot of headache for the next feature that comes
> along and consumes some of those bits.
Yeah, that's a good point. I'll remove it if all are ok with this.
Yan
^ permalink raw reply [flat|nested] 17+ messages in thread
* [lustre-devel] [PATCH 1/6] Autoconf option for rate-limiting Quality of Service (RLQOS)
2017-03-22 14:19 ` Yan Li
@ 2017-03-22 14:27 ` Ben Evans
0 siblings, 0 replies; 17+ messages in thread
From: Ben Evans @ 2017-03-22 14:27 UTC (permalink / raw)
To: lustre-devel
I'd get rid of all the ENABLE_RLQOS blocks myself, but minimally the
lustre_idl.h ones.
-Ben Evans
On 3/22/17, 10:19 AM, "Yan Li" <yanli@ascar.io> wrote:
>
>On 03/21/2017 01:09 PM, Ben Evans wrote:
>> I would remove the #ifdef ENABLE_RLQOS blocks, especially in
>>lustre_idl.h
>> since you're proposing to add new fields and consume some of the padding
>> bits. It will cause a lot of headache for the next feature that comes
>> along and consumes some of those bits.
>
>Yeah, that's a good point. I'll remove it if all are ok with this.
>
>Yan
^ permalink raw reply [flat|nested] 17+ messages in thread
* [lustre-devel] [PATCH 1/6] Autoconf option for rate-limiting Quality of Service (RLQOS)
2017-03-21 19:43 ` [lustre-devel] [PATCH 1/6] Autoconf option for rate-limiting Quality of Service (RLQOS) Yan Li
2017-03-21 20:09 ` Ben Evans
@ 2017-03-24 22:22 ` Dilger, Andreas
[not found] ` <3BE4A898-D944-41F9-84C8-FE8DA80D0D65@datadirectnet.com>
1 sibling, 1 reply; 17+ messages in thread
From: Dilger, Andreas @ 2017-03-24 22:22 UTC (permalink / raw)
To: lustre-devel
On Mar 21, 2017, at 13:43, Yan Li <yanli@ascar.io> wrote:
>
> This patch enables rate-limiting quality of service (RLQOS) support as
> talked in the ASCAR paper [1]. The purpose of RLQOS is to provide a
> client-side rate limiting mechanism that controls max_rpcs_in_flight
> and minimal gap between brw RPC requests (called tau in the code and
> paper).
>
> RLQOS can be enabled by passing --enable-rlqos to configure. It then
> can be controlled by tunables in procfs of each osc.
Hi Yan,
thanks for submitting the patch series. Two high level comments on the
patches, since I haven't had a chance to review them in detail (though
I see Alexey has commented on some of them):
- What external tools (if any) are needed in order to use this functionality?
Are these available for download, and is there documentation for using them?
- It is fine that you've submitted the patches here for discussion and to
raise awareness of your work. In order to get them landed you should submit
the patches to Gerrit (see https://wiki.hpdd.intel.com/display/PUB/Using+Gerrit
I'll try to take a look at them when I get a chance. This may also be of
interest to Li Xi and Qian at DDN, who have been working on server-side NRS.
Cheers, Andreas
> [1] http://storageconference.us/2015/Papers/14.Li.pdf
>
> Signed-off-by: Yan Li <yanli@ascar.io>
> ---
> lustre/autoconf/lustre-core.m4 | 17 +++++++++++++++++
> lustre/include/Makefile.am | 3 ++-
> 2 files changed, 19 insertions(+), 1 deletion(-)
>
> diff --git a/lustre/autoconf/lustre-core.m4 b/lustre/autoconf/lustre-core.m4
> index 0578325..7f1828e 100644
> --- a/lustre/autoconf/lustre-core.m4
> +++ b/lustre/autoconf/lustre-core.m4
> @@ -369,6 +369,22 @@ AC_COMPILE_IFELSE([AC_LANG_SOURCE([
> AC_MSG_RESULT([$enable_ssk])
> ]) # LC_OPENSSL_SSK
>
> +#
> +# LC_CONFIG_RLQOS
> +#
> +# Rate-limiting Quality of Service support
> +#
> +AC_DEFUN([LC_CONFIG_RLQOS], [
> +AC_MSG_CHECKING([whether to enable rate-limiting quality of service support])
> +AC_ARG_ENABLE([rlqos],
> + AC_HELP_STRING([--enable-rlqos],
> + [enable rate-limiting quality of service support]),
> + [], [enable_rlqos="no"])
> +AC_MSG_RESULT([$enable_rlqos])
> +AS_IF([test "x$enable_rlqos" != xno],
> + [AC_DEFINE(ENABLE_RLQOS, 1, [enable rate-limiting quality of service support])])
> +]) # LC_CONFIG_RLQOS
> +
> # LC_INODE_PERMISION_2ARGS
> #
> # up to v2.6.27 had a 3 arg version (inode, mask, nameidata)
> @@ -2241,6 +2257,7 @@ AC_DEFUN([LC_PROG_LINUX], [
> LC_GLIBC_SUPPORT_FHANDLES
> LC_CONFIG_GSS
> LC_OPENSSL_SSK
> + LC_CONFIG_RLQOS
>
> # 2.6.32
> LC_BLK_QUEUE_MAX_SEGMENTS
> diff --git a/lustre/include/Makefile.am b/lustre/include/Makefile.am
> index 9074ca4..6d72b6e 100644
> --- a/lustre/include/Makefile.am
> +++ b/lustre/include/Makefile.am
> @@ -98,4 +98,5 @@ EXTRA_DIST = \
> upcall_cache.h \
> lustre_kernelcomm.h \
> seq_range.h \
> - uapi_kernelcomm.h
> + uapi_kernelcomm.h \
> + rlqos.h
> --
> 1.8.3.1
>
> _______________________________________________
> lustre-devel mailing list
> lustre-devel at lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org
Cheers, Andreas
--
Andreas Dilger
Lustre Principal Architect
Intel Corporation
^ permalink raw reply [flat|nested] 17+ messages in thread
* [lustre-devel] [PATCH 2/6] Added fields to message for RLQOS support
2017-03-21 19:43 [lustre-devel] [PATCH 0/6] Rate-limiting Quality of Service Yan Li
2017-03-21 19:43 ` [lustre-devel] [PATCH 1/6] Autoconf option for rate-limiting Quality of Service (RLQOS) Yan Li
@ 2017-03-21 19:43 ` Yan Li
2017-03-23 14:54 ` Alexey Lyashkov
2017-03-21 19:43 ` [lustre-devel] [PATCH 3/6] RLQOS main data structure Yan Li
` (3 subsequent siblings)
5 siblings, 1 reply; 17+ messages in thread
From: Yan Li @ 2017-03-21 19:43 UTC (permalink / raw)
To: lustre-devel
Modified the request message to embed sent_time, which will be
returned from the server and used to calculate the exponentially
weighted moving average of sent_time gap in return messages. It is
used as a metric for rate-limiting quality of service.
Signed-off-by: Yan Li <yanli@ascar.io>
---
lustre/include/lustre/lustre_idl.h | 4 ++++
lustre/ptlrpc/pack_generic.c | 5 +++++
lustre/ptlrpc/wiretest.c | 2 ++
lustre/utils/wiretest.c | 2 ++
4 files changed, 13 insertions(+)
diff --git a/lustre/include/lustre/lustre_idl.h b/lustre/include/lustre/lustre_idl.h
index bf23a47..7a200d1 100644
--- a/lustre/include/lustre/lustre_idl.h
+++ b/lustre/include/lustre/lustre_idl.h
@@ -3336,8 +3336,12 @@ struct obdo {
* each stripe.
* brw: grant space consumed on
* the client for the write */
+#ifdef ENABLE_RLQOS
+ struct timeval o_sent_time; /* timeval is 64x2 bits on Linux */
+#else
__u64 o_padding_4;
__u64 o_padding_5;
+#endif
__u64 o_padding_6;
};
diff --git a/lustre/ptlrpc/pack_generic.c b/lustre/ptlrpc/pack_generic.c
index 8df8ea8..d0bc87a 100644
--- a/lustre/ptlrpc/pack_generic.c
+++ b/lustre/ptlrpc/pack_generic.c
@@ -1722,8 +1722,13 @@ void lustre_swab_obdo (struct obdo *o)
__swab32s (&o->o_uid_h);
__swab32s (&o->o_gid_h);
__swab64s (&o->o_data_version);
+#ifdef ENABLE_RLQOS
+ __swab64s ((__u64*)&o->o_sent_time.tv_sec);
+ __swab64s ((__u64*)&o->o_sent_time.tv_usec);
+#else
CLASSERT(offsetof(typeof(*o), o_padding_4) != 0);
CLASSERT(offsetof(typeof(*o), o_padding_5) != 0);
+#endif
CLASSERT(offsetof(typeof(*o), o_padding_6) != 0);
}
diff --git a/lustre/ptlrpc/wiretest.c b/lustre/ptlrpc/wiretest.c
index 070ef91..0c909a6 100644
--- a/lustre/ptlrpc/wiretest.c
+++ b/lustre/ptlrpc/wiretest.c
@@ -1314,6 +1314,7 @@ void lustre_assert_wire_constants(void)
(long long)(int)offsetof(struct obdo, o_data_version));
LASSERTF((int)sizeof(((struct obdo *)0)->o_data_version) == 8, "found %lld\n",
(long long)(int)sizeof(((struct obdo *)0)->o_data_version));
+#ifndef ENABLE_RLQOS
LASSERTF((int)offsetof(struct obdo, o_padding_4) == 184, "found %lld\n",
(long long)(int)offsetof(struct obdo, o_padding_4));
LASSERTF((int)sizeof(((struct obdo *)0)->o_padding_4) == 8, "found %lld\n",
@@ -1322,6 +1323,7 @@ void lustre_assert_wire_constants(void)
(long long)(int)offsetof(struct obdo, o_padding_5));
LASSERTF((int)sizeof(((struct obdo *)0)->o_padding_5) == 8, "found %lld\n",
(long long)(int)sizeof(((struct obdo *)0)->o_padding_5));
+#endif
LASSERTF((int)offsetof(struct obdo, o_padding_6) == 200, "found %lld\n",
(long long)(int)offsetof(struct obdo, o_padding_6));
LASSERTF((int)sizeof(((struct obdo *)0)->o_padding_6) == 8, "found %lld\n",
diff --git a/lustre/utils/wiretest.c b/lustre/utils/wiretest.c
index 233d7d8..47fbbf0 100644
--- a/lustre/utils/wiretest.c
+++ b/lustre/utils/wiretest.c
@@ -1329,6 +1329,7 @@ void lustre_assert_wire_constants(void)
(long long)(int)offsetof(struct obdo, o_data_version));
LASSERTF((int)sizeof(((struct obdo *)0)->o_data_version) == 8, "found %lld\n",
(long long)(int)sizeof(((struct obdo *)0)->o_data_version));
+#ifndef ENABLE_RLQOS
LASSERTF((int)offsetof(struct obdo, o_padding_4) == 184, "found %lld\n",
(long long)(int)offsetof(struct obdo, o_padding_4));
LASSERTF((int)sizeof(((struct obdo *)0)->o_padding_4) == 8, "found %lld\n",
@@ -1337,6 +1338,7 @@ void lustre_assert_wire_constants(void)
(long long)(int)offsetof(struct obdo, o_padding_5));
LASSERTF((int)sizeof(((struct obdo *)0)->o_padding_5) == 8, "found %lld\n",
(long long)(int)sizeof(((struct obdo *)0)->o_padding_5));
+#endif
LASSERTF((int)offsetof(struct obdo, o_padding_6) == 200, "found %lld\n",
(long long)(int)offsetof(struct obdo, o_padding_6));
LASSERTF((int)sizeof(((struct obdo *)0)->o_padding_6) == 8, "found %lld\n",
--
1.8.3.1
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [lustre-devel] [PATCH 2/6] Added fields to message for RLQOS support
2017-03-21 19:43 ` [lustre-devel] [PATCH 2/6] Added fields to message for RLQOS support Yan Li
@ 2017-03-23 14:54 ` Alexey Lyashkov
0 siblings, 0 replies; 17+ messages in thread
From: Alexey Lyashkov @ 2017-03-23 14:54 UTC (permalink / raw)
To: lustre-devel
You should don't comment a asserts, but introduce an additional connect
flag to handle used fields if this flag set.
As i see you have write an code to work between patched nodes, but we have
no guaratee all nodes in clusters uses same version all time.
On Tue, Mar 21, 2017 at 10:43 PM, Yan Li <yanli@ascar.io> wrote:
> Modified the request message to embed sent_time, which will be
> returned from the server and used to calculate the exponentially
> weighted moving average of sent_time gap in return messages. It is
> used as a metric for rate-limiting quality of service.
>
> Signed-off-by: Yan Li <yanli@ascar.io>
> ---
> lustre/include/lustre/lustre_idl.h | 4 ++++
> lustre/ptlrpc/pack_generic.c | 5 +++++
> lustre/ptlrpc/wiretest.c | 2 ++
> lustre/utils/wiretest.c | 2 ++
> 4 files changed, 13 insertions(+)
>
> diff --git a/lustre/include/lustre/lustre_idl.h b/lustre/include/lustre/
> lustre_idl.h
> index bf23a47..7a200d1 100644
> --- a/lustre/include/lustre/lustre_idl.h
> +++ b/lustre/include/lustre/lustre_idl.h
> @@ -3336,8 +3336,12 @@ struct obdo {
> * each stripe.
> * brw: grant space
> consumed on
> * the client for the
> write */
> +#ifdef ENABLE_RLQOS
> + struct timeval o_sent_time; /* timeval is 64x2 bits on
> Linux */
> +#else
> __u64 o_padding_4;
> __u64 o_padding_5;
> +#endif
> __u64 o_padding_6;
> };
>
> diff --git a/lustre/ptlrpc/pack_generic.c b/lustre/ptlrpc/pack_generic.c
> index 8df8ea8..d0bc87a 100644
> --- a/lustre/ptlrpc/pack_generic.c
> +++ b/lustre/ptlrpc/pack_generic.c
> @@ -1722,8 +1722,13 @@ void lustre_swab_obdo (struct obdo *o)
> __swab32s (&o->o_uid_h);
> __swab32s (&o->o_gid_h);
> __swab64s (&o->o_data_version);
> +#ifdef ENABLE_RLQOS
> + __swab64s ((__u64*)&o->o_sent_time.tv_sec);
> + __swab64s ((__u64*)&o->o_sent_time.tv_usec);
> +#else
> CLASSERT(offsetof(typeof(*o), o_padding_4) != 0);
> CLASSERT(offsetof(typeof(*o), o_padding_5) != 0);
> +#endif
> CLASSERT(offsetof(typeof(*o), o_padding_6) != 0);
>
> }
> diff --git a/lustre/ptlrpc/wiretest.c b/lustre/ptlrpc/wiretest.c
> index 070ef91..0c909a6 100644
> --- a/lustre/ptlrpc/wiretest.c
> +++ b/lustre/ptlrpc/wiretest.c
> @@ -1314,6 +1314,7 @@ void lustre_assert_wire_constants(void)
> (long long)(int)offsetof(struct obdo, o_data_version));
> LASSERTF((int)sizeof(((struct obdo *)0)->o_data_version) == 8,
> "found %lld\n",
> (long long)(int)sizeof(((struct obdo
> *)0)->o_data_version));
> +#ifndef ENABLE_RLQOS
> LASSERTF((int)offsetof(struct obdo, o_padding_4) == 184, "found
> %lld\n",
> (long long)(int)offsetof(struct obdo, o_padding_4));
> LASSERTF((int)sizeof(((struct obdo *)0)->o_padding_4) == 8, "found
> %lld\n",
> @@ -1322,6 +1323,7 @@ void lustre_assert_wire_constants(void)
> (long long)(int)offsetof(struct obdo, o_padding_5));
> LASSERTF((int)sizeof(((struct obdo *)0)->o_padding_5) == 8, "found
> %lld\n",
> (long long)(int)sizeof(((struct obdo *)0)->o_padding_5));
> +#endif
> LASSERTF((int)offsetof(struct obdo, o_padding_6) == 200, "found
> %lld\n",
> (long long)(int)offsetof(struct obdo, o_padding_6));
> LASSERTF((int)sizeof(((struct obdo *)0)->o_padding_6) == 8, "found
> %lld\n",
> diff --git a/lustre/utils/wiretest.c b/lustre/utils/wiretest.c
> index 233d7d8..47fbbf0 100644
> --- a/lustre/utils/wiretest.c
> +++ b/lustre/utils/wiretest.c
> @@ -1329,6 +1329,7 @@ void lustre_assert_wire_constants(void)
> (long long)(int)offsetof(struct obdo, o_data_version));
> LASSERTF((int)sizeof(((struct obdo *)0)->o_data_version) == 8,
> "found %lld\n",
> (long long)(int)sizeof(((struct obdo
> *)0)->o_data_version));
> +#ifndef ENABLE_RLQOS
> LASSERTF((int)offsetof(struct obdo, o_padding_4) == 184, "found
> %lld\n",
> (long long)(int)offsetof(struct obdo, o_padding_4));
> LASSERTF((int)sizeof(((struct obdo *)0)->o_padding_4) == 8, "found
> %lld\n",
> @@ -1337,6 +1338,7 @@ void lustre_assert_wire_constants(void)
> (long long)(int)offsetof(struct obdo, o_padding_5));
> LASSERTF((int)sizeof(((struct obdo *)0)->o_padding_5) == 8, "found
> %lld\n",
> (long long)(int)sizeof(((struct obdo *)0)->o_padding_5));
> +#endif
> LASSERTF((int)offsetof(struct obdo, o_padding_6) == 200, "found
> %lld\n",
> (long long)(int)offsetof(struct obdo, o_padding_6));
> LASSERTF((int)sizeof(((struct obdo *)0)->o_padding_6) == 8, "found
> %lld\n",
> --
> 1.8.3.1
>
> _______________________________________________
> lustre-devel mailing list
> lustre-devel at lists.lustre.org
> https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.
> lustre.org_listinfo.cgi_lustre-2Ddevel-2Dlustre.org&d=DwICAg&c=IGDlg0lD0b-
> nebmJJ0Kp8A&r=m8P9AM2wTf4l79yg9e1LHD5IHagtwa3P4AXaemlM6Lg&m=
> NuClc8LkPaQ91Zav0h5yoiRmBVC4_Ks9Db6KX3xsRmk&s=
> 6FVNfemWTMvnOwmVBxixoJyS4CNIP_D14UGw2pWlGd0&e=
>
--
Alexey Lyashkov *?* Technical lead for a Morpheus team
Seagate Technology, LLC
www.seagate.com
www.lustre.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-devel-lustre.org/attachments/20170323/f2b00fba/attachment-0001.htm>
^ permalink raw reply [flat|nested] 17+ messages in thread
* [lustre-devel] [PATCH 3/6] RLQOS main data structure
2017-03-21 19:43 [lustre-devel] [PATCH 0/6] Rate-limiting Quality of Service Yan Li
2017-03-21 19:43 ` [lustre-devel] [PATCH 1/6] Autoconf option for rate-limiting Quality of Service (RLQOS) Yan Li
2017-03-21 19:43 ` [lustre-devel] [PATCH 2/6] Added fields to message for RLQOS support Yan Li
@ 2017-03-21 19:43 ` Yan Li
2017-03-21 19:43 ` [lustre-devel] [PATCH 4/6] lprocfs interfaces for showing, parsing, and controlling rules Yan Li
` (2 subsequent siblings)
5 siblings, 0 replies; 17+ messages in thread
From: Yan Li @ 2017-03-21 19:43 UTC (permalink / raw)
To: lustre-devel
Each client_obd maintains a qos data structure.
Signed-off-by: Yan Li <yanli@ascar.io>
---
lustre/include/obd.h | 8 +++
lustre/include/rlqos.h | 136 +++++++++++++++++++++++++++++++++++++++++++++++
lustre/obdclass/genops.c | 25 +++++++++
3 files changed, 169 insertions(+)
create mode 100644 lustre/include/rlqos.h
diff --git a/lustre/include/obd.h b/lustre/include/obd.h
index b4ee379..726493c 100644
--- a/lustre/include/obd.h
+++ b/lustre/include/obd.h
@@ -50,6 +50,9 @@
#include <lustre_intent.h>
#include <lvfs.h>
#include <lustre_quota.h>
+#ifdef ENABLE_RLQOS
+# include "rlqos.h"
+#endif
#define MAX_OBD_DEVICES 8192
@@ -331,6 +334,11 @@ struct client_obd {
void *cl_lru_work;
/* hash tables for osc_quota_info */
struct cfs_hash *cl_quota_hash[LL_MAXQUOTAS];
+
+#ifdef ENABLE_RLQOS
+ /* rate-limiting quality of service data */
+ struct qos_data_t qos;
+#endif
};
#define obd2cli_tgt(obd) ((char *)(obd)->u.cli.cl_target_uuid.uuid)
diff --git a/lustre/include/rlqos.h b/lustre/include/rlqos.h
new file mode 100644
index 0000000..d8e012b
--- /dev/null
+++ b/lustre/include/rlqos.h
@@ -0,0 +1,136 @@
+/*
+ * GPL HEADER START
+ *
+ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 only,
+ * as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License version 2 for more details (a copy is included
+ * in the LICENSE file that accompanied this code).
+ *
+ * You should have received a copy of the GNU General Public License
+ * version 2 along with this program; If not, see
+ * http://www.sun.com/software/products/lustre/docs/GPLv2.pdf
+ *
+ * Please contact Storage Systems Research Center, Computer Science Department,
+ * University of California, Santa Cruz (www.ssrc.ucsc.edu) if you need
+ * additional information or have any questions.
+ *
+ * GPL HEADER END
+ */
+/*
+ * Copyright (c) 2013-2017, University of California, Santa Cruz, CA, USA.
+ * All rights reserved.
+ */
+/*
+ * This file is part of Lustre, http://www.lustre.org/
+ * Lustre is a trademark of Sun Microsystems, Inc.
+ *
+ * lustre/include/rlqos.h
+ */
+
+#ifndef _RLQOS_H
+#define _RLQOS_H
+
+/* We work with kernel only */
+#ifdef __KERNEL__
+# include <linux/types.h>
+# include <linux/time.h>
+# include <asm/param.h>
+# include <libcfs/libcfs.h>
+# include <linux/delay.h>
+#else /* __KERNEL__ */
+# define HZ 100
+# define ONE_MILLION 1000000
+# include <liblustre.h>
+#endif
+
+#define EWMA_ALPHA_INV (8)
+
+/**
+ * For tracking the exponentially-weighted moving average of a timeval. Note
+ * that we can't do float point div in kernel, so actually we are tracking
+ * ea = ewma * alpha. You should divide ea with alpha to get the real ewma.
+ */
+struct time_ewma {
+ __u64 alpha_inv;
+ __u64 ea;
+ struct timeval last_time;
+};
+/* We can't do float point div, so we are tracking
+ * ea = ewma * alpha = ewma / alpha_inv
+ */
+
+struct qos_rule_t {
+ __u64 ack_ewma_lower;
+ __u64 ack_ewma_upper;
+ __u64 send_ewma_lower;
+ __u64 send_ewma_upper;
+ unsigned int rtt_ratio100_lower;
+ unsigned int rtt_ratio100_upper;
+ int m100;
+ int b100;
+ unsigned int tau;
+ int used_times;
+
+ __u64 ack_ewma_avg;
+ __u64 send_ewma_avg;
+ unsigned int rtt_ratio100_avg;
+};
+
+struct qos_data_t {
+ spinlock_t lock;
+ struct time_ewma ack_ewma;
+ struct time_ewma sent_ewma;
+ int rtt_ratio100;
+ long smallest_rtt;
+ int max_rpc_in_flight100;
+ struct timeval last_mrif_update_time;
+ int min_gap_between_updating_mrif;
+ int rule_no;
+ /* Following fields are for calculating I/O bandwidth,
+ * 0 for read, 1 for write */
+ long last_req_sec[2]; /* second of last request we received */
+ __u64 tp_last_sec[2]; /* throughput of last sec */
+ __u64 sum_bytes_this_sec[2]; /* cumulative bytes read within this sec */
+ /* For throttling support */
+ unsigned int min_usec_between_rpcs;
+ struct timeval last_rpc_time;
+ struct qos_rule_t *rules;
+};
+
+static inline __u64 qos_get_ewma_usec(const struct time_ewma *ewma) {
+ return ewma->ea / ewma->alpha_inv;
+}
+
+int parse_qos_rules(const char *buf, struct qos_data_t *qos);
+
+/* Lock of qos must be held. op == 0 for read, 1 for write */
+static inline void calc_throughput(struct qos_data_t *qos, int op, int bytes_transferred)
+{
+ struct timeval now;
+
+ if (op != 0 && op != 1)
+ return;
+
+ do_gettimeofday(&now);
+ if (likely(now.tv_sec == qos->last_req_sec[op])) {
+ qos->sum_bytes_this_sec[op] += bytes_transferred;
+ } else if (likely(now.tv_sec == qos->last_req_sec[op] + 1)) {
+ qos->tp_last_sec[op] = qos->sum_bytes_this_sec[op];
+ qos->last_req_sec[op] = now.tv_sec;
+ qos->sum_bytes_this_sec[op] = bytes_transferred;
+ } else if (likely(now.tv_sec > qos->last_req_sec[op] + 1)) {
+ qos->tp_last_sec[op] = 0;
+ qos->last_req_sec[op] = now.tv_sec;
+ qos->sum_bytes_this_sec[op] = bytes_transferred;
+ }
+ /* Ignore cases when now.tv_sec < qos->last_req_sec */
+}
+
+#endif /* _RLQOS_H */
diff --git a/lustre/obdclass/genops.c b/lustre/obdclass/genops.c
index a48f887..417c612 100644
--- a/lustre/obdclass/genops.c
+++ b/lustre/obdclass/genops.c
@@ -284,6 +284,28 @@ int class_unregister_type(const char *name)
} /* class_unregister_type */
EXPORT_SYMBOL(class_unregister_type);
+#ifdef ENABLE_RLQOS
+static void init_time_ewma(struct time_ewma *ewma)
+{
+ ewma->alpha_inv = 8;
+ ewma->ea = 0;
+ ewma->last_time.tv_sec = 0;
+ ewma->last_time.tv_usec = 0;
+}
+
+static void init_qos(struct client_obd *cli)
+{
+ struct qos_data_t *qos = &cli->qos;
+
+ init_time_ewma(&qos->ack_ewma);
+ init_time_ewma(&qos->sent_ewma);
+
+ spin_lock(&cli->cl_loi_list_lock);
+ qos->max_rpc_in_flight100 = cli->cl_max_rpcs_in_flight * 100;
+ spin_unlock(&cli->cl_loi_list_lock);
+}
+#endif
+
/**
* Create a new obd device.
*
@@ -349,6 +371,9 @@ struct obd_device *class_newdev(const char *type_name, const char *name)
result->obd_type = type;
strncpy(result->obd_name, name,
sizeof(result->obd_name) - 1);
+#ifdef ENABLE_RLQOS
+ init_qos(&result->u.cli);
+#endif
obd_devs[i] = result;
}
}
--
1.8.3.1
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [lustre-devel] [PATCH 4/6] lprocfs interfaces for showing, parsing, and controlling rules
2017-03-21 19:43 [lustre-devel] [PATCH 0/6] Rate-limiting Quality of Service Yan Li
` (2 preceding siblings ...)
2017-03-21 19:43 ` [lustre-devel] [PATCH 3/6] RLQOS main data structure Yan Li
@ 2017-03-21 19:43 ` Yan Li
2017-03-21 19:43 ` [lustre-devel] [PATCH 5/6] Throttle the outgoing requests according to tau Yan Li
2017-03-21 19:43 ` [lustre-devel] [PATCH 6/6] Adjust max_rpcs_in_flight according to metrics Yan Li
5 siblings, 0 replies; 17+ messages in thread
From: Yan Li @ 2017-03-21 19:43 UTC (permalink / raw)
To: lustre-devel
Signed-off-by: Yan Li <yanli@ascar.io>
---
lustre/obdclass/lprocfs_status.c | 32 ++++++++
lustre/osc/Makefile.in | 2 +-
lustre/osc/lproc_osc.c | 157 ++++++++++++++++++++++++++++++++++-----
lustre/osc/qos_rules.c | 125 +++++++++++++++++++++++++++++++
4 files changed, 295 insertions(+), 21 deletions(-)
create mode 100644 lustre/osc/qos_rules.c
diff --git a/lustre/obdclass/lprocfs_status.c b/lustre/obdclass/lprocfs_status.c
index 08db676..841a3da 100644
--- a/lustre/obdclass/lprocfs_status.c
+++ b/lustre/obdclass/lprocfs_status.c
@@ -814,6 +814,14 @@ int lprocfs_import_seq_show(struct seq_file *m, void *data)
int j;
int k;
int rw = 0;
+#ifdef ENABLE_RLQOS
+ struct qos_data_t *qos;
+ __u64 ack_ewma;
+ __u64 sent_ewma;
+ int rtt_ratio100;
+ __u64 read_tp;
+ __u64 write_tp;
+#endif
LASSERT(obd != NULL);
LPROCFS_CLIMP_CHECK(obd);
@@ -884,6 +892,26 @@ int lprocfs_import_seq_show(struct seq_file *m, void *data)
atomic_read(&imp->imp_unregistering),
atomic_read(&imp->imp_timeouts),
ret.lc_sum, header->lc_units);
+#ifdef ENABLE_RLQOS
+ qos = &obd->u.cli.qos;
+ spin_lock(&qos->lock);
+ ack_ewma = qos_get_ewma_usec(&qos->ack_ewma);
+ sent_ewma = qos_get_ewma_usec(&qos->sent_ewma);
+ rtt_ratio100 = qos->rtt_ratio100;
+
+ /* Refresh throughput. If a long time has passed since we
+ received last req, throughput data is stale. */
+ calc_throughput(qos, OST_READ-OST_READ, 0);
+ calc_throughput(qos, OST_WRITE-OST_READ, 0);
+
+ read_tp = qos->tp_last_sec[0];
+ write_tp = qos->tp_last_sec[1];
+ spin_unlock(&qos->lock);
+ seq_printf(m, " ack_ewma: %llu usec\n"
+ " sent_ewma: %llu usec\n"
+ " rtt_ratio100: %d\n",
+ ack_ewma, sent_ewma, rtt_ratio100);
+#endif
k = 0;
for(j = 0; j < IMP_AT_MAX_PORTALS; j++) {
@@ -938,6 +966,10 @@ int lprocfs_import_seq_show(struct seq_file *m, void *data)
k / j, (100 * k / j) % 100);
}
}
+#ifdef ENABLE_RLQOS
+ seq_printf(m, " read_throughput: %llu\n", read_tp);
+ seq_printf(m, " write_throughput: %llu\n", write_tp);
+#endif
out_climp:
LPROCFS_CLIMP_EXIT(obd);
diff --git a/lustre/osc/Makefile.in b/lustre/osc/Makefile.in
index b1128bc..d6edab2 100644
--- a/lustre/osc/Makefile.in
+++ b/lustre/osc/Makefile.in
@@ -1,5 +1,5 @@
MODULES := osc
-osc-objs := osc_request.o lproc_osc.o osc_dev.o osc_object.o osc_page.o osc_lock.o osc_io.o osc_quota.o osc_cache.o
+osc-objs := osc_request.o lproc_osc.o osc_dev.o osc_object.o osc_page.o osc_lock.o osc_io.o osc_quota.o osc_cache.o qos_rules.o
EXTRA_DIST = $(osc-objs:%.o=%.c) osc_internal.h osc_cl_internal.h
diff --git a/lustre/osc/lproc_osc.c b/lustre/osc/lproc_osc.c
index de5a29c..653afc4 100644
--- a/lustre/osc/lproc_osc.c
+++ b/lustre/osc/lproc_osc.c
@@ -1,3 +1,4 @@
+
/*
* GPL HEADER START
*
@@ -38,6 +39,9 @@
#include <lprocfs_status.h>
#include <linux/seq_file.h>
#include "osc_internal.h"
+#ifdef ENABLE_RLQOS
+# include "../include/rlqos.h"
+#endif
#ifdef CONFIG_PROC_FS
static int osc_active_seq_show(struct seq_file *m, void *v)
@@ -92,8 +96,10 @@ static ssize_t osc_max_rpcs_in_flight_seq_write(struct file *file,
{
struct obd_device *dev = ((struct seq_file *)file->private_data)->private;
struct client_obd *cli = &dev->u.cli;
+#ifdef ENABLE_RLQOS
+ struct qos_data_t *qos = &cli->qos;
+#endif
int rc;
- int adding, added, req_count;
__s64 val;
rc = lprocfs_str_to_s64(buffer, count, &val);
@@ -103,31 +109,57 @@ static ssize_t osc_max_rpcs_in_flight_seq_write(struct file *file,
return -ERANGE;
LPROCFS_CLIMP_CHECK(dev);
+ set_max_rpcs_in_flight((int)val, cli);
+ LPROCFS_CLIMP_EXIT(dev);
- adding = (int)val - cli->cl_max_rpcs_in_flight;
- req_count = atomic_read(&osc_pool_req_count);
- if (adding > 0 && req_count < osc_reqpool_maxreqcount) {
- /*
- * There might be some race which will cause over-limit
- * allocation, but it is fine.
- */
- if (req_count + adding > osc_reqpool_maxreqcount)
- adding = osc_reqpool_maxreqcount - req_count;
-
- added = osc_rq_pool->prp_populate(osc_rq_pool, adding);
- atomic_add(added, &osc_pool_req_count);
- }
-
- spin_lock(&cli->cl_loi_list_lock);
- cli->cl_max_rpcs_in_flight = val;
- client_adjust_max_dirty(cli);
- spin_unlock(&cli->cl_loi_list_lock);
+#ifdef ENABLE_RLQOS
+ /* Update the value tracked by QoS routines too */
+ spin_lock(&qos->lock);
+ qos->max_rpc_in_flight100 = val * 100;
+ spin_unlock(&qos->lock);
+#endif
- LPROCFS_CLIMP_EXIT(dev);
return count;
}
LPROC_SEQ_FOPS(osc_max_rpcs_in_flight);
+#ifdef ENABLE_RLQOS
+static int osc_min_brw_rpc_gap_seq_show(struct seq_file *m, void *v)
+{
+ struct obd_device *dev = m->private;
+ struct client_obd *cli = &dev->u.cli;
+ struct qos_data_t *qos = &cli->qos;
+
+ spin_lock(&qos->lock);
+ seq_printf(m, "%u\n", qos->min_usec_between_rpcs);
+ spin_unlock(&qos->lock);
+ return 0;
+}
+
+static ssize_t osc_min_brw_rpc_gap_seq_write(struct file *file,
+ const char __user *buffer,
+ size_t count, loff_t *off)
+{
+ struct obd_device *dev = ((struct seq_file *)file->private_data)->private;
+ struct client_obd *cli = &dev->u.cli;
+ int rc;
+ __s64 val;
+ struct qos_data_t *qos = &cli->qos;
+
+ rc = lprocfs_str_to_s64(buffer, count, &val);
+ if (rc)
+ return rc;
+ if (val < 0)
+ return -ERANGE;
+
+ spin_lock(&qos->lock);
+ qos->min_usec_between_rpcs = val;
+ spin_unlock(&qos->lock);
+ return count;
+}
+LPROC_SEQ_FOPS(osc_min_brw_rpc_gap);
+#endif
+
static int osc_max_dirty_mb_seq_show(struct seq_file *m, void *v)
{
struct obd_device *dev = m->private;
@@ -599,6 +631,83 @@ static int osc_unstable_stats_seq_show(struct seq_file *m, void *v)
}
LPROC_SEQ_FOPS_RO(osc_unstable_stats);
+#ifdef ENABLE_RLQOS
+static int osc_qos_rules_seq_show(struct seq_file *m, void *data)
+{
+ struct obd_device *dev = m->private;
+ struct client_obd *cli = &dev->u.cli;
+ struct qos_data_t *qos = &cli->qos;
+ int i;
+ struct qos_rule_t *r;
+
+ spin_lock(&qos->lock);
+ if (0 == qos->rule_no || NULL == qos->rules || 0 == qos->min_gap_between_updating_mrif) {
+ seq_printf(m, "0\n");
+ /* Make sure the upcoming for loop doesn't run */
+ qos->rule_no = 0;
+ } else {
+ seq_printf(m, "%d,%d\n", qos->rule_no, 1000000 / qos->min_gap_between_updating_mrif);
+ }
+ for (i = 0; i < qos->rule_no; ++i) {
+ r = &qos->rules[i];
+ seq_printf(m, "%llu,%llu,%llu,%llu,%u,%u,%d,%d,%u,%d,%llu,%llu,%u\n",
+ r->ack_ewma_lower, r->ack_ewma_upper,
+ r->send_ewma_lower, r->send_ewma_upper,
+ r->rtt_ratio100_lower, r->rtt_ratio100_upper,
+ r->m100, r->b100, r->tau,
+ r->used_times,
+ r->ack_ewma_avg, r->send_ewma_avg, r->rtt_ratio100_avg);
+ }
+ spin_unlock(&qos->lock);
+ return 0;
+}
+
+static ssize_t osc_qos_rules_seq_write(struct file *file,
+ const char __user *buffer,
+ size_t count, loff_t *off)
+{
+ struct obd_device *dev = ((struct seq_file *)file->private_data)->private;
+ struct client_obd *cli = &dev->u.cli;
+ struct qos_data_t *qos = &cli->qos;
+ int rc;
+ char *kernbuf = NULL;
+
+ OBD_ALLOC(kernbuf, count + 1);
+ if (NULL == kernbuf) {
+ return -ENOMEM;
+ }
+ if (copy_from_user(kernbuf, buffer, count)) {
+ rc = -EFAULT;
+ goto out_free_kernbuf;
+ }
+ /* Make sure the buf ends with a null so that sscanf won't overread */
+ kernbuf[count] = '\0';
+
+ spin_lock(&qos->lock);
+ /* parse_qos_rules() will free existing rules in qos before starting parsing */
+ rc = parse_qos_rules(kernbuf, qos);
+ if (0 == rc) {
+ /* return the number of chars processed on a success parsing */
+ rc = count;
+ }
+ qos->ack_ewma.ea = 0;
+ qos->ack_ewma.last_time.tv_sec = 0;
+ qos->ack_ewma.last_time.tv_usec = 0;
+ qos->sent_ewma.ea = 0;
+ qos->sent_ewma.last_time.tv_sec = 0;
+ qos->sent_ewma.last_time.tv_usec = 0;
+ qos->rtt_ratio100 = 0;
+ qos->smallest_rtt = 0;
+ qos->min_usec_between_rpcs = 0;
+ spin_unlock(&qos->lock);
+out_free_kernbuf:
+ OBD_FREE(kernbuf, count + 1);
+ return rc;
+
+}
+LPROC_SEQ_FOPS(osc_qos_rules);
+#endif
+
LPROC_SEQ_FOPS_RO_TYPE(osc, uuid);
LPROC_SEQ_FOPS_RO_TYPE(osc, connect_flags);
LPROC_SEQ_FOPS_RO_TYPE(osc, blksize);
@@ -647,6 +756,10 @@ struct lprocfs_vars lprocfs_osc_obd_vars[] = {
.fops = &osc_obd_max_pages_per_rpc_fops },
{ .name = "max_rpcs_in_flight",
.fops = &osc_max_rpcs_in_flight_fops },
+#ifdef ENABLE_RLQOS
+ { .name = "min_brw_rpc_gap",
+ .fops = &osc_min_brw_rpc_gap_fops },
+#endif
{ .name = "destroys_in_flight",
.fops = &osc_destroys_in_flight_fops },
{ .name = "max_dirty_mb",
@@ -683,6 +796,10 @@ struct lprocfs_vars lprocfs_osc_obd_vars[] = {
.fops = &osc_pinger_recov_fops },
{ .name = "unstable_stats",
.fops = &osc_unstable_stats_fops },
+#ifdef ENABLE_RLQOS
+ { .name = "qos_rules",
+ .fops = &osc_qos_rules_fops },
+#endif
{ NULL }
};
diff --git a/lustre/osc/qos_rules.c b/lustre/osc/qos_rules.c
new file mode 100644
index 0000000..8db24bd
--- /dev/null
+++ b/lustre/osc/qos_rules.c
@@ -0,0 +1,125 @@
+/*
+ * GPL HEADER START
+ *
+ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 only,
+ * as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License version 2 for more details (a copy is included
+ * in the LICENSE file that accompanied this code).
+ *
+ * You should have received a copy of the GNU General Public License
+ * version 2 along with this program; If not, see
+ * http://www.sun.com/software/products/lustre/docs/GPLv2.pdf
+ *
+ * Please contact Storage Systems Research Center, Computer Science Department,
+ * University of California, Santa Cruz (www.ssrc.ucsc.edu) if you need
+ * additional information or have any questions.
+ *
+ * GPL HEADER END
+ */
+/*
+ * Copyright (c) 2013-2017, University of California, Santa Cruz, CA, USA.
+ * All rights reserved.
+ */
+/*
+ * This file is part of Lustre, http://www.lustre.org/
+ * Lustre is a trademark of Sun Microsystems, Inc.
+ *
+ * qos_rules.c
+ */
+#ifndef __KERNEL__
+ #include <stdio.h>
+ #include "kernel-test-primitives.h"
+ #include <string.h>
+#endif
+#include "../include/rlqos.h"
+
+/* Parse qos_rules in buf and store the result to qos.
+ *
+ * Pre-condition:
+ * 1. qos must be initialized and qos->lock MUST be held before calling this function!
+ * 2. exisiting rules in qos->rules will be freed
+ * 3. buf must be NULL-terminated or sscanf may overread it.
+ *
+ * Return value:
+ * 0: success
+ * other value: error code. On error, qos->rules is NULL and qos->rule_no is 0.
+ */
+int parse_qos_rules(const char *buf, struct qos_data_t *qos)
+{
+ int new_rule_no = 0;
+ int rules_per_sec = 0;
+ int rc;
+ int i;
+ const char *p = buf;
+ int n;
+ const size_t rule_size = sizeof(*(qos->rules));
+ struct qos_rule_t *r;
+
+ /* handle "0\n" and "0" */
+ if (strlen(p) <= 2 && '0' == *p) {
+ if (qos->rules) {
+ LIBCFS_FREE(qos->rules, qos->rule_no * rule_size);
+ }
+ qos->rule_no = 0;
+ qos->rules = NULL;
+ return 0;
+ }
+
+ rc = sscanf(p, "%d,%d\n%n", &new_rule_no, &rules_per_sec, &n);
+ if (2 != rc) {
+ CWARN("Input data error, can't read new_rule_no\n");
+ return -EINVAL;
+ }
+ if (0 == new_rule_no || 0 == rules_per_sec) {
+ if (qos->rules) {
+ LIBCFS_FREE(qos->rules, qos->rule_no * rule_size);
+ }
+ qos->rule_no = 0;
+ qos->rules = NULL;
+ return 0;
+ }
+ p += n;
+ if (qos->rules) {
+ LIBCFS_FREE(qos->rules, qos->rule_no * rule_size);
+ }
+ qos->rule_no = new_rule_no;
+ qos->min_gap_between_updating_mrif = 1000000 / rules_per_sec;
+ LIBCFS_ALLOC_ATOMIC(qos->rules, new_rule_no * rule_size);
+ if (!qos->rules) {
+ CWARN("Can't allocate enough mem for %d rules\n", new_rule_no);
+ return -ENOMEM;
+ }
+ memset(qos->rules, 0, new_rule_no * rule_size);
+
+ for (i = 0; i < new_rule_no; i++) {
+ r = &qos->rules[i];
+ /* Don't put \n at the end of sscanf format str
+ because there may be other unknown fields there,
+ which will be discarded later */
+ rc = sscanf(p, "%llu,%llu,%llu,%llu,%u,%u,%d,%d,%u%n",
+ &r->ack_ewma_lower, &r->ack_ewma_upper,
+ &r->send_ewma_lower, &r->send_ewma_upper,
+ &r->rtt_ratio100_lower, &r->rtt_ratio100_upper,
+ &r->m100, &r->b100, &r->tau, &n);
+ p += n;
+ if (rc != 9) {
+ CWARN("QoS rule parsing error, rc = %d\n", rc);
+ LIBCFS_FREE(qos->rules, qos->rule_no * rule_size);
+ qos->rules = NULL;
+ qos->rule_no = 0;
+ return -EINVAL;
+ }
+ /* consume all other chars till \n or end-of-buffer */
+ while (*p != '\0' && *(p++) != '\n')
+ ;
+ }
+
+ return 0;
+}
--
1.8.3.1
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [lustre-devel] [PATCH 5/6] Throttle the outgoing requests according to tau
2017-03-21 19:43 [lustre-devel] [PATCH 0/6] Rate-limiting Quality of Service Yan Li
` (3 preceding siblings ...)
2017-03-21 19:43 ` [lustre-devel] [PATCH 4/6] lprocfs interfaces for showing, parsing, and controlling rules Yan Li
@ 2017-03-21 19:43 ` Yan Li
2017-03-23 14:03 ` Alexey Lyashkov
2017-03-21 19:43 ` [lustre-devel] [PATCH 6/6] Adjust max_rpcs_in_flight according to metrics Yan Li
5 siblings, 1 reply; 17+ messages in thread
From: Yan Li @ 2017-03-21 19:43 UTC (permalink / raw)
To: lustre-devel
Signed-off-by: Yan Li <yanli@ascar.io>
---
lustre/osc/osc_cache.c | 3 +++
lustre/osc/osc_internal.h | 66 +++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 69 insertions(+)
diff --git a/lustre/osc/osc_cache.c b/lustre/osc/osc_cache.c
index 236263c..2f9d4e1 100644
--- a/lustre/osc/osc_cache.c
+++ b/lustre/osc/osc_cache.c
@@ -2316,6 +2316,9 @@ static int osc_io_unplug0(const struct lu_env *env, struct client_obd *cli,
} else {
CDEBUG(D_CACHE, "Queue writeback work for client %p.\n", cli);
LASSERT(cli->cl_writeback_work != NULL);
+#ifdef ENABLE_RLQOS
+ qos_throttle(&cli->qos);
+#endif
rc = ptlrpcd_queue_work(cli->cl_writeback_work);
}
return rc;
diff --git a/lustre/osc/osc_internal.h b/lustre/osc/osc_internal.h
index 06c21b3..d31d5ba 100644
--- a/lustre/osc/osc_internal.h
+++ b/lustre/osc/osc_internal.h
@@ -245,4 +245,70 @@ extern unsigned long osc_cache_shrink_count(struct shrinker *sk,
extern unsigned long osc_cache_shrink_scan(struct shrinker *sk,
struct shrink_control *sc);
+#ifdef ENABLE_RLQOS
+static inline void qos_throttle(struct qos_data_t *qos)
+{
+ struct timeval now;
+ long usec_since_last_rpc;
+ long need_sleep_usec = 0;
+
+ spin_lock(&qos->lock);
+ if (0 == qos->min_usec_between_rpcs)
+ goto out;
+
+ do_gettimeofday(&now);
+ usec_since_last_rpc = cfs_timeval_sub(&now, &qos->last_rpc_time, NULL);
+ if (usec_since_last_rpc < 0) {
+ usec_since_last_rpc = 0;
+ }
+ if (usec_since_last_rpc < qos->min_usec_between_rpcs) {
+ need_sleep_usec = qos->min_usec_between_rpcs - usec_since_last_rpc;
+ }
+ qos->last_rpc_time = now;
+out:
+ spin_unlock(&qos->lock);
+ if (0 == need_sleep_usec) {
+ return;
+ }
+
+ /* About timer ranges:
+ Ref: https://www.kernel.org/doc/Documentation/timers/timers-howto.txt */
+ if (need_sleep_usec < 1000) {
+ udelay(need_sleep_usec);
+ } else if (need_sleep_usec < 20000) {
+ usleep_range(need_sleep_usec - 1, need_sleep_usec);
+ } else {
+ msleep(need_sleep_usec / 1000);
+ }
+}
+#endif /* ENABLE_RLQOS */
+
+/* You must call LPROCFS_CLIMP_CHECK() on the obd device before and
+ * LPROCFS_CLIMP_EXIT() after calling this function. They are not called inside
+ * this function, because they may return an error code.
+ */
+static inline void set_max_rpcs_in_flight(int val, struct client_obd *cli)
+{
+ int adding, added, req_count;
+
+ adding = val - cli->cl_max_rpcs_in_flight;
+ req_count = atomic_read(&osc_pool_req_count);
+ if (adding > 0 && req_count < osc_reqpool_maxreqcount) {
+ /*
+ * There might be some race which will cause over-limit
+ * allocation, but it is fine.
+ */
+ if (req_count + adding > osc_reqpool_maxreqcount)
+ adding = osc_reqpool_maxreqcount - req_count;
+
+ added = osc_rq_pool->prp_populate(osc_rq_pool, adding);
+ atomic_add(added, &osc_pool_req_count);
+ }
+
+ spin_lock(&cli->cl_loi_list_lock);
+ cli->cl_max_rpcs_in_flight = val;
+ client_adjust_max_dirty(cli);
+ spin_unlock(&cli->cl_loi_list_lock);
+}
+
#endif /* OSC_INTERNAL_H */
--
1.8.3.1
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [lustre-devel] [PATCH 5/6] Throttle the outgoing requests according to tau
2017-03-21 19:43 ` [lustre-devel] [PATCH 5/6] Throttle the outgoing requests according to tau Yan Li
@ 2017-03-23 14:03 ` Alexey Lyashkov
0 siblings, 0 replies; 17+ messages in thread
From: Alexey Lyashkov @ 2017-03-23 14:03 UTC (permalink / raw)
To: lustre-devel
I dislike a sleep in this code.
I think you should use req->rq_sent time to have a some delay, as way as
osc redo code does.
ptlrpc_check_set()
..
/* delayed send - skip */
if (req->rq_phase == RQ_PHASE_NEW && req->rq_sent)
continue;
On Tue, Mar 21, 2017 at 10:43 PM, Yan Li <yanli@ascar.io> wrote:
> Signed-off-by: Yan Li <yanli@ascar.io>
> ---
> lustre/osc/osc_cache.c | 3 +++
> lustre/osc/osc_internal.h | 66 ++++++++++++++++++++++++++++++
> +++++++++++++++++
> 2 files changed, 69 insertions(+)
>
> diff --git a/lustre/osc/osc_cache.c b/lustre/osc/osc_cache.c
> index 236263c..2f9d4e1 100644
> --- a/lustre/osc/osc_cache.c
> +++ b/lustre/osc/osc_cache.c
> @@ -2316,6 +2316,9 @@ static int osc_io_unplug0(const struct lu_env *env,
> struct client_obd *cli,
> } else {
> CDEBUG(D_CACHE, "Queue writeback work for client %p.\n",
> cli);
> LASSERT(cli->cl_writeback_work != NULL);
> +#ifdef ENABLE_RLQOS
> + qos_throttle(&cli->qos);
> +#endif
> rc = ptlrpcd_queue_work(cli->cl_writeback_work);
> }
> return rc;
> diff --git a/lustre/osc/osc_internal.h b/lustre/osc/osc_internal.h
> index 06c21b3..d31d5ba 100644
> --- a/lustre/osc/osc_internal.h
> +++ b/lustre/osc/osc_internal.h
> @@ -245,4 +245,70 @@ extern unsigned long osc_cache_shrink_count(struct
> shrinker *sk,
> extern unsigned long osc_cache_shrink_scan(struct shrinker *sk,
> struct shrink_control *sc);
>
> +#ifdef ENABLE_RLQOS
> +static inline void qos_throttle(struct qos_data_t *qos)
> +{
> + struct timeval now;
> + long usec_since_last_rpc;
> + long need_sleep_usec = 0;
> +
> + spin_lock(&qos->lock);
> + if (0 == qos->min_usec_between_rpcs)
> + goto out;
> +
> + do_gettimeofday(&now);
> + usec_since_last_rpc = cfs_timeval_sub(&now, &qos->last_rpc_time,
> NULL);
> + if (usec_since_last_rpc < 0) {
> + usec_since_last_rpc = 0;
> + }
> + if (usec_since_last_rpc < qos->min_usec_between_rpcs) {
> + need_sleep_usec = qos->min_usec_between_rpcs -
> usec_since_last_rpc;
> + }
> + qos->last_rpc_time = now;
> +out:
> + spin_unlock(&qos->lock);
> + if (0 == need_sleep_usec) {
> + return;
> + }
> +
> + /* About timer ranges:
> + Ref: https://urldefense.proofpoint.com/v2/url?u=https-3A__www.
> kernel.org_doc_Documentation_timers_timers-2Dhowto.txt&d=
> DwICAg&c=IGDlg0lD0b-nebmJJ0Kp8A&r=m8P9AM2wTf4l79yg9e1LHD5IHagtwa
> 3P4AXaemlM6Lg&m=w0oijGmz2ea38--CHGZq4fPu44dwEldJr2BDVZcBR2U&
> s=jN5WjVQ8jELL9iEXADWoal4-Yo76FIU3VVDcdN3zsC4&e= */
> + if (need_sleep_usec < 1000) {
> + udelay(need_sleep_usec);
> + } else if (need_sleep_usec < 20000) {
> + usleep_range(need_sleep_usec - 1, need_sleep_usec);
> + } else {
> + msleep(need_sleep_usec / 1000);
> + }
> +}
> +#endif /* ENABLE_RLQOS */
> +
> +/* You must call LPROCFS_CLIMP_CHECK() on the obd device before and
> + * LPROCFS_CLIMP_EXIT() after calling this function. They are not called
> inside
> + * this function, because they may return an error code.
> + */
> +static inline void set_max_rpcs_in_flight(int val, struct client_obd *cli)
> +{
> + int adding, added, req_count;
> +
> + adding = val - cli->cl_max_rpcs_in_flight;
> + req_count = atomic_read(&osc_pool_req_count);
> + if (adding > 0 && req_count < osc_reqpool_maxreqcount) {
> + /*
> + * There might be some race which will cause over-limit
> + * allocation, but it is fine.
> + */
> + if (req_count + adding > osc_reqpool_maxreqcount)
> + adding = osc_reqpool_maxreqcount - req_count;
> +
> + added = osc_rq_pool->prp_populate(osc_rq_pool, adding);
> + atomic_add(added, &osc_pool_req_count);
> + }
> +
> + spin_lock(&cli->cl_loi_list_lock);
> + cli->cl_max_rpcs_in_flight = val;
> + client_adjust_max_dirty(cli);
> + spin_unlock(&cli->cl_loi_list_lock);
> +}
> +
> #endif /* OSC_INTERNAL_H */
> --
> 1.8.3.1
>
> _______________________________________________
> lustre-devel mailing list
> lustre-devel at lists.lustre.org
> https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.
> lustre.org_listinfo.cgi_lustre-2Ddevel-2Dlustre.org&d=DwICAg&c=IGDlg0lD0b-
> nebmJJ0Kp8A&r=m8P9AM2wTf4l79yg9e1LHD5IHagtwa3P4AXaemlM6Lg&m=w0oijGmz2ea38-
> -CHGZq4fPu44dwEldJr2BDVZcBR2U&s=ppAA2u9phKTaqwpnFsNVQGtqbG3xF6
> tk4_Q9mVL_lGk&e=
>
--
Alexey Lyashkov *?* Technical lead for a Morpheus team
Seagate Technology, LLC
www.seagate.com
www.lustre.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-devel-lustre.org/attachments/20170323/f9d9ab00/attachment.htm>
^ permalink raw reply [flat|nested] 17+ messages in thread
* [lustre-devel] [PATCH 6/6] Adjust max_rpcs_in_flight according to metrics
2017-03-21 19:43 [lustre-devel] [PATCH 0/6] Rate-limiting Quality of Service Yan Li
` (4 preceding siblings ...)
2017-03-21 19:43 ` [lustre-devel] [PATCH 5/6] Throttle the outgoing requests according to tau Yan Li
@ 2017-03-21 19:43 ` Yan Li
5 siblings, 0 replies; 17+ messages in thread
From: Yan Li @ 2017-03-21 19:43 UTC (permalink / raw)
To: lustre-devel
Signed-off-by: Yan Li <yanli@ascar.io>
---
lustre/osc/osc_request.c | 165 +++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 165 insertions(+)
diff --git a/lustre/osc/osc_request.c b/lustre/osc/osc_request.c
index c59c281..8efaf5a 100644
--- a/lustre/osc/osc_request.c
+++ b/lustre/osc/osc_request.c
@@ -1613,6 +1613,156 @@ static void osc_release_ppga(struct brw_page **ppga, size_t count)
OBD_FREE(ppga, sizeof(*ppga) * count);
}
+
+#ifdef ENABLE_RLQOS
+/**
+ * te's lock should be acquired beforehand
+ */
+static void time_ewma_add_extlock(struct time_ewma *te, struct timeval *new_time) {
+ __u64 old_ea = te->ea;
+ long timediff;
+
+ if (te->last_time.tv_sec != 0) {
+ timediff = cfs_timeval_sub(new_time, &te->last_time, NULL);
+ if (timediff < 0) {
+ CDEBUG(D_INFO,
+ "(te: %p) negative timediff %ld detected, using abs value\n",
+ te, timediff);
+ timediff = -timediff;
+ }
+
+ /* Reset ea to 0 if a long gap (>10min) is detected */
+ if (timediff > 10 * 60 * ONE_MILLION) {
+ CWARN("(te: %p) Long gap detected\n", te);
+ te->ea = 0;
+ } else {
+ /* ewma = ewma * (1-alpha) + amount * alpha
+ * ea = ewma * alpha, alpha_inv = 1/alpha
+ *
+ * ea = ea / alpha_inv * (alpha_inv - 1) + timediff
+ */
+ do_div(te->ea, te->alpha_inv);
+ te->ea = te->ea * (te->alpha_inv - 1) + timediff;
+ if (te->ea > 1000000) {
+ CDEBUG(D_INFO,
+ "(te: %p) old_ea = %llu, "
+ "old_time = %ld.%ld, "
+ "new_time = %ld.%ld, new ea = %llu\n",
+ te, old_ea,
+ te->last_time.tv_sec,
+ te->last_time.tv_usec,
+ new_time->tv_sec,
+ new_time->tv_usec, te->ea);
+ }
+ }
+ } else {
+ CDEBUG(D_INFO, "(te: %p) first call\n", te);
+ }
+ te->last_time = *new_time;
+}
+
+/**
+ * Calculate ewma of time values. Long gaps will be ignored.
+ */
+static int qos_adjust(struct obd_device *obd, struct timeval *new_ack_time,
+ struct timeval *new_sent_time, int op, int bytes_transferred)
+{
+ struct client_obd *cli = &obd->u.cli;
+ struct qos_data_t *qos = &cli->qos;
+ struct time_ewma *ack_ewma_p = &qos->ack_ewma;
+ struct time_ewma *sent_ewma_p = &qos->sent_ewma;
+ __u64 ack_ewma;
+ __u64 sent_ewma;
+ struct qos_rule_t *r;
+ int new_mrif = -1; /* -1 means no change needed */
+ int i;
+ struct timeval now;
+ long rtt;
+ int rtt_ratio100;
+ long usec_since_last_mrif_update;
+
+ spin_lock(&qos->lock);
+ time_ewma_add_extlock(ack_ewma_p, new_ack_time);
+ ack_ewma = qos_get_ewma_usec(ack_ewma_p);
+
+ time_ewma_add_extlock(sent_ewma_p, new_sent_time);
+ sent_ewma = qos_get_ewma_usec(sent_ewma_p);
+
+ /* calculate rtt */
+ do_gettimeofday(&now);
+ rtt = cfs_timeval_sub(&now, new_sent_time, NULL);
+ if (0 == qos->smallest_rtt || rtt < qos->smallest_rtt) {
+ qos->smallest_rtt = rtt;
+ }
+ rtt = rtt * 100;
+ rtt_ratio100 = rtt / qos->smallest_rtt;
+ qos->rtt_ratio100 = rtt_ratio100;
+
+ /* Calculate throughput */
+ calc_throughput(qos, op, bytes_transferred);
+
+ /* Adjust max_rpc_in_flight according to ack_ewma and send_ewma */
+ if (NULL == qos->rules) goto out;
+ if (NULL == cli->cl_import) goto out; /* or else LPROCFS_CLIMP_CHECK may return this function, leaving qos->lock locked */
+ for(i = 0; i < qos->rule_no; ++i) {
+ r = &qos->rules[i];
+ if (ack_ewma >= r->ack_ewma_lower &&
+ ack_ewma < r->ack_ewma_upper &&
+ sent_ewma >= r->send_ewma_lower &&
+ sent_ewma < r->send_ewma_upper &&
+ rtt_ratio100 >= r->rtt_ratio100_lower &&
+ rtt_ratio100 < r->rtt_ratio100_upper)
+ {
+ r->used_times++;
+ r->ack_ewma_avg += ((__s64)ack_ewma - (__s64)r->ack_ewma_avg) / r->used_times;
+ r->send_ewma_avg += ((__s64)sent_ewma - (__s64)r->send_ewma_avg) / r->used_times;
+ r->rtt_ratio100_avg += (rtt_ratio100 - (int)r->rtt_ratio100_avg) / r->used_times;
+
+ usec_since_last_mrif_update = cfs_timeval_sub(&now, &qos->last_mrif_update_time, NULL);
+ if (usec_since_last_mrif_update > 0 &&
+ usec_since_last_mrif_update >= qos->min_gap_between_updating_mrif) {
+ qos->last_mrif_update_time = now;
+ /* m100 is disabled when assigned negative values */
+ if (r->m100 >= 0) {
+ /* Must multiply m100 first, then div by 100 to avoid
+ * losing precision */
+ qos->max_rpc_in_flight100 *= r->m100;
+ qos->max_rpc_in_flight100 /= 100;
+ }
+ qos->max_rpc_in_flight100 += r->b100;
+ CDEBUG(D_INFO, "New max_rpc_in_flight100 = %d\n", qos->max_rpc_in_flight100);
+ if (qos->max_rpc_in_flight100 < 0) {
+ CDEBUG(D_INFO, "New max_rpc_in_flight100 is negative, reset it to 0\n");
+ qos->max_rpc_in_flight100 = 0;
+ }
+ if (qos->max_rpc_in_flight100 > OSC_MAX_RIF_MAX * 100) {
+ CDEBUG(D_INFO, "New max_rpc_in_flight100 is larger than %d, reset it to max allowed value\n", OSC_MAX_RIF_MAX * 100);
+ qos->max_rpc_in_flight100 = OSC_MAX_RIF_MAX * 100;
+ }
+ new_mrif = qos->max_rpc_in_flight100 / 100;
+ if (new_mrif < 1) {
+ CDEBUG(D_INFO, "New max_rpc_in_flight is smaller than 1, reset it to 1\n");
+ new_mrif = 1;
+ }
+ }
+ /* Update min_usec_between_rpcs to tau */
+ qos->min_usec_between_rpcs = r->tau;
+ /* set MRIF after unlocking qos->lock to prevent deadlocking */
+ break;
+ }
+ }
+out:
+ spin_unlock(&qos->lock);
+
+ if (-1 != new_mrif) { /* -1 means no change needed */
+ LPROCFS_CLIMP_CHECK(obd);
+ set_max_rpcs_in_flight(new_mrif, cli);
+ LPROCFS_CLIMP_EXIT(obd);
+ }
+ return 0;
+}
+#endif /* ENABLE_RLQOS */
+
static int brw_interpret(const struct lu_env *env,
struct ptlrpc_request *req, void *data, int rc)
{
@@ -1622,6 +1772,14 @@ static int brw_interpret(const struct lu_env *env,
struct client_obd *cli = aa->aa_cli;
ENTRY;
+#ifdef ENABLE_RLQOS
+ qos_adjust(req->rq_import->imp_obd,
+ &req->rq_arrival_time,
+ &aa->aa_oa->o_sent_time,
+ lustre_msg_get_opc(req->rq_reqmsg) - OST_READ,
+ req->rq_bulk->bd_nob_transferred);
+#endif
+
rc = osc_brw_fini_request(req, rc);
CDEBUG(D_INODE, "request %p aa %p rc %d\n", req, aa, rc);
/* When server return -EINPROGRESS, client should always retry
@@ -1874,6 +2032,10 @@ int osc_build_rpc(const struct lu_env *env, struct client_obd *cli,
list_splice_init(&rpc_list, &aa->aa_oaps);
INIT_LIST_HEAD(&aa->aa_exts);
list_splice_init(ext_list, &aa->aa_exts);
+#ifdef ENABLE_RLQOS
+ /* sent_time is used by RLQoS */
+ do_gettimeofday(&aa->aa_oa->o_sent_time);
+#endif
spin_lock(&cli->cl_loi_list_lock);
starting_offset >>= PAGE_SHIFT;
@@ -1897,6 +2059,9 @@ int osc_build_rpc(const struct lu_env *env, struct client_obd *cli,
cli->cl_w_in_flight);
OBD_FAIL_TIMEOUT(OBD_FAIL_OSC_DELAY_IO, cfs_fail_val);
+#ifdef ENABLE_RLQOS
+ qos_throttle(&cli->qos);
+#endif
ptlrpcd_add_req(req);
rc = 0;
EXIT;
--
1.8.3.1
^ permalink raw reply related [flat|nested] 17+ messages in thread