* [patch 15/15] mm: add strictlimit knob
@ 2017-11-30 22:15 akpm
2017-12-01 12:29 ` Jan Kara
0 siblings, 1 reply; 8+ messages in thread
From: akpm @ 2017-11-30 22:15 UTC (permalink / raw)
To: linux-mm, akpm, MPatlasov, fengguang.wu, hmh, jack, mel, t.artem, tytso
From: Maxim Patlasov <MPatlasov@parallels.com>
Subject: mm: add strictlimit knob
The "strictlimit" feature was introduced to enforce per-bdi dirty limits
for FUSE which sets bdi max_ratio to 1% by default:
http://article.gmane.org/gmane.linux.kernel.mm/105809
However the feature can be useful for other relatively slow or untrusted
BDIs like USB flash drives and DVD+RW. The patch adds a knob to enable
the feature:
echo 1 > /sys/class/bdi/X:Y/strictlimit
Being enabled, the feature enforces bdi max_ratio limit even if global
(10%) dirty limit is not reached. Of course, the effect is not visible
until /sys/class/bdi/X:Y/max_ratio is decreased to some reasonable value.
Signed-off-by: Maxim Patlasov <MPatlasov@parallels.com>
Cc: Henrique de Moraes Holschuh <hmh@hmh.eng.br>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: "Artem S. Tashkinov" <t.artem@lycos.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Jan Kara <jack@suse.cz>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
Documentation/ABI/testing/sysfs-class-bdi | 8 ++++
mm/backing-dev.c | 35 ++++++++++++++++++++
2 files changed, 43 insertions(+)
diff -puN Documentation/ABI/testing/sysfs-class-bdi~mm-add-strictlimit-knob-v2 Documentation/ABI/testing/sysfs-class-bdi
--- a/Documentation/ABI/testing/sysfs-class-bdi~mm-add-strictlimit-knob-v2
+++ a/Documentation/ABI/testing/sysfs-class-bdi
@@ -53,3 +53,11 @@ stable_pages_required (read-only)
If set, the backing device requires that all pages comprising a write
request must not be changed until writeout is complete.
+
+strictlimit (read-write)
+
+ Forces per-BDI checks for the share of given device in the write-back
+ cache even before the global background dirty limit is reached. This
+ is useful in situations where the global limit is much higher than
+ affordable for given relatively slow (or untrusted) device. Turning
+ strictlimit on has no visible effect if max_ratio is equal to 100%.
diff -puN mm/backing-dev.c~mm-add-strictlimit-knob-v2 mm/backing-dev.c
--- a/mm/backing-dev.c~mm-add-strictlimit-knob-v2
+++ a/mm/backing-dev.c
@@ -218,11 +218,46 @@ static ssize_t stable_pages_required_sho
}
static DEVICE_ATTR_RO(stable_pages_required);
+static ssize_t strictlimit_store(struct device *dev,
+ struct device_attribute *attr, const char *buf, size_t count)
+{
+ struct backing_dev_info *bdi = dev_get_drvdata(dev);
+ unsigned int val;
+ ssize_t ret;
+
+ ret = kstrtouint(buf, 10, &val);
+ if (ret < 0)
+ return ret;
+
+ switch (val) {
+ case 0:
+ bdi->capabilities &= ~BDI_CAP_STRICTLIMIT;
+ break;
+ case 1:
+ bdi->capabilities |= BDI_CAP_STRICTLIMIT;
+ break;
+ default:
+ return -EINVAL;
+ }
+
+ return count;
+}
+static ssize_t strictlimit_show(struct device *dev,
+ struct device_attribute *attr, char *page)
+{
+ struct backing_dev_info *bdi = dev_get_drvdata(dev);
+
+ return snprintf(page, PAGE_SIZE-1, "%d\n",
+ !!(bdi->capabilities & BDI_CAP_STRICTLIMIT));
+}
+static DEVICE_ATTR_RW(strictlimit);
+
static struct attribute *bdi_dev_attrs[] = {
&dev_attr_read_ahead_kb.attr,
&dev_attr_min_ratio.attr,
&dev_attr_max_ratio.attr,
&dev_attr_stable_pages_required.attr,
+ &dev_attr_strictlimit.attr,
NULL,
};
ATTRIBUTE_GROUPS(bdi_dev);
_
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [patch 15/15] mm: add strictlimit knob
2017-11-30 22:15 [patch 15/15] mm: add strictlimit knob akpm
@ 2017-12-01 12:29 ` Jan Kara
2017-12-07 1:09 ` Andrew Morton
0 siblings, 1 reply; 8+ messages in thread
From: Jan Kara @ 2017-12-01 12:29 UTC (permalink / raw)
To: akpm; +Cc: linux-mm, MPatlasov, fengguang.wu, hmh, jack, mel, t.artem, tytso
On Thu 30-11-17 14:15:58, Andrew Morton wrote:
> From: Maxim Patlasov <MPatlasov@parallels.com>
> Subject: mm: add strictlimit knob
>
> The "strictlimit" feature was introduced to enforce per-bdi dirty limits
> for FUSE which sets bdi max_ratio to 1% by default:
>
> http://article.gmane.org/gmane.linux.kernel.mm/105809
>
> However the feature can be useful for other relatively slow or untrusted
> BDIs like USB flash drives and DVD+RW. The patch adds a knob to enable
> the feature:
>
> echo 1 > /sys/class/bdi/X:Y/strictlimit
>
> Being enabled, the feature enforces bdi max_ratio limit even if global
> (10%) dirty limit is not reached. Of course, the effect is not visible
> until /sys/class/bdi/X:Y/max_ratio is decreased to some reasonable value.
In principle I have nothing against this and the usecase sounds reasonable
(in fact I believe the lack of a feature like this is one of reasons why
desktop automounters usually mount USB devices with 'sync' mount option).
So feel free to add:
Reviewed-by: Jan Kara <jack@suse.cz>
Honza
> Signed-off-by: Maxim Patlasov <MPatlasov@parallels.com>
> Cc: Henrique de Moraes Holschuh <hmh@hmh.eng.br>
> Cc: Theodore Ts'o <tytso@mit.edu>
> Cc: "Artem S. Tashkinov" <t.artem@lycos.com>
> Cc: Mel Gorman <mel@csn.ul.ie>
> Cc: Jan Kara <jack@suse.cz>
> Cc: Wu Fengguang <fengguang.wu@intel.com>
> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
> ---
>
> Documentation/ABI/testing/sysfs-class-bdi | 8 ++++
> mm/backing-dev.c | 35 ++++++++++++++++++++
> 2 files changed, 43 insertions(+)
>
> diff -puN Documentation/ABI/testing/sysfs-class-bdi~mm-add-strictlimit-knob-v2 Documentation/ABI/testing/sysfs-class-bdi
> --- a/Documentation/ABI/testing/sysfs-class-bdi~mm-add-strictlimit-knob-v2
> +++ a/Documentation/ABI/testing/sysfs-class-bdi
> @@ -53,3 +53,11 @@ stable_pages_required (read-only)
>
> If set, the backing device requires that all pages comprising a write
> request must not be changed until writeout is complete.
> +
> +strictlimit (read-write)
> +
> + Forces per-BDI checks for the share of given device in the write-back
> + cache even before the global background dirty limit is reached. This
> + is useful in situations where the global limit is much higher than
> + affordable for given relatively slow (or untrusted) device. Turning
> + strictlimit on has no visible effect if max_ratio is equal to 100%.
> diff -puN mm/backing-dev.c~mm-add-strictlimit-knob-v2 mm/backing-dev.c
> --- a/mm/backing-dev.c~mm-add-strictlimit-knob-v2
> +++ a/mm/backing-dev.c
> @@ -218,11 +218,46 @@ static ssize_t stable_pages_required_sho
> }
> static DEVICE_ATTR_RO(stable_pages_required);
>
> +static ssize_t strictlimit_store(struct device *dev,
> + struct device_attribute *attr, const char *buf, size_t count)
> +{
> + struct backing_dev_info *bdi = dev_get_drvdata(dev);
> + unsigned int val;
> + ssize_t ret;
> +
> + ret = kstrtouint(buf, 10, &val);
> + if (ret < 0)
> + return ret;
> +
> + switch (val) {
> + case 0:
> + bdi->capabilities &= ~BDI_CAP_STRICTLIMIT;
> + break;
> + case 1:
> + bdi->capabilities |= BDI_CAP_STRICTLIMIT;
> + break;
> + default:
> + return -EINVAL;
> + }
> +
> + return count;
> +}
> +static ssize_t strictlimit_show(struct device *dev,
> + struct device_attribute *attr, char *page)
> +{
> + struct backing_dev_info *bdi = dev_get_drvdata(dev);
> +
> + return snprintf(page, PAGE_SIZE-1, "%d\n",
> + !!(bdi->capabilities & BDI_CAP_STRICTLIMIT));
> +}
> +static DEVICE_ATTR_RW(strictlimit);
> +
> static struct attribute *bdi_dev_attrs[] = {
> &dev_attr_read_ahead_kb.attr,
> &dev_attr_min_ratio.attr,
> &dev_attr_max_ratio.attr,
> &dev_attr_stable_pages_required.attr,
> + &dev_attr_strictlimit.attr,
> NULL,
> };
> ATTRIBUTE_GROUPS(bdi_dev);
> _
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [patch 15/15] mm: add strictlimit knob
2017-12-01 12:29 ` Jan Kara
@ 2017-12-07 1:09 ` Andrew Morton
2017-12-07 4:14 ` Fengguang Wu
0 siblings, 1 reply; 8+ messages in thread
From: Andrew Morton @ 2017-12-07 1:09 UTC (permalink / raw)
To: Jan Kara
Cc: linux-mm, MPatlasov, fengguang.wu, hmh, mel, t.artem, tytso, Jens Axboe
On Fri, 1 Dec 2017 13:29:28 +0100 Jan Kara <jack@suse.cz> wrote:
> On Thu 30-11-17 14:15:58, Andrew Morton wrote:
> > From: Maxim Patlasov <MPatlasov@parallels.com>
> > Subject: mm: add strictlimit knob
> >
> > The "strictlimit" feature was introduced to enforce per-bdi dirty limits
> > for FUSE which sets bdi max_ratio to 1% by default:
> >
> > http://article.gmane.org/gmane.linux.kernel.mm/105809
> >
> > However the feature can be useful for other relatively slow or untrusted
> > BDIs like USB flash drives and DVD+RW. The patch adds a knob to enable
> > the feature:
> >
> > echo 1 > /sys/class/bdi/X:Y/strictlimit
> >
> > Being enabled, the feature enforces bdi max_ratio limit even if global
> > (10%) dirty limit is not reached. Of course, the effect is not visible
> > until /sys/class/bdi/X:Y/max_ratio is decreased to some reasonable value.
>
> In principle I have nothing against this and the usecase sounds reasonable
> (in fact I believe the lack of a feature like this is one of reasons why
> desktop automounters usually mount USB devices with 'sync' mount option).
> So feel free to add:
>
> Reviewed-by: Jan Kara <jack@suse.cz>
>
Cc Jens, who may be vaguely interested in plans to finally merge this
three-year-old patch?
From: Maxim Patlasov <MPatlasov@parallels.com>
Subject: mm: add strictlimit knob
The "strictlimit" feature was introduced to enforce per-bdi dirty limits
for FUSE which sets bdi max_ratio to 1% by default:
http://article.gmane.org/gmane.linux.kernel.mm/105809
However the feature can be useful for other relatively slow or untrusted
BDIs like USB flash drives and DVD+RW. The patch adds a knob to enable
the feature:
echo 1 > /sys/class/bdi/X:Y/strictlimit
Being enabled, the feature enforces bdi max_ratio limit even if global
(10%) dirty limit is not reached. Of course, the effect is not visible
until /sys/class/bdi/X:Y/max_ratio is decreased to some reasonable value.
Jan said:
: In principle I have nothing against this and the usecase sounds reasonable
: (in fact I believe the lack of a feature like this is one of reasons why
: desktop automounters usually mount USB devices with 'sync' mount option).
: So feel free to add:
Signed-off-by: Maxim Patlasov <MPatlasov@parallels.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: Henrique de Moraes Holschuh <hmh@hmh.eng.br>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: "Artem S. Tashkinov" <t.artem@lycos.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Jan Kara <jack@suse.cz>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
Documentation/ABI/testing/sysfs-class-bdi | 8 ++++
mm/backing-dev.c | 35 ++++++++++++++++++++
2 files changed, 43 insertions(+)
diff -puN Documentation/ABI/testing/sysfs-class-bdi~mm-add-strictlimit-knob-v2 Documentation/ABI/testing/sysfs-class-bdi
--- a/Documentation/ABI/testing/sysfs-class-bdi~mm-add-strictlimit-knob-v2
+++ a/Documentation/ABI/testing/sysfs-class-bdi
@@ -53,3 +53,11 @@ stable_pages_required (read-only)
If set, the backing device requires that all pages comprising a write
request must not be changed until writeout is complete.
+
+strictlimit (read-write)
+
+ Forces per-BDI checks for the share of given device in the write-back
+ cache even before the global background dirty limit is reached. This
+ is useful in situations where the global limit is much higher than
+ affordable for given relatively slow (or untrusted) device. Turning
+ strictlimit on has no visible effect if max_ratio is equal to 100%.
diff -puN mm/backing-dev.c~mm-add-strictlimit-knob-v2 mm/backing-dev.c
--- a/mm/backing-dev.c~mm-add-strictlimit-knob-v2
+++ a/mm/backing-dev.c
@@ -231,11 +231,46 @@ static ssize_t stable_pages_required_sho
}
static DEVICE_ATTR_RO(stable_pages_required);
+static ssize_t strictlimit_store(struct device *dev,
+ struct device_attribute *attr, const char *buf, size_t count)
+{
+ struct backing_dev_info *bdi = dev_get_drvdata(dev);
+ unsigned int val;
+ ssize_t ret;
+
+ ret = kstrtouint(buf, 10, &val);
+ if (ret < 0)
+ return ret;
+
+ switch (val) {
+ case 0:
+ bdi->capabilities &= ~BDI_CAP_STRICTLIMIT;
+ break;
+ case 1:
+ bdi->capabilities |= BDI_CAP_STRICTLIMIT;
+ break;
+ default:
+ return -EINVAL;
+ }
+
+ return count;
+}
+static ssize_t strictlimit_show(struct device *dev,
+ struct device_attribute *attr, char *page)
+{
+ struct backing_dev_info *bdi = dev_get_drvdata(dev);
+
+ return snprintf(page, PAGE_SIZE-1, "%d\n",
+ !!(bdi->capabilities & BDI_CAP_STRICTLIMIT));
+}
+static DEVICE_ATTR_RW(strictlimit);
+
static struct attribute *bdi_dev_attrs[] = {
&dev_attr_read_ahead_kb.attr,
&dev_attr_min_ratio.attr,
&dev_attr_max_ratio.attr,
&dev_attr_stable_pages_required.attr,
+ &dev_attr_strictlimit.attr,
NULL,
};
ATTRIBUTE_GROUPS(bdi_dev);
_
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [patch 15/15] mm: add strictlimit knob
2017-12-07 1:09 ` Andrew Morton
@ 2017-12-07 4:14 ` Fengguang Wu
2017-12-07 8:50 ` Miklos Szeredi
0 siblings, 1 reply; 8+ messages in thread
From: Fengguang Wu @ 2017-12-07 4:14 UTC (permalink / raw)
To: Andrew Morton
Cc: Jan Kara, linux-mm, MPatlasov, hmh, mel, t.artem, tytso,
Jens Axboe, Miklos Szeredi, linux-fsdevel
CC fuse maintainer, too.
On Wed, Dec 06, 2017 at 05:09:27PM -0800, Andrew Morton wrote:
>On Fri, 1 Dec 2017 13:29:28 +0100 Jan Kara <jack@suse.cz> wrote:
>
>> On Thu 30-11-17 14:15:58, Andrew Morton wrote:
>> > From: Maxim Patlasov <MPatlasov@parallels.com>
>> > Subject: mm: add strictlimit knob
>> >
>> > The "strictlimit" feature was introduced to enforce per-bdi dirty limits
>> > for FUSE which sets bdi max_ratio to 1% by default:
>> >
>> > http://article.gmane.org/gmane.linux.kernel.mm/105809
>> >
>> > However the feature can be useful for other relatively slow or untrusted
>> > BDIs like USB flash drives and DVD+RW. The patch adds a knob to enable
>> > the feature:
>> >
>> > echo 1 > /sys/class/bdi/X:Y/strictlimit
>> >
>> > Being enabled, the feature enforces bdi max_ratio limit even if global
>> > (10%) dirty limit is not reached. Of course, the effect is not visible
>> > until /sys/class/bdi/X:Y/max_ratio is decreased to some reasonable value.
>>
>> In principle I have nothing against this and the usecase sounds reasonable
>> (in fact I believe the lack of a feature like this is one of reasons why
>> desktop automounters usually mount USB devices with 'sync' mount option).
>> So feel free to add:
>>
>> Reviewed-by: Jan Kara <jack@suse.cz>
>>
>
>Cc Jens, who may be vaguely interested in plans to finally merge this
>three-year-old patch?
>
>
>
>From: Maxim Patlasov <MPatlasov@parallels.com>
>Subject: mm: add strictlimit knob
>
>The "strictlimit" feature was introduced to enforce per-bdi dirty limits
>for FUSE which sets bdi max_ratio to 1% by default:
>
>http://article.gmane.org/gmane.linux.kernel.mm/105809
That link is invalid for now, possibly due to the gmane site rebuild.
I find an email thread here which looks relevant:
https://sourceforge.net/p/fuse/mailman/message/35254883/
Where Maxim has an interesting point:
> Did any one try increasing the limit and did see any better/worse
> performance ?
We've used 20% as default value in OpenVZ kernel for a long while (1%
was not enough to saturate our distributed parallel storage).
So the knob will also enable people to _disable_ the 1% fuse limit to
increase performance.
So people can use the exposed knob in 2 ways to fit their needs, which
is in general a good thing.
However the comment in wb_position_ratio() says
Without strictlimit feature, fuse writeback may
* consume arbitrary amount of RAM because it is accounted in
* NR_WRITEBACK_TEMP which is not involved in calculating "nr_dirty".
How dangerous would that be if some user disabled the 1% fuse limit
through the exposed knob? Will the NR_WRITEBACK_TEMP effect go far
beyond the user's expectation (20% max dirty limit)?
Looking at the fuse code, NR_WRITEBACK_TEMP will grow proportional to
WB_WRITEBACK, which should be throttled when bdi_write_congested().
The congested flag will be set on
fuse_conn.num_background >= fuse_conn.congestion_threshold
So it looks NR_WRITEBACK_TEMP will somehow be throttled. Just that
it's not included in the 20% dirty limit.
Other than that concern, the patch looks good to me.
Thanks,
Fengguang
>However the feature can be useful for other relatively slow or untrusted
>BDIs like USB flash drives and DVD+RW. The patch adds a knob to enable
>the feature:
>
>echo 1 > /sys/class/bdi/X:Y/strictlimit
>
>Being enabled, the feature enforces bdi max_ratio limit even if global
>(10%) dirty limit is not reached. Of course, the effect is not visible
>until /sys/class/bdi/X:Y/max_ratio is decreased to some reasonable value.
>
>Jan said:
>
>: In principle I have nothing against this and the usecase sounds reasonable
>: (in fact I believe the lack of a feature like this is one of reasons why
>: desktop automounters usually mount USB devices with 'sync' mount option).
>: So feel free to add:
>
>Signed-off-by: Maxim Patlasov <MPatlasov@parallels.com>
>Reviewed-by: Jan Kara <jack@suse.cz>
>Cc: Henrique de Moraes Holschuh <hmh@hmh.eng.br>
>Cc: Theodore Ts'o <tytso@mit.edu>
>Cc: "Artem S. Tashkinov" <t.artem@lycos.com>
>Cc: Mel Gorman <mel@csn.ul.ie>
>Cc: Jan Kara <jack@suse.cz>
>Cc: Wu Fengguang <fengguang.wu@intel.com>
>Cc: Jens Axboe <axboe@kernel.dk>
>Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
>---
>
> Documentation/ABI/testing/sysfs-class-bdi | 8 ++++
> mm/backing-dev.c | 35 ++++++++++++++++++++
> 2 files changed, 43 insertions(+)
>
>diff -puN Documentation/ABI/testing/sysfs-class-bdi~mm-add-strictlimit-knob-v2 Documentation/ABI/testing/sysfs-class-bdi
>--- a/Documentation/ABI/testing/sysfs-class-bdi~mm-add-strictlimit-knob-v2
>+++ a/Documentation/ABI/testing/sysfs-class-bdi
>@@ -53,3 +53,11 @@ stable_pages_required (read-only)
>
> If set, the backing device requires that all pages comprising a write
> request must not be changed until writeout is complete.
>+
>+strictlimit (read-write)
>+
>+ Forces per-BDI checks for the share of given device in the write-back
>+ cache even before the global background dirty limit is reached. This
>+ is useful in situations where the global limit is much higher than
>+ affordable for given relatively slow (or untrusted) device. Turning
>+ strictlimit on has no visible effect if max_ratio is equal to 100%.
>diff -puN mm/backing-dev.c~mm-add-strictlimit-knob-v2 mm/backing-dev.c
>--- a/mm/backing-dev.c~mm-add-strictlimit-knob-v2
>+++ a/mm/backing-dev.c
>@@ -231,11 +231,46 @@ static ssize_t stable_pages_required_sho
> }
> static DEVICE_ATTR_RO(stable_pages_required);
>
>+static ssize_t strictlimit_store(struct device *dev,
>+ struct device_attribute *attr, const char *buf, size_t count)
>+{
>+ struct backing_dev_info *bdi = dev_get_drvdata(dev);
>+ unsigned int val;
>+ ssize_t ret;
>+
>+ ret = kstrtouint(buf, 10, &val);
>+ if (ret < 0)
>+ return ret;
>+
>+ switch (val) {
>+ case 0:
>+ bdi->capabilities &= ~BDI_CAP_STRICTLIMIT;
>+ break;
>+ case 1:
>+ bdi->capabilities |= BDI_CAP_STRICTLIMIT;
>+ break;
>+ default:
>+ return -EINVAL;
>+ }
>+
>+ return count;
>+}
>+static ssize_t strictlimit_show(struct device *dev,
>+ struct device_attribute *attr, char *page)
>+{
>+ struct backing_dev_info *bdi = dev_get_drvdata(dev);
>+
>+ return snprintf(page, PAGE_SIZE-1, "%d\n",
>+ !!(bdi->capabilities & BDI_CAP_STRICTLIMIT));
>+}
>+static DEVICE_ATTR_RW(strictlimit);
>+
> static struct attribute *bdi_dev_attrs[] = {
> &dev_attr_read_ahead_kb.attr,
> &dev_attr_min_ratio.attr,
> &dev_attr_max_ratio.attr,
> &dev_attr_stable_pages_required.attr,
>+ &dev_attr_strictlimit.attr,
> NULL,
> };
> ATTRIBUTE_GROUPS(bdi_dev);
>_
>
>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [patch 15/15] mm: add strictlimit knob
2017-12-07 4:14 ` Fengguang Wu
@ 2017-12-07 8:50 ` Miklos Szeredi
2017-12-07 10:15 ` Fengguang Wu
0 siblings, 1 reply; 8+ messages in thread
From: Miklos Szeredi @ 2017-12-07 8:50 UTC (permalink / raw)
To: Fengguang Wu
Cc: Andrew Morton, Jan Kara, linux-mm, Maxim Patlasov, hmh, mel,
t.artem, Theodore Ts'o, Jens Axboe, linux-fsdevel
On Thu, Dec 7, 2017 at 5:14 AM, Fengguang Wu <fengguang.wu@intel.com> wrote:
> CC fuse maintainer, too.
>
> On Wed, Dec 06, 2017 at 05:09:27PM -0800, Andrew Morton wrote:
>>
>> On Fri, 1 Dec 2017 13:29:28 +0100 Jan Kara <jack@suse.cz> wrote:
>>
>>> On Thu 30-11-17 14:15:58, Andrew Morton wrote:
>>> > From: Maxim Patlasov <MPatlasov@parallels.com>
>>> > Subject: mm: add strictlimit knob
>>> >
>>> > The "strictlimit" feature was introduced to enforce per-bdi dirty
>>> > limits
>>> > for FUSE which sets bdi max_ratio to 1% by default:
>>> >
>>> > http://article.gmane.org/gmane.linux.kernel.mm/105809
>>> >
>>> > However the feature can be useful for other relatively slow or
>>> > untrusted
>>> > BDIs like USB flash drives and DVD+RW. The patch adds a knob to enable
>>> > the feature:
>>> >
>>> > echo 1 > /sys/class/bdi/X:Y/strictlimit
>>> >
>>> > Being enabled, the feature enforces bdi max_ratio limit even if global
>>> > (10%) dirty limit is not reached. Of course, the effect is not visible
>>> > until /sys/class/bdi/X:Y/max_ratio is decreased to some reasonable
>>> > value.
>>>
>>> In principle I have nothing against this and the usecase sounds
>>> reasonable
>>> (in fact I believe the lack of a feature like this is one of reasons why
>>> desktop automounters usually mount USB devices with 'sync' mount option).
>>> So feel free to add:
>>>
>>> Reviewed-by: Jan Kara <jack@suse.cz>
>>>
>>
>> Cc Jens, who may be vaguely interested in plans to finally merge this
>> three-year-old patch?
>>
>>
>>
>> From: Maxim Patlasov <MPatlasov@parallels.com>
>> Subject: mm: add strictlimit knob
>>
>> The "strictlimit" feature was introduced to enforce per-bdi dirty limits
>> for FUSE which sets bdi max_ratio to 1% by default:
>>
>> http://article.gmane.org/gmane.linux.kernel.mm/105809
>
>
> That link is invalid for now, possibly due to the gmane site rebuild.
> I find an email thread here which looks relevant:
>
> https://sourceforge.net/p/fuse/mailman/message/35254883/
>
> Where Maxim has an interesting point:
>
> > Did any one try increasing the limit and did see any better/worse
>> performance ?
>
> We've used 20% as default value in OpenVZ kernel for a long while (1%
> was not enough to saturate our distributed parallel storage).
>
> So the knob will also enable people to _disable_ the 1% fuse limit to
> increase performance.
>
> So people can use the exposed knob in 2 ways to fit their needs, which
> is in general a good thing.
>
> However the comment in wb_position_ratio() says
>
> Without strictlimit feature, fuse writeback may
> * consume arbitrary amount of RAM because it is accounted in
> * NR_WRITEBACK_TEMP which is not involved in calculating
> "nr_dirty".
>
> How dangerous would that be if some user disabled the 1% fuse limit
> through the exposed knob? Will the NR_WRITEBACK_TEMP effect go far
> beyond the user's expectation (20% max dirty limit)?
>
> Looking at the fuse code, NR_WRITEBACK_TEMP will grow proportional to
> WB_WRITEBACK, which should be throttled when bdi_write_congested().
> The congested flag will be set on
>
> fuse_conn.num_background >= fuse_conn.congestion_threshold
> So it looks NR_WRITEBACK_TEMP will somehow be throttled. Just that
> it's not included in the 20% dirty limit.
Only balance_dirty_pages_ratelimited() is going to limit the
generation of dirty pages, I don't think congestion flags will do
that. And (AFAICS) for fuse only BDI_CAP_STRICTLIMIT will allow
accounting temp writeback pages when throttling dirty page generation.
So without BDI_CAP_STRICTLIMIT kernel memory use of fuse may explode.
So we probably need a way to force BDI_CAP_STRICTLIMIT (i.e. do not
permit disabling it for fuse).
Please correct me if I'm wrong in any of the above statements, it's
been a long time I've taken a detailed look at the page writeback
mechanisms.
Thanks,
Miklos
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [patch 15/15] mm: add strictlimit knob
2017-12-07 8:50 ` Miklos Szeredi
@ 2017-12-07 10:15 ` Fengguang Wu
2017-12-07 10:32 ` Miklos Szeredi
0 siblings, 1 reply; 8+ messages in thread
From: Fengguang Wu @ 2017-12-07 10:15 UTC (permalink / raw)
To: Miklos Szeredi
Cc: Andrew Morton, Jan Kara, linux-mm, Maxim Patlasov, hmh, mel,
t.artem, Theodore Ts'o, Jens Axboe, linux-fsdevel
On Thu, Dec 07, 2017 at 09:50:23AM +0100, Miklos Szeredi wrote:
>On Thu, Dec 7, 2017 at 5:14 AM, Fengguang Wu <fengguang.wu@intel.com> wrote:
>> CC fuse maintainer, too.
>>
>> On Wed, Dec 06, 2017 at 05:09:27PM -0800, Andrew Morton wrote:
>>>
>>> On Fri, 1 Dec 2017 13:29:28 +0100 Jan Kara <jack@suse.cz> wrote:
>>>
>>>> On Thu 30-11-17 14:15:58, Andrew Morton wrote:
>>>> > From: Maxim Patlasov <MPatlasov@parallels.com>
>>>> > Subject: mm: add strictlimit knob
>>>> >
>>>> > The "strictlimit" feature was introduced to enforce per-bdi dirty
>>>> > limits
>>>> > for FUSE which sets bdi max_ratio to 1% by default:
>>>> >
>>>> > http://article.gmane.org/gmane.linux.kernel.mm/105809
>>>> >
>>>> > However the feature can be useful for other relatively slow or
>>>> > untrusted
>>>> > BDIs like USB flash drives and DVD+RW. The patch adds a knob to enable
>>>> > the feature:
>>>> >
>>>> > echo 1 > /sys/class/bdi/X:Y/strictlimit
>>>> >
>>>> > Being enabled, the feature enforces bdi max_ratio limit even if global
>>>> > (10%) dirty limit is not reached. Of course, the effect is not visible
>>>> > until /sys/class/bdi/X:Y/max_ratio is decreased to some reasonable
>>>> > value.
>>>>
>>>> In principle I have nothing against this and the usecase sounds
>>>> reasonable
>>>> (in fact I believe the lack of a feature like this is one of reasons why
>>>> desktop automounters usually mount USB devices with 'sync' mount option).
>>>> So feel free to add:
>>>>
>>>> Reviewed-by: Jan Kara <jack@suse.cz>
>>>>
>>>
>>> Cc Jens, who may be vaguely interested in plans to finally merge this
>>> three-year-old patch?
>>>
>>>
>>>
>>> From: Maxim Patlasov <MPatlasov@parallels.com>
>>> Subject: mm: add strictlimit knob
>>>
>>> The "strictlimit" feature was introduced to enforce per-bdi dirty limits
>>> for FUSE which sets bdi max_ratio to 1% by default:
>>>
>>> http://article.gmane.org/gmane.linux.kernel.mm/105809
>>
>>
>> That link is invalid for now, possibly due to the gmane site rebuild.
>> I find an email thread here which looks relevant:
>>
>> https://sourceforge.net/p/fuse/mailman/message/35254883/
>>
>> Where Maxim has an interesting point:
>>
>> > Did any one try increasing the limit and did see any better/worse
>>> performance ?
>>
>> We've used 20% as default value in OpenVZ kernel for a long while (1%
>> was not enough to saturate our distributed parallel storage).
>>
>> So the knob will also enable people to _disable_ the 1% fuse limit to
>> increase performance.
>>
>> So people can use the exposed knob in 2 ways to fit their needs, which
>> is in general a good thing.
>>
>> However the comment in wb_position_ratio() says
>>
>> Without strictlimit feature, fuse writeback may
>> * consume arbitrary amount of RAM because it is accounted in
>> * NR_WRITEBACK_TEMP which is not involved in calculating
>> "nr_dirty".
>>
>> How dangerous would that be if some user disabled the 1% fuse limit
>> through the exposed knob? Will the NR_WRITEBACK_TEMP effect go far
>> beyond the user's expectation (20% max dirty limit)?
>>
>> Looking at the fuse code, NR_WRITEBACK_TEMP will grow proportional to
>> WB_WRITEBACK, which should be throttled when bdi_write_congested().
>> The congested flag will be set on
>>
>> fuse_conn.num_background >= fuse_conn.congestion_threshold
>> So it looks NR_WRITEBACK_TEMP will somehow be throttled. Just that
>> it's not included in the 20% dirty limit.
>
>Only balance_dirty_pages_ratelimited() is going to limit the
>generation of dirty pages, I don't think congestion flags will do
>that.
Right. However my concern is something to limit the generation of
fuse's _writeback_ pages.
The normal writeback pages are limited in 2 ways:
- balance_dirty_pages_ratelimited()'s dirty throttling:
nr_dirty + nr_writeback + nr_unstable < global and/or bdi dirty limit
- block layer's nr_requests queue limit
However fuse's NR_WRITEBACK_TEMP looks special and has none of such
limits. The congested bit merely affect the vmscan pageout path.
pageout
may_write_to_inode
inode_write_congested
wb_congested
I wonder if fuse has its own approach to limit NR_WRITEBACK_TEMP?
Either explicitly or implicitly, there has to be some hard limit.
>And (AFAICS) for fuse only BDI_CAP_STRICTLIMIT will allow
>accounting temp writeback pages when throttling dirty page generation.
>So without BDI_CAP_STRICTLIMIT kernel memory use of fuse may explode.
>So we probably need a way to force BDI_CAP_STRICTLIMIT (i.e. do not
>permit disabling it for fuse).
So fuse relies on small nr_dirty. Does fuse impose any explicit or
implicit rule that NR_WRITEBACK_TEMP will never exceed (N * nr_dirty)?
Otherwise the size of NR_WRITEBACK_TEMP cannot be guaranteed.
For example, is it possible for some process (eg. dd) to dirty pages
as fast as possible while some other kernel logic to convert PG_dirty
to NR_WRITEBACK_TEMP as fast as possible, so that even the 1% bdi
strictlimit (which limits PG_dirty rather than NR_WRITEBACK_TEMP)
cannot stop all memory being eat up by ever growing NR_WRITEBACK_TEMP?
Thanks,
Fengguang
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [patch 15/15] mm: add strictlimit knob
2017-12-07 10:15 ` Fengguang Wu
@ 2017-12-07 10:32 ` Miklos Szeredi
2018-01-31 22:58 ` Andrew Morton
0 siblings, 1 reply; 8+ messages in thread
From: Miklos Szeredi @ 2017-12-07 10:32 UTC (permalink / raw)
To: Fengguang Wu
Cc: Andrew Morton, Jan Kara, linux-mm, Maxim Patlasov, hmh, mel,
t.artem, Theodore Ts'o, Jens Axboe, linux-fsdevel
On Thu, Dec 7, 2017 at 11:15 AM, Fengguang Wu <fengguang.wu@intel.com> wrote:
> On Thu, Dec 07, 2017 at 09:50:23AM +0100, Miklos Szeredi wrote:
>>
>> On Thu, Dec 7, 2017 at 5:14 AM, Fengguang Wu <fengguang.wu@intel.com>
>> wrote:
>>>
>>> CC fuse maintainer, too.
>>>
>>> On Wed, Dec 06, 2017 at 05:09:27PM -0800, Andrew Morton wrote:
>>>>
>>>>
>>>> On Fri, 1 Dec 2017 13:29:28 +0100 Jan Kara <jack@suse.cz> wrote:
>>>>
>>>>> On Thu 30-11-17 14:15:58, Andrew Morton wrote:
>>>>> > From: Maxim Patlasov <MPatlasov@parallels.com>
>>>>> > Subject: mm: add strictlimit knob
>>>>> >
>>>>> > The "strictlimit" feature was introduced to enforce per-bdi dirty
>>>>> > limits
>>>>> > for FUSE which sets bdi max_ratio to 1% by default:
>>>>> >
>>>>> > http://article.gmane.org/gmane.linux.kernel.mm/105809
>>>>> >
>>>>> > However the feature can be useful for other relatively slow or
>>>>> > untrusted
>>>>> > BDIs like USB flash drives and DVD+RW. The patch adds a knob to
>>>>> > enable
>>>>> > the feature:
>>>>> >
>>>>> > echo 1 > /sys/class/bdi/X:Y/strictlimit
>>>>> >
>>>>> > Being enabled, the feature enforces bdi max_ratio limit even if
>>>>> > global
>>>>> > (10%) dirty limit is not reached. Of course, the effect is not
>>>>> > visible
>>>>> > until /sys/class/bdi/X:Y/max_ratio is decreased to some reasonable
>>>>> > value.
>>>>>
>>>>> In principle I have nothing against this and the usecase sounds
>>>>> reasonable
>>>>> (in fact I believe the lack of a feature like this is one of reasons
>>>>> why
>>>>> desktop automounters usually mount USB devices with 'sync' mount
>>>>> option).
>>>>> So feel free to add:
>>>>>
>>>>> Reviewed-by: Jan Kara <jack@suse.cz>
>>>>>
>>>>
>>>> Cc Jens, who may be vaguely interested in plans to finally merge this
>>>> three-year-old patch?
>>>>
>>>>
>>>>
>>>> From: Maxim Patlasov <MPatlasov@parallels.com>
>>>> Subject: mm: add strictlimit knob
>>>>
>>>> The "strictlimit" feature was introduced to enforce per-bdi dirty limits
>>>> for FUSE which sets bdi max_ratio to 1% by default:
>>>>
>>>> http://article.gmane.org/gmane.linux.kernel.mm/105809
>>>
>>>
>>>
>>> That link is invalid for now, possibly due to the gmane site rebuild.
>>> I find an email thread here which looks relevant:
>>>
>>> https://sourceforge.net/p/fuse/mailman/message/35254883/
>>>
>>> Where Maxim has an interesting point:
>>>
>>> > Did any one try increasing the limit and did see any
>>> better/worse
>>>>
>>>> performance ?
>>>
>>>
>>> We've used 20% as default value in OpenVZ kernel for a long while
>>> (1%
>>> was not enough to saturate our distributed parallel storage).
>>>
>>> So the knob will also enable people to _disable_ the 1% fuse limit to
>>> increase performance.
>>>
>>> So people can use the exposed knob in 2 ways to fit their needs, which
>>> is in general a good thing.
>>>
>>> However the comment in wb_position_ratio() says
>>>
>>> Without strictlimit feature, fuse writeback may
>>> * consume arbitrary amount of RAM because it is accounted in
>>> * NR_WRITEBACK_TEMP which is not involved in calculating
>>> "nr_dirty".
>>>
>>> How dangerous would that be if some user disabled the 1% fuse limit
>>> through the exposed knob? Will the NR_WRITEBACK_TEMP effect go far
>>> beyond the user's expectation (20% max dirty limit)?
>>>
>>> Looking at the fuse code, NR_WRITEBACK_TEMP will grow proportional to
>>> WB_WRITEBACK, which should be throttled when bdi_write_congested().
>>> The congested flag will be set on
>>>
>>> fuse_conn.num_background >= fuse_conn.congestion_threshold
>>> So it looks NR_WRITEBACK_TEMP will somehow be throttled. Just that
>>> it's not included in the 20% dirty limit.
>>
>>
>> Only balance_dirty_pages_ratelimited() is going to limit the
>> generation of dirty pages, I don't think congestion flags will do
>> that.
>
>
> Right. However my concern is something to limit the generation of
> fuse's _writeback_ pages.
>
> The normal writeback pages are limited in 2 ways:
>
> - balance_dirty_pages_ratelimited()'s dirty throttling:
>
> nr_dirty + nr_writeback + nr_unstable < global and/or bdi dirty limit
>
> - block layer's nr_requests queue limit
>
> However fuse's NR_WRITEBACK_TEMP looks special and has none of such
> limits. The congested bit merely affect the vmscan pageout path.
>
> pageout
> may_write_to_inode
> inode_write_congested
> wb_congested
>
> I wonder if fuse has its own approach to limit NR_WRITEBACK_TEMP?
> Either explicitly or implicitly, there has to be some hard limit.
>
>> And (AFAICS) for fuse only BDI_CAP_STRICTLIMIT will allow
>> accounting temp writeback pages when throttling dirty page generation.
>> So without BDI_CAP_STRICTLIMIT kernel memory use of fuse may explode.
>> So we probably need a way to force BDI_CAP_STRICTLIMIT (i.e. do not
>> permit disabling it for fuse).
>
>
> So fuse relies on small nr_dirty. Does fuse impose any explicit or
> implicit rule that NR_WRITEBACK_TEMP will never exceed (N * nr_dirty)?
> Otherwise the size of NR_WRITEBACK_TEMP cannot be guaranteed.
>
> For example, is it possible for some process (eg. dd) to dirty pages
> as fast as possible while some other kernel logic to convert PG_dirty
> to NR_WRITEBACK_TEMP as fast as possible, so that even the 1% bdi
> strictlimit (which limits PG_dirty rather than NR_WRITEBACK_TEMP)
> cannot stop all memory being eat up by ever growing NR_WRITEBACK_TEMP?
Hmm, temp pages are still accounted as WB_WRITEBACK until writeback
finishes. Does that not count towards the dirty limit?
Thanks,
Miklos
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [patch 15/15] mm: add strictlimit knob
2017-12-07 10:32 ` Miklos Szeredi
@ 2018-01-31 22:58 ` Andrew Morton
0 siblings, 0 replies; 8+ messages in thread
From: Andrew Morton @ 2018-01-31 22:58 UTC (permalink / raw)
To: Miklos Szeredi
Cc: Fengguang Wu, Jan Kara, linux-mm, Maxim Patlasov, hmh, mel,
t.artem, Theodore Ts'o, Jens Axboe, linux-fsdevel
On Thu, 7 Dec 2017 11:32:43 +0100 Miklos Szeredi <miklos@szeredi.hu> wrote:
> On Thu, Dec 7, 2017 at 11:15 AM, Fengguang Wu <fengguang.wu@intel.com> wrote:
> > On Thu, Dec 07, 2017 at 09:50:23AM +0100, Miklos Szeredi wrote:
> >>
> >> On Thu, Dec 7, 2017 at 5:14 AM, Fengguang Wu <fengguang.wu@intel.com>
> >> wrote:
> >>>
> >>> CC fuse maintainer, too.
> >>>
> >>> On Wed, Dec 06, 2017 at 05:09:27PM -0800, Andrew Morton wrote:
> >>>>
> >>>>
> >>>> On Fri, 1 Dec 2017 13:29:28 +0100 Jan Kara <jack@suse.cz> wrote:
> >>>>
> >>>>> On Thu 30-11-17 14:15:58, Andrew Morton wrote:
> >>>>> > From: Maxim Patlasov <MPatlasov@parallels.com>
> >>>>> > Subject: mm: add strictlimit knob
> >>>>> >
> >>>>> > The "strictlimit" feature was introduced to enforce per-bdi dirty
> >>>>> > limits
> >>>>> > for FUSE which sets bdi max_ratio to 1% by default:
> >>>>> >
> >>>>> > http://article.gmane.org/gmane.linux.kernel.mm/105809
> >>>>> >
> >>>>> > However the feature can be useful for other relatively slow or
> >>>>> > untrusted
> >>>>> > BDIs like USB flash drives and DVD+RW. The patch adds a knob to
> >>>>> > enable
> >>>>> > the feature:
> >>>>> >
> >>>>> > echo 1 > /sys/class/bdi/X:Y/strictlimit
> >>>>> >
> >>>>> > Being enabled, the feature enforces bdi max_ratio limit even if
> >>>>> > global
> >>>>> > (10%) dirty limit is not reached. Of course, the effect is not
> >>>>> > visible
> >>>>> > until /sys/class/bdi/X:Y/max_ratio is decreased to some reasonable
> >>>>> > value.
> >>>>>
> >>>>> In principle I have nothing against this and the usecase sounds
> >>>>> reasonable
> >>>>> (in fact I believe the lack of a feature like this is one of reasons
> >>>>> why
> >>>>> desktop automounters usually mount USB devices with 'sync' mount
> >>>>> option).
> >>>>> So feel free to add:
> >>>>>
> >>>>> Reviewed-by: Jan Kara <jack@suse.cz>
> >>>>>
> >>>>
> >>>> Cc Jens, who may be vaguely interested in plans to finally merge this
> >>>> three-year-old patch?
> >>>>
> >>>>
> >>>>
> >>>> From: Maxim Patlasov <MPatlasov@parallels.com>
> >>>> Subject: mm: add strictlimit knob
> >>>>
> >>>> The "strictlimit" feature was introduced to enforce per-bdi dirty limits
> >>>> for FUSE which sets bdi max_ratio to 1% by default:
> >>>>
> >>>> http://article.gmane.org/gmane.linux.kernel.mm/105809
> >>>
> >>>
> >>>
> >>> That link is invalid for now, possibly due to the gmane site rebuild.
> >>> I find an email thread here which looks relevant:
> >>>
> >>> https://sourceforge.net/p/fuse/mailman/message/35254883/
> >>>
> >>> Where Maxim has an interesting point:
> >>>
> >>> > Did any one try increasing the limit and did see any
> >>> better/worse
> >>>>
> >>>> performance ?
> >>>
> >>>
> >>> We've used 20% as default value in OpenVZ kernel for a long while
> >>> (1%
> >>> was not enough to saturate our distributed parallel storage).
> >>>
> >>> So the knob will also enable people to _disable_ the 1% fuse limit to
> >>> increase performance.
> >>>
> >>> So people can use the exposed knob in 2 ways to fit their needs, which
> >>> is in general a good thing.
> >>>
> >>> However the comment in wb_position_ratio() says
> >>>
> >>> Without strictlimit feature, fuse writeback may
> >>> * consume arbitrary amount of RAM because it is accounted in
> >>> * NR_WRITEBACK_TEMP which is not involved in calculating
> >>> "nr_dirty".
> >>>
> >>> How dangerous would that be if some user disabled the 1% fuse limit
> >>> through the exposed knob? Will the NR_WRITEBACK_TEMP effect go far
> >>> beyond the user's expectation (20% max dirty limit)?
> >>>
> >>> Looking at the fuse code, NR_WRITEBACK_TEMP will grow proportional to
> >>> WB_WRITEBACK, which should be throttled when bdi_write_congested().
> >>> The congested flag will be set on
> >>>
> >>> fuse_conn.num_background >= fuse_conn.congestion_threshold
> >>> So it looks NR_WRITEBACK_TEMP will somehow be throttled. Just that
> >>> it's not included in the 20% dirty limit.
> >>
> >>
> >> Only balance_dirty_pages_ratelimited() is going to limit the
> >> generation of dirty pages, I don't think congestion flags will do
> >> that.
> >
> >
> > Right. However my concern is something to limit the generation of
> > fuse's _writeback_ pages.
> >
> > The normal writeback pages are limited in 2 ways:
> >
> > - balance_dirty_pages_ratelimited()'s dirty throttling:
> >
> > nr_dirty + nr_writeback + nr_unstable < global and/or bdi dirty limit
> >
> > - block layer's nr_requests queue limit
> >
> > However fuse's NR_WRITEBACK_TEMP looks special and has none of such
> > limits. The congested bit merely affect the vmscan pageout path.
> >
> > pageout
> > may_write_to_inode
> > inode_write_congested
> > wb_congested
> >
> > I wonder if fuse has its own approach to limit NR_WRITEBACK_TEMP?
> > Either explicitly or implicitly, there has to be some hard limit.
> >
> >> And (AFAICS) for fuse only BDI_CAP_STRICTLIMIT will allow
> >> accounting temp writeback pages when throttling dirty page generation.
> >> So without BDI_CAP_STRICTLIMIT kernel memory use of fuse may explode.
> >> So we probably need a way to force BDI_CAP_STRICTLIMIT (i.e. do not
> >> permit disabling it for fuse).
> >
> >
> > So fuse relies on small nr_dirty. Does fuse impose any explicit or
> > implicit rule that NR_WRITEBACK_TEMP will never exceed (N * nr_dirty)?
> > Otherwise the size of NR_WRITEBACK_TEMP cannot be guaranteed.
> >
> > For example, is it possible for some process (eg. dd) to dirty pages
> > as fast as possible while some other kernel logic to convert PG_dirty
> > to NR_WRITEBACK_TEMP as fast as possible, so that even the 1% bdi
> > strictlimit (which limits PG_dirty rather than NR_WRITEBACK_TEMP)
> > cannot stop all memory being eat up by ever growing NR_WRITEBACK_TEMP?
>
> Hmm, temp pages are still accounted as WB_WRITEBACK until writeback
> finishes. Does that not count towards the dirty limit?
>
This discussion died out and the patch is still "stuck" :(
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2018-01-31 22:58 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-11-30 22:15 [patch 15/15] mm: add strictlimit knob akpm
2017-12-01 12:29 ` Jan Kara
2017-12-07 1:09 ` Andrew Morton
2017-12-07 4:14 ` Fengguang Wu
2017-12-07 8:50 ` Miklos Szeredi
2017-12-07 10:15 ` Fengguang Wu
2017-12-07 10:32 ` Miklos Szeredi
2018-01-31 22:58 ` Andrew Morton
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.