linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* zram/zsmalloc issues in very low memory conditions
@ 2013-10-23 21:51 Olav Haugan
  2013-10-23 22:17 ` Luigi Semenzato
                   ` (3 more replies)
  0 siblings, 4 replies; 14+ messages in thread
From: Olav Haugan @ 2013-10-23 21:51 UTC (permalink / raw)
  To: minchan, sjenning; +Cc: linux-kernel, linux-mm

I am trying to use zram in very low memory conditions and I am having
some issues. zram is in the reclaim path. So if the system is very low
on memory the system is trying to reclaim pages by swapping out (in this
case to zram). However, since we are very low on memory zram fails to
get a page from zsmalloc and thus zram fails to store the page. We get
into a cycle where the system is low on memory so it tries to swap out
to get more memory but swap out fails because there is not enough memory
in the system! The major problem I am seeing is that there does not seem
to be a way for zram to tell the upper layers to stop swapping out
because the swap device is essentially "full" (since there is no more
memory available for zram pages). Has anyone thought about this issue
already and have ideas how to solve this or am I missing something and I
should not be seeing this issue?

I am also seeing a couple other issues that I was wondering whether
folks have already thought about:

1) The size of a swap device is statically computed when the swap device
is turned on (nr_swap_pages). The size of zram swap device is dynamic
since we are compressing the pages and thus the swap subsystem thinks
that the zram swap device is full when it is not really full. Any
plans/thoughts about the possibility of being able to update the size
and/or the # of available pages in a swap device on the fly?

2) zsmalloc fails when the page allocated is at physical address 0 (pfn
= 0) since the handle returned from zsmalloc is encoded as (<PFN>,
<obj_idx>) and thus the resulting handle will be 0 (since obj_idx starts
at 0). zs_malloc returns the handle but does not distinguish between a
valid handle of 0 and a failure to allocate. A possible solution to this
would be to start the obj_idx at 1. Is this feasible?

Thanks,

Olav Haugan

-- 
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: zram/zsmalloc issues in very low memory conditions
  2013-10-23 21:51 zram/zsmalloc issues in very low memory conditions Olav Haugan
@ 2013-10-23 22:17 ` Luigi Semenzato
  2013-10-24  0:55 ` Bob Liu
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 14+ messages in thread
From: Luigi Semenzato @ 2013-10-23 22:17 UTC (permalink / raw)
  To: Olav Haugan; +Cc: Minchan Kim, Seth Jennings, linux-kernel, linux-mm

(sorry about the HTML in the previous message)

On Wed, Oct 23, 2013 at 2:51 PM, Olav Haugan <ohaugan@codeaurora.org> wrote:
> I am trying to use zram in very low memory conditions and I am having
> some issues. zram is in the reclaim path. So if the system is very low
> on memory the system is trying to reclaim pages by swapping out (in this
> case to zram). However, since we are very low on memory zram fails to
> get a page from zsmalloc and thus zram fails to store the page. We get
> into a cycle where the system is low on memory so it tries to swap out
> to get more memory but swap out fails because there is not enough memory
> in the system! The major problem I am seeing is that there does not seem
> to be a way for zram to tell the upper layers to stop swapping out
> because the swap device is essentially "full" (since there is no more
> memory available for zram pages). Has anyone thought about this issue
> already and have ideas how to solve this or am I missing something and I
> should not be seeing this issue?

What do you want the system to do at this point?  OOM kill?  Also, if
you are that low on memory, how are you preventing thrashing on the
code pages?

I am asking because we also use zram but haven't run into this
problem---however we had to deal with other problems that motivate
these questions.

>
> I am also seeing a couple other issues that I was wondering whether
> folks have already thought about:
>
> 1) The size of a swap device is statically computed when the swap device
> is turned on (nr_swap_pages). The size of zram swap device is dynamic
> since we are compressing the pages and thus the swap subsystem thinks
> that the zram swap device is full when it is not really full. Any
> plans/thoughts about the possibility of being able to update the size
> and/or the # of available pages in a swap device on the fly?

That is a known limitation of zram.  If you can predict your
compression ratio and your working set size, it's not a big problem:
allocate a swap device which, based on the expected compression ratio,
will use up RAM until what's left is just enough for the working set.

> 2) zsmalloc fails when the page allocated is at physical address 0 (pfn
> = 0) since the handle returned from zsmalloc is encoded as (<PFN>,
> <obj_idx>) and thus the resulting handle will be 0 (since obj_idx starts
> at 0). zs_malloc returns the handle but does not distinguish between a
> valid handle of 0 and a failure to allocate. A possible solution to this
> would be to start the obj_idx at 1. Is this feasible?
>
> Thanks,
>
> Olav Haugan
>
> --
> The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
> hosted by The Linux Foundation
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: zram/zsmalloc issues in very low memory conditions
  2013-10-23 21:51 zram/zsmalloc issues in very low memory conditions Olav Haugan
  2013-10-23 22:17 ` Luigi Semenzato
@ 2013-10-24  0:55 ` Bob Liu
  2013-10-25  0:35   ` Olav Haugan
  2013-10-24 10:42 ` Weijie Yang
  2013-10-25  9:19 ` Minchan Kim
  3 siblings, 1 reply; 14+ messages in thread
From: Bob Liu @ 2013-10-24  0:55 UTC (permalink / raw)
  To: Olav Haugan; +Cc: minchan, sjenning, linux-kernel, linux-mm


On 10/24/2013 05:51 AM, Olav Haugan wrote:
> I am trying to use zram in very low memory conditions and I am having
> some issues. zram is in the reclaim path. So if the system is very low
> on memory the system is trying to reclaim pages by swapping out (in this
> case to zram). However, since we are very low on memory zram fails to
> get a page from zsmalloc and thus zram fails to store the page. We get
> into a cycle where the system is low on memory so it tries to swap out
> to get more memory but swap out fails because there is not enough memory
> in the system! The major problem I am seeing is that there does not seem
> to be a way for zram to tell the upper layers to stop swapping out
> because the swap device is essentially "full" (since there is no more
> memory available for zram pages). Has anyone thought about this issue
> already and have ideas how to solve this or am I missing something and I
> should not be seeing this issue?
> 

The same question as Luigi "What do you want the system to do at this
point?"

If swap fails then OOM killer will be triggered, I don't think this will
be a issue.

By the way, could you take a try with zswap? Which can write pages to
real swap device if compressed pool is full.

> I am also seeing a couple other issues that I was wondering whether
> folks have already thought about:
> 
> 1) The size of a swap device is statically computed when the swap device
> is turned on (nr_swap_pages). The size of zram swap device is dynamic
> since we are compressing the pages and thus the swap subsystem thinks
> that the zram swap device is full when it is not really full. Any
> plans/thoughts about the possibility of being able to update the size
> and/or the # of available pages in a swap device on the fly?
> 
> 2) zsmalloc fails when the page allocated is at physical address 0 (pfn

AFAIK, this will never happen.

> = 0) since the handle returned from zsmalloc is encoded as (<PFN>,
> <obj_idx>) and thus the resulting handle will be 0 (since obj_idx starts
> at 0). zs_malloc returns the handle but does not distinguish between a
> valid handle of 0 and a failure to allocate. A possible solution to this
> would be to start the obj_idx at 1. Is this feasible?
> 
> Thanks,
> 
> Olav Haugan
> 

-- 
Regards,
-Bob

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: zram/zsmalloc issues in very low memory conditions
  2013-10-23 21:51 zram/zsmalloc issues in very low memory conditions Olav Haugan
  2013-10-23 22:17 ` Luigi Semenzato
  2013-10-24  0:55 ` Bob Liu
@ 2013-10-24 10:42 ` Weijie Yang
  2013-10-25  9:19 ` Minchan Kim
  3 siblings, 0 replies; 14+ messages in thread
From: Weijie Yang @ 2013-10-24 10:42 UTC (permalink / raw)
  To: Olav Haugan; +Cc: minchan, sjenning, linux-kernel, linux-mm, semenzato, bob.liu

On Thu, Oct 24, 2013 at 5:51 AM, Olav Haugan <ohaugan@codeaurora.org> wrote:
> I am trying to use zram in very low memory conditions and I am having
> some issues. zram is in the reclaim path. So if the system is very low
> on memory the system is trying to reclaim pages by swapping out (in this
> case to zram). However, since we are very low on memory zram fails to
> get a page from zsmalloc and thus zram fails to store the page. We get
> into a cycle where the system is low on memory so it tries to swap out
> to get more memory but swap out fails because there is not enough memory
> in the system! The major problem I am seeing is that there does not seem
> to be a way for zram to tell the upper layers to stop swapping out
> because the swap device is essentially "full" (since there is no more
> memory available for zram pages). Has anyone thought about this issue
> already and have ideas how to solve this or am I missing something and I
> should not be seeing this issue?

I agree with Luigi and Bob.

zram's size is based on how many free memory you expect to use for zram.
In my test, the compression ratio is about 1:4, of course the working
sets may be
different with yours.

Further more, may be you can modify vm_swap_full() to let kernel free swap_entry
aggressively.


> I am also seeing a couple other issues that I was wondering whether
> folks have already thought about:
>
> 1) The size of a swap device is statically computed when the swap device
> is turned on (nr_swap_pages). The size of zram swap device is dynamic
> since we are compressing the pages and thus the swap subsystem thinks
> that the zram swap device is full when it is not really full. Any
> plans/thoughts about the possibility of being able to update the size
> and/or the # of available pages in a swap device on the fly?
>
> 2) zsmalloc fails when the page allocated is at physical address 0 (pfn
> = 0) since the handle returned from zsmalloc is encoded as (<PFN>,
> <obj_idx>) and thus the resulting handle will be 0 (since obj_idx starts
> at 0). zs_malloc returns the handle but does not distinguish between a
> valid handle of 0 and a failure to allocate. A possible solution to this
> would be to start the obj_idx at 1. Is this feasible?
>
> Thanks,
>
> Olav Haugan
>
> --
> The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
> hosted by The Linux Foundation
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: zram/zsmalloc issues in very low memory conditions
  2013-10-24  0:55 ` Bob Liu
@ 2013-10-25  0:35   ` Olav Haugan
  2013-10-25  1:12     ` Luigi Semenzato
  2013-10-25  2:59     ` Bob Liu
  0 siblings, 2 replies; 14+ messages in thread
From: Olav Haugan @ 2013-10-25  0:35 UTC (permalink / raw)
  To: Bob Liu; +Cc: minchan, sjenning, linux-kernel, linux-mm, semenzato

Hi Bob, Luigi,

On 10/23/2013 5:55 PM, Bob Liu wrote:
> 
> On 10/24/2013 05:51 AM, Olav Haugan wrote:
>> I am trying to use zram in very low memory conditions and I am having
>> some issues. zram is in the reclaim path. So if the system is very low
>> on memory the system is trying to reclaim pages by swapping out (in this
>> case to zram). However, since we are very low on memory zram fails to
>> get a page from zsmalloc and thus zram fails to store the page. We get
>> into a cycle where the system is low on memory so it tries to swap out
>> to get more memory but swap out fails because there is not enough memory
>> in the system! The major problem I am seeing is that there does not seem
>> to be a way for zram to tell the upper layers to stop swapping out
>> because the swap device is essentially "full" (since there is no more
>> memory available for zram pages). Has anyone thought about this issue
>> already and have ideas how to solve this or am I missing something and I
>> should not be seeing this issue?
>>
> 
> The same question as Luigi "What do you want the system to do at this
> point?"
> 
> If swap fails then OOM killer will be triggered, I don't think this will
> be a issue.

I definitely don't want OOM killer to run since OOM killer can kill
critical processes (this is on Android so we have Android LMK to handle
the killing in a more "safe" way). However, what I am seeing is that
when I run low on memory zram fails to swap out and returns error but
the swap subsystem just continues to try to swap out even when this
error occurs (it tries over and over again very rapidly causing the
kernel to be filled with error messages [at least two error messages per
failure btw]).

What I expected to happen is for the swap subsystem to stop trying to
swap out until memory is available to swap out. I guess this could be
handled several ways. Either 1) the swap subsystem, upon encountering an
error to swap out, backs off from trying to swap out for some time or 2)
zram informs the swap subsystem that the swap device is full.

Could this be handled by congestion control? However, I found the
following comment in the code in vmscan.c:

* If the page is swapcache, write it back even if that would
* block, for some throttling. This happens by accident, because
* swap_backing_dev_info is bust: it doesn't reflect the
* congestion state of the swapdevs.  Easy to fix, if needed.

However, how would one update the congested state of zram when it
becomes un-congested?

> By the way, could you take a try with zswap? Which can write pages to
> real swap device if compressed pool is full.

zswap might not be feasible in all cases if you only have flash as
backing storage.

>> I am also seeing a couple other issues that I was wondering whether
>> folks have already thought about:
>>
>> 2) zsmalloc fails when the page allocated is at physical address 0 (pfn
> 
> AFAIK, this will never happen.

I can easily get this to happen since I have memory starting at physical
address 0.

Thanks,

Olav Haugan

-- 
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: zram/zsmalloc issues in very low memory conditions
  2013-10-25  0:35   ` Olav Haugan
@ 2013-10-25  1:12     ` Luigi Semenzato
  2013-10-31 23:34       ` Olav Haugan
  2013-10-25  2:59     ` Bob Liu
  1 sibling, 1 reply; 14+ messages in thread
From: Luigi Semenzato @ 2013-10-25  1:12 UTC (permalink / raw)
  To: Olav Haugan; +Cc: Bob Liu, Minchan Kim, Seth Jennings, linux-kernel, linux-mm

On Thu, Oct 24, 2013 at 5:35 PM, Olav Haugan <ohaugan@codeaurora.org> wrote:
> Hi Bob, Luigi,
>
> On 10/23/2013 5:55 PM, Bob Liu wrote:
>>
>> On 10/24/2013 05:51 AM, Olav Haugan wrote:
>>> I am trying to use zram in very low memory conditions and I am having
>>> some issues. zram is in the reclaim path. So if the system is very low
>>> on memory the system is trying to reclaim pages by swapping out (in this
>>> case to zram). However, since we are very low on memory zram fails to
>>> get a page from zsmalloc and thus zram fails to store the page. We get
>>> into a cycle where the system is low on memory so it tries to swap out
>>> to get more memory but swap out fails because there is not enough memory
>>> in the system! The major problem I am seeing is that there does not seem
>>> to be a way for zram to tell the upper layers to stop swapping out
>>> because the swap device is essentially "full" (since there is no more
>>> memory available for zram pages). Has anyone thought about this issue
>>> already and have ideas how to solve this or am I missing something and I
>>> should not be seeing this issue?
>>>
>>
>> The same question as Luigi "What do you want the system to do at this
>> point?"
>>
>> If swap fails then OOM killer will be triggered, I don't think this will
>> be a issue.
>
> I definitely don't want OOM killer to run since OOM killer can kill
> critical processes (this is on Android so we have Android LMK to handle
> the killing in a more "safe" way). However, what I am seeing is that
> when I run low on memory zram fails to swap out and returns error but
> the swap subsystem just continues to try to swap out even when this
> error occurs (it tries over and over again very rapidly causing the
> kernel to be filled with error messages [at least two error messages per
> failure btw]).
>
> What I expected to happen is for the swap subsystem to stop trying to
> swap out until memory is available to swap out. I guess this could be
> handled several ways. Either 1) the swap subsystem, upon encountering an
> error to swap out, backs off from trying to swap out for some time or 2)
> zram informs the swap subsystem that the swap device is full.

There is a lot I don't know, both about the specifics of your case and
the MM subsystem in general, so I'll make some guesses.  Don't trust
anything I say here (as if you would anyway :-).

As the system gets low on memory, the MM tries to reclaim it in
various ways.  The biggest (I think) sources of reclaim come from
swapping out anonymous pages (process data), and discarding
file-backed pages (code, for instance).  The "swappiness" parameter
decides how much attention to give to each of these types of memory.

It's possible that you get in a situation in which attempts to swap
out anonymous pages with zram fail because you're out of memory at
that point, but then some memory is reclaimed by discarding file
pages, and that's why you don't see OOM kills or kernel panic.  Either
way you should be really close to that moment, unless something funny
is going on.

You could try to snapshot the memory usage when those message are
produced.  You can, for instance, use SysRQ-M to dump a bunch of data
in the syslog.  You may also want to monitor the zram device
utlization from the sysfs.

It's possible that by the time you see those messages you're already
thrashing badly and that things slowed down so much that you aren't
quite getting to the OOM killer.  You could try to reduce the size of
your swap device, and/or change the swappiness.

By the way, I am under the impression that Android uses the OOM killer
as part of their memory management strategy.


> Could this be handled by congestion control? However, I found the
> following comment in the code in vmscan.c:
>
> * If the page is swapcache, write it back even if that would
> * block, for some throttling. This happens by accident, because
> * swap_backing_dev_info is bust: it doesn't reflect the
> * congestion state of the swapdevs.  Easy to fix, if needed.
>
> However, how would one update the congested state of zram when it
> becomes un-congested?
>
>> By the way, could you take a try with zswap? Which can write pages to
>> real swap device if compressed pool is full.
>
> zswap might not be feasible in all cases if you only have flash as
> backing storage.

Zswap can be configured to run without a backing storage.

>
>>> I am also seeing a couple other issues that I was wondering whether
>>> folks have already thought about:
>>>
>>> 2) zsmalloc fails when the page allocated is at physical address 0 (pfn
>>
>> AFAIK, this will never happen.
>
> I can easily get this to happen since I have memory starting at physical
> address 0.
>
> Thanks,
>
> Olav Haugan
>
> --
> The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
> hosted by The Linux Foundation

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: zram/zsmalloc issues in very low memory conditions
  2013-10-25  0:35   ` Olav Haugan
  2013-10-25  1:12     ` Luigi Semenzato
@ 2013-10-25  2:59     ` Bob Liu
  1 sibling, 0 replies; 14+ messages in thread
From: Bob Liu @ 2013-10-25  2:59 UTC (permalink / raw)
  To: Olav Haugan; +Cc: minchan, sjenning, linux-kernel, linux-mm, semenzato

On 10/25/2013 08:35 AM, Olav Haugan wrote:
> Hi Bob, Luigi,
> 
> On 10/23/2013 5:55 PM, Bob Liu wrote:
>>
>> On 10/24/2013 05:51 AM, Olav Haugan wrote:
>>> I am trying to use zram in very low memory conditions and I am having
>>> some issues. zram is in the reclaim path. So if the system is very low
>>> on memory the system is trying to reclaim pages by swapping out (in this
>>> case to zram). However, since we are very low on memory zram fails to
>>> get a page from zsmalloc and thus zram fails to store the page. We get
>>> into a cycle where the system is low on memory so it tries to swap out
>>> to get more memory but swap out fails because there is not enough memory
>>> in the system! The major problem I am seeing is that there does not seem
>>> to be a way for zram to tell the upper layers to stop swapping out
>>> because the swap device is essentially "full" (since there is no more
>>> memory available for zram pages). Has anyone thought about this issue
>>> already and have ideas how to solve this or am I missing something and I
>>> should not be seeing this issue?
>>>
>>
>> The same question as Luigi "What do you want the system to do at this
>> point?"
>>
>> If swap fails then OOM killer will be triggered, I don't think this will
>> be a issue.
> 
> I definitely don't want OOM killer to run since OOM killer can kill
> critical processes (this is on Android so we have Android LMK to handle
> the killing in a more "safe" way). However, what I am seeing is that
> when I run low on memory zram fails to swap out and returns error but
> the swap subsystem just continues to try to swap out even when this
> error occurs (it tries over and over again very rapidly causing the
> kernel to be filled with error messages [at least two error messages per
> failure btw]).
> 

A simple way to disable the error messages is delete the printk line in
zram source code.

> What I expected to happen is for the swap subsystem to stop trying to
> swap out until memory is available to swap out. I guess this could be

In my opinion, the system already entered heavy memory pressure state
when zram fails to allocate a page.

In this case, the OOM killer or Low Memory Killer should be waked up and
kill some processes in order to free some memory pages.

After this happen, the system free memory may above water mark and no
swap will happen. Even swap happens again, it's unlikely that zram will
fail to alloc page. If it fails again, OOM killer or LMK should be
triggered once more.

> handled several ways. Either 1) the swap subsystem, upon encountering an
> error to swap out, backs off from trying to swap out for some time or 2)
> zram informs the swap subsystem that the swap device is full.
> 
> Could this be handled by congestion control? However, I found the
> following comment in the code in vmscan.c:
> 
> * If the page is swapcache, write it back even if that would
> * block, for some throttling. This happens by accident, because
> * swap_backing_dev_info is bust: it doesn't reflect the
> * congestion state of the swapdevs.  Easy to fix, if needed.
> 
> However, how would one update the congested state of zram when it
> becomes un-congested?
> 
>> By the way, could you take a try with zswap? Which can write pages to
>> real swap device if compressed pool is full.
> 
> zswap might not be feasible in all cases if you only have flash as
> backing storage.
> 

Yes, that's still a problem of zswap. Perhaps you can create a swap file
on the backing storage.

I'll try to add a fake swap device for zswap, so that users can have one
more choice besides zram.

>>> I am also seeing a couple other issues that I was wondering whether
>>> folks have already thought about:
>>>
>>> 2) zsmalloc fails when the page allocated is at physical address 0 (pfn
>>
>> AFAIK, this will never happen.
> 
> I can easily get this to happen since I have memory starting at physical
> address 0.
> 

Could you confirm this? AFAIR, physical memory start from 0 usually
reserved for some special usage.

I used 'cat /proc/zoneinfo' on x86 and arm, neither of the 'start_pfn'
was 0.

-- 
Regards,
-Bob

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: zram/zsmalloc issues in very low memory conditions
  2013-10-23 21:51 zram/zsmalloc issues in very low memory conditions Olav Haugan
                   ` (2 preceding siblings ...)
  2013-10-24 10:42 ` Weijie Yang
@ 2013-10-25  9:19 ` Minchan Kim
  2013-11-02  0:59   ` Olav Haugan
  3 siblings, 1 reply; 14+ messages in thread
From: Minchan Kim @ 2013-10-25  9:19 UTC (permalink / raw)
  To: Olav Haugan; +Cc: sjenning, linux-kernel, linux-mm

Hello,

I had no enough time to think over your great questions since I should enjoy
in Edinburgh so if I miss something, Sorry!

On Wed, Oct 23, 2013 at 02:51:34PM -0700, Olav Haugan wrote:
> I am trying to use zram in very low memory conditions and I am having
> some issues. zram is in the reclaim path. So if the system is very low
> on memory the system is trying to reclaim pages by swapping out (in this
> case to zram). However, since we are very low on memory zram fails to
> get a page from zsmalloc and thus zram fails to store the page. We get
> into a cycle where the system is low on memory so it tries to swap out
> to get more memory but swap out fails because there is not enough memory
> in the system! The major problem I am seeing is that there does not seem
> to be a way for zram to tell the upper layers to stop swapping out

True. The zram is block device so at a moment, I don't want to make zram
swap-specific if it's possible.

> because the swap device is essentially "full" (since there is no more
> memory available for zram pages). Has anyone thought about this issue
> already and have ideas how to solve this or am I missing something and I
> should not be seeing this issue?

It's true. We might need feedback loop and it shoudn't be specific for
zram-swap. One think I can imagine is that we could move failed victim
pages into LRU active list when the swapout failed so VM will have more
weight for file pages than anon ones. For detail, you could see
AOP_WRITEPAGE_ACTIVATE and get_scan_count for detail.

The problem is it's on fs layer while zram is on block layer so what I
can think at a moment is follwing as

diff --git a/mm/vmscan.c b/mm/vmscan.c
index 8ed1b77..c80b0b4 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -502,6 +502,8 @@ static pageout_t pageout(struct page *page, struct address_space *mapping,
                if (!PageWriteback(page)) {
                        /* synchronous write or broken a_ops? */
                        ClearPageReclaim(page);
+                       if (PageError(page))
+                               return PAGE_ACTIVATE;
                }
                trace_mm_vmscan_writepage(page, trace_reclaim_flags(page));
                inc_zone_page_state(page, NR_VMSCAN_WRITE);


It doesn't prevent swapout at all but it should throttle pick up anonymous
pages for reclaiming so file-backed pages will be preferred by VM so sometime,
zsmalloc succeed to allocate a free page and swapout will resume again.

> 
> I am also seeing a couple other issues that I was wondering whether
> folks have already thought about:
> 
> 1) The size of a swap device is statically computed when the swap device
> is turned on (nr_swap_pages). The size of zram swap device is dynamic
> since we are compressing the pages and thus the swap subsystem thinks
> that the zram swap device is full when it is not really full. Any
> plans/thoughts about the possibility of being able to update the size
> and/or the # of available pages in a swap device on the fly?

It's really good question. We could make zram's size bigger to prevent
such problem when you set zram's disksize from the beginning but in this case,
zram's meta(ie, struct table) size will be increased a bit. Is such memory
overhead is critical for you?

> 
> 2) zsmalloc fails when the page allocated is at physical address 0 (pfn
> = 0) since the handle returned from zsmalloc is encoded as (<PFN>,
> <obj_idx>) and thus the resulting handle will be 0 (since obj_idx starts
> at 0). zs_malloc returns the handle but does not distinguish between a
> valid handle of 0 and a failure to allocate. A possible solution to this
> would be to start the obj_idx at 1. Is this feasible?

I think it's doable.

> 
> Thanks,
> 
> Olav Haugan
> 
> -- 
> The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
> hosted by The Linux Foundation

-- 
Kind regards,
Minchan Kim

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: zram/zsmalloc issues in very low memory conditions
  2013-10-25  1:12     ` Luigi Semenzato
@ 2013-10-31 23:34       ` Olav Haugan
  2013-11-01  0:27         ` Luigi Semenzato
  2013-11-01  0:35         ` Bob Liu
  0 siblings, 2 replies; 14+ messages in thread
From: Olav Haugan @ 2013-10-31 23:34 UTC (permalink / raw)
  To: Luigi Semenzato
  Cc: Bob Liu, Minchan Kim, Seth Jennings, linux-kernel, linux-mm

Hi Luigi,

On 10/24/2013 6:12 PM, Luigi Semenzato wrote:
> On Thu, Oct 24, 2013 at 5:35 PM, Olav Haugan <ohaugan@codeaurora.org> wrote:
>> Hi Bob, Luigi,
>>
>> On 10/23/2013 5:55 PM, Bob Liu wrote:
>>>
>>> On 10/24/2013 05:51 AM, Olav Haugan wrote:
>>
>>> By the way, could you take a try with zswap? Which can write pages to
>>> real swap device if compressed pool is full.
>>
>> zswap might not be feasible in all cases if you only have flash as
>> backing storage.
> 
> Zswap can be configured to run without a backing storage.
> 

I was under the impression that zswap requires a backing storage. Can
you elaborate on how to configure zswap to not need a backing storage?


Olav Haugan

-- 
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: zram/zsmalloc issues in very low memory conditions
  2013-10-31 23:34       ` Olav Haugan
@ 2013-11-01  0:27         ` Luigi Semenzato
  2013-11-02  7:40           ` Stephen Barber
  2013-11-01  0:35         ` Bob Liu
  1 sibling, 1 reply; 14+ messages in thread
From: Luigi Semenzato @ 2013-11-01  0:27 UTC (permalink / raw)
  To: Olav Haugan, Stephen Barber
  Cc: Bob Liu, Minchan Kim, Seth Jennings, linux-kernel, linux-mm

[apologies for the previous HTML email]

Hi Olav,

I haven't personally done it.  Seth outlines the configuration in this thread:

http://thread.gmane.org/gmane.linux.kernel.mm/105378/focus=105543

Stephen, can you add more detail from your experience?

Thanks!


On Thu, Oct 31, 2013 at 4:34 PM, Olav Haugan <ohaugan@codeaurora.org> wrote:
> Hi Luigi,
>
> On 10/24/2013 6:12 PM, Luigi Semenzato wrote:
>> On Thu, Oct 24, 2013 at 5:35 PM, Olav Haugan <ohaugan@codeaurora.org> wrote:
>>> Hi Bob, Luigi,
>>>
>>> On 10/23/2013 5:55 PM, Bob Liu wrote:
>>>>
>>>> On 10/24/2013 05:51 AM, Olav Haugan wrote:
>>>
>>>> By the way, could you take a try with zswap? Which can write pages to
>>>> real swap device if compressed pool is full.
>>>
>>> zswap might not be feasible in all cases if you only have flash as
>>> backing storage.
>>
>> Zswap can be configured to run without a backing storage.
>>
>
> I was under the impression that zswap requires a backing storage. Can
> you elaborate on how to configure zswap to not need a backing storage?
>
>
> Olav Haugan
>
> --
> The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
> hosted by The Linux Foundation

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: zram/zsmalloc issues in very low memory conditions
  2013-10-31 23:34       ` Olav Haugan
  2013-11-01  0:27         ` Luigi Semenzato
@ 2013-11-01  0:35         ` Bob Liu
  1 sibling, 0 replies; 14+ messages in thread
From: Bob Liu @ 2013-11-01  0:35 UTC (permalink / raw)
  To: Olav Haugan
  Cc: Luigi Semenzato, Minchan Kim, Seth Jennings, linux-kernel, linux-mm

Hi Olav,

On 11/01/2013 07:34 AM, Olav Haugan wrote:
> Hi Luigi,
> 
> On 10/24/2013 6:12 PM, Luigi Semenzato wrote:
>> On Thu, Oct 24, 2013 at 5:35 PM, Olav Haugan <ohaugan@codeaurora.org> wrote:
>>> Hi Bob, Luigi,
>>>
>>> On 10/23/2013 5:55 PM, Bob Liu wrote:
>>>>
>>>> On 10/24/2013 05:51 AM, Olav Haugan wrote:
>>>
>>>> By the way, could you take a try with zswap? Which can write pages to
>>>> real swap device if compressed pool is full.
>>>
>>> zswap might not be feasible in all cases if you only have flash as
>>> backing storage.
>>
>> Zswap can be configured to run without a backing storage.
>>
> 
> I was under the impression that zswap requires a backing storage. Can
> you elaborate on how to configure zswap to not need a backing storage?
> 

AFAIK, currently zswap can't be used without a backing storage.
Perhaps you can take a try by creating a swap device backed by a file on
storage.

-- 
Regards,
-Bob

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: zram/zsmalloc issues in very low memory conditions
  2013-10-25  9:19 ` Minchan Kim
@ 2013-11-02  0:59   ` Olav Haugan
  2013-11-02  2:50     ` Bob Liu
  0 siblings, 1 reply; 14+ messages in thread
From: Olav Haugan @ 2013-11-02  0:59 UTC (permalink / raw)
  To: Minchan Kim; +Cc: sjenning, linux-kernel, linux-mm

On 10/25/2013 2:19 AM, Minchan Kim wrote:
> Hello,
> 
> I had no enough time to think over your great questions since I should enjoy
> in Edinburgh so if I miss something, Sorry!
> 
> On Wed, Oct 23, 2013 at 02:51:34PM -0700, Olav Haugan wrote:
>> I am trying to use zram in very low memory conditions and I am having
>> some issues. zram is in the reclaim path. So if the system is very low
>> on memory the system is trying to reclaim pages by swapping out (in this
>> case to zram). However, since we are very low on memory zram fails to
>> get a page from zsmalloc and thus zram fails to store the page. We get
>> into a cycle where the system is low on memory so it tries to swap out
>> to get more memory but swap out fails because there is not enough memory
>> in the system! The major problem I am seeing is that there does not seem
>> to be a way for zram to tell the upper layers to stop swapping out
> 
> True. The zram is block device so at a moment, I don't want to make zram
> swap-specific if it's possible.
> 
>> because the swap device is essentially "full" (since there is no more
>> memory available for zram pages). Has anyone thought about this issue
>> already and have ideas how to solve this or am I missing something and I
>> should not be seeing this issue?
> 
> It's true. We might need feedback loop and it shoudn't be specific for
> zram-swap. One think I can imagine is that we could move failed victim
> pages into LRU active list when the swapout failed so VM will have more
> weight for file pages than anon ones. For detail, you could see
> AOP_WRITEPAGE_ACTIVATE and get_scan_count for detail.
> 
> The problem is it's on fs layer while zram is on block layer so what I
> can think at a moment is follwing as
> 
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index 8ed1b77..c80b0b4 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -502,6 +502,8 @@ static pageout_t pageout(struct page *page, struct address_space *mapping,
>                 if (!PageWriteback(page)) {
>                         /* synchronous write or broken a_ops? */
>                         ClearPageReclaim(page);
> +                       if (PageError(page))
> +                               return PAGE_ACTIVATE;
>                 }
>                 trace_mm_vmscan_writepage(page, trace_reclaim_flags(page));
>                 inc_zone_page_state(page, NR_VMSCAN_WRITE);
> 
> 
> It doesn't prevent swapout at all but it should throttle pick up anonymous
> pages for reclaiming so file-backed pages will be preferred by VM so sometime,
> zsmalloc succeed to allocate a free page and swapout will resume again.

I tried the above suggestion but it does not seem to have any noticeable
impact. The system is still trying to swap out at a very high rate after
zram reported failure to swap out. The error logging is actually so much
that my system crashed due to excessive logging (we have a watchdog that
is not getting pet because the kernel is busy logging kernel messages).

There isn't anything that can be set to tell the fs layer to back off
completely for a while (congestion control)?


Olav Haugan

-- 
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: zram/zsmalloc issues in very low memory conditions
  2013-11-02  0:59   ` Olav Haugan
@ 2013-11-02  2:50     ` Bob Liu
  0 siblings, 0 replies; 14+ messages in thread
From: Bob Liu @ 2013-11-02  2:50 UTC (permalink / raw)
  To: Olav Haugan; +Cc: Minchan Kim, sjenning, linux-kernel, linux-mm

Hi Olav,

On 11/02/2013 08:59 AM, Olav Haugan wrote:

> 
> I tried the above suggestion but it does not seem to have any noticeable
> impact. The system is still trying to swap out at a very high rate after
> zram reported failure to swap out. The error logging is actually so much
> that my system crashed due to excessive logging (we have a watchdog that
> is not getting pet because the kernel is busy logging kernel messages).
> 

I have a question that why the low memory killer didn't get triggered in
this situation?
Is it possible to set the LMK a bit more aggressive?

> There isn't anything that can be set to tell the fs layer to back off
> completely for a while (congestion control)?
> 

The other way I think might fix your issue is the same as your mentioned
in your previous email.
Set the congested bit for swap device also.
Like:

diff --git a/drivers/staging/zram/zram_drv.c
b/drivers/staging/zram/zram_drv.c
index 91d94b5..c4fc63e 100644
--- a/drivers/staging/zram/zram_drv.c
+++ b/drivers/staging/zram/zram_drv.c
@@ -474,6 +474,7 @@ static int zram_bvec_write(struct zram *zram, struct
bio_vec *bvec, u32 index,
        if (!handle) {
                pr_info("Error allocating memory for compressed page:
%u, size=%zu\n",
                        index, clen);
+               blk_set_queue_congested(zram->disk->queue, BLK_RW_ASYNC);
                ret = -ENOMEM;
                goto out;
        }
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 8ed1b77..1c790ee 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -394,8 +394,6 @@ static inline int is_page_cache_freeable(struct page
*page)
 static int may_write_to_queue(struct backing_dev_info *bdi,
                              struct scan_control *sc)
 {
-       if (current->flags & PF_SWAPWRITE)
-               return 1;

--------------------------------------------------------------

For the update of the congested state of zram, I think you can clear it
from use space eg. after LMK triggered and reclaimed some memory.

Of course this depends on zram driver to export a sysfs node like
"/sys/block/zram0/clear_congested".

-- 
Regards,
-Bob

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: zram/zsmalloc issues in very low memory conditions
  2013-11-01  0:27         ` Luigi Semenzato
@ 2013-11-02  7:40           ` Stephen Barber
  0 siblings, 0 replies; 14+ messages in thread
From: Stephen Barber @ 2013-11-02  7:40 UTC (permalink / raw)
  To: Luigi Semenzato
  Cc: Olav Haugan, Bob Liu, Minchan Kim, Seth Jennings, linux-kernel, linux-mm

I did get zswap working without a backing store, using Bob's patches
from http://thread.gmane.org/gmane.linux.kernel.mm/105627/focus=105642
applied to 3.11-rc6, The major issue we had with it was that i915
graphics buffers were getting corrupted somehow, which wasn't a
problem with zram on the same kernel version. I'm sure with a little
more investigation that problem could be fixed, though, and I don't
think there were any other major problems with it in our (admittedly
not very exhaustive) testing of those patches.

Hope that helps,
Steve Barber

On Thu, Oct 31, 2013 at 5:27 PM, Luigi Semenzato <semenzato@google.com> wrote:
> [apologies for the previous HTML email]
>
> Hi Olav,
>
> I haven't personally done it.  Seth outlines the configuration in this thread:
>
> http://thread.gmane.org/gmane.linux.kernel.mm/105378/focus=105543
>
> Stephen, can you add more detail from your experience?
>
> Thanks!
>
>
> On Thu, Oct 31, 2013 at 4:34 PM, Olav Haugan <ohaugan@codeaurora.org> wrote:
>> Hi Luigi,
>>
>> On 10/24/2013 6:12 PM, Luigi Semenzato wrote:
>>> On Thu, Oct 24, 2013 at 5:35 PM, Olav Haugan <ohaugan@codeaurora.org> wrote:
>>>> Hi Bob, Luigi,
>>>>
>>>> On 10/23/2013 5:55 PM, Bob Liu wrote:
>>>>>
>>>>> On 10/24/2013 05:51 AM, Olav Haugan wrote:
>>>>
>>>>> By the way, could you take a try with zswap? Which can write pages to
>>>>> real swap device if compressed pool is full.
>>>>
>>>> zswap might not be feasible in all cases if you only have flash as
>>>> backing storage.
>>>
>>> Zswap can be configured to run without a backing storage.
>>>
>>
>> I was under the impression that zswap requires a backing storage. Can
>> you elaborate on how to configure zswap to not need a backing storage?
>>
>>
>> Olav Haugan
>>
>> --
>> The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
>> hosted by The Linux Foundation

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2013-11-02  7:50 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-10-23 21:51 zram/zsmalloc issues in very low memory conditions Olav Haugan
2013-10-23 22:17 ` Luigi Semenzato
2013-10-24  0:55 ` Bob Liu
2013-10-25  0:35   ` Olav Haugan
2013-10-25  1:12     ` Luigi Semenzato
2013-10-31 23:34       ` Olav Haugan
2013-11-01  0:27         ` Luigi Semenzato
2013-11-02  7:40           ` Stephen Barber
2013-11-01  0:35         ` Bob Liu
2013-10-25  2:59     ` Bob Liu
2013-10-24 10:42 ` Weijie Yang
2013-10-25  9:19 ` Minchan Kim
2013-11-02  0:59   ` Olav Haugan
2013-11-02  2:50     ` Bob Liu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).