All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH V2 0/4] Frontswap (was Transcendent Memory): overview
@ 2010-05-28 17:40 ` Dan Magenheimer
  0 siblings, 0 replies; 10+ messages in thread
From: Dan Magenheimer @ 2010-05-28 17:40 UTC (permalink / raw)
  To: linux-kernel, linux-mm, jeremy, hugh.dickins, ngupta, JBeulich,
	chris.mason, kurt.hackel, dave.mccracken, npiggin, akpm, riel,
	avi, pavel, konrad.wilk, dan.magenheimer

[PATCH V2 0/4] Frontswap (was Transcendent Memory): overview

Changes since V1:
- Rebased to 2.6.34 (no functional changes)
- Convert to sane types (per Al Viro comment in cleancache thread)
- Define some raw constants (Konrad Wilk)
- Performance analysis shows significant advantage for frontswap's
  synchronous page-at-a-time design (vs batched asynchronous speculated
  as an alternative design).  See http://lkml.org/lkml/2010/5/20/314

In previous patch postings, frontswap was part of the Transcendent
Memory ("tmem") patchset.  This patchset refocuses not on the underlying
technology (tmem) but instead on the useful functionality provided for Linux,
and provides a clean API so that frontswap can provide this very useful
functionality via a Xen tmem driver OR completely independent of tmem.
For example: an in-kernel compression "backend" for frontswap can be
implemented and some believe frontswap will be a very nice interface
for building RAM-like functionality for pseudo-RAM devices such as
on-memory-bus SSD or phase-change memory; and a Pune University team
is looking at a backend for virtio (see OLS'2010).

A more complete description of frontswap can be found in the introductory
comment in mm/frontswap.c (in PATCH 2/4) which is included below
for convenience.

Note that an earlier version of this patch is now shipping in OpenSuSE 11.2
and will soon ship in a release of Oracle Enterprise Linux.  Underlying
tmem technology is now shipping in Oracle VM 2.2 and was just released
in Xen 4.0 on April 15, 2010.  (Search news.google.com for Transcendent
Memory)

Signed-off-by: Dan Magenheimer <dan.magenheimer@oracle.com>
Reviewed-by: Jeremy Fitzhardinge <jeremy@goop.org>

 include/linux/frontswap.h |   98 ++++++++++++++
 include/linux/swap.h      |    2 
 include/linux/swapfile.h  |   13 +
 mm/Kconfig                |   16 ++
 mm/Makefile               |    1 
 mm/frontswap.c            |  301 ++++++++++++++++++++++++++++++++++++++++++++++
 mm/page_io.c              |   10 +
 mm/swapfile.c             |   59 +++++++--
 8 files changed, 491 insertions(+), 9 deletions(-)

Frontswap is so named because it can be thought of as the opposite of
a "backing" store for a swap device.  The storage is assumed to be
a synchronous concurrency-safe page-oriented pseudo-RAM device (such as
Xen's Transcendent Memory, aka "tmem", or in-kernel compressed memory,
aka "zmem", or other RAM-like devices) which is not directly accessible
or addressable by the kernel and is of unknown and possibly time-varying
size.  This pseudo-RAM device links itself to frontswap by setting the
frontswap_ops pointer appropriately and the functions it provides must
conform to certain policies as follows:

An "init" prepares the pseudo-RAM to receive frontswap pages and returns
a non-negative pool id, used for all swap device numbers (aka "type").
A "put_page" will copy the page to pseudo-RAM and associate it with
the type and offset associated with the page. A "get_page" will copy the
page, if found, from pseudo-RAM into kernel memory, but will NOT remove
the page from pseudo-RAM.  A "flush_page" will remove the page from
pseudo-RAM and a "flush_area" will remove ALL pages associated with the
swap type (e.g., like swapoff) and notify the pseudo-RAM device to refuse
further puts with that swap type.

Once a page is successfully put, a matching get on the page will always
succeed.  So when the kernel finds itself in a situation where it needs
to swap out a page, it first attempts to use frontswap.  If the put returns
non-zero, the data has been successfully saved to pseudo-RAM and
a disk write and, if the data is later read back, a disk read are avoided.
If a put returns zero, pseudo-RAM has rejected the data, and the page can
be written to swap as usual.

Note that if a page is put and the page already exists in pseudo-RAM
(a "duplicate" put), either the put succeeds and the data is overwritten,
or the put fails AND the page is flushed.  This ensures stale data may
never be obtained from pseudo-RAM.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH V2 0/4] Frontswap (was Transcendent Memory): overview
@ 2010-05-28 17:40 ` Dan Magenheimer
  0 siblings, 0 replies; 10+ messages in thread
From: Dan Magenheimer @ 2010-05-28 17:40 UTC (permalink / raw)
  To: linux-kernel, linux-mm, jeremy, hugh.dickins, ngupta, JBeulich,
	chris.mason, kurt.hackel, dave.mccracken, npiggin, akpm, riel,
	avi, pavel, konrad.wilk, dan.magenheimer

[PATCH V2 0/4] Frontswap (was Transcendent Memory): overview

Changes since V1:
- Rebased to 2.6.34 (no functional changes)
- Convert to sane types (per Al Viro comment in cleancache thread)
- Define some raw constants (Konrad Wilk)
- Performance analysis shows significant advantage for frontswap's
  synchronous page-at-a-time design (vs batched asynchronous speculated
  as an alternative design).  See http://lkml.org/lkml/2010/5/20/314

In previous patch postings, frontswap was part of the Transcendent
Memory ("tmem") patchset.  This patchset refocuses not on the underlying
technology (tmem) but instead on the useful functionality provided for Linux,
and provides a clean API so that frontswap can provide this very useful
functionality via a Xen tmem driver OR completely independent of tmem.
For example: an in-kernel compression "backend" for frontswap can be
implemented and some believe frontswap will be a very nice interface
for building RAM-like functionality for pseudo-RAM devices such as
on-memory-bus SSD or phase-change memory; and a Pune University team
is looking at a backend for virtio (see OLS'2010).

A more complete description of frontswap can be found in the introductory
comment in mm/frontswap.c (in PATCH 2/4) which is included below
for convenience.

Note that an earlier version of this patch is now shipping in OpenSuSE 11.2
and will soon ship in a release of Oracle Enterprise Linux.  Underlying
tmem technology is now shipping in Oracle VM 2.2 and was just released
in Xen 4.0 on April 15, 2010.  (Search news.google.com for Transcendent
Memory)

Signed-off-by: Dan Magenheimer <dan.magenheimer@oracle.com>
Reviewed-by: Jeremy Fitzhardinge <jeremy@goop.org>

 include/linux/frontswap.h |   98 ++++++++++++++
 include/linux/swap.h      |    2 
 include/linux/swapfile.h  |   13 +
 mm/Kconfig                |   16 ++
 mm/Makefile               |    1 
 mm/frontswap.c            |  301 ++++++++++++++++++++++++++++++++++++++++++++++
 mm/page_io.c              |   10 +
 mm/swapfile.c             |   59 +++++++--
 8 files changed, 491 insertions(+), 9 deletions(-)

Frontswap is so named because it can be thought of as the opposite of
a "backing" store for a swap device.  The storage is assumed to be
a synchronous concurrency-safe page-oriented pseudo-RAM device (such as
Xen's Transcendent Memory, aka "tmem", or in-kernel compressed memory,
aka "zmem", or other RAM-like devices) which is not directly accessible
or addressable by the kernel and is of unknown and possibly time-varying
size.  This pseudo-RAM device links itself to frontswap by setting the
frontswap_ops pointer appropriately and the functions it provides must
conform to certain policies as follows:

An "init" prepares the pseudo-RAM to receive frontswap pages and returns
a non-negative pool id, used for all swap device numbers (aka "type").
A "put_page" will copy the page to pseudo-RAM and associate it with
the type and offset associated with the page. A "get_page" will copy the
page, if found, from pseudo-RAM into kernel memory, but will NOT remove
the page from pseudo-RAM.  A "flush_page" will remove the page from
pseudo-RAM and a "flush_area" will remove ALL pages associated with the
swap type (e.g., like swapoff) and notify the pseudo-RAM device to refuse
further puts with that swap type.

Once a page is successfully put, a matching get on the page will always
succeed.  So when the kernel finds itself in a situation where it needs
to swap out a page, it first attempts to use frontswap.  If the put returns
non-zero, the data has been successfully saved to pseudo-RAM and
a disk write and, if the data is later read back, a disk read are avoided.
If a put returns zero, pseudo-RAM has rejected the data, and the page can
be written to swap as usual.

Note that if a page is put and the page already exists in pseudo-RAM
(a "duplicate" put), either the put succeeds and the data is overwritten,
or the put fails AND the page is flushed.  This ensures stale data may
never be obtained from pseudo-RAM.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH V2 0/4] Frontswap (was Transcendent Memory): overview
  2010-05-28 17:40 ` Dan Magenheimer
@ 2010-05-30 18:15   ` Nitin Gupta
  -1 siblings, 0 replies; 10+ messages in thread
From: Nitin Gupta @ 2010-05-30 18:15 UTC (permalink / raw)
  To: Dan Magenheimer
  Cc: linux-kernel, linux-mm, jeremy, hugh.dickins, JBeulich,
	chris.mason, kurt.hackel, dave.mccracken, npiggin, akpm, riel,
	avi, pavel, konrad.wilk

On 05/28/2010 11:10 PM, Dan Magenheimer wrote:
> [PATCH V2 0/4] Frontswap (was Transcendent Memory): overview
> 
> Changes since V1:
> - Rebased to 2.6.34 (no functional changes)
> - Convert to sane types (per Al Viro comment in cleancache thread)
> - Define some raw constants (Konrad Wilk)
> - Performance analysis shows significant advantage for frontswap's
>   synchronous page-at-a-time design (vs batched asynchronous speculated
>   as an alternative design).  See http://lkml.org/lkml/2010/5/20/314
> 

I think zram (http://lwn.net/Articles/388889/) is a more generic solution
and can also achieve swap-to-hypervisor as a special case.

zram is a generic in-memory compressed block device. To get frontswap
functionality, such a device (/dev/zram0) can be exposed to a VM as
a 'raw disk'. Such a disk can be used for _any_ purpose by the guest,
including use as a swap disk.

This method even works for Windows guests. Please see:
http://www.vflare.org/2010/05/compressed-ram-disk-for-windows-virtual.html

Here /dev/zram0 of size 2GB was created and exposed to Windows VM as a
'raw disk' (using VirtualBox). This disk was detected in the guest and NTFS
filesystem was created on it (Windows cannot swap directly to a partition;
it always uses swap file(s)). Then Windows was configured to swap over a
file in this drive.

Obviously, the same can be done with Linux guests. Thus, zram is useful
in both native and virtualized environments with different use cases.


Thanks,
Nitin

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH V2 0/4] Frontswap (was Transcendent Memory): overview
@ 2010-05-30 18:15   ` Nitin Gupta
  0 siblings, 0 replies; 10+ messages in thread
From: Nitin Gupta @ 2010-05-30 18:15 UTC (permalink / raw)
  To: Dan Magenheimer
  Cc: linux-kernel, linux-mm, jeremy, hugh.dickins, JBeulich,
	chris.mason, kurt.hackel, dave.mccracken, npiggin, akpm, riel,
	avi, pavel, konrad.wilk

On 05/28/2010 11:10 PM, Dan Magenheimer wrote:
> [PATCH V2 0/4] Frontswap (was Transcendent Memory): overview
> 
> Changes since V1:
> - Rebased to 2.6.34 (no functional changes)
> - Convert to sane types (per Al Viro comment in cleancache thread)
> - Define some raw constants (Konrad Wilk)
> - Performance analysis shows significant advantage for frontswap's
>   synchronous page-at-a-time design (vs batched asynchronous speculated
>   as an alternative design).  See http://lkml.org/lkml/2010/5/20/314
> 

I think zram (http://lwn.net/Articles/388889/) is a more generic solution
and can also achieve swap-to-hypervisor as a special case.

zram is a generic in-memory compressed block device. To get frontswap
functionality, such a device (/dev/zram0) can be exposed to a VM as
a 'raw disk'. Such a disk can be used for _any_ purpose by the guest,
including use as a swap disk.

This method even works for Windows guests. Please see:
http://www.vflare.org/2010/05/compressed-ram-disk-for-windows-virtual.html

Here /dev/zram0 of size 2GB was created and exposed to Windows VM as a
'raw disk' (using VirtualBox). This disk was detected in the guest and NTFS
filesystem was created on it (Windows cannot swap directly to a partition;
it always uses swap file(s)). Then Windows was configured to swap over a
file in this drive.

Obviously, the same can be done with Linux guests. Thus, zram is useful
in both native and virtualized environments with different use cases.


Thanks,
Nitin

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* RE: [PATCH V2 0/4] Frontswap (was Transcendent Memory): overview
  2010-05-30 18:15   ` Nitin Gupta
@ 2010-05-31 17:14     ` Dan Magenheimer
  -1 siblings, 0 replies; 10+ messages in thread
From: Dan Magenheimer @ 2010-05-31 17:14 UTC (permalink / raw)
  To: ngupta
  Cc: linux-kernel, linux-mm, jeremy, hugh.dickins, JBeulich,
	chris.mason, kurt.hackel, dave.mccracken, npiggin, akpm, riel,
	avi, pavel, konrad.wilk

> On 05/28/2010 11:10 PM, Dan Magenheimer wrote:
> > [PATCH V2 0/4] Frontswap (was Transcendent Memory): overview
> >
> > Changes since V1:
> > - Rebased to 2.6.34 (no functional changes)
> > - Convert to sane types (per Al Viro comment in cleancache thread)
> > - Define some raw constants (Konrad Wilk)
> > - Performance analysis shows significant advantage for frontswap's
> >   synchronous page-at-a-time design (vs batched asynchronous
> speculated
> >   as an alternative design).  See http://lkml.org/lkml/2010/5/20/314
> >
> 
> I think zram (http://lwn.net/Articles/388889/) is a more generic
> solution
> and can also achieve swap-to-hypervisor as a special case.
> 
> zram is a generic in-memory compressed block device. To get frontswap
> functionality, such a device (/dev/zram0) can be exposed to a VM as
> a 'raw disk'. Such a disk can be used for _any_ purpose by the guest,
> including use as a swap disk.

Hi Nitin --

Though I agree zram is cool inside Linux, I don't see that it can
be used to get the critical value of frontswap functionality in a
virtual environment, specifically the 100% dynamic control by the
hypervisor of every single page attempted to be "put" to frontswap.
This is the key to the "intelligent overcommit" discussed in the
previous long thread about frontswap.

Further, by doing "guest-side compression" you are eliminating
possibilities for KSM-style sharing, right?

So while zram may be a great feature, it is NOT a more generic
solution than frontswap, just a different solution that has a
different set of objectives.

Dan

^ permalink raw reply	[flat|nested] 10+ messages in thread

* RE: [PATCH V2 0/4] Frontswap (was Transcendent Memory): overview
@ 2010-05-31 17:14     ` Dan Magenheimer
  0 siblings, 0 replies; 10+ messages in thread
From: Dan Magenheimer @ 2010-05-31 17:14 UTC (permalink / raw)
  To: ngupta
  Cc: linux-kernel, linux-mm, jeremy, hugh.dickins, JBeulich,
	chris.mason, kurt.hackel, dave.mccracken, npiggin, akpm, riel,
	avi, pavel, konrad.wilk

> On 05/28/2010 11:10 PM, Dan Magenheimer wrote:
> > [PATCH V2 0/4] Frontswap (was Transcendent Memory): overview
> >
> > Changes since V1:
> > - Rebased to 2.6.34 (no functional changes)
> > - Convert to sane types (per Al Viro comment in cleancache thread)
> > - Define some raw constants (Konrad Wilk)
> > - Performance analysis shows significant advantage for frontswap's
> >   synchronous page-at-a-time design (vs batched asynchronous
> speculated
> >   as an alternative design).  See http://lkml.org/lkml/2010/5/20/314
> >
> 
> I think zram (http://lwn.net/Articles/388889/) is a more generic
> solution
> and can also achieve swap-to-hypervisor as a special case.
> 
> zram is a generic in-memory compressed block device. To get frontswap
> functionality, such a device (/dev/zram0) can be exposed to a VM as
> a 'raw disk'. Such a disk can be used for _any_ purpose by the guest,
> including use as a swap disk.

Hi Nitin --

Though I agree zram is cool inside Linux, I don't see that it can
be used to get the critical value of frontswap functionality in a
virtual environment, specifically the 100% dynamic control by the
hypervisor of every single page attempted to be "put" to frontswap.
This is the key to the "intelligent overcommit" discussed in the
previous long thread about frontswap.

Further, by doing "guest-side compression" you are eliminating
possibilities for KSM-style sharing, right?

So while zram may be a great feature, it is NOT a more generic
solution than frontswap, just a different solution that has a
different set of objectives.

Dan

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH V2 0/4] Frontswap (was Transcendent Memory): overview
  2010-05-31 17:14     ` Dan Magenheimer
@ 2010-05-31 19:09       ` Nitin Gupta
  -1 siblings, 0 replies; 10+ messages in thread
From: Nitin Gupta @ 2010-05-31 19:09 UTC (permalink / raw)
  To: Dan Magenheimer
  Cc: linux-kernel, linux-mm, jeremy, hugh.dickins, JBeulich,
	chris.mason, kurt.hackel, dave.mccracken, npiggin, akpm, riel,
	avi, pavel, konrad.wilk

Hi Dan,

On 05/31/2010 10:44 PM, Dan Magenheimer wrote:
>> On 05/28/2010 11:10 PM, Dan Magenheimer wrote:
>>> [PATCH V2 0/4] Frontswap (was Transcendent Memory): overview
>>>
>>> Changes since V1:
>>> - Rebased to 2.6.34 (no functional changes)
>>> - Convert to sane types (per Al Viro comment in cleancache thread)
>>> - Define some raw constants (Konrad Wilk)
>>> - Performance analysis shows significant advantage for frontswap's
>>>   synchronous page-at-a-time design (vs batched asynchronous
>> speculated
>>>   as an alternative design).  See http://lkml.org/lkml/2010/5/20/314
>>>
>>
>> I think zram (http://lwn.net/Articles/388889/) is a more generic
>> solution
>> and can also achieve swap-to-hypervisor as a special case.
>>
>> zram is a generic in-memory compressed block device. To get frontswap
>> functionality, such a device (/dev/zram0) can be exposed to a VM as
>> a 'raw disk'. Such a disk can be used for _any_ purpose by the guest,
>> including use as a swap disk.
> 

> 
> Though I agree zram is cool inside Linux, I don't see that it can
> be used to get the critical value of frontswap functionality in a
> virtual environment, specifically the 100% dynamic control by the
> hypervisor of every single page attempted to be "put" to frontswap.
> This is the key to the "intelligent overcommit" discussed in the
> previous long thread about frontswap.
>

Yes, zram cannot return write/put failure for arbitrary pages but other
than that what additional benefits does frontswap bring? Even with frontswap,
whatever pages are once given out to hypervisor just stay there till guest
reads them back. Unlike cleancache, you cannot free them at any point. So,
it does not seem anyway more flexible than zram.

One point I can see is additional block layer overhead in case of zram.
For this, I have not yet done detailed measurements.

 
> Further, by doing "guest-side compression" you are eliminating
> possibilities for KSM-style sharing, right?
> 

With zram, whether compression happens within guest or on the host,
depends on how it is used.

When zram device(s) are exported as raw disk(s) to a guest, pages
written to them are sent to host and they are compressed on host an
not within the guest. Also, I'm planning to include de-duplication
support for zram too (which will be separate from KSM).

> So while zram may be a great feature, it is NOT a more generic
> solution than frontswap, just a different solution that has a
> different set of objectives.
> 

frontswap is a particular use case of zram disks. However, we still
need to work on some issues with zram:
 - zram cannot return write/put failures for arbitrary pages. OTOH,
frontswap can consult host before every put and may forward pages to
in-guest swap device when put fails.
 - When a swap slot is freed, the notification from guest does
not reach zram device(s) as exported from host. OTOH, frontswap calls
frontswap_flush() which frees corresponding page from host memory.
 - Being a block device, it is potentially slower than frontswap
approach. But being a generic device, its useful for all kinds
of guest OS (including windows etc).

Thanks,
Nitin

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH V2 0/4] Frontswap (was Transcendent Memory): overview
@ 2010-05-31 19:09       ` Nitin Gupta
  0 siblings, 0 replies; 10+ messages in thread
From: Nitin Gupta @ 2010-05-31 19:09 UTC (permalink / raw)
  To: Dan Magenheimer
  Cc: linux-kernel, linux-mm, jeremy, hugh.dickins, JBeulich,
	chris.mason, kurt.hackel, dave.mccracken, npiggin, akpm, riel,
	avi, pavel, konrad.wilk

Hi Dan,

On 05/31/2010 10:44 PM, Dan Magenheimer wrote:
>> On 05/28/2010 11:10 PM, Dan Magenheimer wrote:
>>> [PATCH V2 0/4] Frontswap (was Transcendent Memory): overview
>>>
>>> Changes since V1:
>>> - Rebased to 2.6.34 (no functional changes)
>>> - Convert to sane types (per Al Viro comment in cleancache thread)
>>> - Define some raw constants (Konrad Wilk)
>>> - Performance analysis shows significant advantage for frontswap's
>>>   synchronous page-at-a-time design (vs batched asynchronous
>> speculated
>>>   as an alternative design).  See http://lkml.org/lkml/2010/5/20/314
>>>
>>
>> I think zram (http://lwn.net/Articles/388889/) is a more generic
>> solution
>> and can also achieve swap-to-hypervisor as a special case.
>>
>> zram is a generic in-memory compressed block device. To get frontswap
>> functionality, such a device (/dev/zram0) can be exposed to a VM as
>> a 'raw disk'. Such a disk can be used for _any_ purpose by the guest,
>> including use as a swap disk.
> 

> 
> Though I agree zram is cool inside Linux, I don't see that it can
> be used to get the critical value of frontswap functionality in a
> virtual environment, specifically the 100% dynamic control by the
> hypervisor of every single page attempted to be "put" to frontswap.
> This is the key to the "intelligent overcommit" discussed in the
> previous long thread about frontswap.
>

Yes, zram cannot return write/put failure for arbitrary pages but other
than that what additional benefits does frontswap bring? Even with frontswap,
whatever pages are once given out to hypervisor just stay there till guest
reads them back. Unlike cleancache, you cannot free them at any point. So,
it does not seem anyway more flexible than zram.

One point I can see is additional block layer overhead in case of zram.
For this, I have not yet done detailed measurements.

 
> Further, by doing "guest-side compression" you are eliminating
> possibilities for KSM-style sharing, right?
> 

With zram, whether compression happens within guest or on the host,
depends on how it is used.

When zram device(s) are exported as raw disk(s) to a guest, pages
written to them are sent to host and they are compressed on host an
not within the guest. Also, I'm planning to include de-duplication
support for zram too (which will be separate from KSM).

> So while zram may be a great feature, it is NOT a more generic
> solution than frontswap, just a different solution that has a
> different set of objectives.
> 

frontswap is a particular use case of zram disks. However, we still
need to work on some issues with zram:
 - zram cannot return write/put failures for arbitrary pages. OTOH,
frontswap can consult host before every put and may forward pages to
in-guest swap device when put fails.
 - When a swap slot is freed, the notification from guest does
not reach zram device(s) as exported from host. OTOH, frontswap calls
frontswap_flush() which frees corresponding page from host memory.
 - Being a block device, it is potentially slower than frontswap
approach. But being a generic device, its useful for all kinds
of guest OS (including windows etc).

Thanks,
Nitin

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* RE: [PATCH V2 0/4] Frontswap (was Transcendent Memory): overview
  2010-05-31 19:09       ` Nitin Gupta
@ 2010-06-01  0:23         ` Dan Magenheimer
  -1 siblings, 0 replies; 10+ messages in thread
From: Dan Magenheimer @ 2010-06-01  0:23 UTC (permalink / raw)
  To: ngupta
  Cc: linux-kernel, linux-mm, jeremy, hugh.dickins, JBeulich,
	chris.mason, kurt.hackel, dave.mccracken, npiggin, akpm, riel,
	avi, pavel, konrad.wilk

> From: Nitin Gupta [mailto:ngupta@vflare.org]

> frontswap is a particular use case of zram disks. However, we still
> need to work on some issues with zram:
>  - zram cannot return write/put failures for arbitrary pages. OTOH,
> frontswap can consult host before every put and may forward pages to
> in-guest swap device when put fails.
>  - When a swap slot is freed, the notification from guest does
> not reach zram device(s) as exported from host. OTOH, frontswap calls
> frontswap_flush() which frees corresponding page from host memory.
>  - Being a block device, it is potentially slower than frontswap
> approach. But being a generic device, its useful for all kinds
> of guest OS (including windows etc).

Hi Nitin --

This is a good list (not sure offhand it is complete or not) of
the key differences between zram and frontswap.  Unless/until
zram solves each of these issues -- which are critical to the
primary objective of frontswap (namely intelligent overcommit) --
I simply can't agree that frontswap is a particular use case
of zram.  Zram is just batched asynchronous I/O to a fixed-size
device with a bonus of on-the-fly compression.  Cool, yes.
Useful, yes.  Useful in some cases in a virtualized environment,
yes.  But a superset/replacement of frontswap, no.

> Yes, zram cannot return write/put failure for arbitrary pages but other
> than that what additional benefits does frontswap bring? Even with
> frontswap,
> whatever pages are once given out to hypervisor just stay there till
> guest
> reads them back. Unlike cleancache, you cannot free them at any point.
> So,
> it does not seem anyway more flexible than zram.

The flexibility is that the hypervisor can make admittance
decisions on each individual page... this is exactly what
allows for intelligent overcommit.  Since the pages "just
stay there until the guest reads them back", the hypervisor
must be very careful about which and how many pages it accepts
and the admittance decisions must be very dynamic, depending
on a lot of factors not visible to any individual guest
and not timely enough to be determined by the asynchronous
"backend I/O" subsystem of a host or dom0.

Thanks,
Dan

^ permalink raw reply	[flat|nested] 10+ messages in thread

* RE: [PATCH V2 0/4] Frontswap (was Transcendent Memory): overview
@ 2010-06-01  0:23         ` Dan Magenheimer
  0 siblings, 0 replies; 10+ messages in thread
From: Dan Magenheimer @ 2010-06-01  0:23 UTC (permalink / raw)
  To: ngupta
  Cc: linux-kernel, linux-mm, jeremy, hugh.dickins, JBeulich,
	chris.mason, kurt.hackel, dave.mccracken, npiggin, akpm, riel,
	avi, pavel, konrad.wilk

> From: Nitin Gupta [mailto:ngupta@vflare.org]

> frontswap is a particular use case of zram disks. However, we still
> need to work on some issues with zram:
>  - zram cannot return write/put failures for arbitrary pages. OTOH,
> frontswap can consult host before every put and may forward pages to
> in-guest swap device when put fails.
>  - When a swap slot is freed, the notification from guest does
> not reach zram device(s) as exported from host. OTOH, frontswap calls
> frontswap_flush() which frees corresponding page from host memory.
>  - Being a block device, it is potentially slower than frontswap
> approach. But being a generic device, its useful for all kinds
> of guest OS (including windows etc).

Hi Nitin --

This is a good list (not sure offhand it is complete or not) of
the key differences between zram and frontswap.  Unless/until
zram solves each of these issues -- which are critical to the
primary objective of frontswap (namely intelligent overcommit) --
I simply can't agree that frontswap is a particular use case
of zram.  Zram is just batched asynchronous I/O to a fixed-size
device with a bonus of on-the-fly compression.  Cool, yes.
Useful, yes.  Useful in some cases in a virtualized environment,
yes.  But a superset/replacement of frontswap, no.

> Yes, zram cannot return write/put failure for arbitrary pages but other
> than that what additional benefits does frontswap bring? Even with
> frontswap,
> whatever pages are once given out to hypervisor just stay there till
> guest
> reads them back. Unlike cleancache, you cannot free them at any point.
> So,
> it does not seem anyway more flexible than zram.

The flexibility is that the hypervisor can make admittance
decisions on each individual page... this is exactly what
allows for intelligent overcommit.  Since the pages "just
stay there until the guest reads them back", the hypervisor
must be very careful about which and how many pages it accepts
and the admittance decisions must be very dynamic, depending
on a lot of factors not visible to any individual guest
and not timely enough to be determined by the asynchronous
"backend I/O" subsystem of a host or dom0.

Thanks,
Dan

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2010-06-01  0:25 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-05-28 17:40 [PATCH V2 0/4] Frontswap (was Transcendent Memory): overview Dan Magenheimer
2010-05-28 17:40 ` Dan Magenheimer
2010-05-30 18:15 ` Nitin Gupta
2010-05-30 18:15   ` Nitin Gupta
2010-05-31 17:14   ` Dan Magenheimer
2010-05-31 17:14     ` Dan Magenheimer
2010-05-31 19:09     ` Nitin Gupta
2010-05-31 19:09       ` Nitin Gupta
2010-06-01  0:23       ` Dan Magenheimer
2010-06-01  0:23         ` Dan Magenheimer

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.