[1/4] zsmalloc: remove x86 dependency
diff mbox series

Message ID 1341263752-10210-2-git-send-email-sjenning@linux.vnet.ibm.com
State New, archived
Headers show
Series
  • zsmalloc improvements
Related show

Commit Message

Seth Jennings July 2, 2012, 9:15 p.m. UTC
This patch replaces the page table assisted object mapping
method, which has x86 dependencies, with a arch-independent
method that does a simple copy into a temporary per-cpu
buffer.

While a copy seems like it would be worse than mapping the pages,
tests demonstrate the copying is always faster and, in the case of
running inside a KVM guest, roughly 4x faster.

Signed-off-by: Seth Jennings <sjenning@linux.vnet.ibm.com>
---
 drivers/staging/zsmalloc/Kconfig         |    4 --
 drivers/staging/zsmalloc/zsmalloc-main.c |   99 +++++++++++++++++++++---------
 drivers/staging/zsmalloc/zsmalloc_int.h  |    5 +-
 3 files changed, 72 insertions(+), 36 deletions(-)

Comments

Minchan Kim July 10, 2012, 2:21 a.m. UTC | #1
On 07/03/2012 06:15 AM, Seth Jennings wrote:
> This patch replaces the page table assisted object mapping
> method, which has x86 dependencies, with a arch-independent
> method that does a simple copy into a temporary per-cpu
> buffer.
> 
> While a copy seems like it would be worse than mapping the pages,
> tests demonstrate the copying is always faster and, in the case of
> running inside a KVM guest, roughly 4x faster.
> 
> Signed-off-by: Seth Jennings <sjenning@linux.vnet.ibm.com>
> ---
>  drivers/staging/zsmalloc/Kconfig         |    4 --
>  drivers/staging/zsmalloc/zsmalloc-main.c |   99 +++++++++++++++++++++---------
>  drivers/staging/zsmalloc/zsmalloc_int.h  |    5 +-
>  3 files changed, 72 insertions(+), 36 deletions(-)
> 
> diff --git a/drivers/staging/zsmalloc/Kconfig b/drivers/staging/zsmalloc/Kconfig
> index a5ab720..9084565 100644
> --- a/drivers/staging/zsmalloc/Kconfig
> +++ b/drivers/staging/zsmalloc/Kconfig
> @@ -1,9 +1,5 @@
>  config ZSMALLOC
>  	tristate "Memory allocator for compressed pages"
> -	# X86 dependency is because of the use of __flush_tlb_one and set_pte
> -	# in zsmalloc-main.c.
> -	# TODO: convert these to portable functions
> -	depends on X86
>  	default n
>  	help
>  	  zsmalloc is a slab-based memory allocator designed to store
> diff --git a/drivers/staging/zsmalloc/zsmalloc-main.c b/drivers/staging/zsmalloc/zsmalloc-main.c
> index 10b0d60..a7a6f22 100644
> --- a/drivers/staging/zsmalloc/zsmalloc-main.c
> +++ b/drivers/staging/zsmalloc/zsmalloc-main.c
> @@ -470,6 +470,57 @@ static struct page *find_get_zspage(struct size_class *class)
>  	return page;
>  }
>  
> +static void zs_copy_map_object(char *buf, struct page *firstpage,
> +				int off, int size)

firstpage is rather misleading.
As you know, we use firstpage term for real firstpage of zspage but
in case of zs_copy_map_object, it could be a middle page of zspage.
So I would like to use "page" instead of firstpage.

> +{
> +	struct page *pages[2];
> +	int sizes[2];
> +	void *addr;
> +
> +	pages[0] = firstpage;
> +	pages[1] = get_next_page(firstpage);
> +	BUG_ON(!pages[1]);
> +
> +	sizes[0] = PAGE_SIZE - off;
> +	sizes[1] = size - sizes[0];
> +
> +	/* disable page faults to match kmap_atomic() return conditions */
> +	pagefault_disable();

If I understand your intention correctly, you want to prevent calling
this function on non-atomic context. Right?
Please write down description more clearly as point of view what's happen
if we didn't.

> +
> +	/* copy object to per-cpu buffer */
> +	addr = kmap_atomic(pages[0]);
> +	memcpy(buf, addr + off, sizes[0]);
> +	kunmap_atomic(addr);
> +	addr = kmap_atomic(pages[1]);
> +	memcpy(buf + sizes[0], addr, sizes[1]);
> +	kunmap_atomic(addr);
> +}
> +
> +static void zs_copy_unmap_object(char *buf, struct page *firstpage,
> +				int off, int size)

Ditto firstpage.

Otherwise, Looks good to me.
Seth Jennings July 10, 2012, 3:29 p.m. UTC | #2
On 07/09/2012 09:21 PM, Minchan Kim wrote:
> On 07/03/2012 06:15 AM, Seth Jennings wrote:
<snip>
>> +static void zs_copy_map_object(char *buf, struct page *firstpage,
>> +				int off, int size)
> 
> firstpage is rather misleading.
> As you know, we use firstpage term for real firstpage of zspage but
> in case of zs_copy_map_object, it could be a middle page of zspage.
> So I would like to use "page" instead of firstpage.

Accepted.

>> +{
>> +	struct page *pages[2];
>> +	int sizes[2];
>> +	void *addr;
>> +
>> +	pages[0] = firstpage;
>> +	pages[1] = get_next_page(firstpage);
>> +	BUG_ON(!pages[1]);
>> +
>> +	sizes[0] = PAGE_SIZE - off;
>> +	sizes[1] = size - sizes[0];
>> +
>> +	/* disable page faults to match kmap_atomic() return conditions */
>> +	pagefault_disable();
> 
> If I understand your intention correctly, you want to prevent calling
> this function on non-atomic context. Right?

This is moved to zs_map_object() in a later patch, but the
point is to provide uniform return conditions, regardless of
whether the object to be mapped is contained in a single
page or spans two pages.  kmap_atomic() disables page
faults, so I did it here to create symmetry.  The result is
that zs_map_object always returns with preemption and page
faults disabled.

Also, Greg already merged these patches so I'll have to
incorporate these changes as a separate patch.

Thanks,
Seth

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Minchan Kim July 11, 2012, 7:27 a.m. UTC | #3
On 07/11/2012 12:29 AM, Seth Jennings wrote:
> On 07/09/2012 09:21 PM, Minchan Kim wrote:
>> On 07/03/2012 06:15 AM, Seth Jennings wrote:
> <snip>
>>> +static void zs_copy_map_object(char *buf, struct page *firstpage,
>>> +				int off, int size)
>>
>> firstpage is rather misleading.
>> As you know, we use firstpage term for real firstpage of zspage but
>> in case of zs_copy_map_object, it could be a middle page of zspage.
>> So I would like to use "page" instead of firstpage.
> 
> Accepted.
> 
>>> +{
>>> +	struct page *pages[2];
>>> +	int sizes[2];
>>> +	void *addr;
>>> +
>>> +	pages[0] = firstpage;
>>> +	pages[1] = get_next_page(firstpage);
>>> +	BUG_ON(!pages[1]);
>>> +
>>> +	sizes[0] = PAGE_SIZE - off;
>>> +	sizes[1] = size - sizes[0];
>>> +
>>> +	/* disable page faults to match kmap_atomic() return conditions */
>>> +	pagefault_disable();
>>
>> If I understand your intention correctly, you want to prevent calling
>> this function on non-atomic context. Right?
> 
> This is moved to zs_map_object() in a later patch, but the
> point is to provide uniform return conditions, regardless of
> whether the object to be mapped is contained in a single
> page or spans two pages.  kmap_atomic() disables page
> faults, so I did it here to create symmetry.  The result is

The one I want to comment out is why we should disable page fault.
ie, if we don't disable page fault, what's happen?

As I read the comment of kmap_atomic about pagefault_disable, 
it seems that for preventing reentrant bug in preemptive kernel
while it catch page fault during atomic context in non-preemptive kernel.
But I'm not sure so Ccing Peter.


> that zs_map_object always returns with preemption and page
> faults disabled.
> 
> Also, Greg already merged these patches so I'll have to
> incorporate these changes as a separate patch.
> 
> Thanks,
> Seth
>
Nitin Gupta July 11, 2012, 6:26 p.m. UTC | #4
On 07/02/2012 02:15 PM, Seth Jennings wrote:
> This patch replaces the page table assisted object mapping
> method, which has x86 dependencies, with a arch-independent
> method that does a simple copy into a temporary per-cpu
> buffer.
> 
> While a copy seems like it would be worse than mapping the pages,
> tests demonstrate the copying is always faster and, in the case of
> running inside a KVM guest, roughly 4x faster.
>
> Signed-off-by: Seth Jennings <sjenning@linux.vnet.ibm.com>
> ---
>  drivers/staging/zsmalloc/Kconfig         |    4 --
>  drivers/staging/zsmalloc/zsmalloc-main.c |   99 +++++++++++++++++++++---------
>  drivers/staging/zsmalloc/zsmalloc_int.h  |    5 +-
>  3 files changed, 72 insertions(+), 36 deletions(-)
> 


>  struct mapping_area {
> -	struct vm_struct *vm;
> -	pte_t *vm_ptes[2];
> -	char *vm_addr;
> +	char *vm_buf; /* copy buffer for objects that span pages */
> +	char *vm_addr; /* address of kmap_atomic()'ed pages */
>  };
>  

I think we can reduce the copying overhead by not copying an entire
compressed object to another (per-cpu) buffer. The basic idea of the
method below is to:
 - Copy only the amount of data that spills over into the next page
 - No need for a separate buffer to copy into

Currently, we store objects that split across pages as:

+-Page1-+
|	|
|	|
|-------| <-- obj-1 off: 0
|<ob1'>	|
+-------+ <-- obj-1 off: s'

+-Page2-+ <-- obj-1 off: s'
|<ob1''>|
|-------| <-- obj-1 off: obj1_size, obj-2 off: 0
|<ob2>	|
|-------| <-- obj-2 off: obj2_size
+-------+

But now we would store it as:

+-Page1-+
|	|
|-------| <-- obj-1 off: s''
|	|
|<ob1'>	|
+-------+ <-- obj-1 off: obj1_size

+-Page2-+ <-- obj-1 off: 0
|<ob1''>|
|-------| <-- obj-1 off: s'', obj-2 off: 0
|<ob2>	|
|-------| <-- obj-2 off: obj2_size
+-------+

When object-1 (ob1) is to be mapped, part (size: s'-0) of object-2 will
be swapped with ob1'. This swapping can be done in-place using simple
xor swap algorithm. So, after swap, page-1 and page-2 will look like:

+-Page1-+
|	|
|-------| <-- obj-2 off: 0
|	|
|<ob2''>|
+-------+ <-- obj-2 off: (obj1_size - s'')

+-Page2-+ <-- obj-1 off: 0
|	|
|<ob1>	|
|-------| <-- obj-1 off: obj1_size, obj-2 off: (obj1_size - s'')
|<ob2'>	|
+-------+ <-- obj-2 off: obj2_size

Now obj-1 lies completely within page-2, so can be kmap'ed as usual. On
zs_unmap_object() we would just do the reverse and restore objects as in
figure-1.

We can reduce the overhead even further by removing the need to
"restore" the object. For this, we would have to remove the assumption
in zsmalloc that the spilled part begins at offset 0 in the next page.
Not sure how feasible this would be.

Note that doing this transform is always possible since we ensure that
an object can never exceed system page size.

Thanks,
Nitin


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Seth Jennings July 11, 2012, 8:32 p.m. UTC | #5
On 07/11/2012 01:26 PM, Nitin Gupta wrote:
> On 07/02/2012 02:15 PM, Seth Jennings wrote:
>> This patch replaces the page table assisted object mapping
>> method, which has x86 dependencies, with a arch-independent
>> method that does a simple copy into a temporary per-cpu
>> buffer.
>>
>> While a copy seems like it would be worse than mapping the pages,
>> tests demonstrate the copying is always faster and, in the case of
>> running inside a KVM guest, roughly 4x faster.
>>
>> Signed-off-by: Seth Jennings <sjenning@linux.vnet.ibm.com>
>> ---
>>  drivers/staging/zsmalloc/Kconfig         |    4 --
>>  drivers/staging/zsmalloc/zsmalloc-main.c |   99 +++++++++++++++++++++---------
>>  drivers/staging/zsmalloc/zsmalloc_int.h  |    5 +-
>>  3 files changed, 72 insertions(+), 36 deletions(-)
>>
> 
> 
>>  struct mapping_area {
>> -	struct vm_struct *vm;
>> -	pte_t *vm_ptes[2];
>> -	char *vm_addr;
>> +	char *vm_buf; /* copy buffer for objects that span pages */
>> +	char *vm_addr; /* address of kmap_atomic()'ed pages */
>>  };
>>  
> 
> I think we can reduce the copying overhead by not copying an entire
> compressed object to another (per-cpu) buffer. The basic idea of the
> method below is to:
>  - Copy only the amount of data that spills over into the next page
>  - No need for a separate buffer to copy into
> 
> Currently, we store objects that split across pages as:
> 
> +-Page1-+
> |	|
> |	|
> |-------| <-- obj-1 off: 0
> |<ob1'>	|
> +-------+ <-- obj-1 off: s'
> 
> +-Page2-+ <-- obj-1 off: s'
> |<ob1''>|
> |-------| <-- obj-1 off: obj1_size, obj-2 off: 0
> |<ob2>	|
> |-------| <-- obj-2 off: obj2_size
> +-------+
> 
> But now we would store it as:
> 
> +-Page1-+
> |	|
> |-------| <-- obj-1 off: s''
> |	|
> |<ob1'>	|
> +-------+ <-- obj-1 off: obj1_size
> 
> +-Page2-+ <-- obj-1 off: 0
> |<ob1''>|
> |-------| <-- obj-1 off: s'', obj-2 off: 0
> |<ob2>	|
> |-------| <-- obj-2 off: obj2_size
> +-------+
> 
> When object-1 (ob1) is to be mapped, part (size: s'-0) of object-2 will
> be swapped with ob1'. This swapping can be done in-place using simple
> xor swap algorithm. So, after swap, page-1 and page-2 will look like:
> 
> +-Page1-+
> |	|
> |-------| <-- obj-2 off: 0
> |	|
> |<ob2''>|
> +-------+ <-- obj-2 off: (obj1_size - s'')
> 
> +-Page2-+ <-- obj-1 off: 0
> |	|
> |<ob1>	|
> |-------| <-- obj-1 off: obj1_size, obj-2 off: (obj1_size - s'')
> |<ob2'>	|
> +-------+ <-- obj-2 off: obj2_size
> 
> Now obj-1 lies completely within page-2, so can be kmap'ed as usual. On
> zs_unmap_object() we would just do the reverse and restore objects as in
> figure-1.

Hey Nitin, thanks for the feedback.

Correct me if I'm wrong, but it seems like you wouldn't be able to map
ob2 while ob1 was mapped with this design.  You'd need some sort of
zspage level protection against concurrent object mappings.  The
code for that protection might cancel any benefit you would gain by
doing it this way.

Thanks,
Seth

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Nitin Gupta July 11, 2012, 10:42 p.m. UTC | #6
On Wed, Jul 11, 2012 at 1:32 PM, Seth Jennings
<sjenning@linux.vnet.ibm.com> wrote:
> On 07/11/2012 01:26 PM, Nitin Gupta wrote:
>> On 07/02/2012 02:15 PM, Seth Jennings wrote:
>>> This patch replaces the page table assisted object mapping
>>> method, which has x86 dependencies, with a arch-independent
>>> method that does a simple copy into a temporary per-cpu
>>> buffer.
>>>
>>> While a copy seems like it would be worse than mapping the pages,
>>> tests demonstrate the copying is always faster and, in the case of
>>> running inside a KVM guest, roughly 4x faster.
>>>
>>> Signed-off-by: Seth Jennings <sjenning@linux.vnet.ibm.com>
>>> ---
>>>  drivers/staging/zsmalloc/Kconfig         |    4 --
>>>  drivers/staging/zsmalloc/zsmalloc-main.c |   99 +++++++++++++++++++++---------
>>>  drivers/staging/zsmalloc/zsmalloc_int.h  |    5 +-
>>>  3 files changed, 72 insertions(+), 36 deletions(-)
>>>
>>
>>
>>>  struct mapping_area {
>>> -    struct vm_struct *vm;
>>> -    pte_t *vm_ptes[2];
>>> -    char *vm_addr;
>>> +    char *vm_buf; /* copy buffer for objects that span pages */
>>> +    char *vm_addr; /* address of kmap_atomic()'ed pages */
>>>  };
>>>
>>
>> I think we can reduce the copying overhead by not copying an entire
>> compressed object to another (per-cpu) buffer. The basic idea of the
>> method below is to:
>>  - Copy only the amount of data that spills over into the next page
>>  - No need for a separate buffer to copy into
>>
>> Currently, we store objects that split across pages as:
>>
>> +-Page1-+
>> |     |
>> |     |
>> |-------| <-- obj-1 off: 0
>> |<ob1'>       |
>> +-------+ <-- obj-1 off: s'
>>
>> +-Page2-+ <-- obj-1 off: s'
>> |<ob1''>|
>> |-------| <-- obj-1 off: obj1_size, obj-2 off: 0
>> |<ob2>        |
>> |-------| <-- obj-2 off: obj2_size
>> +-------+
>>
>> But now we would store it as:
>>
>> +-Page1-+
>> |     |
>> |-------| <-- obj-1 off: s''
>> |     |
>> |<ob1'>       |
>> +-------+ <-- obj-1 off: obj1_size
>>
>> +-Page2-+ <-- obj-1 off: 0
>> |<ob1''>|
>> |-------| <-- obj-1 off: s'', obj-2 off: 0
>> |<ob2>        |
>> |-------| <-- obj-2 off: obj2_size
>> +-------+
>>
>> When object-1 (ob1) is to be mapped, part (size: s'-0) of object-2 will
>> be swapped with ob1'. This swapping can be done in-place using simple
>> xor swap algorithm. So, after swap, page-1 and page-2 will look like:
>>
>> +-Page1-+
>> |     |
>> |-------| <-- obj-2 off: 0
>> |     |
>> |<ob2''>|
>> +-------+ <-- obj-2 off: (obj1_size - s'')
>>
>> +-Page2-+ <-- obj-1 off: 0
>> |     |
>> |<ob1>        |
>> |-------| <-- obj-1 off: obj1_size, obj-2 off: (obj1_size - s'')
>> |<ob2'>       |
>> +-------+ <-- obj-2 off: obj2_size
>>
>> Now obj-1 lies completely within page-2, so can be kmap'ed as usual. On
>> zs_unmap_object() we would just do the reverse and restore objects as in
>> figure-1.
>
> Hey Nitin, thanks for the feedback.
>
> Correct me if I'm wrong, but it seems like you wouldn't be able to map
> ob2 while ob1 was mapped with this design.  You'd need some sort of
> zspage level protection against concurrent object mappings.  The
> code for that protection might cancel any benefit you would gain by
> doing it this way.
>

Do you think blocking access of just one particular object (or
blocking an entire zspage, for simplicity) for a short time would be
an issue, apart from the complexity of implementing per zspage
locking?

Thanks,
Nitin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Seth Jennings July 12, 2012, 12:23 a.m. UTC | #7
On 07/11/2012 05:42 PM, Nitin Gupta wrote:
> On Wed, Jul 11, 2012 at 1:32 PM, Seth Jennings
> <sjenning@linux.vnet.ibm.com> wrote:
>> On 07/11/2012 01:26 PM, Nitin Gupta wrote:
<snip>
>>> Now obj-1 lies completely within page-2, so can be kmap'ed as usual. On
>>> zs_unmap_object() we would just do the reverse and restore objects as in
>>> figure-1.
>>
>> Hey Nitin, thanks for the feedback.
>>
>> Correct me if I'm wrong, but it seems like you wouldn't be able to map
>> ob2 while ob1 was mapped with this design.  You'd need some sort of
>> zspage level protection against concurrent object mappings.  The
>> code for that protection might cancel any benefit you would gain by
>> doing it this way.
>>
> 
> Do you think blocking access of just one particular object (or
> blocking an entire zspage, for simplicity) for a short time would be
> an issue, apart from the complexity of implementing per zspage
> locking?

It would only need to prevent the mapping of the temporarily displaced
object, but I said zspage because I don't know how we would do
per-object locking.  I actually don't know how we would do zspage
locking either unless there is a lock in the struct page we can use.

Either way, I think it is a complexity I think we'd be better to avoid
for now.  I'm trying to get zsmalloc in shape to bring into mainline, so
I'm really focusing on portability first and low hanging performance
fruit second. This optimization would be more like top-of-the-tree
performance fruit :-/

However, if you want to try it out, don't let me stop you :)

Thanks,
Seth

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Patch
diff mbox series

diff --git a/drivers/staging/zsmalloc/Kconfig b/drivers/staging/zsmalloc/Kconfig
index a5ab720..9084565 100644
--- a/drivers/staging/zsmalloc/Kconfig
+++ b/drivers/staging/zsmalloc/Kconfig
@@ -1,9 +1,5 @@ 
 config ZSMALLOC
 	tristate "Memory allocator for compressed pages"
-	# X86 dependency is because of the use of __flush_tlb_one and set_pte
-	# in zsmalloc-main.c.
-	# TODO: convert these to portable functions
-	depends on X86
 	default n
 	help
 	  zsmalloc is a slab-based memory allocator designed to store
diff --git a/drivers/staging/zsmalloc/zsmalloc-main.c b/drivers/staging/zsmalloc/zsmalloc-main.c
index 10b0d60..a7a6f22 100644
--- a/drivers/staging/zsmalloc/zsmalloc-main.c
+++ b/drivers/staging/zsmalloc/zsmalloc-main.c
@@ -470,6 +470,57 @@  static struct page *find_get_zspage(struct size_class *class)
 	return page;
 }
 
+static void zs_copy_map_object(char *buf, struct page *firstpage,
+				int off, int size)
+{
+	struct page *pages[2];
+	int sizes[2];
+	void *addr;
+
+	pages[0] = firstpage;
+	pages[1] = get_next_page(firstpage);
+	BUG_ON(!pages[1]);
+
+	sizes[0] = PAGE_SIZE - off;
+	sizes[1] = size - sizes[0];
+
+	/* disable page faults to match kmap_atomic() return conditions */
+	pagefault_disable();
+
+	/* copy object to per-cpu buffer */
+	addr = kmap_atomic(pages[0]);
+	memcpy(buf, addr + off, sizes[0]);
+	kunmap_atomic(addr);
+	addr = kmap_atomic(pages[1]);
+	memcpy(buf + sizes[0], addr, sizes[1]);
+	kunmap_atomic(addr);
+}
+
+static void zs_copy_unmap_object(char *buf, struct page *firstpage,
+				int off, int size)
+{
+	struct page *pages[2];
+	int sizes[2];
+	void *addr;
+
+	pages[0] = firstpage;
+	pages[1] = get_next_page(firstpage);
+	BUG_ON(!pages[1]);
+
+	sizes[0] = PAGE_SIZE - off;
+	sizes[1] = size - sizes[0];
+
+	/* copy per-cpu buffer to object */
+	addr = kmap_atomic(pages[0]);
+	memcpy(addr + off, buf, sizes[0]);
+	kunmap_atomic(addr);
+	addr = kmap_atomic(pages[1]);
+	memcpy(addr, buf + sizes[0], sizes[1]);
+	kunmap_atomic(addr);
+
+	/* enable page faults to match kunmap_atomic() return conditions */
+	pagefault_enable();
+}
 
 static int zs_cpu_notifier(struct notifier_block *nb, unsigned long action,
 				void *pcpu)
@@ -480,18 +531,23 @@  static int zs_cpu_notifier(struct notifier_block *nb, unsigned long action,
 	switch (action) {
 	case CPU_UP_PREPARE:
 		area = &per_cpu(zs_map_area, cpu);
-		if (area->vm)
-			break;
-		area->vm = alloc_vm_area(2 * PAGE_SIZE, area->vm_ptes);
-		if (!area->vm)
-			return notifier_from_errno(-ENOMEM);
+		/*
+		 * Make sure we don't leak memory if a cpu UP notification
+		 * and zs_init() race and both call zs_cpu_up() on the same cpu
+		 */
+		if (area->vm_buf)
+			return 0;
+		area->vm_buf = (char *)__get_free_page(GFP_KERNEL);
+		if (!area->vm_buf)
+			return -ENOMEM;
+		return 0;
 		break;
 	case CPU_DEAD:
 	case CPU_UP_CANCELED:
 		area = &per_cpu(zs_map_area, cpu);
-		if (area->vm)
-			free_vm_area(area->vm);
-		area->vm = NULL;
+		if (area->vm_buf)
+			free_page((unsigned long)area->vm_buf);
+		area->vm_buf = NULL;
 		break;
 	}
 
@@ -714,22 +770,11 @@  void *zs_map_object(struct zs_pool *pool, unsigned long handle)
 	if (off + class->size <= PAGE_SIZE) {
 		/* this object is contained entirely within a page */
 		area->vm_addr = kmap_atomic(page);
-	} else {
-		/* this object spans two pages */
-		struct page *nextp;
-
-		nextp = get_next_page(page);
-		BUG_ON(!nextp);
-
-
-		set_pte(area->vm_ptes[0], mk_pte(page, PAGE_KERNEL));
-		set_pte(area->vm_ptes[1], mk_pte(nextp, PAGE_KERNEL));
-
-		/* We pre-allocated VM area so mapping can never fail */
-		area->vm_addr = area->vm->addr;
+		return area->vm_addr + off;
 	}
 
-	return area->vm_addr + off;
+	zs_copy_map_object(area->vm_buf, page, off, class->size);
+	return area->vm_buf;
 }
 EXPORT_SYMBOL_GPL(zs_map_object);
 
@@ -751,14 +796,10 @@  void zs_unmap_object(struct zs_pool *pool, unsigned long handle)
 	off = obj_idx_to_offset(page, obj_idx, class->size);
 
 	area = &__get_cpu_var(zs_map_area);
-	if (off + class->size <= PAGE_SIZE) {
+	if (off + class->size <= PAGE_SIZE)
 		kunmap_atomic(area->vm_addr);
-	} else {
-		set_pte(area->vm_ptes[0], __pte(0));
-		set_pte(area->vm_ptes[1], __pte(0));
-		__flush_tlb_one((unsigned long)area->vm_addr);
-		__flush_tlb_one((unsigned long)area->vm_addr + PAGE_SIZE);
-	}
+	else
+		zs_copy_unmap_object(area->vm_buf, page, off, class->size);
 	put_cpu_var(zs_map_area);
 }
 EXPORT_SYMBOL_GPL(zs_unmap_object);
diff --git a/drivers/staging/zsmalloc/zsmalloc_int.h b/drivers/staging/zsmalloc/zsmalloc_int.h
index 6fd32a9..f760dae 100644
--- a/drivers/staging/zsmalloc/zsmalloc_int.h
+++ b/drivers/staging/zsmalloc/zsmalloc_int.h
@@ -110,9 +110,8 @@  enum fullness_group {
 static const int fullness_threshold_frac = 4;
 
 struct mapping_area {
-	struct vm_struct *vm;
-	pte_t *vm_ptes[2];
-	char *vm_addr;
+	char *vm_buf; /* copy buffer for objects that span pages */
+	char *vm_addr; /* address of kmap_atomic()'ed pages */
 };
 
 struct size_class {