linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Seth Jennings <sjenning@linux.vnet.ibm.com>
To: Ric Mason <ric.masonn@gmail.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Nitin Gupta <ngupta@vflare.org>, Minchan Kim <minchan@kernel.org>,
	Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>,
	Dan Magenheimer <dan.magenheimer@oracle.com>,
	Robert Jennings <rcj@linux.vnet.ibm.com>,
	Jenifer Hopper <jhopper@us.ibm.com>, Mel Gorman <mgorman@suse.de>,
	Johannes Weiner <jweiner@redhat.com>,
	Rik van Riel <riel@redhat.com>,
	Larry Woodman <lwoodman@redhat.com>,
	Benjamin Herrenschmidt <benh@kernel.crashing.org>,
	Dave Hansen <dave@linux.vnet.ibm.com>,
	Joe Perches <joe@perches.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	devel@driverdev.osuosl.org
Subject: Re: [PATCHv5 2/8] zsmalloc: add documentation
Date: Mon, 25 Feb 2013 09:18:14 -0600	[thread overview]
Message-ID: <512B80B6.6090401@linux.vnet.ibm.com> (raw)
In-Reply-To: <512960CD.4080008@gmail.com>

On 02/23/2013 06:37 PM, Ric Mason wrote:
> On 02/23/2013 05:02 AM, Seth Jennings wrote:
>> On 02/21/2013 08:56 PM, Ric Mason wrote:
>>> On 02/21/2013 11:50 PM, Seth Jennings wrote:
>>>> On 02/21/2013 02:49 AM, Ric Mason wrote:
>>>>> On 02/19/2013 03:16 AM, Seth Jennings wrote:
>>>>>> On 02/16/2013 12:21 AM, Ric Mason wrote:
>>>>>>> On 02/14/2013 02:38 AM, Seth Jennings wrote:
>>>>>>>> This patch adds a documentation file for zsmalloc at
>>>>>>>> Documentation/vm/zsmalloc.txt
>>>>>>>>
>>>>>>>> Signed-off-by: Seth Jennings <sjenning@linux.vnet.ibm.com>
>>>>>>>> ---
>>>>>>>>      Documentation/vm/zsmalloc.txt |   68
>>>>>>>> +++++++++++++++++++++++++++++++++++++++++
>>>>>>>>      1 file changed, 68 insertions(+)
>>>>>>>>      create mode 100644 Documentation/vm/zsmalloc.txt
>>>>>>>>
>>>>>>>> diff --git a/Documentation/vm/zsmalloc.txt
>>>>>>>> b/Documentation/vm/zsmalloc.txt
>>>>>>>> new file mode 100644
>>>>>>>> index 0000000..85aa617
>>>>>>>> --- /dev/null
>>>>>>>> +++ b/Documentation/vm/zsmalloc.txt
>>>>>>>> @@ -0,0 +1,68 @@
>>>>>>>> +zsmalloc Memory Allocator
>>>>>>>> +
>>>>>>>> +Overview
>>>>>>>> +
>>>>>>>> +zmalloc a new slab-based memory allocator,
>>>>>>>> +zsmalloc, for storing compressed pages.  It is designed for
>>>>>>>> +low fragmentation and high allocation success rate on
>>>>>>>> +large object, but <= PAGE_SIZE allocations.
>>>>>>>> +
>>>>>>>> +zsmalloc differs from the kernel slab allocator in two primary
>>>>>>>> +ways to achieve these design goals.
>>>>>>>> +
>>>>>>>> +zsmalloc never requires high order page allocations to back
>>>>>>>> +slabs, or "size classes" in zsmalloc terms. Instead it allows
>>>>>>>> +multiple single-order pages to be stitched together into a
>>>>>>>> +"zspage" which backs the slab.  This allows for higher
>>>>>>>> allocation
>>>>>>>> +success rate under memory pressure.
>>>>>>>> +
>>>>>>>> +Also, zsmalloc allows objects to span page boundaries within the
>>>>>>>> +zspage.  This allows for lower fragmentation than could be had
>>>>>>>> +with the kernel slab allocator for objects between PAGE_SIZE/2
>>>>>>>> +and PAGE_SIZE.  With the kernel slab allocator, if a page
>>>>>>>> compresses
>>>>>>>> +to 60% of it original size, the memory savings gained through
>>>>>>>> +compression is lost in fragmentation because another object of
>>>>>>>> +the same size can't be stored in the leftover space.
>>>>>>>> +
>>>>>>>> +This ability to span pages results in zsmalloc allocations not
>>>>>>>> being
>>>>>>>> +directly addressable by the user.  The user is given an
>>>>>>>> +non-dereferencable handle in response to an allocation request.
>>>>>>>> +That handle must be mapped, using zs_map_object(), which returns
>>>>>>>> +a pointer to the mapped region that can be used.  The mapping is
>>>>>>>> +necessary since the object data may reside in two different
>>>>>>>> +noncontigious pages.
>>>>>>> Do you mean the reason of  to use a zsmalloc object must map after
>>>>>>> malloc is object data maybe reside in two different nocontiguous
>>>>>>> pages?
>>>>>> Yes, that is one reason for the mapping.  The other reason (more
>>>>>> of an
>>>>>> added bonus) is below.
>>>>>>
>>>>>>>> +
>>>>>>>> +For 32-bit systems, zsmalloc has the added benefit of being
>>>>>>>> +able to back slabs with HIGHMEM pages, something not possible
>>>>>>> What's the meaning of "back slabs with HIGHMEM pages"?
>>>>>> By HIGHMEM, I'm referring to the HIGHMEM memory zone on 32-bit
>>>>>> systems
>>>>>> with larger that 1GB (actually a little less) of RAM.  The upper
>>>>>> 3GB
>>>>>> of the 4GB address space, depending on kernel build options, is not
>>>>>> directly addressable by the kernel, but can be mapped into the
>>>>>> kernel
>>>>>> address space with functions like kmap() or kmap_atomic().
>>>>>>
>>>>>> These pages can't be used by slab/slub because they are not
>>>>>> continuously mapped into the kernel address space.  However, since
>>>>>> zsmalloc requires a mapping anyway to handle objects that span
>>>>>> non-contiguous page boundaries, we do the kernel mapping as part of
>>>>>> the process.
>>>>>>
>>>>>> So zspages, the conceptual slab in zsmalloc backed by single-order
>>>>>> pages can include pages from the HIGHMEM zone as well.
>>>>> Thanks for your clarify,
>>>>>    http://lwn.net/Articles/537422/, your article about zswap in lwn.
>>>>>    "Additionally, the kernel slab allocator does not allow
>>>>> objects that
>>>>> are less
>>>>> than a page in size to span a page boundary. This means that if an
>>>>> object is
>>>>> PAGE_SIZE/2 + 1 bytes in size, it effectively use an entire page,
>>>>> resulting in
>>>>> ~50% waste. Hense there are *no kmalloc() cache size* between
>>>>> PAGE_SIZE/2 and
>>>>> PAGE_SIZE."
>>>>> Are your sure? It seems that kmalloc cache support big size, your
>>>>> can
>>>>> check in
>>>>> include/linux/kmalloc_sizes.h
>>>> Yes, kmalloc can allocate large objects > PAGE_SIZE, but there are no
>>>> cache sizes _between_ PAGE_SIZE/2 and PAGE_SIZE.  For example, on a
>>>> system with 4k pages, there are no caches between kmalloc-2048 and
>>>> kmalloc-4096.
>>> kmalloc object > PAGE_SIZE/2 or > PAGE_SIZE should also allocate from
>>> slab cache, correct? Then how can alloc object w/o slab cache which?
>>> contains this object size objects?
>> I have to admit, I didn't understand the question.
> 
> object is allocated from slab cache, correct? There two kinds of slab
> cache, one is for general purpose, eg. kmalloc slab cache, the other
> is for special purpose, eg. mm_struct, task_struct. kmalloc object >
> PAGE_SIZE/2 or > PAGE_SIZE should also allocated from slab cache,
> correct? then why you said that there are no caches between
> kmalloc-2048 and kmalloc-4096?

Ok, now I get it.  Yes, I guess I should qualified here that there are
no _kmalloc_ caches between PAGE_SIZE/2 and PAGE_SIZE.

Yes, one can create caches of a particular size.  However that doesn't
work well for zswap because the compressed pages vary widely and size
and, imo, it doesn't make sense to create a bunch of caches very
granular in size.

Plus having granular caches doesn't solve the fragmentation issue
caused by the storage of large objects.

Thanks,
Seth


  reply	other threads:[~2013-02-25 15:27 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <1360780731-11708-1-git-send-email-sjenning@linux.vnet.ibm.com>
2013-02-16  3:20 ` [PATCHv5 0/8] zswap: compressed swap caching Ric Mason
2013-02-18 19:37   ` Seth Jennings
     [not found] ` <1360780731-11708-5-git-send-email-sjenning@linux.vnet.ibm.com>
2013-02-16  4:04   ` [PATCHv5 4/8] zswap: add to mm/ Ric Mason
2013-02-18 19:24     ` Seth Jennings
2013-02-18 19:49       ` Cody P Schafer
2013-02-18 20:07         ` Seth Jennings
2013-02-18 19:55       ` Dan Magenheimer
2013-02-18 20:39         ` Seth Jennings
2013-02-18 21:59           ` Dan Magenheimer
2013-02-18 22:52             ` Seth Jennings
2013-02-18 23:17               ` Dan Magenheimer
2013-02-20 20:37         ` Seth Jennings
     [not found] ` <1360780731-11708-3-git-send-email-sjenning@linux.vnet.ibm.com>
2013-02-16  6:21   ` [PATCHv5 2/8] zsmalloc: add documentation Ric Mason
2013-02-18 19:16     ` Seth Jennings
2013-02-21  8:49       ` Ric Mason
2013-02-21 15:50         ` Seth Jennings
2013-02-21 16:20           ` Dan Magenheimer
2013-02-22  2:56           ` Ric Mason
2013-02-22 21:02             ` Seth Jennings
2013-02-24  0:37               ` Ric Mason
2013-02-25 15:18                 ` Seth Jennings [this message]
2013-03-01  6:47                   ` Ric Mason
2013-02-22  2:59           ` Ric Mason
     [not found] ` <1360780731-11708-2-git-send-email-sjenning@linux.vnet.ibm.com>
2013-02-16  3:26   ` [PATCHv5 1/8] zsmalloc: add to mm/ Ric Mason
2013-02-18 19:04     ` Seth Jennings
2013-02-19  9:18   ` Joonsoo Kim
2013-02-19 17:54     ` Seth Jennings
2013-02-19 23:37       ` Minchan Kim
2013-02-22  9:24         ` Joonsoo Kim
2013-02-22 20:04           ` Seth Jennings
2013-02-25 17:05             ` Dan Magenheimer
2013-02-25 19:14               ` Seth Jennings
2013-02-26  0:20                 ` Dan Magenheimer
2013-02-20  2:42       ` Nitin Gupta
     [not found] ` <1360780731-11708-8-git-send-email-sjenning@linux.vnet.ibm.com>
2013-02-16  6:11   ` [PATCHv5 7/8] zswap: add swap page writeback support Ric Mason
2013-02-18 19:32     ` Seth Jennings
2013-02-25  2:54   ` Minchan Kim
2013-02-25 17:37     ` Seth Jennings

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=512B80B6.6090401@linux.vnet.ibm.com \
    --to=sjenning@linux.vnet.ibm.com \
    --cc=akpm@linux-foundation.org \
    --cc=benh@kernel.crashing.org \
    --cc=dan.magenheimer@oracle.com \
    --cc=dave@linux.vnet.ibm.com \
    --cc=devel@driverdev.osuosl.org \
    --cc=gregkh@linuxfoundation.org \
    --cc=iamjoonsoo.kim@lge.com \
    --cc=jhopper@us.ibm.com \
    --cc=joe@perches.com \
    --cc=jweiner@redhat.com \
    --cc=konrad.wilk@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lwoodman@redhat.com \
    --cc=mgorman@suse.de \
    --cc=minchan@kernel.org \
    --cc=ngupta@vflare.org \
    --cc=rcj@linux.vnet.ibm.com \
    --cc=ric.masonn@gmail.com \
    --cc=riel@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).