Re: [PATCH 1/4] compcache: xvmalloc memory allocator

From: Nitin Gupta <ngupta@vflare.org>
To: ngupta@vflare.org
Cc: Hugh Dickins <hugh.dickins@tiscali.co.uk>,
	Pekka Enberg <penberg@cs.helsinki.fi>,
	akpm@linux-foundation.org, linux-kernel@vger.kernel.org,
	linux-mm@kvack.org, linux-mm-cc@laptop.org
Subject: Re: [PATCH 1/4] compcache: xvmalloc memory allocator
Date: Wed, 26 Aug 2009 00:33:40 +0530	[thread overview]
Message-ID: <4A94358C.6060708@vflare.org> (raw)
In-Reply-To: <4A93FAA5.5000001@vflare.org>

On 08/25/2009 08:22 PM, Nitin Gupta wrote:
> On 08/25/2009 03:16 AM, Hugh Dickins wrote:
>> On Tue, 25 Aug 2009, Nitin Gupta wrote:
>>> On 08/25/2009 02:09 AM, Hugh Dickins wrote:
>>>> On Tue, 25 Aug 2009, Nitin Gupta wrote:
>>>>> On 08/24/2009 11:03 PM, Pekka Enberg wrote:
>>>>>>
>>>>>> What's the purpose of passing PFNs around? There's quite a lot of PFN
>>>>>> to struct page conversion going on because of it. Wouldn't it make
>>>>>> more sense to return (and pass) a pointer to struct page instead?
>>>>>
>>>>> PFNs are 32-bit on all archs
>>>>
>>>> Are you sure? If it happens to be so for all machines built today,
>>>> I think it can easily change tomorrow. We consistently use unsigned
>>>> long
>>>> for pfn (there, now I've said that, I bet you'll find somewhere we
>>>> don't!)
>>>>
>>>> x86_64 says MAX_PHYSMEM_BITS 46 and ia64 says MAX_PHYSMEM_BITS 50 and
>>>> mm/sparse.c says
>>>> unsigned long max_sparsemem_pfn = 1UL<< (MAX_PHYSMEM_BITS-PAGE_SHIFT);
>>>>
>>>
>>> For PFN to exceed 32-bit we need to have physical memory> 16TB (2^32
>>> * 4KB).
>>> So, maybe I can simply add a check in ramzswap module load to make
>>> sure that
>>> RAM is indeed< 16TB and then safely use 32-bit for PFN?
>>
>> Others know much more about it, but I believe that with sparsemem you
>> may be handling vast holes in physical memory: so a relatively small
>> amount of physical memory might in part be mapped with gigantic pfns.
>>
>> So if you go that route, I think you'd rather have to refuse pages
>> with oversized pfns (or refuse configurations with any oversized pfns),
>> than base it upon the quantity of physical memory in the machine.
>>
>> Seems ugly to me, as it did to Pekka; but I can understand that you're
>> very much in the business of saving memory, so doubling the size of some
>> of your tables (I may be oversimplifying) would be repugnant to you.
>>
>> You could add a CONFIG option, rather like CONFIG_LBDAF, to switch on
>> u64-sized pfns; but you'd still have to handle what happens when the
>> pfn is too big to fit in u32 without that option; and if distros always
>> switch the option on, to accomodate the larger machines, then there may
>> have been no point to adding it.
>>
>
> Thanks for these details.
>
> Now I understand that use of 32-bit PFN on 64-bit archs is unsafe. So,
> there is no option but to include extra bits for PFNs or use struct page.
>
> * Solution of ramzswap block device:
>
> Use 48 bit PFNs (32 + 8) and have a compile time error to make sure that
> that MAX_PHYSMEM_BITS is < 48 + PAGE_SHIFT. The ramzswap table can
> accommodate
> 48-bits without any increase in table size.
>

I went crazy. I meant 40 bits for PFN -- not 48. This 40-bit PFN should be 
sufficient for all archs. For archs where 40 + PAGE_SHIFT < MAX_PHYSMEM_BITS
ramzswap will just issue a compiler error.

Thanks,
Nitin