linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 1/2] mm/zsmalloc.c: check encoded object value overflow for PAE
@ 2018-10-25  1:27 Rafael David Tinoco
  2018-10-25  1:27 ` [PATCH 2/2] mm/zsmalloc.c: fix zsmalloc ARM LPAE support Rafael David Tinoco
                   ` (3 more replies)
  0 siblings, 4 replies; 12+ messages in thread
From: Rafael David Tinoco @ 2018-10-25  1:27 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-arm-kernel, linux-mm, Rafael David Tinoco, Russell King,
	Mark Brown, Sergey Senozhatsky, Nitin Gupta, Minchan Kim

On 32-bit systems, zsmalloc uses HIGHMEM and, when PAE is enabled, the
physical frame number might be so big that zsmalloc obj encoding (to
location) will break IF architecture does not re-define
MAX_PHYSMEM_BITS, causing:

[  497.097843] ==================================================================
[  497.102365] BUG: KASAN: null-ptr-deref in zs_map_object+0xa4/0x2bc
[  497.105933] Read of size 4 at addr 00000000 by task mkfs.ext4/623
[  497.109684]
[  497.110722] CPU: 2 PID: 623 Comm: mkfs.ext4 Not tainted 4.19.0-rc8-00017-g8239bc6d3307-dirty #15
[  497.116098] Hardware name: Generic DT based system
[  497.119094] [<c0418f7c>] (unwind_backtrace) from [<c0410ca4>] (show_stack+0x20/0x24)
[  497.123819] [<c0410ca4>] (show_stack) from [<c16bd540>] (dump_stack+0xbc/0xe8)
[  497.128299] [<c16bd540>] (dump_stack) from [<c06cab74>] (kasan_report+0x248/0x390)
[  497.132928] [<c06cab74>] (kasan_report) from [<c06c94e8>] (__asan_load4+0x78/0xb4)
[  497.137530] [<c06c94e8>] (__asan_load4) from [<c06ddc24>] (zs_map_object+0xa4/0x2bc)
[  497.142335] [<c06ddc24>] (zs_map_object) from [<bf0bbbd8>] (zram_bvec_rw.constprop.2+0x324/0x8e4 [zram])
[  497.148294] [<bf0bbbd8>] (zram_bvec_rw.constprop.2 [zram]) from [<bf0bc3cc>] (zram_make_request+0x234/0x46c [zram])
[  497.154653] [<bf0bc3cc>] (zram_make_request [zram]) from [<c09aff9c>] (generic_make_request+0x304/0x63c)
[  497.160413] [<c09aff9c>] (generic_make_request) from [<c09b0320>] (submit_bio+0x4c/0x1c8)
[  497.165379] [<c09b0320>] (submit_bio) from [<c0743570>] (submit_bh_wbc.constprop.15+0x238/0x26c)
[  497.170775] [<c0743570>] (submit_bh_wbc.constprop.15) from [<c0746cf8>] (__block_write_full_page+0x524/0x76c)
[  497.176776] [<c0746cf8>] (__block_write_full_page) from [<c07472c4>] (block_write_full_page+0x1bc/0x1d4)
[  497.182549] [<c07472c4>] (block_write_full_page) from [<c0748eb4>] (blkdev_writepage+0x24/0x28)
[  497.187849] [<c0748eb4>] (blkdev_writepage) from [<c064a780>] (__writepage+0x44/0x78)
[  497.192633] [<c064a780>] (__writepage) from [<c064b50c>] (write_cache_pages+0x3b8/0x800)
[  497.197616] [<c064b50c>] (write_cache_pages) from [<c064dd78>] (generic_writepages+0x74/0xa0)
[  497.202807] [<c064dd78>] (generic_writepages) from [<c0748e64>] (blkdev_writepages+0x18/0x1c)
[  497.208022] [<c0748e64>] (blkdev_writepages) from [<c064e890>] (do_writepages+0x68/0x134)
[  497.213002] [<c064e890>] (do_writepages) from [<c06368a4>] (__filemap_fdatawrite_range+0xb0/0xf4)
[  497.218447] [<c06368a4>] (__filemap_fdatawrite_range) from [<c0636b68>] (file_write_and_wait_range+0x64/0xd0)
[  497.224416] [<c0636b68>] (file_write_and_wait_range) from [<c0747af8>] (blkdev_fsync+0x54/0x84)
[  497.229749] [<c0747af8>] (blkdev_fsync) from [<c0739dac>] (vfs_fsync_range+0x70/0xd4)
[  497.234549] [<c0739dac>] (vfs_fsync_range) from [<c0739e98>] (do_fsync+0x4c/0x80)
[  497.239159] [<c0739e98>] (do_fsync) from [<c073a26c>] (sys_fsync+0x1c/0x20)
[  497.243407] [<c073a26c>] (sys_fsync) from [<c0401000>] (ret_fast_syscall+0x0/0x2c)

like in this ARM 32-bit (LPAE enabled), example.

That occurs because, if not set, MAX_POSSIBLE_PHYSMEM_BITS will default
to BITS_PER_LONG (32) in most cases, and, for PAE, _PFN_BITS will be
wrong: which may cause obj variable to overflow if PFN is HIGHMEM and
referencing any page above the 4GB watermark.

This commit exposes the BUG IF the architecture supports PAE AND does
not explicitly set MAX_POSSIBLE_PHYSMEM_BITS to supported value.

Link: https://bugs.linaro.org/show_bug.cgi?id=3765#c17
Signed-off-by: Rafael David Tinoco <rafael.tinoco@linaro.org>
---
 mm/zsmalloc.c | 18 ++++++++++++++++--
 1 file changed, 16 insertions(+), 2 deletions(-)

diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
index 9da65552e7ca..9c3ff8c2ccbc 100644
--- a/mm/zsmalloc.c
+++ b/mm/zsmalloc.c
@@ -119,6 +119,15 @@
 #define OBJ_INDEX_BITS	(BITS_PER_LONG - _PFN_BITS - OBJ_TAG_BITS)
 #define OBJ_INDEX_MASK	((_AC(1, UL) << OBJ_INDEX_BITS) - 1)
 
+/*
+ * When using PAE, the obj encoding might overflow if arch does
+ * not re-define MAX_PHYSMEM_BITS, since zsmalloc uses HIGHMEM.
+ * This checks for a future bad page access, when de-coding obj.
+ */
+#define OBJ_OVERFLOW(_pfn) \
+	(((unsigned long long) _pfn << (OBJ_INDEX_BITS + OBJ_TAG_BITS)) >= \
+	((_AC(1, ULL)) << MAX_POSSIBLE_PHYSMEM_BITS) ? 1 : 0)
+
 #define FULLNESS_BITS	2
 #define CLASS_BITS	8
 #define ISOLATED_BITS	3
@@ -871,9 +880,14 @@ static void obj_to_location(unsigned long obj, struct page **page,
  */
 static unsigned long location_to_obj(struct page *page, unsigned int obj_idx)
 {
-	unsigned long obj;
+	unsigned long obj, pfn;
+
+	pfn = page_to_pfn(page);
+
+	if (unlikely(OBJ_OVERFLOW(pfn)))
+		BUG();
 
-	obj = page_to_pfn(page) << OBJ_INDEX_BITS;
+	obj = pfn << OBJ_INDEX_BITS;
 	obj |= obj_idx & OBJ_INDEX_MASK;
 	obj <<= OBJ_TAG_BITS;
 
-- 
2.19.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 2/2] mm/zsmalloc.c: fix zsmalloc ARM LPAE support
  2018-10-25  1:27 [PATCH 1/2] mm/zsmalloc.c: check encoded object value overflow for PAE Rafael David Tinoco
@ 2018-10-25  1:27 ` Rafael David Tinoco
  2018-10-25  5:29 ` [PATCH 1/2] mm/zsmalloc.c: check encoded object value overflow for PAE Sergey Senozhatsky
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 12+ messages in thread
From: Rafael David Tinoco @ 2018-10-25  1:27 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-arm-kernel, linux-mm, Rafael David Tinoco, Russell King,
	Mark Brown, Sergey Senozhatsky, Nitin Gupta, Minchan Kim

Since commit 02390b87a945 ("mm/zsmalloc: Prepare to variable
MAX_PHYSMEM_BITS"), an architecture has to define this value in order to
guarantee that zsmalloc will be able to encode and decode the obj value
properly.

Similar to that change, this one sets the value for ARM LPAE, fixing a
possible null-ptr-deref in zs_map_object() when using ARM LPAE and
HIGHMEM pages located above the 4GB watermark.

Link: https://bugs.linaro.org/show_bug.cgi?id=3765#c17
Signed-off-by: Rafael David Tinoco <rafael.tinoco@linaro.org>
---
 arch/arm/include/asm/pgtable-3level-types.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/arm/include/asm/pgtable-3level-types.h b/arch/arm/include/asm/pgtable-3level-types.h
index 921aa30259c4..bd4994f98700 100644
--- a/arch/arm/include/asm/pgtable-3level-types.h
+++ b/arch/arm/include/asm/pgtable-3level-types.h
@@ -67,4 +67,6 @@ typedef pteval_t pgprot_t;
 
 #endif	/* STRICT_MM_TYPECHECKS */
 
+#define MAX_POSSIBLE_PHYSMEM_BITS	36
+
 #endif	/* _ASM_PGTABLE_3LEVEL_TYPES_H */
-- 
2.19.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/2] mm/zsmalloc.c: check encoded object value overflow for PAE
  2018-10-25  1:27 [PATCH 1/2] mm/zsmalloc.c: check encoded object value overflow for PAE Rafael David Tinoco
  2018-10-25  1:27 ` [PATCH 2/2] mm/zsmalloc.c: fix zsmalloc ARM LPAE support Rafael David Tinoco
@ 2018-10-25  5:29 ` Sergey Senozhatsky
  2018-10-25 11:03   ` Rafael David Tinoco
  2018-10-25 12:00 ` Russell King - ARM Linux
  2018-10-25 12:42 ` [PATCH 1/2] mm/zsmalloc.c: check encoded object value overflow for PAE kbuild test robot
  3 siblings, 1 reply; 12+ messages in thread
From: Sergey Senozhatsky @ 2018-10-25  5:29 UTC (permalink / raw)
  To: Rafael David Tinoco
  Cc: linux-kernel, linux-arm-kernel, linux-mm, Russell King,
	Mark Brown, Sergey Senozhatsky, Nitin Gupta, Minchan Kim,
	Andrew Morton

On (10/24/18 22:27), Rafael David Tinoco wrote:
>  static unsigned long location_to_obj(struct page *page, unsigned int obj_idx)
>  {
> -	unsigned long obj;
> +	unsigned long obj, pfn;
> +
> +	pfn = page_to_pfn(page);
> +
> +	if (unlikely(OBJ_OVERFLOW(pfn)))
> +		BUG();

The trend these days is to have less BUG/BUG_ON-s in the kernel.

	-ss

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/2] mm/zsmalloc.c: check encoded object value overflow for PAE
  2018-10-25  5:29 ` [PATCH 1/2] mm/zsmalloc.c: check encoded object value overflow for PAE Sergey Senozhatsky
@ 2018-10-25 11:03   ` Rafael David Tinoco
  0 siblings, 0 replies; 12+ messages in thread
From: Rafael David Tinoco @ 2018-10-25 11:03 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Rafael David Tinoco, linux-kernel, linux-arm-kernel, linux-mm,
	Russell King, Mark Brown, Nitin Gupta, Minchan Kim,
	Andrew Morton

On Thu, Oct 25, 2018 at 2:29 AM, Sergey Senozhatsky
<sergey.senozhatsky.work@gmail.com> wrote:
> On (10/24/18 22:27), Rafael David Tinoco wrote:
>>  static unsigned long location_to_obj(struct page *page, unsigned int obj_idx)
>>  {
>> -     unsigned long obj;
>> +     unsigned long obj, pfn;
>> +
>> +     pfn = page_to_pfn(page);
>> +
>> +     if (unlikely(OBJ_OVERFLOW(pfn)))
>> +             BUG();
>
> The trend these days is to have less BUG/BUG_ON-s in the kernel.
>
>         -ss

For this case, IMHO, it is worth.

It will avoid a investigation like:
https://bugs.linaro.org/show_bug.cgi?id=3765#c7 and and #c8, where I
had to poison slab allocation - to force both zs_handle and zspage
slabs not to be merged - and to make sure the zspage slab had a good
magic number AND to identify why the bad paging request happened.

If this happens again, for any other arch supporting PAE that does not
declare MAX_POSSIBLE_PHYSMEM_BITS or MAX_PHYSMEM_BITS appropriately,
the kernel will panic, no matter what, by the time it reaches
obj_to_location(). Things can be more complicated about declarations
for PAE if we consider ARM can declare MAX_PHYSMEM_BITS differently in
arch/arm/mach-XXX and/or, for this case, when having, or not SPARSEMEM
set (if I had SPARSEMEM set I would not face this, for example).

If this occurs, the kernel will panic, no matter what, by the time it
reaches obj_to_location()... so why not to BUG() here and let user to
know exactly where it panic-ed and why ? Other option would be to
WARN() here and let it panic naturally because of bad paging request
in a very near future... please advise.

Thanks,
Best Rgds
-Rafael

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/2] mm/zsmalloc.c: check encoded object value overflow for PAE
  2018-10-25  1:27 [PATCH 1/2] mm/zsmalloc.c: check encoded object value overflow for PAE Rafael David Tinoco
  2018-10-25  1:27 ` [PATCH 2/2] mm/zsmalloc.c: fix zsmalloc ARM LPAE support Rafael David Tinoco
  2018-10-25  5:29 ` [PATCH 1/2] mm/zsmalloc.c: check encoded object value overflow for PAE Sergey Senozhatsky
@ 2018-10-25 12:00 ` Russell King - ARM Linux
  2018-10-25 12:37   ` Rafael David Tinoco
  2018-10-25 12:42 ` [PATCH 1/2] mm/zsmalloc.c: check encoded object value overflow for PAE kbuild test robot
  3 siblings, 1 reply; 12+ messages in thread
From: Russell King - ARM Linux @ 2018-10-25 12:00 UTC (permalink / raw)
  To: Rafael David Tinoco
  Cc: linux-kernel, linux-arm-kernel, linux-mm, Mark Brown,
	Sergey Senozhatsky, Nitin Gupta, Minchan Kim

On Wed, Oct 24, 2018 at 10:27:44PM -0300, Rafael David Tinoco wrote:
> On 32-bit systems, zsmalloc uses HIGHMEM and, when PAE is enabled, the
> physical frame number might be so big that zsmalloc obj encoding (to
> location) will break IF architecture does not re-define
> MAX_PHYSMEM_BITS, causing:

I think there's a deeper problem here - a misunderstanding of what
MAX_PHYSMEM_BITS is.

MAX_PHYSMEM_BITS is a definition for sparsemem, and is only visible
when sparsemem is enabled.  When sparsemem is disabled, asm/sparsemem.h
is not included (and should not be included) which means there is no
MAX_PHYSMEM_BITS definition.

I don't think zsmalloc.c should be (ab)using MAX_PHYSMEM_BITS, and
your description above makes it sound like you expect it to always be
defined.

If we want to have a definition for this, we shouldn't be playing
fragile games like:

#ifndef MAX_POSSIBLE_PHYSMEM_BITS
#ifdef MAX_PHYSMEM_BITS
#define MAX_POSSIBLE_PHYSMEM_BITS MAX_PHYSMEM_BITS
#else
/*
 * If this definition of MAX_PHYSMEM_BITS is used, OBJ_INDEX_BITS will just
 * be PAGE_SHIFT
 */
#define MAX_POSSIBLE_PHYSMEM_BITS BITS_PER_LONG
#endif
#endif

but instead insist that MAX_PHYSMEM_BITS is defined _everywhere_.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/2] mm/zsmalloc.c: check encoded object value overflow for PAE
  2018-10-25 12:00 ` Russell King - ARM Linux
@ 2018-10-25 12:37   ` Rafael David Tinoco
  2018-10-25 13:43     ` Russell King - ARM Linux
  0 siblings, 1 reply; 12+ messages in thread
From: Rafael David Tinoco @ 2018-10-25 12:37 UTC (permalink / raw)
  To: Russell King - ARM Linux
  Cc: Rafael David Tinoco, linux-kernel, linux-arm-kernel, linux-mm,
	Mark Brown, Sergey Senozhatsky, Nitin Gupta, Minchan Kim

> MAX_PHYSMEM_BITS is a definition for sparsemem, and is only visible
> when sparsemem is enabled.  When sparsemem is disabled, asm/sparsemem.h
> is not included (and should not be included) which means there is no
> MAX_PHYSMEM_BITS definition.

Missed that part :\, tks.

> I don't think zsmalloc.c should be (ab)using MAX_PHYSMEM_BITS, and
> your description above makes it sound like you expect it to always be
> defined.
>
> If we want to have a definition for this, we shouldn't be playing
> fragile games like:
>
> #ifndef MAX_POSSIBLE_PHYSMEM_BITS
> #ifdef MAX_PHYSMEM_BITS
> #define MAX_POSSIBLE_PHYSMEM_BITS MAX_PHYSMEM_BITS
> #else
> /*
>  * If this definition of MAX_PHYSMEM_BITS is used, OBJ_INDEX_BITS will just
>  * be PAGE_SHIFT
>  */
> #define MAX_POSSIBLE_PHYSMEM_BITS BITS_PER_LONG
> #endif
> #endif
>
> but instead insist that MAX_PHYSMEM_BITS is defined _everywhere_.

Is it okay to propose using only MAX_PHYSMEM_BITS for zsmalloc (like
it was before commit 02390b87) instead, and make sure *at least* ARM
32/64 and x86/x64, for now, have it defined outside sparsemem headers
as well ? This way I can WARN_ONCE(), instead of BUG(), when specific
arch does not define it - enforcing behavior - showing BITS_PER_LONG
is being used instead of MAX_PHYSMEM_BITS (warning, at least once, for
the possibility of an overflow, like the issue showed in here).

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/2] mm/zsmalloc.c: check encoded object value overflow for PAE
  2018-10-25  1:27 [PATCH 1/2] mm/zsmalloc.c: check encoded object value overflow for PAE Rafael David Tinoco
                   ` (2 preceding siblings ...)
  2018-10-25 12:00 ` Russell King - ARM Linux
@ 2018-10-25 12:42 ` kbuild test robot
  3 siblings, 0 replies; 12+ messages in thread
From: kbuild test robot @ 2018-10-25 12:42 UTC (permalink / raw)
  To: Rafael David Tinoco
  Cc: kbuild-all, linux-kernel, linux-arm-kernel, linux-mm,
	Rafael David Tinoco, Russell King, Mark Brown,
	Sergey Senozhatsky, Nitin Gupta, Minchan Kim

[-- Attachment #1: Type: text/plain, Size: 12017 bytes --]

Hi Rafael,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on linux-sof-driver/master]
[also build test WARNING on v4.19 next-20181019]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Rafael-David-Tinoco/mm-zsmalloc-c-check-encoded-object-value-overflow-for-PAE/20181025-110258
base:   https://github.com/thesofproject/linux master
config: um-allyesconfig (attached as .config)
compiler: gcc-7 (Debian 7.3.0-1) 7.3.0
reproduce:
        # save the attached .config to linux build tree
        make ARCH=um 

All warnings (new ones prefixed by >>):

   In file included from include/linux/kernel.h:10:0,
                    from include/linux/list.h:9,
                    from include/linux/module.h:9,
                    from mm/zsmalloc.c:33:
   mm/zsmalloc.c: In function 'location_to_obj':
>> mm/zsmalloc.c:129:17: warning: left shift count >= width of type [-Wshift-count-overflow]
     ((_AC(1, ULL)) << MAX_POSSIBLE_PHYSMEM_BITS) ? 1 : 0)
                    ^
   include/linux/compiler.h:77:42: note: in definition of macro 'unlikely'
    # define unlikely(x) __builtin_expect(!!(x), 0)
                                             ^
>> mm/zsmalloc.c:886:15: note: in expansion of macro 'OBJ_OVERFLOW'
     if (unlikely(OBJ_OVERFLOW(pfn)))
                  ^~~~~~~~~~~~
   Cyclomatic Complexity 5 include/linux/compiler.h:__read_once_size
   Cyclomatic Complexity 5 include/linux/compiler.h:__write_once_size
   Cyclomatic Complexity 1 include/linux/kasan-checks.h:kasan_check_read
   Cyclomatic Complexity 1 include/linux/kasan-checks.h:kasan_check_write
   Cyclomatic Complexity 2 arch/x86/include/asm/bitops.h:set_bit
   Cyclomatic Complexity 2 arch/x86/include/asm/bitops.h:clear_bit
   Cyclomatic Complexity 1 arch/x86/include/asm/bitops.h:clear_bit_unlock
   Cyclomatic Complexity 1 arch/x86/include/asm/bitops.h:test_and_set_bit
   Cyclomatic Complexity 1 arch/x86/include/asm/bitops.h:test_and_set_bit_lock
   Cyclomatic Complexity 1 arch/x86/include/asm/bitops.h:constant_test_bit
   Cyclomatic Complexity 1 arch/x86/include/asm/bitops.h:variable_test_bit
   Cyclomatic Complexity 1 arch/x86/include/asm/bitops.h:fls64
   Cyclomatic Complexity 1 include/linux/log2.h:__ilog2_u64
   Cyclomatic Complexity 1 include/linux/kernel.h:___might_sleep
   Cyclomatic Complexity 1 include/linux/list.h:INIT_LIST_HEAD
   Cyclomatic Complexity 2 include/linux/list.h:__list_add
   Cyclomatic Complexity 1 include/linux/list.h:list_add
   Cyclomatic Complexity 1 include/linux/list.h:__list_del
   Cyclomatic Complexity 2 include/linux/list.h:__list_del_entry
   Cyclomatic Complexity 1 include/linux/list.h:list_del
   Cyclomatic Complexity 1 include/linux/list.h:list_del_init
   Cyclomatic Complexity 1 include/linux/list.h:list_empty
   Cyclomatic Complexity 1 include/linux/list.h:__list_splice
   Cyclomatic Complexity 2 include/linux/list.h:list_splice_init
   Cyclomatic Complexity 1 arch/um/include/shared/mem.h:to_virt
   Cyclomatic Complexity 1 include/asm-generic/getorder.h:__get_order
   Cyclomatic Complexity 1 arch/um/include/asm/thread_info.h:current_thread_info
   Cyclomatic Complexity 1 include/asm-generic/preempt.h:preempt_count
   Cyclomatic Complexity 1 arch/x86/include/asm/atomic.h:arch_atomic_read
   Cyclomatic Complexity 1 arch/x86/include/asm/atomic.h:arch_atomic_set
   Cyclomatic Complexity 1 arch/x86/include/asm/atomic.h:arch_atomic_inc
   Cyclomatic Complexity 1 arch/x86/include/asm/atomic.h:arch_atomic_dec_and_test
   Cyclomatic Complexity 1 arch/x86/include/asm/atomic64_64.h:arch_atomic64_read
   Cyclomatic Complexity 1 arch/x86/include/asm/atomic64_64.h:arch_atomic64_add
   Cyclomatic Complexity 1 arch/x86/include/asm/atomic64_64.h:arch_atomic64_sub
   Cyclomatic Complexity 1 arch/x86/include/asm/atomic64_64.h:arch_atomic64_inc
   Cyclomatic Complexity 1 arch/x86/include/asm/atomic64_64.h:arch_atomic64_dec
   Cyclomatic Complexity 1 include/asm-generic/atomic-instrumented.h:atomic_read
   Cyclomatic Complexity 1 include/asm-generic/atomic-instrumented.h:atomic64_read
   Cyclomatic Complexity 1 include/asm-generic/atomic-instrumented.h:atomic_set
   Cyclomatic Complexity 1 include/asm-generic/atomic-instrumented.h:atomic_inc
   Cyclomatic Complexity 1 include/asm-generic/atomic-instrumented.h:atomic64_inc
   Cyclomatic Complexity 1 include/asm-generic/atomic-instrumented.h:atomic64_dec
   Cyclomatic Complexity 1 include/asm-generic/atomic-instrumented.h:atomic64_add
   Cyclomatic Complexity 1 include/asm-generic/atomic-instrumented.h:atomic64_sub
   Cyclomatic Complexity 1 include/asm-generic/atomic-instrumented.h:atomic_dec_and_test
   Cyclomatic Complexity 1 include/asm-generic/atomic-long.h:atomic_long_read
   Cyclomatic Complexity 1 include/asm-generic/atomic-long.h:atomic_long_inc
   Cyclomatic Complexity 1 include/asm-generic/atomic-long.h:atomic_long_dec
   Cyclomatic Complexity 1 include/asm-generic/atomic-long.h:atomic_long_add
   Cyclomatic Complexity 1 include/asm-generic/atomic-long.h:atomic_long_sub
   Cyclomatic Complexity 1 arch/x86/um/asm/processor.h:rep_nop
   Cyclomatic Complexity 1 include/linux/spinlock.h:spinlock_check
   Cyclomatic Complexity 1 include/linux/spinlock.h:spin_lock
   Cyclomatic Complexity 1 include/linux/spinlock.h:spin_unlock
   Cyclomatic Complexity 1 include/linux/jump_label.h:static_key_count
   Cyclomatic Complexity 2 include/linux/jump_label.h:static_key_false
   Cyclomatic Complexity 1 include/linux/nodemask.h:node_state
   Cyclomatic Complexity 1 include/linux/err.h:PTR_ERR
   Cyclomatic Complexity 1 include/linux/err.h:IS_ERR
   Cyclomatic Complexity 1 include/linux/workqueue.h:queue_work
   Cyclomatic Complexity 1 include/linux/workqueue.h:schedule_work
   Cyclomatic Complexity 1 include/linux/topology.h:numa_node_id
   Cyclomatic Complexity 1 include/linux/topology.h:numa_mem_id
   Cyclomatic Complexity 1 include/linux/gfp.h:__alloc_pages
   Cyclomatic Complexity 4 include/linux/gfp.h:__alloc_pages_node
   Cyclomatic Complexity 2 include/linux/gfp.h:alloc_pages_node
   Cyclomatic Complexity 4 include/linux/bit_spinlock.h:bit_spin_lock
   Cyclomatic Complexity 2 include/linux/bit_spinlock.h:bit_spin_trylock
   Cyclomatic Complexity 2 include/linux/bit_spinlock.h:bit_spin_unlock
   Cyclomatic Complexity 2 include/linux/bit_spinlock.h:bit_spin_is_locked
   Cyclomatic Complexity 1 include/linux/fs.h:mount_pseudo
   Cyclomatic Complexity 2 include/linux/page-flags.h:compound_head
   Cyclomatic Complexity 1 include/linux/page-flags.h:PagePoisoned
   Cyclomatic Complexity 1 include/linux/page-flags.h:PageLocked
   Cyclomatic Complexity 1 include/linux/page-flags.h:PagePrivate
   Cyclomatic Complexity 1 include/linux/page-flags.h:SetPagePrivate
   Cyclomatic Complexity 1 include/linux/page-flags.h:ClearPagePrivate
   Cyclomatic Complexity 1 include/linux/page-flags.h:PageOwnerPriv1
   Cyclomatic Complexity 1 include/linux/page-flags.h:SetPageOwnerPriv1
   Cyclomatic Complexity 1 include/linux/page-flags.h:ClearPageOwnerPriv1
   Cyclomatic Complexity 1 include/linux/page-flags.h:PageIsolated
   Cyclomatic Complexity 1 include/linux/page_ref.h:page_ref_count
   Cyclomatic Complexity 2 include/linux/page_ref.h:page_ref_inc
   Cyclomatic Complexity 2 include/linux/page_ref.h:page_ref_dec_and_test
   Cyclomatic Complexity 1 include/linux/mm.h:put_page_testzero
   Cyclomatic Complexity 1 include/linux/mm.h:page_mapcount_reset
   Cyclomatic Complexity 1 include/linux/mm.h:page_zonenum
   Cyclomatic Complexity 1 include/linux/mm.h:get_page
   Cyclomatic Complexity 2 include/linux/mm.h:put_page
   Cyclomatic Complexity 1 include/linux/mm.h:page_zone
   Cyclomatic Complexity 1 include/linux/vmstat.h:__inc_zone_state
   Cyclomatic Complexity 1 include/linux/vmstat.h:__dec_zone_state
   Cyclomatic Complexity 1 include/linux/vmstat.h:__inc_zone_page_state
   Cyclomatic Complexity 1 include/linux/vmstat.h:__dec_zone_page_state
   Cyclomatic Complexity 1 include/linux/mm.h:lowmem_page_address
   Cyclomatic Complexity 1 include/linux/uaccess.h:pagefault_disabled_inc
   Cyclomatic Complexity 1 include/linux/uaccess.h:pagefault_disabled_dec

vim +129 mm/zsmalloc.c

    32	
  > 33	#include <linux/module.h>
    34	#include <linux/kernel.h>
    35	#include <linux/sched.h>
    36	#include <linux/magic.h>
    37	#include <linux/bitops.h>
    38	#include <linux/errno.h>
    39	#include <linux/highmem.h>
    40	#include <linux/string.h>
    41	#include <linux/slab.h>
    42	#include <asm/tlbflush.h>
    43	#include <asm/pgtable.h>
    44	#include <linux/cpumask.h>
    45	#include <linux/cpu.h>
    46	#include <linux/vmalloc.h>
    47	#include <linux/preempt.h>
    48	#include <linux/spinlock.h>
    49	#include <linux/shrinker.h>
    50	#include <linux/types.h>
    51	#include <linux/debugfs.h>
    52	#include <linux/zsmalloc.h>
    53	#include <linux/zpool.h>
    54	#include <linux/mount.h>
    55	#include <linux/migrate.h>
    56	#include <linux/pagemap.h>
    57	#include <linux/fs.h>
    58	
    59	#define ZSPAGE_MAGIC	0x58
    60	
    61	/*
    62	 * This must be power of 2 and greater than of equal to sizeof(link_free).
    63	 * These two conditions ensure that any 'struct link_free' itself doesn't
    64	 * span more than 1 page which avoids complex case of mapping 2 pages simply
    65	 * to restore link_free pointer values.
    66	 */
    67	#define ZS_ALIGN		8
    68	
    69	/*
    70	 * A single 'zspage' is composed of up to 2^N discontiguous 0-order (single)
    71	 * pages. ZS_MAX_ZSPAGE_ORDER defines upper limit on N.
    72	 */
    73	#define ZS_MAX_ZSPAGE_ORDER 2
    74	#define ZS_MAX_PAGES_PER_ZSPAGE (_AC(1, UL) << ZS_MAX_ZSPAGE_ORDER)
    75	
    76	#define ZS_HANDLE_SIZE (sizeof(unsigned long))
    77	
    78	/*
    79	 * Object location (<PFN>, <obj_idx>) is encoded as
    80	 * as single (unsigned long) handle value.
    81	 *
    82	 * Note that object index <obj_idx> starts from 0.
    83	 *
    84	 * This is made more complicated by various memory models and PAE.
    85	 */
    86	
    87	#ifndef MAX_POSSIBLE_PHYSMEM_BITS
    88	#ifdef MAX_PHYSMEM_BITS
    89	#define MAX_POSSIBLE_PHYSMEM_BITS MAX_PHYSMEM_BITS
    90	#else
    91	/*
    92	 * If this definition of MAX_PHYSMEM_BITS is used, OBJ_INDEX_BITS will just
    93	 * be PAGE_SHIFT
    94	 */
    95	#define MAX_POSSIBLE_PHYSMEM_BITS BITS_PER_LONG
    96	#endif
    97	#endif
    98	
    99	#define _PFN_BITS		(MAX_POSSIBLE_PHYSMEM_BITS - PAGE_SHIFT)
   100	
   101	/*
   102	 * Memory for allocating for handle keeps object position by
   103	 * encoding <page, obj_idx> and the encoded value has a room
   104	 * in least bit(ie, look at obj_to_location).
   105	 * We use the bit to synchronize between object access by
   106	 * user and migration.
   107	 */
   108	#define HANDLE_PIN_BIT	0
   109	
   110	/*
   111	 * Head in allocated object should have OBJ_ALLOCATED_TAG
   112	 * to identify the object was allocated or not.
   113	 * It's okay to add the status bit in the least bit because
   114	 * header keeps handle which is 4byte-aligned address so we
   115	 * have room for two bit at least.
   116	 */
   117	#define OBJ_ALLOCATED_TAG 1
   118	#define OBJ_TAG_BITS 1
   119	#define OBJ_INDEX_BITS	(BITS_PER_LONG - _PFN_BITS - OBJ_TAG_BITS)
   120	#define OBJ_INDEX_MASK	((_AC(1, UL) << OBJ_INDEX_BITS) - 1)
   121	
   122	/*
   123	 * When using PAE, the obj encoding might overflow if arch does
   124	 * not re-define MAX_PHYSMEM_BITS, since zsmalloc uses HIGHMEM.
   125	 * This checks for a future bad page access, when de-coding obj.
   126	 */
   127	#define OBJ_OVERFLOW(_pfn) \
   128		(((unsigned long long) _pfn << (OBJ_INDEX_BITS + OBJ_TAG_BITS)) >= \
 > 129		((_AC(1, ULL)) << MAX_POSSIBLE_PHYSMEM_BITS) ? 1 : 0)
   130	

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 19285 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/2] mm/zsmalloc.c: check encoded object value overflow for PAE
  2018-10-25 12:37   ` Rafael David Tinoco
@ 2018-10-25 13:43     ` Russell King - ARM Linux
  2018-11-21  0:11       ` [PATCH v2] mm/zsmalloc.c: Fix zsmalloc 32-bit PAE support Rafael David Tinoco
  0 siblings, 1 reply; 12+ messages in thread
From: Russell King - ARM Linux @ 2018-10-25 13:43 UTC (permalink / raw)
  To: Rafael David Tinoco
  Cc: linux-kernel, linux-arm-kernel, linux-mm, Mark Brown,
	Sergey Senozhatsky, Nitin Gupta, Minchan Kim

On Thu, Oct 25, 2018 at 09:37:59AM -0300, Rafael David Tinoco wrote:
> Is it okay to propose using only MAX_PHYSMEM_BITS for zsmalloc (like
> it was before commit 02390b87) instead, and make sure *at least* ARM
> 32/64 and x86/x64, for now, have it defined outside sparsemem headers
> as well ?

It looks to me like this has been broken on ARM for quite some time,
predating that commit.  The original was:

#ifndef MAX_PHYSMEM_BITS
#ifdef CONFIG_HIGHMEM64G
#define MAX_PHYSMEM_BITS 36
#else /* !CONFIG_HIGHMEM64G */
#define MAX_PHYSMEM_BITS BITS_PER_LONG
#endif
#endif
#define _PFN_BITS              (MAX_PHYSMEM_BITS - PAGE_SHIFT)

On ARM, CONFIG_HIGHMEM64G is never defined (it's an x86 private symbol)
which means that the above sets MAX_PHYSMEM_BITS to 32 on non-sparsemem
ARM LPAE platforms.  So commit 02390b87 hasn't really changed anything
as far as ARM LPAE is concerned - and this looks to be a bug that goes
all the way back to when zsmalloc.c was moved out of staging in 2014.

Digging further back, it seems this brokenness was introduced with:

commit 6e00ec00b1a76a199b8c0acae401757b795daf57
Author: Seth Jennings <sjenning@linux.vnet.ibm.com>
Date:   Mon Mar 5 11:33:22 2012 -0600

    staging: zsmalloc: calculate MAX_PHYSMEM_BITS if not defined

    This patch provides a way to determine or "set a
    reasonable value for" MAX_PHYSMEM_BITS in the case that
    it is not defined (i.e. !SPARSEMEM)

    Signed-off-by: Seth Jennings <sjenning@linux.vnet.ibm.com>
    Acked-by: Nitin Gupta <ngupta@vflare.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

which, at the time, realised the problem with SPARSEMEM, but decided
that in the absense of SPARSEMEM, that MAX_PHYSMEM_BITS shall be
BITS_PER_LONG which seems absurd (see below.)

> This way I can WARN_ONCE(), instead of BUG(), when specific
> arch does not define it - enforcing behavior - showing BITS_PER_LONG
> is being used instead of MAX_PHYSMEM_BITS (warning, at least once, for
> the possibility of an overflow, like the issue showed in here).

Assuming that the maximum number of physical memory bits are
BITS_PER_LONG in the absense of MAX_POSSIBLE_PHYSMEM_BITS is a nonsense
- we have had the potential for PAE systems for a long time, and to
introduce new code that makes this assumption was plainly wrong.

We know when there's the potential for PAE, and thus more than
BITS_PER_LONG bits of physical memory address, through
CONFIG_PHYS_ADDR_T_64BIT.  So if we have the situation where
MAX_POSSIBLE_PHYSMEM_BITS (or the older case of MAX_PHYSMEM_BITS) not
being defined, but CONFIG_PHYS_ADDR_T_64BIT set, we should've been
erroring or something based on not knowing how many physical memory
bits are possible - it would be more than BITS_PER_LONG but less
than some unknown number of bits.

This is why I think any fallback here to BITS_PER_LONG is wrong.

What I suggested is to not fall back to BITS_PER_LONG in any case, but
always define MAX_PHYSMEM_BITS.  However, I now see that won't work for
x86 because MAX_PHYSMEM_BITS is not a constant anymore.

So I suggest everything that uses zsmalloc.c should instead define
MAX_POSSIBLE_PHYSMEM_BITS.

Note that there should _also_ be some protection in zsmalloc.c against
MAX_POSSIBLE_PHYSMEM_BITS being too large:

#define OBJ_INDEX_BITS  (BITS_PER_LONG - _PFN_BITS - OBJ_TAG_BITS)
#define OBJ_TAG_BITS 1
#define _PFN_BITS               (MAX_POSSIBLE_PHYSMEM_BITS - PAGE_SHIFT)

which means there's an implicit limitation on _PFN_BITS being less than
BITS_PER_LONG - OBJ_TAG_BITS (where, if it's equal to this, and hence
OBJ_INDEX_BITS will be zero.)  This imples that MAX_POSSIBLE_PHYSMEM_BITS
must be smaller than BITS_PER_LONG + PAGE_SHIFT - OBJ_TAG_BITS, or
43 bits on a 32 bit system.  If you want to guarantee a minimum number
of objects, then that limitation needs to be reduced further.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v2] mm/zsmalloc.c: Fix zsmalloc 32-bit PAE support
  2018-10-25 13:43     ` Russell King - ARM Linux
@ 2018-11-21  0:11       ` Rafael David Tinoco
  2018-11-21  0:18         ` Rafael David Tinoco
  0 siblings, 1 reply; 12+ messages in thread
From: Rafael David Tinoco @ 2018-11-21  0:11 UTC (permalink / raw)
  To: linux
  Cc: broonie, linux-arm-kernel, linux-kernel, linux-mm, minchan,
	ngupta, rafael.tinoco, sergey.senozhatsky.work

On 32-bit systems, zsmalloc uses HIGHMEM and, when PAE is enabled, the
physical frame number might be so big that zsmalloc obj encoding (to
location) will break, causing:

BUG: KASAN: null-ptr-deref in zs_map_object+0xa4/0x2bc
Read of size 4 at addr 00000000 by task mkfs.ext4/623
CPU: 2 PID: 623 Comm: mkfs.ext4 Not tainted 4.19.0-rc8-00017-g8239bc6d3307-dirty #15
Hardware name: Generic DT based system
[<c0418f7c>] (unwind_backtrace) from [<c0410ca4>] (show_stack+0x20/0x24)
[<c0410ca4>] (show_stack) from [<c16bd540>] (dump_stack+0xbc/0xe8)
[<c16bd540>] (dump_stack) from [<c06cab74>] (kasan_report+0x248/0x390)
[<c06cab74>] (kasan_report) from [<c06c94e8>] (__asan_load4+0x78/0xb4)
[<c06c94e8>] (__asan_load4) from [<c06ddc24>] (zs_map_object+0xa4/0x2bc)
[<c06ddc24>] (zs_map_object) from [<bf0bbbd8>] (zram_bvec_rw.constprop.2+0x324/0x8e4 [zram])
[<bf0bbbd8>] (zram_bvec_rw.constprop.2 [zram]) from [<bf0bc3cc>] (zram_make_request+0x234/0x46c [zram])
[<bf0bc3cc>] (zram_make_request [zram]) from [<c09aff9c>] (generic_make_request+0x304/0x63c)
[<c09aff9c>] (generic_make_request) from [<c09b0320>] (submit_bio+0x4c/0x1c8)
[<c09b0320>] (submit_bio) from [<c0743570>] (submit_bh_wbc.constprop.15+0x238/0x26c)
[<c0743570>] (submit_bh_wbc.constprop.15) from [<c0746cf8>] (__block_write_full_page+0x524/0x76c)
[<c0746cf8>] (__block_write_full_page) from [<c07472c4>] (block_write_full_page+0x1bc/0x1d4)
[<c07472c4>] (block_write_full_page) from [<c0748eb4>] (blkdev_writepage+0x24/0x28)
[<c0748eb4>] (blkdev_writepage) from [<c064a780>] (__writepage+0x44/0x78)
[<c064a780>] (__writepage) from [<c064b50c>] (write_cache_pages+0x3b8/0x800)
[<c064b50c>] (write_cache_pages) from [<c064dd78>] (generic_writepages+0x74/0xa0)
[<c064dd78>] (generic_writepages) from [<c0748e64>] (blkdev_writepages+0x18/0x1c)
[<c0748e64>] (blkdev_writepages) from [<c064e890>] (do_writepages+0x68/0x134)
[<c064e890>] (do_writepages) from [<c06368a4>] (__filemap_fdatawrite_range+0xb0/0xf4)
[<c06368a4>] (__filemap_fdatawrite_range) from [<c0636b68>] (file_write_and_wait_range+0x64/0xd0)
[<c0636b68>] (file_write_and_wait_range) from [<c0747af8>] (blkdev_fsync+0x54/0x84)
[<c0747af8>] (blkdev_fsync) from [<c0739dac>] (vfs_fsync_range+0x70/0xd4)
[<c0739dac>] (vfs_fsync_range) from [<c0739e98>] (do_fsync+0x4c/0x80)
[<c0739e98>] (do_fsync) from [<c073a26c>] (sys_fsync+0x1c/0x20)
[<c073a26c>] (sys_fsync) from [<c0401000>] (ret_fast_syscall+0x0/0x2c)

when trying to decode (the pfn) and map the object.

That happens because one architecture might not re-define
MAX_PHYSMEM_BITS, like in this ARM 32-bit w/ LPAE enabled example. For
32-bit systems, if not re-defined, MAX_POSSIBLE_PHYSMEM_BITS will
default to BITS_PER_LONG (32) in most cases, and, with PAE enabled,
_PFN_BITS might be wrong: which may cause obj variable to overflow if
frame number is in HIGHMEM and referencing a page above the 4GB
watermark.

commit 6e00ec00b1a7 ("staging: zsmalloc: calculate MAX_PHYSMEM_BITS if
not defined") realized MAX_PHYSMEM_BITS depended on SPARSEMEM headers
and "fixed" it by calculating it using BITS_PER_LONG if SPARSEMEM wasn't
used, like in the example given above.

Systems with potential for PAE exist for a long time and assuming
BITS_PER_LONG seems inadequate. Defining MAX_PHYSMEM_BITS looks better,
however it is NOT a constant anymore for x86.

SO, instead, MAX_POSSIBLE_PHYSMEM_BITS should be defined by every
architecture using zsmalloc, together with a sanity check for
MAX_POSSIBLE_PHYSMEM_BITS being too big on 32-bit systems.

Link: https://bugs.linaro.org/show_bug.cgi?id=3765#c17
Signed-off-by: Rafael David Tinoco <rafael.tinoco@linaro.org>
---
 arch/arm/include/asm/pgtable-2level-types.h |  2 ++
 arch/arm/include/asm/pgtable-3level-types.h |  2 ++
 arch/arm64/include/asm/pgtable-types.h      |  2 ++
 arch/ia64/include/asm/page.h                |  2 ++
 arch/mips/include/asm/page.h                |  2 ++
 arch/powerpc/include/asm/mmu.h              |  2 ++
 arch/s390/include/asm/page.h                |  2 ++
 arch/sh/include/asm/page.h                  |  2 ++
 arch/sparc/include/asm/page_32.h            |  2 ++
 arch/sparc/include/asm/page_64.h            |  2 ++
 arch/x86/include/asm/pgtable-2level_types.h |  2 ++
 arch/x86/include/asm/pgtable-3level_types.h |  3 +-
 arch/x86/include/asm/pgtable_64_types.h     |  4 +--
 mm/zsmalloc.c                               | 35 +++++++++++----------
 14 files changed, 45 insertions(+), 19 deletions(-)

diff --git a/arch/arm/include/asm/pgtable-2level-types.h b/arch/arm/include/asm/pgtable-2level-types.h
index 66cb5b0e89c5..552dba411324 100644
--- a/arch/arm/include/asm/pgtable-2level-types.h
+++ b/arch/arm/include/asm/pgtable-2level-types.h
@@ -64,4 +64,6 @@ typedef pteval_t pgprot_t;
 
 #endif /* STRICT_MM_TYPECHECKS */
 
+#define MAX_POSSIBLE_PHYSMEM_BITS 32
+
 #endif	/* _ASM_PGTABLE_2LEVEL_TYPES_H */
diff --git a/arch/arm/include/asm/pgtable-3level-types.h b/arch/arm/include/asm/pgtable-3level-types.h
index 921aa30259c4..664c39e6517c 100644
--- a/arch/arm/include/asm/pgtable-3level-types.h
+++ b/arch/arm/include/asm/pgtable-3level-types.h
@@ -67,4 +67,6 @@ typedef pteval_t pgprot_t;
 
 #endif	/* STRICT_MM_TYPECHECKS */
 
+#define MAX_POSSIBLE_PHYSMEM_BITS 36
+
 #endif	/* _ASM_PGTABLE_3LEVEL_TYPES_H */
diff --git a/arch/arm64/include/asm/pgtable-types.h b/arch/arm64/include/asm/pgtable-types.h
index 345a072b5856..45c3834eb4c8 100644
--- a/arch/arm64/include/asm/pgtable-types.h
+++ b/arch/arm64/include/asm/pgtable-types.h
@@ -64,4 +64,6 @@ typedef struct { pteval_t pgprot; } pgprot_t;
 #include <asm-generic/5level-fixup.h>
 #endif
 
+#define MAX_POSSIBLE_PHYSMEM_BITS CONFIG_ARM64_PA_BITS
+
 #endif	/* __ASM_PGTABLE_TYPES_H */
diff --git a/arch/ia64/include/asm/page.h b/arch/ia64/include/asm/page.h
index 5798bd2b462c..a3e055979e46 100644
--- a/arch/ia64/include/asm/page.h
+++ b/arch/ia64/include/asm/page.h
@@ -235,4 +235,6 @@ get_order (unsigned long size)
 
 #define __HAVE_ARCH_GATE_AREA	1
 
+#define MAX_POSSIBLE_PHYSMEM_BITS 50
+
 #endif /* _ASM_IA64_PAGE_H */
diff --git a/arch/mips/include/asm/page.h b/arch/mips/include/asm/page.h
index e8cc328fce2d..f6a5dea1a66c 100644
--- a/arch/mips/include/asm/page.h
+++ b/arch/mips/include/asm/page.h
@@ -263,4 +263,6 @@ extern int __virt_addr_valid(const volatile void *kaddr);
 #include <asm-generic/memory_model.h>
 #include <asm-generic/getorder.h>
 
+#define MAX_POSSIBLE_PHYSMEM_BITS 48
+
 #endif /* _ASM_PAGE_H */
diff --git a/arch/powerpc/include/asm/mmu.h b/arch/powerpc/include/asm/mmu.h
index eb20eb3b8fb0..2ebc1d2d9a5c 100644
--- a/arch/powerpc/include/asm/mmu.h
+++ b/arch/powerpc/include/asm/mmu.h
@@ -324,6 +324,8 @@ static inline u16 get_mm_addr_key(struct mm_struct *mm, unsigned long address)
 #define MAX_PHYSMEM_BITS        46
 #endif
 
+#define MAX_POSSIBLE_PHYSMEM_BITS MAX_PHYSMEM_BITS
+
 #ifdef CONFIG_PPC_BOOK3S_64
 #include <asm/book3s/64/mmu.h>
 #else /* CONFIG_PPC_BOOK3S_64 */
diff --git a/arch/s390/include/asm/page.h b/arch/s390/include/asm/page.h
index a4d38092530a..8abec1461bf7 100644
--- a/arch/s390/include/asm/page.h
+++ b/arch/s390/include/asm/page.h
@@ -180,4 +180,6 @@ static inline int devmem_is_allowed(unsigned long pfn)
 #include <asm-generic/memory_model.h>
 #include <asm-generic/getorder.h>
 
+#define MAX_POSSIBLE_PHYSMEM_BITS CONFIG_MAX_PHYSMEM_BITS
+
 #endif /* _S390_PAGE_H */
diff --git a/arch/sh/include/asm/page.h b/arch/sh/include/asm/page.h
index 5eef8be3e59f..40c7e12cf09e 100644
--- a/arch/sh/include/asm/page.h
+++ b/arch/sh/include/asm/page.h
@@ -205,4 +205,6 @@ typedef struct page *pgtable_t;
 #define ARCH_SLAB_MINALIGN	8
 #endif
 
+#define MAX_POSSIBLE_PHYSMEM_BITS 32
+
 #endif /* __ASM_SH_PAGE_H */
diff --git a/arch/sparc/include/asm/page_32.h b/arch/sparc/include/asm/page_32.h
index b76d59edec8c..14e9ca4659d7 100644
--- a/arch/sparc/include/asm/page_32.h
+++ b/arch/sparc/include/asm/page_32.h
@@ -139,4 +139,6 @@ extern unsigned long pfn_base;
 #include <asm-generic/memory_model.h>
 #include <asm-generic/getorder.h>
 
+#define MAX_POSSIBLE_PHYSMEM_BITS 32
+
 #endif /* _SPARC_PAGE_H */
diff --git a/arch/sparc/include/asm/page_64.h b/arch/sparc/include/asm/page_64.h
index e80f2d5bf62f..6d6f3654ead1 100644
--- a/arch/sparc/include/asm/page_64.h
+++ b/arch/sparc/include/asm/page_64.h
@@ -163,4 +163,6 @@ extern unsigned long PAGE_OFFSET;
 
 #include <asm-generic/getorder.h>
 
+#define MAX_POSSIBLE_PHYSMEM_BITS MAX_PHYS_ADDRESS_BITS
+
 #endif /* _SPARC64_PAGE_H */
diff --git a/arch/x86/include/asm/pgtable-2level_types.h b/arch/x86/include/asm/pgtable-2level_types.h
index 6deb6cd236e3..c2eae59e6505 100644
--- a/arch/x86/include/asm/pgtable-2level_types.h
+++ b/arch/x86/include/asm/pgtable-2level_types.h
@@ -38,4 +38,6 @@ typedef union {
 /* This covers all VMSPLIT_* and VMSPLIT_*_OPT variants */
 #define PGD_KERNEL_START	(CONFIG_PAGE_OFFSET >> PGDIR_SHIFT)
 
+#define MAX_POSSIBLE_PHYSMEM_BITS 32
+
 #endif /* _ASM_X86_PGTABLE_2LEVEL_DEFS_H */
diff --git a/arch/x86/include/asm/pgtable-3level_types.h b/arch/x86/include/asm/pgtable-3level_types.h
index 33845d36897c..5fce514a49a0 100644
--- a/arch/x86/include/asm/pgtable-3level_types.h
+++ b/arch/x86/include/asm/pgtable-3level_types.h
@@ -45,7 +45,8 @@ typedef union {
  */
 #define PTRS_PER_PTE	512
 
-#define MAX_POSSIBLE_PHYSMEM_BITS	36
 #define PGD_KERNEL_START	(CONFIG_PAGE_OFFSET >> PGDIR_SHIFT)
 
+#define MAX_POSSIBLE_PHYSMEM_BITS 36
+
 #endif /* _ASM_X86_PGTABLE_3LEVEL_DEFS_H */
diff --git a/arch/x86/include/asm/pgtable_64_types.h b/arch/x86/include/asm/pgtable_64_types.h
index 84bd9bdc1987..d808cfde3d19 100644
--- a/arch/x86/include/asm/pgtable_64_types.h
+++ b/arch/x86/include/asm/pgtable_64_types.h
@@ -64,8 +64,6 @@ extern unsigned int ptrs_per_p4d;
 #define P4D_SIZE		(_AC(1, UL) << P4D_SHIFT)
 #define P4D_MASK		(~(P4D_SIZE - 1))
 
-#define MAX_POSSIBLE_PHYSMEM_BITS	52
-
 #else /* CONFIG_X86_5LEVEL */
 
 /*
@@ -154,4 +152,6 @@ extern unsigned int ptrs_per_p4d;
 
 #define PGD_KERNEL_START	((PAGE_SIZE / 2) / sizeof(pgd_t))
 
+#define MAX_POSSIBLE_PHYSMEM_BITS (pgtable_l5_enabled() ? 52 : 46)
+
 #endif /* _ASM_X86_PGTABLE_64_DEFS_H */
diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
index 0787d33b80d8..132c20b6fd4f 100644
--- a/mm/zsmalloc.c
+++ b/mm/zsmalloc.c
@@ -80,23 +80,7 @@
  * as single (unsigned long) handle value.
  *
  * Note that object index <obj_idx> starts from 0.
- *
- * This is made more complicated by various memory models and PAE.
- */
-
-#ifndef MAX_POSSIBLE_PHYSMEM_BITS
-#ifdef MAX_PHYSMEM_BITS
-#define MAX_POSSIBLE_PHYSMEM_BITS MAX_PHYSMEM_BITS
-#else
-/*
- * If this definition of MAX_PHYSMEM_BITS is used, OBJ_INDEX_BITS will just
- * be PAGE_SHIFT
  */
-#define MAX_POSSIBLE_PHYSMEM_BITS BITS_PER_LONG
-#endif
-#endif
-
-#define _PFN_BITS		(MAX_POSSIBLE_PHYSMEM_BITS - PAGE_SHIFT)
 
 /*
  * Memory for allocating for handle keeps object position by
@@ -116,6 +100,25 @@
  */
 #define OBJ_ALLOCATED_TAG 1
 #define OBJ_TAG_BITS 1
+
+/*
+ * MAX_POSSIBLE_PHYSMEM_BITS should be defined by all archs using zsmalloc:
+ * Trying to guess it from MAX_PHYSMEM_BITS, or considering it BITS_PER_LONG,
+ * proved to be wrong by not considering PAE capabilities, or using SPARSEMEM
+ * only headers, leading to bad object encoding due to object index overflow.
+ */
+#ifndef MAX_POSSIBLE_PHYSMEM_BITS
+ #define MAX_POSSIBLE_PHYSMEM_BITS BITS_PER_LONG
+ #error "MAX_POSSIBLE_PHYSMEM_BITS HAS to be defined by arch using zsmalloc";
+#else
+ #ifndef CONFIG_64BIT
+  #if (MAX_POSSIBLE_PHYSMEM_BITS >= (BITS_PER_LONG + PAGE_SHIFT - OBJ_TAG_BITS))
+   #error "MAX_POSSIBLE_PHYSMEM_BITS is wrong for this arch";
+  #endif
+ #endif
+#endif
+
+#define _PFN_BITS (MAX_POSSIBLE_PHYSMEM_BITS - PAGE_SHIFT)
 #define OBJ_INDEX_BITS	(BITS_PER_LONG - _PFN_BITS - OBJ_TAG_BITS)
 #define OBJ_INDEX_MASK	((_AC(1, UL) << OBJ_INDEX_BITS) - 1)
 
-- 
2.19.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH v2] mm/zsmalloc.c: Fix zsmalloc 32-bit PAE support
  2018-11-21  0:11       ` [PATCH v2] mm/zsmalloc.c: Fix zsmalloc 32-bit PAE support Rafael David Tinoco
@ 2018-11-21  0:18         ` Rafael David Tinoco
  2018-11-27 20:33           ` Rafael David Tinoco
  0 siblings, 1 reply; 12+ messages in thread
From: Rafael David Tinoco @ 2018-11-21  0:18 UTC (permalink / raw)
  To: linux
  Cc: Rafael David Tinoco, broonie, linux-arm-kernel, linux-kernel,
	linux-mm, minchan, ngupta, sergey.senozhatsky.work

On 11/20/18 10:11 PM, Rafael David Tinoco wrote:
> On 32-bit systems, zsmalloc uses HIGHMEM and, when PAE is enabled, the
> physical frame number might be so big that zsmalloc obj encoding (to
> location) will break, causing:
> 
> BUG: KASAN: null-ptr-deref in zs_map_object+0xa4/0x2bc
> Read of size 4 at addr 00000000 by task mkfs.ext4/623
> CPU: 2 PID: 623 Comm: mkfs.ext4 Not tainted 4.19.0-rc8-00017-g8239bc6d3307-dirty #15
> Hardware name: Generic DT based system
> [<c0418f7c>] (unwind_backtrace) from [<c0410ca4>] (show_stack+0x20/0x24)
> [<c0410ca4>] (show_stack) from [<c16bd540>] (dump_stack+0xbc/0xe8)
> [<c16bd540>] (dump_stack) from [<c06cab74>] (kasan_report+0x248/0x390)
> [<c06cab74>] (kasan_report) from [<c06c94e8>] (__asan_load4+0x78/0xb4)
> [<c06c94e8>] (__asan_load4) from [<c06ddc24>] (zs_map_object+0xa4/0x2bc)
> [<c06ddc24>] (zs_map_object) from [<bf0bbbd8>] (zram_bvec_rw.constprop.2+0x324/0x8e4 [zram])
> [<bf0bbbd8>] (zram_bvec_rw.constprop.2 [zram]) from [<bf0bc3cc>] (zram_make_request+0x234/0x46c [zram])
> [<bf0bc3cc>] (zram_make_request [zram]) from [<c09aff9c>] (generic_make_request+0x304/0x63c)
> [<c09aff9c>] (generic_make_request) from [<c09b0320>] (submit_bio+0x4c/0x1c8)
> [<c09b0320>] (submit_bio) from [<c0743570>] (submit_bh_wbc.constprop.15+0x238/0x26c)
> [<c0743570>] (submit_bh_wbc.constprop.15) from [<c0746cf8>] (__block_write_full_page+0x524/0x76c)
> [<c0746cf8>] (__block_write_full_page) from [<c07472c4>] (block_write_full_page+0x1bc/0x1d4)
> [<c07472c4>] (block_write_full_page) from [<c0748eb4>] (blkdev_writepage+0x24/0x28)
> [<c0748eb4>] (blkdev_writepage) from [<c064a780>] (__writepage+0x44/0x78)
> [<c064a780>] (__writepage) from [<c064b50c>] (write_cache_pages+0x3b8/0x800)
> [<c064b50c>] (write_cache_pages) from [<c064dd78>] (generic_writepages+0x74/0xa0)
> [<c064dd78>] (generic_writepages) from [<c0748e64>] (blkdev_writepages+0x18/0x1c)
> [<c0748e64>] (blkdev_writepages) from [<c064e890>] (do_writepages+0x68/0x134)
> [<c064e890>] (do_writepages) from [<c06368a4>] (__filemap_fdatawrite_range+0xb0/0xf4)
> [<c06368a4>] (__filemap_fdatawrite_range) from [<c0636b68>] (file_write_and_wait_range+0x64/0xd0)
> [<c0636b68>] (file_write_and_wait_range) from [<c0747af8>] (blkdev_fsync+0x54/0x84)
> [<c0747af8>] (blkdev_fsync) from [<c0739dac>] (vfs_fsync_range+0x70/0xd4)
> [<c0739dac>] (vfs_fsync_range) from [<c0739e98>] (do_fsync+0x4c/0x80)
> [<c0739e98>] (do_fsync) from [<c073a26c>] (sys_fsync+0x1c/0x20)
> [<c073a26c>] (sys_fsync) from [<c0401000>] (ret_fast_syscall+0x0/0x2c)
> 
> when trying to decode (the pfn) and map the object.
> 
> That happens because one architecture might not re-define
> MAX_PHYSMEM_BITS, like in this ARM 32-bit w/ LPAE enabled example. For
> 32-bit systems, if not re-defined, MAX_POSSIBLE_PHYSMEM_BITS will
> default to BITS_PER_LONG (32) in most cases, and, with PAE enabled,
> _PFN_BITS might be wrong: which may cause obj variable to overflow if
> frame number is in HIGHMEM and referencing a page above the 4GB
> watermark.
> 
> commit 6e00ec00b1a7 ("staging: zsmalloc: calculate MAX_PHYSMEM_BITS if
> not defined") realized MAX_PHYSMEM_BITS depended on SPARSEMEM headers
> and "fixed" it by calculating it using BITS_PER_LONG if SPARSEMEM wasn't
> used, like in the example given above.
> 
> Systems with potential for PAE exist for a long time and assuming
> BITS_PER_LONG seems inadequate. Defining MAX_PHYSMEM_BITS looks better,
> however it is NOT a constant anymore for x86.
> 
> SO, instead, MAX_POSSIBLE_PHYSMEM_BITS should be defined by every
> architecture using zsmalloc, together with a sanity check for
> MAX_POSSIBLE_PHYSMEM_BITS being too big on 32-bit systems.
> 
> Link: https://bugs.linaro.org/show_bug.cgi?id=3765#c17
> Signed-off-by: Rafael David Tinoco <rafael.tinoco@linaro.org>
> ---
>   arch/arm/include/asm/pgtable-2level-types.h |  2 ++
>   arch/arm/include/asm/pgtable-3level-types.h |  2 ++
>   arch/arm64/include/asm/pgtable-types.h      |  2 ++
>   arch/ia64/include/asm/page.h                |  2 ++
>   arch/mips/include/asm/page.h                |  2 ++
>   arch/powerpc/include/asm/mmu.h              |  2 ++
>   arch/s390/include/asm/page.h                |  2 ++
>   arch/sh/include/asm/page.h                  |  2 ++
>   arch/sparc/include/asm/page_32.h            |  2 ++
>   arch/sparc/include/asm/page_64.h            |  2 ++
>   arch/x86/include/asm/pgtable-2level_types.h |  2 ++
>   arch/x86/include/asm/pgtable-3level_types.h |  3 +-
>   arch/x86/include/asm/pgtable_64_types.h     |  4 +--
>   mm/zsmalloc.c                               | 35 +++++++++++----------
>   14 files changed, 45 insertions(+), 19 deletions(-)

Russel,

I have tried to place MAX_POSSIBLE_PHYSMEM_BITS in the best available 
header for each architecture, considering different paging levels, PAE 
existence, and existing similar definitions. Also, I have only 
considered those architectures already having "sparsemem.h" header.

Would you mind reviewing it ?

Tks
--
Rafael D. Tinoco
Linaro Kernel Validation

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2] mm/zsmalloc.c: Fix zsmalloc 32-bit PAE support
  2018-11-21  0:18         ` Rafael David Tinoco
@ 2018-11-27 20:33           ` Rafael David Tinoco
  2018-11-29  2:53             ` Sergey Senozhatsky
  0 siblings, 1 reply; 12+ messages in thread
From: Rafael David Tinoco @ 2018-11-27 20:33 UTC (permalink / raw)
  To: linux
  Cc: Rafael David Tinoco, broonie, linux-arm-kernel, linux-kernel,
	linux-mm, minchan, ngupta, sergey.senozhatsky.work

On 11/20/18 10:18 PM, Rafael David Tinoco wrote:
> 
> Russell,
> 
> I have tried to place MAX_POSSIBLE_PHYSMEM_BITS in the best available
> header for each architecture, considering different paging levels, PAE
> existence, and existing similar definitions. Also, I have only
> considered those architectures already having "sparsemem.h" header.
> 
> Would you mind reviewing it ?

Should I re-send the this v2 (as v3) with complete list of
get_maintainer.pl ? I was in doubt because I'm touching headers from
several archs and I'm not sure who, if it is accepted, would merge it.

Thank you
-- 
Rafael D. Tinoco
Linaro Kernel Validation

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2] mm/zsmalloc.c: Fix zsmalloc 32-bit PAE support
  2018-11-27 20:33           ` Rafael David Tinoco
@ 2018-11-29  2:53             ` Sergey Senozhatsky
  0 siblings, 0 replies; 12+ messages in thread
From: Sergey Senozhatsky @ 2018-11-29  2:53 UTC (permalink / raw)
  To: Rafael David Tinoco
  Cc: linux, broonie, linux-arm-kernel, linux-kernel, linux-mm,
	minchan, ngupta, sergey.senozhatsky.work

On (11/27/18 18:33), Rafael David Tinoco wrote:
> On 11/20/18 10:18 PM, Rafael David Tinoco wrote:
> > 
> > Russell,
> > 
> > I have tried to place MAX_POSSIBLE_PHYSMEM_BITS in the best available
> > header for each architecture, considering different paging levels, PAE
> > existence, and existing similar definitions. Also, I have only
> > considered those architectures already having "sparsemem.h" header.
> > 
> > Would you mind reviewing it ?
> 
> Should I re-send the this v2 (as v3) with complete list of
> get_maintainer.pl ? I was in doubt because I'm touching headers from
> several archs and I'm not sure who, if it is accepted, would merge it.

Yes, resending and Cc-ing archs' maintainers if the right thing to do.
It's also possible that they will ask to split the patch and do a
per-arch change.

	-ss

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2018-11-29  2:53 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-10-25  1:27 [PATCH 1/2] mm/zsmalloc.c: check encoded object value overflow for PAE Rafael David Tinoco
2018-10-25  1:27 ` [PATCH 2/2] mm/zsmalloc.c: fix zsmalloc ARM LPAE support Rafael David Tinoco
2018-10-25  5:29 ` [PATCH 1/2] mm/zsmalloc.c: check encoded object value overflow for PAE Sergey Senozhatsky
2018-10-25 11:03   ` Rafael David Tinoco
2018-10-25 12:00 ` Russell King - ARM Linux
2018-10-25 12:37   ` Rafael David Tinoco
2018-10-25 13:43     ` Russell King - ARM Linux
2018-11-21  0:11       ` [PATCH v2] mm/zsmalloc.c: Fix zsmalloc 32-bit PAE support Rafael David Tinoco
2018-11-21  0:18         ` Rafael David Tinoco
2018-11-27 20:33           ` Rafael David Tinoco
2018-11-29  2:53             ` Sergey Senozhatsky
2018-10-25 12:42 ` [PATCH 1/2] mm/zsmalloc.c: check encoded object value overflow for PAE kbuild test robot

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).