exfat: Avoid allocating upcase table using kcalloc()
diff mbox series

Message ID 20201124194749.4041176-1-123321artyom@gmail.com
State New, archived
Headers show
Series
  • exfat: Avoid allocating upcase table using kcalloc()
Related show

Commit Message

Artem Labazov Nov. 24, 2020, 7:47 p.m. UTC
The table for Unicode upcase conversion requires an order-5 allocation,
which may fail on a highly-fragmented system:

 pool-udisksd: page allocation failure: order:5, mode:0x40dc0(GFP_KERNEL|__GFP_COMP|__GFP_ZERO), nodemask=(null),cpuset=/,mems_allowed=0
 CPU: 4 PID: 3756880 Comm: pool-udisksd Tainted: G     U            5.8.10-200.fc32.x86_64 #1
 Hardware name: Dell Inc. XPS 13 9360/0PVG6D, BIOS 2.13.0 11/14/2019
 Call Trace:
  dump_stack+0x6b/0x88
  warn_alloc.cold+0x75/0xd9
  ? _cond_resched+0x16/0x40
  ? __alloc_pages_direct_compact+0x144/0x150
  __alloc_pages_slowpath.constprop.0+0xcfa/0xd30
  ? __schedule+0x28a/0x840
  ? __wait_on_bit_lock+0x92/0xa0
  __alloc_pages_nodemask+0x2df/0x320
  kmalloc_order+0x1b/0x80
  kmalloc_order_trace+0x1d/0xa0
  exfat_create_upcase_table+0x115/0x390 [exfat]
  exfat_fill_super+0x3ef/0x7f0 [exfat]
  ? sget_fc+0x1d0/0x240
  ? exfat_init_fs_context+0x120/0x120 [exfat]
  get_tree_bdev+0x15c/0x250
  vfs_get_tree+0x25/0xb0
  do_mount+0x7c3/0xaf0
  ? copy_mount_options+0xab/0x180
  __x64_sys_mount+0x8e/0xd0
  do_syscall_64+0x4d/0x90
  entry_SYSCALL_64_after_hwframe+0x44/0xa9

Make the driver use vmalloc() to eliminate the issue.

Cc: stable@vger.kernel.org # v5.7+
Signed-off-by: Artem Labazov <123321artyom@gmail.com>
---
 fs/exfat/nls.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

Comments

David Laight Nov. 24, 2020, 10:17 p.m. UTC | #1
From: Artem Labazov
> Sent: 24 November 2020 19:48
> 
> The table for Unicode upcase conversion requires an order-5 allocation,
> which may fail on a highly-fragmented system:

ISTM that is the wrong way to do the case conversion.
It is also why having to do it is bloody stupid.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
Sungjong Seo Dec. 2, 2020, 4:58 a.m. UTC | #2
> The table for Unicode upcase conversion requires an order-5 allocation,
> which may fail on a highly-fragmented system:
> 
>  pool-udisksd: page allocation failure: order:5,
> mode:0x40dc0(GFP_KERNEL|__GFP_COMP|__GFP_ZERO),
> nodemask=(null),cpuset=/,mems_allowed=0
>  CPU: 4 PID: 3756880 Comm: pool-udisksd Tainted: G     U
5.8.10-
> 200.fc32.x86_64 #1
>  Hardware name: Dell Inc. XPS 13 9360/0PVG6D, BIOS 2.13.0 11/14/2019  Call
> Trace:
>   dump_stack+0x6b/0x88
>   warn_alloc.cold+0x75/0xd9
>   ? _cond_resched+0x16/0x40
>   ? __alloc_pages_direct_compact+0x144/0x150
>   __alloc_pages_slowpath.constprop.0+0xcfa/0xd30
>   ? __schedule+0x28a/0x840
>   ? __wait_on_bit_lock+0x92/0xa0
>   __alloc_pages_nodemask+0x2df/0x320
>   kmalloc_order+0x1b/0x80
>   kmalloc_order_trace+0x1d/0xa0
>   exfat_create_upcase_table+0x115/0x390 [exfat]
>   exfat_fill_super+0x3ef/0x7f0 [exfat]
>   ? sget_fc+0x1d0/0x240
>   ? exfat_init_fs_context+0x120/0x120 [exfat]
>   get_tree_bdev+0x15c/0x250
>   vfs_get_tree+0x25/0xb0
>   do_mount+0x7c3/0xaf0
>   ? copy_mount_options+0xab/0x180
>   __x64_sys_mount+0x8e/0xd0
>   do_syscall_64+0x4d/0x90
>   entry_SYSCALL_64_after_hwframe+0x44/0xa9
> 
> Make the driver use vmalloc() to eliminate the issue.

I have not yet received a report of the same issue.
But I agree that this problem is likely to occur even if it is low
probability.

I think it would be more appropriate to use kvcalloc and kvfree instead.
Could you send me v2 patch?
Artem Labazov Dec. 4, 2020, 12:57 p.m. UTC | #3
> I have not yet received a report of the same issue.
> But I agree that this problem is likely to occur even if it is low
> probability.

Perhaps I should clarify my setup a little bit more.
The issue can be reliably reproduced on my laptop. It has 8 GBs of RAM
(pretty common amount nowadays) and runs an unmodified Fedora 32 kernel.
Also, I use zswap, which seems to be contributing to fragmentation as well.

> I think it would be more appropriate to use kvcalloc and kvfree instead.

I do not think this is really needed.
Upcase table allocation is relatively large (32 pages of 4KB size) and
happens only once, when the drive is being mounted. Also, exfat driver
does not rely on the fact that the table is physically contiguous.
That said, vmalloc/vfree seems to be the best option, according to kernel's
"Memory Allocation Guide".

--
Artem
Sungjong Seo Dec. 7, 2020, 6:23 a.m. UTC | #4
> > I have not yet received a report of the same issue.
> > But I agree that this problem is likely to occur even if it is low
> > probability.
> 
> Perhaps I should clarify my setup a little bit more.
> The issue can be reliably reproduced on my laptop. It has 8 GBs of RAM
> (pretty common amount nowadays) and runs an unmodified Fedora 32 kernel.
> Also, I use zswap, which seems to be contributing to fragmentation as
well.
> 
> > I think it would be more appropriate to use kvcalloc and kvfree instead.
> 
> I do not think this is really needed.
> Upcase table allocation is relatively large (32 pages of 4KB size) and
> happens only once, when the drive is being mounted. Also, exfat driver
> does not rely on the fact that the table is physically contiguous.
> That said, vmalloc/vfree seems to be the best option, according to
> kernel's "Memory Allocation Guide".

The address range available for vmalloc() allocations is limited on 32-bit
systems. If all kernel codes that need non-contiguous memory of the size
order 1 or larger try to allocate it by only vmalloc(), the address range
for vmalloc() could be insufficient.
So, I think it would be better to give kmalloc() "one" chance.

I know that kvmalloc() only tries kmalloc() once (noretry, nowarn) and if it
fails, it immediately falls back to vmalloc(). Therefore, I think kvmalloc()
and kvfree() are the best solution for solving the problem you are facing
and
the problem I mentioned above.

Could you send me patch v3 that uses kvcalloc() and kvfree()?

> 
> --
> Artem

Patch
diff mbox series

diff --git a/fs/exfat/nls.c b/fs/exfat/nls.c
index 675d0e7058c5..0582faf8de77 100644
--- a/fs/exfat/nls.c
+++ b/fs/exfat/nls.c
@@ -6,6 +6,7 @@ 
 #include <linux/string.h>
 #include <linux/slab.h>
 #include <linux/buffer_head.h>
+#include <linux/vmalloc.h>
 #include <asm/unaligned.h>
 
 #include "exfat_raw.h"
@@ -659,7 +660,7 @@  static int exfat_load_upcase_table(struct super_block *sb,
 	unsigned char skip = false;
 	unsigned short *upcase_table;
 
-	upcase_table = kcalloc(UTBL_COUNT, sizeof(unsigned short), GFP_KERNEL);
+	upcase_table = vmalloc(UTBL_COUNT*sizeof(unsigned short));
 	if (!upcase_table)
 		return -ENOMEM;
 
@@ -715,7 +716,7 @@  static int exfat_load_default_upcase_table(struct super_block *sb)
 	unsigned short uni = 0, *upcase_table;
 	unsigned int index = 0;
 
-	upcase_table = kcalloc(UTBL_COUNT, sizeof(unsigned short), GFP_KERNEL);
+	upcase_table = vmalloc(UTBL_COUNT*sizeof(unsigned short));
 	if (!upcase_table)
 		return -ENOMEM;
 
@@ -803,5 +804,5 @@  int exfat_create_upcase_table(struct super_block *sb)
 
 void exfat_free_upcase_table(struct exfat_sb_info *sbi)
 {
-	kfree(sbi->vol_utbl);
+	vfree(sbi->vol_utbl);
 }