* Re: [Bug 205937] New: BUG: unable to handle page fault for address: f3170000
[not found] <bug-205937-27@https.bugzilla.kernel.org/>
@ 2019-12-23 22:14 ` Andrew Morton
2019-12-24 3:45 ` Dennis Clarke
2019-12-26 21:41 ` Christopher Lameter
0 siblings, 2 replies; 4+ messages in thread
From: Andrew Morton @ 2019-12-23 22:14 UTC (permalink / raw)
To: dclarke
Cc: bugzilla-daemon, penberg, Christopher Lameter, David Rientjes,
Joonsoo Kim, linux-mm, Qian Cai
(switched to email. Please respond via emailed reply-to-all, not via the
bugzilla web interface).
Thanks.
On Sat, 21 Dec 2019 03:08:17 +0000 bugzilla-daemon@bugzilla.kernel.org wrote:
> https://bugzilla.kernel.org/show_bug.cgi?id=205937
>
> Bug ID: 205937
> Summary: BUG: unable to handle page fault for address: f3170000
> Product: Memory Management
> Version: 2.5
> Kernel Version: 5.5-rc2
> Hardware: i386
> OS: Linux
> Tree: Mainline
> Status: NEW
> Severity: normal
> Priority: P1
> Component: Page Allocator
> Assignee: akpm@linux-foundation.org
> Reporter: dclarke@blastwave.org
> Regression: No
"yes"
Looks like the asynchronous sysfs file removal code is failing.
sysfs_slab_remove_workfn().
Guys, did we make recent changes in this area?
> Created attachment 286393
> --> https://bugzilla.kernel.org/attachment.cgi?id=286393&action=edit
> kernel config for 5.5.0-rc2
>
> Testing a system under excessive memory pressure with some trivial
> code wherein a set of 16 pthreads are dispatched and each merely fills
> an array :
>
>
> void *big_array_fill(void *recv_parm)
> {
> thread_parm_t *p = (thread_parm_t *)recv_parm;
>
> printf("TRD : %d filling the big_array.\n", p->tnum);
> for ( p->loop0 = 0; p->loop0 < BIG_ARRAY_DIM0; p->loop0++ ) {
> for ( p->loop1 = 0; p->loop1 < BIG_ARRAY_DIM1; p->loop1++ ) {
> p->big_array[p->loop0][p->loop1] = (uint64_t)(p->loop0 * p->loop1);
> }
> }
> printf("TRD : %d big_array full.\n", p->tnum);
>
> /* return some random data */
> p->ret_val = drand48();
>
> return (NULL);
> }
>
> The received parameters for each thread were in a struct thus :
>
> titan$ cat p0.h
>
> #define NUM_THREADS 16
> #define BIG_ARRAY_DIM0 384
> #define BIG_ARRAY_DIM1 65536
>
> /*
> * struct to pass parameters to a dispatched thread
> */
> typedef struct {
> uint32_t tnum; /* thread number */
> int sleep_time, loop0, loop1;
> double ret_val; /* some sort of a return data value */
> uint64_t big_array[BIG_ARRAY_DIM0][BIG_ARRAY_DIM1]; /* memory abuse */
> } thread_parm_t;
>
> These threads were fired of as a test while doing a teaching demo :
>
> printf("\n-------------- begin dispatch -----------------------\n");
> for ( i = 0; i < NUM_THREADS; i++) {
> parm[i] = calloc( (size_t) 1 , (size_t) sizeof(thread_parm_t) );
>
> if ( parm[i] == NULL ) {
> if ( errno == ENOMEM ) {
> fprintf(stderr,"FAIL : calloc returns ENOMEM at %s:%d\n",
> __FILE__, __LINE__ );
> } else {
> fprintf(stderr,"FAIL : calloc fails at %s:%d\n",
> __FILE__, __LINE__ );
> }
> perror("FAIL ");
> /* gee .. before we bail out did we allocate any of the
> * previous thread parameter memory regions? If so then
> * clean up before bailing out. In fact we may have
> * already dispatched out threads. */
>
> if (i == 0 ) return ( EXIT_FAILURE );
>
> for ( j = 0; j < i; j++ ) {
> /* lets ask those threads to just be nice and
> * we call them in with a join */
> pthread_join(tid[j], NULL);
> fprintf(stderr,"BAIL : pthread_join(%i) done.\n", j);
> free(parm[j]);
> parm[j] = NULL;
> }
> fprintf(stderr,"BAIL : cleanup done.\n", j);
> ru();
>
> return ( EXIT_FAILURE );
>
> }
>
> parm[i]->tnum = i;
> parm[i]->sleep_time = 1 + (int)( drand48() * 10.0 );
>
> pthread_create( &tid[i], NULL, big_array_fill, (void *)parm[i] );
>
> printf("INFO : pthread_create %2i called for %2i secs.\n",
> i, parm[i]->sleep_time );
> }
> printf("\n-------------- end dispatch -------------------------\n");
>
>
> All very nice and does what it does on most systems and even with a very
> old and slow pentium II with very little memory we see everything just
> works fine so long as there is some swap.
>
> However on linux 5.5-rc2 I see this a warning that the CPU is busy and
> that is fine however the process seems to merely get "stuck" for lack
> of a better word. A kill -HUP on the pid has no effect. A kill -9 also
> seems to have no effect. A kill -9 of the PPID merelu shifts the new
> parent to be number 1 and I see a zombie that won't go away.
>
> esther#
> esther# ps -efl | grep -E "UID|dclarke|init"
> F S UID PID PPID C PRI NI ADDR SZ WCHAN STIME TTY TIME CMD
> 4 S root 1 0 0 80 0 - 9079 do_epo Dec20 ? 00:01:02
> /sbin/init verbose
> 4 S dclarke 382 1 0 80 0 - 4320 do_epo Dec20 ? 00:00:03
> /lib/systemd/systemd --user
> 5 S dclarke 384 382 0 80 0 - 9424 do_sig Dec20 ? 00:00:00
> (sd-pam)
> 0 Z dclarke 914 1 3 95 15 - 0 - 01:13 ? 00:03:13 [p0]
> <defunct>
> 4 S root 959 338 0 80 0 - 3256 poll_s 01:55 ? 00:00:01
> sshd: dclarke [priv]
> 5 S dclarke 965 959 0 80 0 - 3256 poll_s 01:55 ? 00:00:03
> sshd: dclarke@pts/2
> 0 S dclarke 966 965 0 80 0 - 2458 do_wai 01:55 pts/2 00:00:02
> -bash
> 0 S root 1188 1107 6 80 0 - 1958 pipe_r 02:57 pts/2 00:00:00 grep
> -E UID|dclarke|init
> esther#
>
> Looking in /proc I see :
>
> esther#
> esther# cat /proc/914/status
> Name: p0
> State: Z (zombie)
> Tgid: 914
> Ngid: 0
> Pid: 914
> PPid: 1
> TracerPid: 0
> Uid: 16411 16411 16411 16411
> Gid: 20002 20002 20002 20002
> FDSize: 0
> Groups: 20002
> NStgid: 914
> NSpid: 914
> NSpgid: 913
> NSsid: 398
> Threads: 2
> SigQ: 2/7323
> SigPnd: 0000000000000000
> ShdPnd: 0000000000000103
> SigBlk: 0000000000000000
> SigIgn: 0000000000000000
> SigCgt: 0000000180000000
> CapInh: 0000000000000000
> CapPrm: 0000000000000000
> CapEff: 0000000000000000
> CapBnd: 0000003fffffffff
> CapAmb: 0000000000000000
> NoNewPrivs: 0
> Seccomp: 0
> Speculation_Store_Bypass: vulnerable
> Cpus_allowed: 1
> Cpus_allowed_list: 0
> voluntary_ctxt_switches: 13
> nonvoluntary_ctxt_switches: 74
> esther#
>
> However dmesg reveals far more information :
>
> .
> .
> .
> [44540.046308] kobject: '(null)' (5fcda702): kobject_cleanup, parent 2a0c29d5
> [44540.060815] kobject: '(null)' (5fcda702): calling ktype release
> [44540.230679] kobject: '(null)' (0cf40105): kobject_cleanup, parent 2a0c29d5
> [44540.244669] kobject: '(null)' (0cf40105): calling ktype release
> [44540.430165] kobject: '(null)' (1eed3f2a): kobject_cleanup, parent 2a0c29d5
> [44540.444359] kobject: '(null)' (1eed3f2a): calling ktype release
> [44540.612080] kobject: '(null)' (b9893805): kobject_cleanup, parent 2a0c29d5
> [44540.625521] kobject: '(null)' (b9893805): calling ktype release
> [44540.777358] kobject: '(null)' (6e8d4424): kobject_cleanup, parent 2a0c29d5
> [44540.792340] kobject: '(null)' (6e8d4424): calling ktype release
> [44540.902623] kobject: '(null)' (07ba38b5): kobject_cleanup, parent 2a0c29d5
> [44540.916637] kobject: '(null)' (07ba38b5): calling ktype release
> [44545.033382] kobject: '(null)' (dbf42766): kobject_cleanup, parent 2a0c29d5
> [44545.048144] kobject: '(null)' (dbf42766): calling ktype release
> [44545.242257] kobject: '(null)' (e64a3d73): kobject_cleanup, parent 2a0c29d5
> [44545.255661] kobject: '(null)' (e64a3d73): calling ktype release
> [44545.402036] kobject: '(null)' (e43ef4d7): kobject_cleanup, parent 2a0c29d5
> [44545.415573] kobject: '(null)' (e43ef4d7): calling ktype release
> [44545.566126] kobject: '(null)' (2c27ba6b): kobject_cleanup, parent 2a0c29d5
> [44545.579740] kobject: '(null)' (2c27ba6b): calling ktype release
> [44546.186101] kobject: '(null)' (da4ac031): kobject_cleanup, parent 2a0c29d5
> [44546.188957] BUG: unable to handle page fault for address: f3170000
> [44546.188965] #PF: supervisor read access in kernel mode
> [44546.188973] #PF: error_code(0x0000) - not-present page
> [44546.188979] *pde = 36f4a067 *pte = 33170060
> [44546.188995] Oops: 0000 [#1] DEBUG_PAGEALLOC
> [44546.189004] CPU: 0 PID: 680 Comm: kworker/0:1 Not tainted 5.5.0-rc2-genunix
> #1
> [44546.189072] Hardware name: /CN700-8237, BIOS 6.00 PG 11/13/2006
> [44546.189079] Workqueue: events sysfs_slab_remove_workfn
> [44546.189090] EIP: hw_bitblt_1+0x240/0x310 [viafb]
> [44546.189108] Code: 08 80 fa 02 0f 84 d8 00 00 00 0f b6 55 ec c0 ea 03 0f b6
> d2 0f af ca 83 c1 03 c1 e9 02 74 17 81 c3 00 00 20 00 8d 74 26 00
> 90 <8b> 14 87 89 13 83 c0 01 39 c8 72 f4 8d 65 f4 31 c0 5b 5e 5f 5d c3
> [44546.189116] EAX: 00000994 EBX: f8600000 ECX: 000009a0 EDX: 00000000
> [44546.189124] ESI: 00000002 EDI: f316d9b0 EBP: eb98bc70 ESP: eb98bc50
> [44546.189193] DS: 007b ES: 007b FS: 0000 GS: 00e0 SS: 0068 EFLAGS: 00010083
> [44546.189201] CR0: 80050033 CR2: f3170000 CR3: 2b25b000 CR4: 00000690
> [44546.189206] Call Trace:
> [44546.189213] ? hw_bitblt_2+0x2b0/0x2b0 [viafb]
> [44546.189219] viafb_imageblit+0x90/0xf0 [viafb]
> [44546.189225] bit_putcs+0x215/0x430
> [44546.189231] ? bit_clear+0x120/0x120
> [44546.189236] fbcon_putcs+0xcb/0xe0
> [44546.189242] ? bit_clear+0x120/0x120
> [44546.189248] ? fb_flashcursor+0x100/0x100
> [44546.189315] vt_console_print+0x353/0x400
> [44546.189321] ? insert_char+0xd0/0xd0
> [44546.189327] console_unlock+0x35e/0x4e0
> [44546.189333] vprintk_emit+0x23a/0x2f0
> [44546.189339] vprintk_default+0x17/0x20
> [44546.189345] vprintk_func+0x36/0xb7
> [44546.189350] printk+0x13/0x15
> [44546.189356] __dynamic_pr_debug+0x46/0x70
> [44546.189363] ? __lock_acquire.isra.0+0xfe/0x4e0
> [44546.189369] kobject_put+0x7b/0x190
> [44546.189376] sysfs_slab_remove_workfn+0x30/0x40
> [44546.189382] process_one_work+0x1e4/0x3c0
> [44546.189388] worker_thread+0x14e/0x3b0
> [44546.189395] ? process_one_work+0x3c0/0x3c0
> [44546.189401] kthread+0xdb/0x110
> [44546.189407] ? process_one_work+0x3c0/0x3c0
> [44546.189414] ? kthread_create_on_node+0x20/0x20
> [44546.189419] ret_from_fork+0x2e/0x38
> [44546.189424] Modules linked in: via_camera videobuf2_dma_sg videobuf2_memops
> videobuf2_v4l2 videobuf2_comm on videodev mc evdev padlock_sha
> padlock_aes snd_pcm uhci_hcd via_cputemp ehci_pci hwmon_vid ehci_hcd snd_ti
> mer via_rng viafb snd rng_core usbcore soundcore serio_raw pcspkr
> i2c_viapro sg i2c_algo_bit acpi_cpufreq bu tton ip_tables x_tables
> autofs4 sd_mod ata_generic fan
> [44546.189481] CR2: 00000000f3170000
> [44546.189481] ---[ end trace 5d021d89c9f5c08d ]---
> [44546.189481] EIP: hw_bitblt_1+0x240/0x310 [viafb]
> [44546.189481] Code: 08 80 fa 02 0f 84 d8 00 00 00 0f b6 55 ec c0 ea 03 0f b6
> d2 0f af ca 83 c1 03 c1 e9 02 74 17 81 c3 00 00 20 00 8d 74 26 00
> 90 <8b> 14 87 89 13 83 c0 01 39 c8 72 f4 8d 65 f4 31 c0 5b 5e 5f 5d c3
> [44546.189481] EAX: 00000994 EBX: f8600000 ECX: 000009a0 EDX: 00000000
> [44546.189481] ESI: 00000002 EDI: f316d9b0 EBP: eb98bc70 ESP: eb98bc50
> [44546.189481] DS: 007b ES: 007b FS: 0000 GS: 00e0 SS: 0068 EFLAGS: 00010083
> [44546.189481] CR0: 80050033 CR2: f3170000 CR3: 2b25b000 CR4: 00000690
> [44571.433760] watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [p0:918]
> [44571.433768] Modules linked in: via_camera videobuf2_dma_sg videobuf2_memops
> videobuf2_v4l2 videobuf2_comm on videodev mc evdev padlock_sha
> padlock_aes snd_pcm uhci_hcd via_cputemp ehci_pci hwmon_vid ehci_hcd snd_ti
> mer via_rng viafb snd rng_core usbcore soundcore serio_raw pcspkr
> i2c_viapro sg i2c_algo_bit acpi_cpufreq bu tton ip_tables x_tables
> autofs4 sd_mod ata_generic fan
> [44571.434034] CPU: 0 PID: 918 Comm: p0 Tainted: G D
> 5.5.0-rc2-genunix #1
> [44571.434042] Hardware name: /CN700-8237, BIOS 6.00 PG 11/13/2006
> [44571.434047] EIP: 0x437636
> [44571.434066] Code: 83 c4 10 8b 45 e4 c7 40 08 00 00 00 00 eb 6a 8b 45 e4 c7
> 40 0c 00 00 00 00 eb 42 8b 45 e4 8b 50 08 8b 45 e4 8b 40 0c 0f af
> c2 <8b> 55 e4 8b 7a 08 8b 55 e4 8b 72 0c 89 c2 c1 fa 1f 8b 4d e4 c1 e7
> [44571.434074] EAX: 003f2551 EBX: 0043a000 ECX: 9652a010 EDX: 00000079
> [44571.434082] ESI: 0079859a EDI: 00790000 EBP: 96529358 ESP: 96529330
> [44571.434151] DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b EFLAGS: 00000296
> [44599.433696] watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [htop:893]
> [44599.433765] Modules linked in: via_camera videobuf2_dma_sg videobuf2_memops
> videobuf2_v4l2 videobuf2_comm on videodev mc evdev padlock_sha
> padlock_aes snd_pcm uhci_hcd via_cputemp ehci_pci hwmon_vid ehci_hcd snd_ti
> mer via_rng viafb snd rng_core usbcore soundcore serio_raw pcspkr
> i2c_viapro sg i2c_algo_bit acpi_cpufreq bu tton ip_tables x_tables
> autofs4 sd_mod ata_generic fan
> [44599.434032] CPU: 0 PID: 893 Comm: htop Tainted: G D L
> 5.5.0-rc2-genunix #1
> [44599.434040] Hardware name: /CN700-8237, BIOS 6.00 PG 11/13/2006
> [44599.434045] EIP: 0xb7f564a7
> [44599.434063] Code: 24 04 89 1a 89 6a 08 89 42 04 8b 44 24 0c 89 4a 18 89 42
> 0c 8b 44 24 10 89 42 10 8b 44 24 08 89 42 14 83 c4 18 89 d0 5b 5e
> 5f <5d> c2 04 00 8d 74 26 00 90 f7 c7 00 ff 00 00 74 10 c6 44 24 17 00
> [44599.434132] EAX: bfa3e8a0 EBX: b7f8dafc ECX: 00000000 EDX: bfa3e8a0
> [44599.434140] ESI: bfa4146c EDI: 000004b4 EBP: 00000000 ESP: bfa3e828
> [44599.434148] DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b EFLAGS: 00000282
> esther#
>
> Not sure what other information to include however :
>
> esther#
> esther# cat /proc/version
> Linux version 5.5.0-rc2-genunix (root@esther) (gcc version 9.2.1 20191130
> (Debian 9.2.1-21)) #1 Tue Dec 17 01:57:17 UTC 2019
> esther#
> esther# cat /proc/cpuinfo
> processor : 0
> vendor_id : CentaurHauls
> cpu family : 6
> model : 10
> model name : VIA Esther processor 1200MHz
> stepping : 9
> cpu MHz : 400.000
> cache size : 128 KB
> fdiv_bug : no
> f00f_bug : no
> coma_bug : no
> fpu : yes
> fpu_exception : yes
> cpuid level : 1
> wp : yes
> flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge cmov pat
> clflush acpi mmx fxsr sse sse2 tm nx cpuid pni est tm2 rng rng_en ace ace_en
> ace2 ace2_en phe phe_en pmm pmm_en
> bugs : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds
> swapgs itlb_multihit
> bogomips : 800.02
> clflush size : 64
> cache_alignment : 64
> address sizes : 36 bits physical, 32 bits virtual
> power management:
>
> esther#
> esther# cat /proc/meminfo
> MemTotal: 937412 kB
> MemFree: 70200 kB
> MemAvailable: 31728 kB
> Buffers: 11400 kB
> Cached: 43532 kB
> SwapCached: 55872 kB
> Active: 385352 kB
> Inactive: 400988 kB
> Active(anon): 352888 kB
> Inactive(anon): 379860 kB
> Active(file): 32464 kB
> Inactive(file): 21128 kB
> Unevictable: 0 kB
> Mlocked: 0 kB
> HighTotal: 76680 kB
> HighFree: 1552 kB
> LowTotal: 860732 kB
> LowFree: 68648 kB
> SwapTotal: 31250428 kB
> SwapFree: 29862396 kB
> Dirty: 16 kB
> Writeback: 0 kB
> AnonPages: 676316 kB
> Mapped: 16560 kB
> Shmem: 1340 kB
> KReclaimable: 13748 kB
> Slab: 54152 kB
> SReclaimable: 13748 kB
> SUnreclaim: 40404 kB
> KernelStack: 632 kB
> PageTables: 2932 kB
> NFS_Unstable: 0 kB
> Bounce: 0 kB
> WritebackTmp: 0 kB
> CommitLimit: 31719132 kB
> Committed_AS: 2333036 kB
> VmallocTotal: 122880 kB
> VmallocUsed: 11532 kB
> VmallocChunk: 0 kB
> Percpu: 192 kB
> HardwareCorrupted: 0 kB
> AnonHugePages: 0 kB
> ShmemHugePages: 0 kB
> ShmemPmdMapped: 0 kB
> FileHugePages: 0 kB
> FilePmdMapped: 0 kB
> HugePages_Total: 0
> HugePages_Free: 0
> HugePages_Rsvd: 0
> HugePages_Surp: 0
> Hugepagesize: 4096 kB
> Hugetlb: 0 kB
> DirectMap4k: 905208 kB
> DirectMap4M: 0 kB
> esther#
> esther# swapon
> NAME TYPE SIZE USED PRIO
> /dev/sda2 partition 29.8G 1.3G -2
> esther#
>
> Also I will attach the kernel config from /boot for 5.5.0-rc2-genunix.
>
>
> --
> Dennis Clarke
> RISC-V/SPARC/PPC/ARM/CISC
> UNIX and Linux spoken
> GreyBeard and suspenders optional
>
> --
> You are receiving this mail because:
> You are the assignee for the bug.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [Bug 205937] New: BUG: unable to handle page fault for address: f3170000
2019-12-23 22:14 ` [Bug 205937] New: BUG: unable to handle page fault for address: f3170000 Andrew Morton
@ 2019-12-24 3:45 ` Dennis Clarke
2019-12-26 21:41 ` Christopher Lameter
1 sibling, 0 replies; 4+ messages in thread
From: Dennis Clarke @ 2019-12-24 3:45 UTC (permalink / raw)
To: Andrew Morton
Cc: bugzilla-daemon, penberg, Christopher Lameter, David Rientjes,
Joonsoo Kim, linux-mm, Qian Cai
On 12/23/19 5:14 PM, Andrew Morton wrote:
> (switched to email. Please respond via emailed reply-to-all, not via the
> bugzilla web interface).
>
> Thanks.
>
If someone can think of a specific code test I can run it on the
same hardware here. However I have switched over to 5.5-rc3 as a gift
for Christmas. :)
Thank you for the follow up and I will dig into what changed recently in
the asynchronous sysfs_slab_remove_workfn() because it isn't new and it
was working. Seems to have been around a few years now :
https://github.com/torvalds/linux/commit/3b7b314053d021601940c50b07f5f1423ae67e21
--
Dennis Clarke
RISC-V/SPARC/PPC/ARM/CISC
UNIX and Linux spoken
GreyBeard and suspenders optional
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [Bug 205937] New: BUG: unable to handle page fault for address: f3170000
2019-12-23 22:14 ` [Bug 205937] New: BUG: unable to handle page fault for address: f3170000 Andrew Morton
2019-12-24 3:45 ` Dennis Clarke
@ 2019-12-26 21:41 ` Christopher Lameter
2019-12-26 23:58 ` Roman Gushchin
1 sibling, 1 reply; 4+ messages in thread
From: Christopher Lameter @ 2019-12-26 21:41 UTC (permalink / raw)
To: Andrew Morton
Cc: dclarke, bugzilla-daemon, Pekka Enberg, David Rientjes,
Joonsoo Kim, linux-mm, Qian Cai, longman
On Mon, 23 Dec 2019, Andrew Morton wrote:
> Guys, did we make recent changes in this area?
Yes the cgroup folks did. f.e.
commit 04f768a39d55967246c002aa66b407b3bfdd8269
Author: Waiman Long <longman@redhat.com>
Date: Mon Sep 23 15:33:46 2019 -0700
mm, slab: extend slab/shrink to shrink all memcg caches
Currently, a value of '1" is written to /sys/kernel/slab/<slab>/shrink
file to shrink the slab by flushing out all the per-cpu slabs and free
slabs in partial lists. This can be useful to squeeze out a bit more
memory under extreme condition as well as making the active object
counts
in /proc/slabinfo more accurate.
This usually applies only to the root caches, as the SLUB_ME
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [Bug 205937] New: BUG: unable to handle page fault for address: f3170000
2019-12-26 21:41 ` Christopher Lameter
@ 2019-12-26 23:58 ` Roman Gushchin
0 siblings, 0 replies; 4+ messages in thread
From: Roman Gushchin @ 2019-12-26 23:58 UTC (permalink / raw)
To: Christopher Lameter
Cc: Andrew Morton, dclarke, bugzilla-daemon, Pekka Enberg,
David Rientjes, Joonsoo Kim, linux-mm, Qian Cai, longman
On Thu, Dec 26, 2019 at 09:41:30PM +0000, Christopher Lameter wrote:
> On Mon, 23 Dec 2019, Andrew Morton wrote:
>
> > Guys, did we make recent changes in this area?
>
> Yes the cgroup folks did. f.e.
>
> commit 04f768a39d55967246c002aa66b407b3bfdd8269
> Author: Waiman Long <longman@redhat.com>
> Date: Mon Sep 23 15:33:46 2019 -0700
>
> mm, slab: extend slab/shrink to shrink all memcg caches
I'd be surprised if the issue is caused by using of this very
new interface. But it would be an easy thing...
>
> Currently, a value of '1" is written to /sys/kernel/slab/<slab>/shrink
> file to shrink the slab by flushing out all the per-cpu slabs and free
> slabs in partial lists. This can be useful to squeeze out a bit more
> memory under extreme condition as well as making the active object
> counts
> in /proc/slabinfo more accurate.
>
> This usually applies only to the root caches, as the SLUB_ME
>
Otherwise I bet on some race around s->kobj.state_in_sysfs .
It interesting that it reproduces on a single cpu i386 machine.
I really wonder what makes it different.
Dennis, can you, please, insert some debug printing into
sysfs_slab_remove_workfn()? I wonder if it's really called twice
and where exactly it panics?
Thanks!
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2019-12-26 23:58 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
[not found] <bug-205937-27@https.bugzilla.kernel.org/>
2019-12-23 22:14 ` [Bug 205937] New: BUG: unable to handle page fault for address: f3170000 Andrew Morton
2019-12-24 3:45 ` Dennis Clarke
2019-12-26 21:41 ` Christopher Lameter
2019-12-26 23:58 ` Roman Gushchin
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).