From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753544Ab3FRHmT (ORCPT ); Tue, 18 Jun 2013 03:42:19 -0400 Received: from cantor2.suse.de ([195.135.220.15]:49707 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752186Ab3FRHmR (ORCPT ); Tue, 18 Jun 2013 03:42:17 -0400 Date: Tue, 18 Jun 2013 09:42:15 +0200 From: Michal Hocko To: Glauber Costa Cc: Dave Chinner , Andrew Morton , linux-mm@kvack.org, LKML Subject: Re: linux-next: slab shrinkers: BUG at mm/list_lru.c:92 Message-ID: <20130618074215.GA13677@dhcp22.suse.cz> References: <20130617141822.GF5018@dhcp22.suse.cz> <20130617151403.GA25172@localhost.localdomain> <20130617153302.GI5018@dhcp22.suse.cz> <20130617165409.GA10764@localhost.localdomain> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20130617165409.GA10764@localhost.localdomain> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon 17-06-13 20:54:10, Glauber Costa wrote: > On Mon, Jun 17, 2013 at 05:33:02PM +0200, Michal Hocko wrote: [...] > > I have seen some other traces as well (mentioning ext3 dput paths) but I > > cannot reproduce them anymore. > > > > Do you have those traces? If there is a bug in the ext3 dput, then it is > most likely the culprit. dput() is when we insert things into the LRU. So > if we are not fully inserting an element that we should have - and later > on try to remove it, we'll go negative. > > Can we see those traces? Unfortunatelly I don't because the machine where I saw those didn't have a serial console and the traces where scrolling like crazy. Anyway I am working on reproducing this. Linux next is hard to debug due to unrelated crashes so I am still with my -mm git tree. Anyway, I was able to reproduce one of those hangs which smells like the same/similar issue: 4659 pts/0 S+ 0:00 /bin/sh ./run_batch.sh mmotm 4661 pts/0 S+ 0:00 /bin/bash ./start.sh 4666 pts/0 S+ 5:08 /bin/bash ./start.sh 18294 pts/0 S+ 0:00 sleep 1s 4682 pts/0 S+ 0:00 /bin/bash ./run_test.sh /dev/cgroup B 2 4683 pts/0 S+ 5:16 /bin/bash ./run_test.sh /dev/cgroup B 2 18293 pts/0 S+ 0:00 sleep 1s 8509 pts/0 S+ 0:00 /usr/bin/time -v make -j4 vmlinux 8510 pts/0 S+ 0:00 make -j4 vmlinux 11730 pts/0 S+ 0:00 make -f scripts/Makefile.build obj=drivers 13135 pts/0 S+ 0:00 make -f scripts/Makefile.build obj=drivers/net 13415 pts/0 S+ 0:00 make -f scripts/Makefile.build obj=drivers/net/wireless 13657 pts/0 S+ 0:00 make -f scripts/Makefile.build obj=drivers/net/wireless/rtl818x 13665 pts/0 D+ 0:00 make -f scripts/Makefile.build obj=drivers/net/wireless/rtl818x/rtl8180 13737 pts/0 S+ 0:00 make -f scripts/Makefile.build obj=drivers/net/wireless/rtlwifi 13754 pts/0 D+ 0:00 make -f scripts/Makefile.build obj=drivers/net/wireless/rtlwifi/rtl8192de 13917 pts/0 D+ 0:00 make -f scripts/Makefile.build obj=drivers/net/wireless/rtlwifi/rtl8192se demon:/home/mhocko # cat /proc/13917/stack [] path_lookupat+0x792/0x830 [] filename_lookup+0x33/0xd0 [] user_path_at_empty+0x7b/0xb0 [] user_path_at+0xc/0x10 [] vfs_fstatat+0x51/0xb0 [] vfs_stat+0x16/0x20 [] sys_newstat+0x1f/0x50 [] system_call_fastpath+0x16/0x1b [] 0xffffffffffffffff demon:/home/mhocko # cat /proc/13754/stack [] __wait_on_freeing_inode+0x9e/0xc0 [] find_inode_fast+0xa1/0xc0 [] iget_locked+0x4f/0x180 [] ext4_iget+0x33/0x9f0 [] ext4_lookup+0xbc/0x160 [] lookup_real+0x20/0x60 [] __lookup_hash+0x34/0x40 [] path_lookupat+0x7a2/0x830 [] filename_lookup+0x33/0xd0 [] user_path_at_empty+0x7b/0xb0 [] user_path_at+0xc/0x10 [] vfs_fstatat+0x51/0xb0 [] vfs_stat+0x16/0x20 [] sys_newstat+0x1f/0x50 [] system_call_fastpath+0x16/0x1b [] 0xffffffffffffffff demon:/home/mhocko # cat /proc/13665/stack [] __wait_on_freeing_inode+0x9e/0xc0 [] find_inode_fast+0xa1/0xc0 [] iget_locked+0x4f/0x180 [] ext4_iget+0x33/0x9f0 [] ext4_lookup+0xbc/0x160 [] lookup_real+0x20/0x60 [] lookup_open+0x175/0x1d0 [] do_last+0x2de/0x780 [] path_openat+0xda/0x400 [] do_filp_open+0x43/0xa0 [] do_sys_open+0x160/0x1e0 [] sys_open+0x1c/0x20 [] system_call_fastpath+0x16/0x1b [] 0xffffffffffffffff Sysrq+l doesn't show only idle CPUs. Ext4 is showing in the traces because of CONFIG_EXT4_USE_FOR_EXT23=y. -- Michal Hocko SUSE Labs From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from psmtp.com (na3sys010amx123.postini.com [74.125.245.123]) by kanga.kvack.org (Postfix) with SMTP id 880CC6B0032 for ; Tue, 18 Jun 2013 03:42:17 -0400 (EDT) Date: Tue, 18 Jun 2013 09:42:15 +0200 From: Michal Hocko Subject: Re: linux-next: slab shrinkers: BUG at mm/list_lru.c:92 Message-ID: <20130618074215.GA13677@dhcp22.suse.cz> References: <20130617141822.GF5018@dhcp22.suse.cz> <20130617151403.GA25172@localhost.localdomain> <20130617153302.GI5018@dhcp22.suse.cz> <20130617165409.GA10764@localhost.localdomain> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20130617165409.GA10764@localhost.localdomain> Sender: owner-linux-mm@kvack.org List-ID: To: Glauber Costa Cc: Dave Chinner , Andrew Morton , linux-mm@kvack.org, LKML On Mon 17-06-13 20:54:10, Glauber Costa wrote: > On Mon, Jun 17, 2013 at 05:33:02PM +0200, Michal Hocko wrote: [...] > > I have seen some other traces as well (mentioning ext3 dput paths) but I > > cannot reproduce them anymore. > > > > Do you have those traces? If there is a bug in the ext3 dput, then it is > most likely the culprit. dput() is when we insert things into the LRU. So > if we are not fully inserting an element that we should have - and later > on try to remove it, we'll go negative. > > Can we see those traces? Unfortunatelly I don't because the machine where I saw those didn't have a serial console and the traces where scrolling like crazy. Anyway I am working on reproducing this. Linux next is hard to debug due to unrelated crashes so I am still with my -mm git tree. Anyway, I was able to reproduce one of those hangs which smells like the same/similar issue: 4659 pts/0 S+ 0:00 /bin/sh ./run_batch.sh mmotm 4661 pts/0 S+ 0:00 /bin/bash ./start.sh 4666 pts/0 S+ 5:08 /bin/bash ./start.sh 18294 pts/0 S+ 0:00 sleep 1s 4682 pts/0 S+ 0:00 /bin/bash ./run_test.sh /dev/cgroup B 2 4683 pts/0 S+ 5:16 /bin/bash ./run_test.sh /dev/cgroup B 2 18293 pts/0 S+ 0:00 sleep 1s 8509 pts/0 S+ 0:00 /usr/bin/time -v make -j4 vmlinux 8510 pts/0 S+ 0:00 make -j4 vmlinux 11730 pts/0 S+ 0:00 make -f scripts/Makefile.build obj=drivers 13135 pts/0 S+ 0:00 make -f scripts/Makefile.build obj=drivers/net 13415 pts/0 S+ 0:00 make -f scripts/Makefile.build obj=drivers/net/wireless 13657 pts/0 S+ 0:00 make -f scripts/Makefile.build obj=drivers/net/wireless/rtl818x 13665 pts/0 D+ 0:00 make -f scripts/Makefile.build obj=drivers/net/wireless/rtl818x/rtl8180 13737 pts/0 S+ 0:00 make -f scripts/Makefile.build obj=drivers/net/wireless/rtlwifi 13754 pts/0 D+ 0:00 make -f scripts/Makefile.build obj=drivers/net/wireless/rtlwifi/rtl8192de 13917 pts/0 D+ 0:00 make -f scripts/Makefile.build obj=drivers/net/wireless/rtlwifi/rtl8192se demon:/home/mhocko # cat /proc/13917/stack [] path_lookupat+0x792/0x830 [] filename_lookup+0x33/0xd0 [] user_path_at_empty+0x7b/0xb0 [] user_path_at+0xc/0x10 [] vfs_fstatat+0x51/0xb0 [] vfs_stat+0x16/0x20 [] sys_newstat+0x1f/0x50 [] system_call_fastpath+0x16/0x1b [] 0xffffffffffffffff demon:/home/mhocko # cat /proc/13754/stack [] __wait_on_freeing_inode+0x9e/0xc0 [] find_inode_fast+0xa1/0xc0 [] iget_locked+0x4f/0x180 [] ext4_iget+0x33/0x9f0 [] ext4_lookup+0xbc/0x160 [] lookup_real+0x20/0x60 [] __lookup_hash+0x34/0x40 [] path_lookupat+0x7a2/0x830 [] filename_lookup+0x33/0xd0 [] user_path_at_empty+0x7b/0xb0 [] user_path_at+0xc/0x10 [] vfs_fstatat+0x51/0xb0 [] vfs_stat+0x16/0x20 [] sys_newstat+0x1f/0x50 [] system_call_fastpath+0x16/0x1b [] 0xffffffffffffffff demon:/home/mhocko # cat /proc/13665/stack [] __wait_on_freeing_inode+0x9e/0xc0 [] find_inode_fast+0xa1/0xc0 [] iget_locked+0x4f/0x180 [] ext4_iget+0x33/0x9f0 [] ext4_lookup+0xbc/0x160 [] lookup_real+0x20/0x60 [] lookup_open+0x175/0x1d0 [] do_last+0x2de/0x780 [] path_openat+0xda/0x400 [] do_filp_open+0x43/0xa0 [] do_sys_open+0x160/0x1e0 [] sys_open+0x1c/0x20 [] system_call_fastpath+0x16/0x1b [] 0xffffffffffffffff Sysrq+l doesn't show only idle CPUs. Ext4 is showing in the traces because of CONFIG_EXT4_USE_FOR_EXT23=y. -- Michal Hocko SUSE Labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org