From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751742Ab3FQVfN (ORCPT ); Mon, 17 Jun 2013 17:35:13 -0400 Received: from mail.linuxfoundation.org ([140.211.169.12]:41349 "EHLO mail.linuxfoundation.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751422Ab3FQVfL (ORCPT ); Mon, 17 Jun 2013 17:35:11 -0400 Date: Mon, 17 Jun 2013 14:35:08 -0700 From: Andrew Morton To: Glauber Costa Cc: Michal Hocko , Dave Chinner , linux-mm@kvack.org, LKML Subject: Re: linux-next: slab shrinkers: BUG at mm/list_lru.c:92 Message-Id: <20130617143508.7417f1ac9ecd15d8b2877f76@linux-foundation.org> In-Reply-To: <20130617151403.GA25172@localhost.localdomain> References: <20130617141822.GF5018@dhcp22.suse.cz> <20130617151403.GA25172@localhost.localdomain> X-Mailer: Sylpheed 3.2.0beta5 (GTK+ 2.24.10; x86_64-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, 17 Jun 2013 19:14:12 +0400 Glauber Costa wrote: > > I managed to trigger: > > [ 1015.776029] kernel BUG at mm/list_lru.c:92! > > [ 1015.776029] invalid opcode: 0000 [#1] SMP > > with Linux next (next-20130607) with https://lkml.org/lkml/2013/6/17/203 > > on top. > > > > This is obviously BUG_ON(nlru->nr_items < 0) and > > ffffffff81122d0b: 48 85 c0 test %rax,%rax > > ffffffff81122d0e: 49 89 44 24 18 mov %rax,0x18(%r12) > > ffffffff81122d13: 0f 84 87 00 00 00 je ffffffff81122da0 > > ffffffff81122d19: 49 83 7c 24 18 00 cmpq $0x0,0x18(%r12) > > ffffffff81122d1f: 78 7b js ffffffff81122d9c > > [...] > > ffffffff81122d9c: 0f 0b ud2 > > > > RAX is -1UL. > Yes, fearing those kind of imbalances, we decided to leave the counter as a signed quantity > and BUG, instead of an unsigned quantity. > > > > > I assume that the current backtrace is of no use and it would most > > probably be some shrinker which doesn't behave. > > > There are currently 3 users of list_lru in tree: dentries, inodes and xfs. > Assuming you are not using xfs, we are left with dentries and inodes. > > The first thing to do is to find which one of them is misbehaving. You can try finding > this out by the address of the list_lru, and where it lays in the superblock. > > Once we know each of them is misbehaving, then we'll have to figure out why. The trace says shrink_slab_node->super_cache_scan->prune_icache_sb. So it's inodes? From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from psmtp.com (na3sys010amx181.postini.com [74.125.245.181]) by kanga.kvack.org (Postfix) with SMTP id 3B5636B0031 for ; Mon, 17 Jun 2013 17:35:12 -0400 (EDT) Date: Mon, 17 Jun 2013 14:35:08 -0700 From: Andrew Morton Subject: Re: linux-next: slab shrinkers: BUG at mm/list_lru.c:92 Message-Id: <20130617143508.7417f1ac9ecd15d8b2877f76@linux-foundation.org> In-Reply-To: <20130617151403.GA25172@localhost.localdomain> References: <20130617141822.GF5018@dhcp22.suse.cz> <20130617151403.GA25172@localhost.localdomain> Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: To: Glauber Costa Cc: Michal Hocko , Dave Chinner , linux-mm@kvack.org, LKML On Mon, 17 Jun 2013 19:14:12 +0400 Glauber Costa wrote: > > I managed to trigger: > > [ 1015.776029] kernel BUG at mm/list_lru.c:92! > > [ 1015.776029] invalid opcode: 0000 [#1] SMP > > with Linux next (next-20130607) with https://lkml.org/lkml/2013/6/17/203 > > on top. > > > > This is obviously BUG_ON(nlru->nr_items < 0) and > > ffffffff81122d0b: 48 85 c0 test %rax,%rax > > ffffffff81122d0e: 49 89 44 24 18 mov %rax,0x18(%r12) > > ffffffff81122d13: 0f 84 87 00 00 00 je ffffffff81122da0 > > ffffffff81122d19: 49 83 7c 24 18 00 cmpq $0x0,0x18(%r12) > > ffffffff81122d1f: 78 7b js ffffffff81122d9c > > [...] > > ffffffff81122d9c: 0f 0b ud2 > > > > RAX is -1UL. > Yes, fearing those kind of imbalances, we decided to leave the counter as a signed quantity > and BUG, instead of an unsigned quantity. > > > > > I assume that the current backtrace is of no use and it would most > > probably be some shrinker which doesn't behave. > > > There are currently 3 users of list_lru in tree: dentries, inodes and xfs. > Assuming you are not using xfs, we are left with dentries and inodes. > > The first thing to do is to find which one of them is misbehaving. You can try finding > this out by the address of the list_lru, and where it lays in the superblock. > > Once we know each of them is misbehaving, then we'll have to figure out why. The trace says shrink_slab_node->super_cache_scan->prune_icache_sb. So it's inodes? -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org