From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1757399Ab3AIIYk (ORCPT ); Wed, 9 Jan 2013 03:24:40 -0500 Received: from relay.parallels.com ([195.214.232.42]:33766 "EHLO relay.parallels.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1757273Ab3AIIYi (ORCPT ); Wed, 9 Jan 2013 03:24:38 -0500 Message-ID: <50ED293D.9050605@parallels.com> Date: Wed, 9 Jan 2013 12:24:29 +0400 From: Stanislav Kinsbursky User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/17.0 Thunderbird/17.0 MIME-Version: 1.0 To: Sasha Levin CC: Andrew Morton , , , , , , , , , , , , , , , , , , , , Wu Fengguang Subject: Re: [RFC PATCH v8 0/5] IPC: checkpoint/restore in userspace enhancements References: <20121024151555.5642.79086.stgit@localhost.localdomain> <20121218123601.113a29c0.akpm@linux-foundation.org> <50D28EC8.7000708@parallels.com> <20121220124751.d7ccbd8e.akpm@linux-foundation.org> <50D4CA90.60205@parallels.com> <50D4DB5D.9020309@oracle.com> <50D5D50B.8090309@oracle.com> In-Reply-To: <50D5D50B.8090309@oracle.com> Content-Type: text/plain; charset="UTF-8"; format=flowed Content-Transfer-Encoding: 8bit X-Originating-IP: [10.30.18.163] Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org 22.12.2012 19:43, Sasha Levin пишет: > On 12/21/2012 04:57 PM, Sasha Levin wrote: >> On 12/21/2012 03:46 PM, Stanislav Kinsbursky wrote: >>> 21.12.2012 00:47, Andrew Morton пишет: >>>> On Thu, 20 Dec 2012 08:06:32 +0400 >>>> Stanislav Kinsbursky wrote: >>>> >>>>> 19.12.2012 00:36, Andrew Morton __________: >>>>>> On Wed, 24 Oct 2012 19:34:51 +0400 >>>>>> Stanislav Kinsbursky wrote: >>>>>> >>>>>>> This respin of the patch set was significantly reworked. Most part of new API >>>>>>> was replaced by sysctls (by one per messages, semaphores and shared memory), >>>>>>> allowing to preset desired id for next new IPC object. >>>>>>> >>>>>>> This patch set is aimed to provide additional functionality for all IPC >>>>>>> objects, which is required for migration of these objects by user-space >>>>>>> checkpoint/restore utils (CRIU). >>>>>>> >>>>>>> The main problem here was impossibility to set up object id. This patch set >>>>>>> solves the problem by adding new sysctls for preset of desired id for new IPC >>>>>>> object. >>>>>>> >>>>>>> Another problem was to peek messages from queues without deleting them. >>>>>>> This was achived by introducing of new MSG_COPY flag for sys_msgrcv(). If >>>>>>> MSG_COPY flag is set, then msgtyp is interpreted as message number. >>>>>> According to my extensive records, Sasha hit a bug in >>>>>> ipc-message-queue-copy-feature-introduced.patch and Fengguang found a >>>>>> bug in >>>>>> ipc-message-queue-copy-feature-introduced-cleanup-do_msgrcv-aroung-msg_copy-feature.patch >>>>>> >>>>>> It's not obvious (to me) that these things have been identified and >>>>>> fixed. What's the status, please? >>>>> Hello, Andrew. >>>>> Fengguang's issue was solved by "ipc: simplify message copying" I sent you. >>>>> But I can't find Sasha's issue. As I remember, there was some problem in >>>>> early >>>>> version of the patch set. But I believe its fixed now. >>>> http://lkml.indiana.edu/hypermail/linux/kernel/1210.3/01710.html >>>> >>>> Subject: "ipc, msgqueue: NULL ptr deref in msgrcv" >>> >>> Ah, yes. Thanks. >>> Hi found it in initial version of code, which was significantly changed (or cleaned and simplified) by further patch series. >>> And I cant find out, how this can happen, because this patch he bisect to do not modify the queue itself, while he found the >>> problem in testmsg. >> >> I actually can't reproduce it on the latest -next. >> >> I was reverting the IPC changes in the past couple of weeks so that I could test the >> rest of the IPC code with the fuzzer, and when I added them back in again I can't >> reproduce the issue I've reported earlier. >> >> We can probably figure out where it got fixed by bisecting between -next trees if anyone >> is interested in that. > > Ignore that. It just took more fuzzing to stumble on it again: > Hello, Sasha! Thanks! But I still can't understand, how this can happen... And I can't reproduce. Could you specify your load? I.e. how do you stumble on this panic? Looks like you don't use new "copy" feature. > [ 103.164594] BUG: unable to handle kernel NULL pointer dereference at 0000000000000010 > [ 103.168159] IP: [] do_msgrcv+0x205/0x540 > [ 103.170031] PGD c7cd067 PUD d274067 PMD 0 > [ 103.170031] Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC > [ 103.170031] Dumping ftrace buffer: > [ 103.170031] (ftrace buffer empty) > [ 103.170031] CPU 4 > [ 103.170031] Pid: 7056, comm: trinity Tainted: G W 3.7.0-next-20121221-sasha-00014-g339890c #229 > [ 103.170031] RIP: 0010:[] [] do_msgrcv+0x205/0x540 > [ 103.170031] RSP: 0018:ffff88000c7cfe88 EFLAGS: 00010246 > [ 103.170031] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 > [ 103.170031] RDX: ffff880013681f00 RSI: 0000000000000124 RDI: ffff8800075a7810 > [ 103.170031] RBP: ffff88000c7cff68 R08: 0000000000000000 R09: 0000000000000000 > [ 103.170031] R10: 0000000000000001 R11: 0000000000000001 R12: 0000000000000002 > [ 103.170031] R13: ffff8800075a78c0 R14: 7fffffff00000000 R15: ffff8800075a7810 > [ 103.170031] FS: 00007ffa529ae700(0000) GS:ffff880013c00000(0000) knlGS:0000000000000000 > [ 103.170031] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 103.170031] CR2: 0000000000000010 CR3: 000000000c7cc000 CR4: 00000000000406e0 > [ 103.170031] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > [ 103.170031] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 > [ 103.170031] Process trinity (pid: 7056, threadinfo ffff88000c7ce000, task ffff88000c020000) > [ 103.170031] Stack: > [ 103.170031] ffff88000c7cfea8 ffff88000c020000 ffff88000c020000 ffff88000c020000 > [ 103.170031] 0000000000000000 ffffffff81935e50 0000000000000008 0000000000000000 > [ 103.170031] ffffffff858e91e0 0000000000000000 0000000000001001 ffff88000c020000 > [ 103.170031] Call Trace: > [ 103.170031] [] ? load_msg+0x170/0x170 > [ 103.170031] [] ? syscall_trace_enter+0x24/0x2e0 > [ 103.170031] [] ? trace_hardirqs_on_caller+0x118/0x140 > [ 103.170031] [] sys_msgrcv+0x10/0x20 > [ 103.170031] [] tracesys+0xe1/0xe6 > [ 103.170031] Code: 80 f5 ff ff ff 90 41 83 fc 03 74 32 41 83 fc 04 74 0c 41 83 fc 02 75 2c eb 11 0f 1f 40 00 4c 3b 73 10 7d 20 > 66 90 e9 94 00 00 00 <4c> 39 73 10 0f 85 8a 00 00 00 90 eb 0c 66 0f 1f 44 00 00 4c 39 > [ 103.170031] RIP [] do_msgrcv+0x205/0x540 > [ 103.170031] RSP > [ 103.170031] CR2: 0000000000000010 > [ 103.228270] ---[ end trace ddc37199fdad82b0 ]--- > > > Thanks, > Sasha > -- Best regards, Stanislav Kinsbursky