From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ob0-f179.google.com ([209.85.214.179]:36171 "EHLO mail-ob0-f179.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752129AbcBOWcz (ORCPT ); Mon, 15 Feb 2016 17:32:55 -0500 Received: by mail-ob0-f179.google.com with SMTP id gc3so130993373obb.3 for ; Mon, 15 Feb 2016 14:32:55 -0800 (PST) MIME-Version: 1.0 In-Reply-To: <20160215184554.GY17997@ZenIV.linux.org.uk> References: <20160212042757.GP17997@ZenIV.linux.org.uk> <20160213174738.GR17997@ZenIV.linux.org.uk> <20160214025615.GU17997@ZenIV.linux.org.uk> <20160214234312.GX17997@ZenIV.linux.org.uk> <20160215184554.GY17997@ZenIV.linux.org.uk> Date: Mon, 15 Feb 2016 17:32:54 -0500 Message-ID: Subject: Re: Orangefs ABI documentation From: Martin Brandenburg To: Al Viro Cc: Mike Marshall , Linus Torvalds , linux-fsdevel , Stephen Rothwell Content-Type: text/plain; charset=UTF-8 Sender: linux-fsdevel-owner@vger.kernel.org List-ID: On 2/15/16, Al Viro wrote: > On Mon, Feb 15, 2016 at 12:46:51PM -0500, Mike Marshall wrote: >> I pushed the list_del up to the kernel.org for-next branch... >> >> And I've been running tests with the CRUDE bandaid... weird >> results... >> >> No oopses, no WARN_ONs... I was running dbench and ls -R >> or find and kill-minus-nining different ones of them with no >> perceived resulting problems, so I moved on to signalling >> the client-core to abort... it restarted numerous times, >> and then stuff wedged up differently than I've seen before. > > There are other problems with that thing (starting with the fact that > retrying readdir/wait_for_direct_io can try to grab a slot despite the > bufmap winding down). OK, at that point I think we should try to see > if bufmap rewrite works - I've rebased on top of your branch and pushed > (head at 8c3bc9a). Bufmap rewrite is really completely untested - > it's done pretty much blindly and I'd be surprised as hell if it has no > brainos at the first try. > There's at least one major issue aside from a small typo. Something that used a slot, such as reader, would call service_operation while holding a bufmap. Then the client-core would crash, and the kernel would get run_down waiting on the slots to be given up. But the slots are not given up until someone wakes all the processes waiting in service_operation up, which happens after all the slots are given up. Then client-core hangs until someone sends a deadly signal to all the processes waiting in service_operation or presumably the timeout expires. This splits finalize and run_down so that orangefs_devreq_release can mark the slot map as killed, then purge waiting ops, then wait for all the slots to be released. Meanwhile, processes which were waiting will get into orangefs_bufmap_get which will see that the slot map is shutting down and wait for the client-core to come back. This is all at https://www.github.com/martinbrandenburg/linux.git branch slots. -- Martin diff --git a/fs/orangefs/devorangefs-req.c b/fs/orangefs/devorangefs-req.c index d96bcf10..b27ed1c 100644 --- a/fs/orangefs/devorangefs-req.c +++ b/fs/orangefs/devorangefs-req.c @@ -513,6 +513,9 @@ static int orangefs_devreq_release(struct inode *inode, struct file *file) * them as purged and wake them up */ purge_inprogress_ops(); + + orangefs_bufmap_run_down(); + gossip_debug(GOSSIP_DEV_DEBUG, "pvfs2-client-core: device close complete\n"); open_access_count = 0; diff --git a/fs/orangefs/orangefs-bufmap.c b/fs/orangefs/orangefs-bufmap.c index 3c6e07c..c544710 100644 --- a/fs/orangefs/orangefs-bufmap.c +++ b/fs/orangefs/orangefs-bufmap.c @@ -20,7 +20,7 @@ static struct slot_map rw_map = { }; static struct slot_map readdir_map = { .c = -1, - .q = __WAIT_QUEUE_HEAD_INITIALIZER(rw_map.q) + .q = __WAIT_QUEUE_HEAD_INITIALIZER(readdir_map.q) }; @@ -430,6 +430,15 @@ void orangefs_bufmap_finalize(void) gossip_debug(GOSSIP_BUFMAP_DEBUG, "orangefs_bufmap_finalize: called\n"); mark_killed(&rw_map); mark_killed(&readdir_map); + gossip_debug(GOSSIP_BUFMAP_DEBUG, + "orangefs_bufmap_finalize: exiting normally\n"); +} + +void orangefs_bufmap_run_down(void) +{ + struct orangefs_bufmap *bufmap = __orangefs_bufmap; + if (!bufmap) + return; run_down(&rw_map); run_down(&readdir_map); spin_lock(&orangefs_bufmap_lock); @@ -437,8 +446,6 @@ void orangefs_bufmap_finalize(void) spin_unlock(&orangefs_bufmap_lock); orangefs_bufmap_unmap(bufmap); orangefs_bufmap_free(bufmap); - gossip_debug(GOSSIP_BUFMAP_DEBUG, - "orangefs_bufmap_finalize: exiting normally\n"); } /* diff --git a/fs/orangefs/orangefs-bufmap.h b/fs/orangefs/orangefs-bufmap.h index ad8d82a..0be62be 100644 --- a/fs/orangefs/orangefs-bufmap.h +++ b/fs/orangefs/orangefs-bufmap.h @@ -17,6 +17,8 @@ int orangefs_bufmap_initialize(struct ORANGEFS_dev_map_desc *user_desc); void orangefs_bufmap_finalize(void); +void orangefs_bufmap_run_down(void); + int orangefs_bufmap_get(struct orangefs_bufmap **mapp, int *buffer_index); void orangefs_bufmap_put(int buffer_index);