All of lore.kernel.org
 help / color / mirror / Atom feed
From: yumkam@gmail.com (Yuriy M. Kaminskiy)
To: util-linux@vger.kernel.org
Subject: [RFC][PATCH alt v2] fsck -l: block concurrent fsck for any stacked devices (Re: [RFC][WIP][PATCH] fsck -l and lvm/dmcrypt/md/etc)
Date: Tue, 19 Apr 2016 18:11:46 +0300	[thread overview]
Message-ID: <m3inzdk4zx.fsf_-_@gmail.com> (raw)
In-Reply-To: m3twixk6c0.fsf@gmail.com

[-- Attachment #1: Type: text/plain, Size: 2383 bytes --]

yumkam@gmail.com (Yuriy M. Kaminskiy)
writes:

> Karel Zak <kzak@redhat.com> writes:
>
>> On Sat, Apr 09, 2016 at 12:40:18AM +0300, Yuriy M. Kaminskiy wrote:
>>> I think this algorithm should work:
>>> 
>>> 1) walk /dev/block/${dev}/slaves recursively (e.g. lvm-over- luks-
>>> over-lvm- over-md, first-level slaves won't work), collect all
>>> whole-disk devices;
>>> 2) sort this list by device numbers (to avoid AB/BA deadlocks), remove
>>> duplicates;
>>> 3) acquire all locks consequently.
>>> 
>>> There are some unavoidable flaws (e.g. if someone alters structure while fsck
>>> is running), and some things could be improved, but it should work in
>>> most cases (and if it fails, it is just no worse than current behavior).
>>> 
>>> I've tested with LVM and LUKS-over-LVM on (debian's) 3.16 kernel, it seems
>>> works as expected.
>>> 
>>> What is not covered: /dev/loop* (and I have no plans for it).
>>> 
>>> Comments? NACKs?
>>
>> The question is if we really need it :-)
>
> Yes?
>
>> All the care fsck has about whole-disks is about performance, because
>> we assume (on classic rotation disks) that execute fsck on more
>> partitions in the same time is bad idea.
>>
>> The problem with stacked devices is that we have no clue about sectors
>> mapping. Your patch locks all the disks, but maybe kernel will read
>> only from some subset of the disks, etc.
>
> Is this likely scenario? (Or even possible? I can imagine that, with
> lvm-over-raid1+, kernel theoretically can schedule one fsck instance to
> read only slaveA, and another to read only from slaveB, thus avoiding
> seeking; but I suspect, in practice it will just roundrobin all requests
> over both slaves, thus same seekstorm.)
>
>> From my point of view the best way is not to care about it in
>> userspace at all and depend on kernel I/O scheduling only. The
>
> No problem. Please set in-kernel fsck-friendly io scheduler for fsck
> instead. (Oh, there are none exists?)
>
>> optimization probably still makes sense for classic layout
>> "HDD->partition", but I don't think we need to extend this
>> functionality to another use-cases.
>
> `fsck -A` does not execute fsck for any stacked devices in
> parallel. `fsck -l` does. This is inconsistent and unfair. Alternative
> (simplier) patch attached (compile-tested only).

v2 (path variable and snprintf removed, added comment about unlock/unlink).


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: fsck-l-disallow-parallel-fsck-for-stacked-devices.v2.patch --]
[-- Type: text/x-diff, Size: 1372 bytes --]

diff --git a/disk-utils/fsck.c b/disk-utils/fsck.c
index 84d2dcc..bd9114a 100644
--- a/disk-utils/fsck.c
+++ b/disk-utils/fsck.c
@@ -109,6 +109,7 @@ struct fsck_instance {
 
 	int	lock;		/* flock()ed lockpath file descriptor or -1 */
 	char	*lockpath;	/* /run/fsck/<diskname>.lock or NULL */
+	int	lock_stacked;	/* flock()ed stacked file descriptor or -1 */
 
 	int	exit_status;
 	struct timeval start_time;
@@ -365,6 +366,21 @@ static void lock_disk(struct fsck_instance *inst)
 	if (!diskname)
 		diskname = diskpath;
 
+	/*
+	 * Note: stacked.lock can be share-locked and MUST NOT be removed
+	 * upon unlock!
+	 */
+	inst->lock_stacked = open(FSCK_RUNTIME_DIRNAME "/stacked.lock",
+				    O_RDONLY|O_CREAT|O_CLOEXEC,
+				    S_IWUSR|S_IRUSR|S_IRGRP|S_IROTH);
+	if (inst->lock_stacked >= 0) {
+		if (flock(inst->lock_stacked,
+			fs_is_stacked(inst->fs) ? LOCK_EX : LOCK_SH) != 0) {
+			close(inst->lock_stacked);	/* failed */
+			inst->lock_stacked = -1;
+		}
+	}
+ 	
 	xasprintf(&inst->lockpath, FSCK_RUNTIME_DIRNAME "/%s.lock", diskname);
 
 	if (verbose)
@@ -410,10 +426,12 @@ static void unlock_disk(struct fsck_instance *inst)
 		printf(_("Unlocking %s.\n"), inst->lockpath);
 
 	close(inst->lock);			/* unlock */
+	close(inst->lock_stacked);		/* unlock */
 
 	free(inst->lockpath);
 
 	inst->lock = -1;
+	inst->lock_stacked = -1;
 	inst->lockpath = NULL;
 }
 

      reply	other threads:[~2016-04-19 15:12 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-04-08 21:40 [RFC][WIP][PATCH] fsck -l and lvm/dmcrypt/md/etc Yuriy M. Kaminskiy
2016-04-18 10:59 ` Karel Zak
2016-04-19 14:42   ` Yuriy M. Kaminskiy
2016-04-19 15:11     ` Yuriy M. Kaminskiy [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=m3inzdk4zx.fsf_-_@gmail.com \
    --to=yumkam@gmail.com \
    --cc=util-linux@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.