From mboxrd@z Thu Jan 1 00:00:00 1970 From: Dan Merillat Subject: Re: bcache fails after reboot if discard is enabled Date: Sun, 12 Apr 2015 01:56:56 -0400 Message-ID: References: <54A66945.6030403@profihost.ag> <54A66C44.6070505@profihost.ag> <54A819A0.9010501@rolffokkens.nl> <54A843BC.608@profihost.ag> <55257303.8020008@profihost.ag> <3ldgvb-het.ln1@hurikhan77.spdns.de> <5a1mvb-6k.ln1@hurikhan77.spdns.de> <3k5mvb-c14.ln1@hurikhan77.spdns.de> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Return-path: Received: from mail-ie0-f178.google.com ([209.85.223.178]:32916 "EHLO mail-ie0-f178.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750720AbbDLF45 (ORCPT ); Sun, 12 Apr 2015 01:56:57 -0400 Received: by iebmp1 with SMTP id mp1so44321207ieb.0 for ; Sat, 11 Apr 2015 22:56:56 -0700 (PDT) In-Reply-To: Sender: linux-bcache-owner@vger.kernel.org List-Id: linux-bcache@vger.kernel.org To: Kai Krakow Cc: linux-bcache@vger.kernel.org On Sat, Apr 11, 2015 at 4:09 PM, Kai Krakow wrote: > With this knowledge, I guess that bcache could probably detect its backing > device signature twice - once through the underlying raw device and once > through the md device. From your logs I'm not sure if they were complete It doesn't, the system is smarter than you think it is. > enough to see that case. But to be sure I'd modify the udev rules to exclude > the md parent devices from being run through probe-bcache. Otherwise all > sorts of strange things may happen (like one process accessing the backing > device through md, while bcache access it through the parent device - > probably even on different mirror stripes). This didn't occur, I copied all the lines pertaining to bcache but skipped the superfluous ones. > It's your setup, but personally I'd avoid MD for that reason and go with > lvm. MD is just not modern, neither appropriate for modern system setups. It > should really be just there for legacy setups and migration paths. Not related to bcache at all. Perhaps complain about MD on the appropriate list? I'm not seeing any evidence that MD had anything to do with this, especially since the issues with bcache are entirely confined to the direct SATA access to /dev/sda4. In that vein, I'm reading the on-disk format of bcache and seeing exactly what's still valid on my system. It looks like I've got 65,000 good buckets before the first bad one. My idea is to go through, look for valid data in the buckets and use a COW in user-mode-linux to write that data back to the (copy-on-write version of) the backing device. Basically, anything that passes checksum and is still 'dirty', force-write-it-out. Then see what the status of my backing-store is. If it works, do it outside UML to the real backing store. Are there any diagnostic tools outside the bcache-tools repo? Not much there other than show the superblock info. Otherwise I'll just finish writing it myself.