5.4.20: cannot mount device that blipped off the bus: duplicate device fsid:devid for

All of lore.kernel.org
 help / color / mirror / Atom feed

* 5.4.20: cannot mount device that blipped off the bus: duplicate device fsid:devid for
@ 2020-03-21 20:23 Marc MERLIN
  2020-03-21 21:25 ` Nikolay Borisov
  2020-04-21  7:21 ` [PATCH] btrfs: boilerplate: devlist and fsinfo Anand Jain
  0 siblings, 2 replies; 15+ messages in thread
From: Marc MERLIN @ 2020-03-21 20:23 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

/dev/sde blipped off the bus (hardware issue?) and came
back as /dev/sdq.
Except btrfs won't let me scan or mount it.

I was able to btrfs check it though and that came back clean.

gargamel:~# ls -l /dev/sde
ls: cannot access '/dev/sde': No such file or directory

gargamel:~# mount /dev/sdq1 /mnt/mnt
mount: /mnt/mnt: mount(2) system call failed: File exists.
gargamel:~# dmesg |tail -1
[2560371.195249] BTRFS warning (device sde1): duplicate device fsid:devid for 727c7ba3-f6f9-462a-8472-453dd7d46d8a:1 old:/dev/sde1 new:/dev/sdq1

gargamel:~# btrfs device scan
Scanning for Btrfs filesystems
ERROR: device scan failed on '/dev/sdq1': File exists
ERROR: there are 1 errors while registering devices
gargamel:~# dmesg |tail -1
[2560416.434529] BTRFS warning (device sde1): duplicate device fsid:devid for 727c7ba3-f6f9-462a-8472-453dd7d46d8a:1 old:/dev/sde1 new:/dev/sdq1

gargamel:~# grep sde /proc/mounts 
cgroup2 /sys/fs/cgroup/unified cgroup2 rw,nosuid,nodev,noexec,relatime,nsdelegate 0 0
gargamel:~# 

gargamel:~# lsblk -f |grep 727c7ba3-f6f9-462a-8472-453dd7d46d8a
└─sdq1                            btrfs             btrfs_space                 727c7ba3-f6f9-462a-8472-453dd7d46d8a   
gargamel:~# 

So, that FS isn't a duplicate anymore and I see to have no way out except reboot
which I'll do now.

Was there another way around it? Obviously this is not desirable
behaviour, in the past, I was able to remount the device when it came
back.

Thanks,
Marc
-- 
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.

Home page: http://marc.merlins.org/                       | PGP 7F55D5F27AAF9D08

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: 5.4.20: cannot mount device that blipped off the bus: duplicate device fsid:devid for
  2020-03-21 20:23 5.4.20: cannot mount device that blipped off the bus: duplicate device fsid:devid for Marc MERLIN
@ 2020-03-21 21:25 ` Nikolay Borisov
  2020-03-25 20:14   ` Marc MERLIN
  2020-04-21  7:21 ` [PATCH] btrfs: boilerplate: devlist and fsinfo Anand Jain
  1 sibling, 1 reply; 15+ messages in thread
From: Nikolay Borisov @ 2020-03-21 21:25 UTC (permalink / raw)
  To: Marc MERLIN, linux-btrfs, kernel-team



On 21.03.20 г. 22:23 ч., Marc MERLIN wrote:
> /dev/sde blipped off the bus (hardware issue?) and came
> back as /dev/sdq.
> Except btrfs won't let me scan or mount it.
> 
> I was able to btrfs check it though and that came back clean.
> 
> gargamel:~# ls -l /dev/sde
> ls: cannot access '/dev/sde': No such file or directory
> 
> 
> gargamel:~# mount /dev/sdq1 /mnt/mnt
> mount: /mnt/mnt: mount(2) system call failed: File exists.
> gargamel:~# dmesg |tail -1
> [2560371.195249] BTRFS warning (device sde1): duplicate device fsid:devid for 727c7ba3-f6f9-462a-8472-453dd7d46d8a:1 old:/dev/sde1 new:/dev/sdq1
> 
> gargamel:~# btrfs device scan
> Scanning for Btrfs filesystems
> ERROR: device scan failed on '/dev/sdq1': File exists
> ERROR: there are 1 errors while registering devices
> gargamel:~# dmesg |tail -1
> [2560416.434529] BTRFS warning (device sde1): duplicate device fsid:devid for 727c7ba3-f6f9-462a-8472-453dd7d46d8a:1 old:/dev/sde1 new:/dev/sdq1
> 
> gargamel:~# grep sde /proc/mounts 
> cgroup2 /sys/fs/cgroup/unified cgroup2 rw,nosuid,nodev,noexec,relatime,nsdelegate 0 0
> gargamel:~# 
> 
> gargamel:~# lsblk -f |grep 727c7ba3-f6f9-462a-8472-453dd7d46d8a
> └─sdq1                            btrfs             btrfs_space                 727c7ba3-f6f9-462a-8472-453dd7d46d8a   
> gargamel:~# 
> 
> So, that FS isn't a duplicate anymore and I see to have no way out except reboot
> which I'll do now.
> 
> Was there another way around it? Obviously this is not desirable
> behaviour, in the past, I was able to remount the device when it came
> back.
> 

Presumably you could have used the device forget functionality that got
introduced in 5.1, i.e the BTRFS_IOC_FORGET_DEV ioctl. For more info
check out: 228a73abde5c04428678e917b271f8526cfd90ed

> Thanks,
> Marc
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: 5.4.20: cannot mount device that blipped off the bus: duplicate device fsid:devid for
  2020-03-21 21:25 ` Nikolay Borisov
@ 2020-03-25 20:14   ` Marc MERLIN
  2020-03-25 23:56     ` Anand Jain
  0 siblings, 1 reply; 15+ messages in thread
From: Marc MERLIN @ 2020-03-25 20:14 UTC (permalink / raw)
  To: Nikolay Borisov, anand.jain, dsterba; +Cc: linux-btrfs, kernel-team

Thanks for the suggestion Nikolay

Dear Anand, David,

I see that 
https://gitlab.freedesktop.org/seanpaul/dpu-staging/commit/228a73abde5c04428678e917b271f8526cfd90ed
may have helped, but is this really something a user should know/do?

Why does a device that disappeared from the bus, need to be manually
unregistered?
Are users really supposed to know this?
Why does btrfs device scan not invalidate the cache of devices and keep
remembering a device that's gone (not visible in new scan)?

Thanks,
Marc


On Sat, Mar 21, 2020 at 11:25:04PM +0200, Nikolay Borisov wrote:
> 
> 
> On 21.03.20 г. 22:23 ч., Marc MERLIN wrote:
> > /dev/sde blipped off the bus (hardware issue?) and came
> > back as /dev/sdq.
> > Except btrfs won't let me scan or mount it.
> > 
> > I was able to btrfs check it though and that came back clean.
> > 
> > gargamel:~# ls -l /dev/sde
> > ls: cannot access '/dev/sde': No such file or directory
> > 
> > 
> > gargamel:~# mount /dev/sdq1 /mnt/mnt
> > mount: /mnt/mnt: mount(2) system call failed: File exists.
> > gargamel:~# dmesg |tail -1
> > [2560371.195249] BTRFS warning (device sde1): duplicate device fsid:devid for 727c7ba3-f6f9-462a-8472-453dd7d46d8a:1 old:/dev/sde1 new:/dev/sdq1
> > 
> > gargamel:~# btrfs device scan
> > Scanning for Btrfs filesystems
> > ERROR: device scan failed on '/dev/sdq1': File exists
> > ERROR: there are 1 errors while registering devices
> > gargamel:~# dmesg |tail -1
> > [2560416.434529] BTRFS warning (device sde1): duplicate device fsid:devid for 727c7ba3-f6f9-462a-8472-453dd7d46d8a:1 old:/dev/sde1 new:/dev/sdq1
> > 
> > gargamel:~# grep sde /proc/mounts 
> > cgroup2 /sys/fs/cgroup/unified cgroup2 rw,nosuid,nodev,noexec,relatime,nsdelegate 0 0
> > gargamel:~# 
> > 
> > gargamel:~# lsblk -f |grep 727c7ba3-f6f9-462a-8472-453dd7d46d8a
> > └─sdq1                            btrfs             btrfs_space                 727c7ba3-f6f9-462a-8472-453dd7d46d8a   
> > gargamel:~# 
> > 
> > So, that FS isn't a duplicate anymore and I see to have no way out except reboot
> > which I'll do now.
> > 
> > Was there another way around it? Obviously this is not desirable
> > behaviour, in the past, I was able to remount the device when it came
> > back.
> > 
> 
> Presumably you could have used the device forget functionality that got
> introduced in 5.1, i.e the BTRFS_IOC_FORGET_DEV ioctl. For more info
> check out: 228a73abde5c04428678e917b271f8526cfd90ed
> 
> > Thanks,
> > Marc
> > 
> 

-- 
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
 
Home page: http://marc.merlins.org/                       | PGP 7F55D5F27AAF9D08

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: 5.4.20: cannot mount device that blipped off the bus: duplicate device fsid:devid for
  2020-03-25 20:14   ` Marc MERLIN
@ 2020-03-25 23:56     ` Anand Jain
  2020-03-26  1:30       ` Marc MERLIN
  0 siblings, 1 reply; 15+ messages in thread
From: Anand Jain @ 2020-03-25 23:56 UTC (permalink / raw)
  To: Marc MERLIN, Nikolay Borisov, dsterba; +Cc: linux-btrfs, kernel-team


Hi Marc,


On 26/3/20 4:14 AM, Marc MERLIN wrote:
> Thanks for the suggestion Nikolay
> 
> Dear Anand, David,
> 
> I see that
> https://gitlab.freedesktop.org/seanpaul/dpu-staging/commit/228a73abde5c04428678e917b271f8526cfd90ed
> may have helped, but is this really something a user should know/do?
> 
> Why does a device that disappeared from the bus, need to be manually
> unregistered?

> Are users really supposed to know this?
> Why does btrfs device scan not invalidate the cache of devices and keep
> remembering a device that's gone (not visible in new scan)?

  btrfs device scan --forget is only useful to cleanup the unmounted
  devices, per the logs below the device was mounted when it disappeared.
  More below.


> Thanks,
> Marc
> 
> 
> On Sat, Mar 21, 2020 at 11:25:04PM +0200, Nikolay Borisov wrote:
>>
>>
>> On 21.03.20 г. 22:23 ч., Marc MERLIN wrote:
>>> /dev/sde blipped off the bus (hardware issue?) and came
>>> back as /dev/sdq.
>>> Except btrfs won't let me scan or mount it.
>>>
>>> I was able to btrfs check it though and that came back clean.
>>>
>>> gargamel:~# ls -l /dev/sde
>>> ls: cannot access '/dev/sde': No such file or directory
>>>
>>>
>>> gargamel:~# mount /dev/sdq1 /mnt/mnt
>>> mount: /mnt/mnt: mount(2) system call failed: File exists.
>>> gargamel:~# dmesg |tail -1
>>> [2560371.195249] BTRFS warning (device sde1): duplicate device fsid:devid for 727c7ba3-f6f9-462a-8472-453dd7d46d8a:1 old:/dev/sde1 new:/dev/sdq1

   This indicates the device was mounted when it disappeared. So it
   re-appears with the new path, but as its fsid+uuid+devid matches
   with the old still mounted device we rightly consider it as an
   alien device and fail the mount.

   To avoid assigning new path to the reappearing device we need to
   close/pause the device path when it disappears. I need to figure
   out if there is any KPI from the block layer to help doing that.

   Anyone- any idea if there is anything in the block layer which can
   do the callback into the filesystem if the device disappears?

Thanks, Anand

>>>
>>> gargamel:~# btrfs device scan
>>> Scanning for Btrfs filesystems
>>> ERROR: device scan failed on '/dev/sdq1': File exists
>>> ERROR: there are 1 errors while registering devices
>>> gargamel:~# dmesg |tail -1
>>> [2560416.434529] BTRFS warning (device sde1): duplicate device fsid:devid for 727c7ba3-f6f9-462a-8472-453dd7d46d8a:1 old:/dev/sde1 new:/dev/sdq1
>>>
>>> gargamel:~# grep sde /proc/mounts
>>> cgroup2 /sys/fs/cgroup/unified cgroup2 rw,nosuid,nodev,noexec,relatime,nsdelegate 0 0
>>> gargamel:~#
>>>
>>> gargamel:~# lsblk -f |grep 727c7ba3-f6f9-462a-8472-453dd7d46d8a
>>> └─sdq1                            btrfs             btrfs_space                 727c7ba3-f6f9-462a-8472-453dd7d46d8a
>>> gargamel:~#
>>>
>>> So, that FS isn't a duplicate anymore and I see to have no way out except reboot
>>> which I'll do now.
>>>
>>> Was there another way around it? Obviously this is not desirable
>>> behaviour, in the past, I was able to remount the device when it came
>>> back.
>>>
>>
>> Presumably you could have used the device forget functionality that got
>> introduced in 5.1, i.e the BTRFS_IOC_FORGET_DEV ioctl. For more info
>> check out: 228a73abde5c04428678e917b271f8526cfd90ed
>>
>>> Thanks,
>>> Marc
>>>
>>
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: 5.4.20: cannot mount device that blipped off the bus: duplicate device fsid:devid for
  2020-03-25 23:56     ` Anand Jain
@ 2020-03-26  1:30       ` Marc MERLIN
  2020-03-26  3:33         ` Anand Jain
  0 siblings, 1 reply; 15+ messages in thread
From: Marc MERLIN @ 2020-03-26  1:30 UTC (permalink / raw)
  To: Anand Jain; +Cc: Nikolay Borisov, dsterba, linux-btrfs, kernel-team

On Thu, Mar 26, 2020 at 07:56:10AM +0800, Anand Jain wrote:
> > Are users really supposed to know this?
> > Why does btrfs device scan not invalidate the cache of devices and keep
> > remembering a device that's gone (not visible in new scan)?
> 
>  btrfs device scan --forget is only useful to cleanup the unmounted
>  devices, per the logs below the device was mounted when it disappeared.
>  More below.
 
I'm confused: why is --forget even needed? Why would it remember devices
that were unmounted and not part of a new scan?

And yes, the device was not unmounted. The sata layer failed, device
disappeared while mounted and then re-appeared 
I was able to force umount the mountpoints, so maybe --forget would have
helped, but I'm confused as to why it even exists.
 
>   This indicates the device was mounted when it disappeared. So it
>   re-appears with the new path, but as its fsid+uuid+devid matches
>   with the old still mounted device we rightly consider it as an
>   alien device and fail the mount.
 
It was unmounted after disappearing, see the 'grep sde /proc/mounts'
showing that it wasn't mounted anymore, so it seems that even that part
didn't work as intended?

Marc
-- 
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
 
Home page: http://marc.merlins.org/  

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: 5.4.20: cannot mount device that blipped off the bus: duplicate device fsid:devid for
  2020-03-26  1:30       ` Marc MERLIN
@ 2020-03-26  3:33         ` Anand Jain
  2020-03-26  4:26           ` Marc MERLIN
  0 siblings, 1 reply; 15+ messages in thread
From: Anand Jain @ 2020-03-26  3:33 UTC (permalink / raw)
  To: Marc MERLIN; +Cc: Nikolay Borisov, dsterba, linux-btrfs, kernel-team



On 3/26/20 9:30 AM, Marc MERLIN wrote:
> On Thu, Mar 26, 2020 at 07:56:10AM +0800, Anand Jain wrote:
>>> Are users really supposed to know this?
>>> Why does btrfs device scan not invalidate the cache of devices and keep
>>> remembering a device that's gone (not visible in new scan)?
>>
>>   btrfs device scan --forget is only useful to cleanup the unmounted
>>   devices, per the logs below the device was mounted when it disappeared.
>>   More below.
>   
> I'm confused: why is --forget even needed? Why would it remember devices
> that were unmounted and not part of a new scan?
> 
> And yes, the device was not unmounted. The sata layer failed, device
> disappeared while mounted and then re-appeared


> I was able to force umount the mountpoints, so maybe --forget would have
> helped, but I'm confused as to why it even exists.
>

  We would log the below only if the old device sde is still in mounted
  state.  Unfortunately we don't have the unmount event log yet (patches
  are in the ML) so we don't know if unmount was successful.

[2560416.434529] BTRFS warning (device sde1): duplicate device 
fsid:devid for 727c7ba3-f6f9-462a-8472-453dd7d46d8a:1 old:/dev/sde1 
new:/dev/sdq1

  If the device is unmounted, the scan would have replaced the sde
  to sdi, unless the sde (stale) generation is > generation in sdi
  (lost commit). In which case the --forget is useful to remove the
  state device entry (provided device is unmounted).

>>    This indicates the device was mounted when it disappeared. So it
>>    re-appears with the new path, but as its fsid+uuid+devid matches
>>    with the old still mounted device we rightly consider it as an
>>    alien device and fail the mount.
>   
> It was unmounted after disappearing, see the 'grep sde /proc/mounts'
> showing that it wasn't mounted anymore, so it seems that even that part
> didn't work as intended?

  Its strange /proc/mounts doesn't list sde. Could you please send me
  complete kernel logs. Lets try if there is any clue.


  I tried to reproduce.. but in my case the unmount was successful.


$ mkfs.btrfs -fq /dev/sdc && mount /dev/sdc /btrfs
$ devmgt show | grep sdc
host2 sdc
$ devmgt detach /dev/sdc
::
detach /dev/sdc successful
$ devmgt attach host2
::
	sd 2:0:0:0: [sdb] Attached SCSI disk
::
	BTRFS warning (device sdc): duplicate device fsid:devid for 
dcbc5603-e1cf-4d8d-9ec2-832bc3ac4e36:1 old:/dev/sdc new:/dev/sdb
------------------
attach host2 successful

mounts shows sdc is in ro.

$ cat /proc/mounts | grep sdc
/dev/sdc /btrfs btrfs ro,relatime,noacl,space_cache,subvolid=5,subvol=/ 0 0

$ dmesg -k | tail

[ 1427.268767] BTRFS warning (device sdc): duplicate device fsid:devid 
for dcbc5603-e1cf-4d8d-9ec2-832bc3ac4e36:1 old:/dev/sdc new:/dev/sdb

$ umount /dev/sdc

Unfortunately there is no log about the unmount :-(.

And the following device scan replaces the sdc with sdb.

$ btrfs dev scan
Scanning for Btrfs filesystems

$ cat /proc/fs/btrfs/devlist | grep sdc
$ cat /proc/fs/btrfs/devlist | grep sdb
		device:		/dev/sdb
$

And mount is successful.

$ mount /dev/sdb /btrfs


Thanks, Anand

> Marc

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: 5.4.20: cannot mount device that blipped off the bus: duplicate device fsid:devid for
  2020-03-26  3:33         ` Anand Jain
@ 2020-03-26  4:26           ` Marc MERLIN
  2020-04-14  0:38             ` Marc MERLIN
  0 siblings, 1 reply; 15+ messages in thread
From: Marc MERLIN @ 2020-03-26  4:26 UTC (permalink / raw)
  To: Anand Jain; +Cc: Nikolay Borisov, dsterba, linux-btrfs, kernel-team

On Thu, Mar 26, 2020 at 11:33:23AM +0800, Anand Jain wrote:
>  We would log the below only if the old device sde is still in mounted
>  state.  Unfortunately we don't have the unmount event log yet (patches
>  are in the ML) so we don't know if unmount was successful.
> 
> [2560416.434529] BTRFS warning (device sde1): duplicate device fsid:devid
> for 727c7ba3-f6f9-462a-8472-453dd7d46d8a:1 old:/dev/sde1 new:/dev/sdq1
> 
>  If the device is unmounted, the scan would have replaced the sde
>  to sdi, unless the sde (stale) generation is > generation in sdi
>  (lost commit). In which case the --forget is useful to remove the
>  state device entry (provided device is unmounted).
 
Well, the device did disappear, wouldn't that cause the in memory
version to be more recent than the disk version?

>  Its strange /proc/mounts doesn't list sde. Could you please send me
>  complete kernel logs. Lets try if there is any clue.
 
Sure https://pastebin.com/SWAfYxV8
 
>  I tried to reproduce.. but in my case the unmount was successful.
> 
> 
> $ mkfs.btrfs -fq /dev/sdc && mount /dev/sdc /btrfs
> $ devmgt show | grep sdc
> host2 sdc
> $ devmgt detach /dev/sdc
> ::
> detach /dev/sdc successful
> $ devmgt attach host2

That's probably too clean.
Can you 
1) write to device in a loop
2) pull power from SATA device (in this case it was an ssd)
3) plug device back in

> $ umount /dev/sdc
> 
> Unfortunately there is no log about the unmount :-(.

Maybe worth adding to help debug later?

I looked in my bash history, and it shows this:
37280  mount | grep sde
37281  umount /mnt/btrfs_space
37282  umount /var/local/space
37283  umount /var/cache/zoneminder
37284  fuser -vm /var/cache/zoneminder
37285  fuser -vkm /var/cache/zoneminder
37286  umount /var/cache/zoneminder
37287  umount /var/lib/mysql
37288  mount | grep sde1

First time, I got output to the mount command.
Second time I did not.

other commands I typed:
37289  mount /dev/sdq1 /mnt/btrfs_space
37296  btrfs device scan
37301  grep sde /etc/* 2>/dev/null
37303  mount /dev/sdq1 /mnt/btrfs_space
37308  grep -r /mnt/btrfs_space /etc 2>/dev/null
37311  btrfs device scan 
37312  l /sys/block/sde/
37313  btrfs check /dev/sdq1
37314  btrfs device scan 
37320  mount /dev/sdq1 /mnt/mnt
37323  btrfs device scan
37324  dmesg |tail -1
37326  lsblk -v
37327  lsblk 
37328  grep sde /proc/mounts 

Hope this helps.

Marc
-- 
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
 
Home page: http://marc.merlins.org/  

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: 5.4.20: cannot mount device that blipped off the bus: duplicate device fsid:devid for
  2020-03-26  4:26           ` Marc MERLIN
@ 2020-04-14  0:38             ` Marc MERLIN
  2020-04-16 10:43               ` Anand Jain
  2020-04-20 11:10               ` Anand Jain
  0 siblings, 2 replies; 15+ messages in thread
From: Marc MERLIN @ 2020-04-14  0:38 UTC (permalink / raw)
  To: Anand Jain; +Cc: Nikolay Borisov, dsterba, linux-btrfs, kernel-team

Anaud, I had this happen agin with 5.5.11, and it was impossible to do
anything to fix it, I had to reboot again.
btrfs device scan --forget 
did nothing.

See details:
BTRFS: device label btrfs_space devid 1 transid 35178413 /dev/sde1
BTRFS info (device sde1): use lzo compression, level 0
BTRFS info (device sde1): disk space caching is enabled
BTRFS info (device sde1): has skinny extents
BTRFS info (device sde1): enabling ssd optimizations
sd 6:1:3:0: [sde] tag#642 FAILED Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK cmd_age=2s  
sd 6:1:3:0: [sde] tag#640 FAILED Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK cmd_age=2s
sd 6:1:3:0: [sde] tag#702 FAILED Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK cmd_age=2s
sd 6:1:3:0: [sde] tag#702 CDB: Write(16) 8a 00 00 00 00 00 f1 a7 3a 68 00 00 01 f0 00 00  
blk_update_request: I/O error, dev sde, sector 4054268520 op 0x1:(WRITE) flags 0x100000 phys_seg 62 prio class 0
sd 6:1:3:0: [sde] tag#701 FAILED Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK cmd_age=2s
sd 6:1:3:0: [sde] tag#701 CDB: Write(16) 8a 00 00 00 00 00 f1 a7 38 68 00 00 02 00 00 00  
blk_update_request: I/O error, dev sde, sector 4054268008 op 0x1:(WRITE) flags 0x104000 phys_seg 64 prio class 0
sd 6:1:3:0: [sde] tag#700 FAILED Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK cmd_age=2s
sd 6:1:3:0: [sde] tag#700 CDB: Write(16) 8a 00 00 00 00 00 f1 a7 36 68 00 00 02 00 00 00  
blk_update_request: I/O error, dev sde, sector 4054267496 op 0x1:(WRITE) flags 0x104000 phys_seg 64 prio class 0
BTRFS error (device sde1): bdev /dev/sde1 errs: wr 1, rd 0, flush 0, corrupt 0, gen 0
sd 6:1:3:0: [sde] tag#641 FAILED Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK cmd_age=10s
sd 6:1:3:0: [sde] tag#641 CDB: Unmap/Read sub-channel 42 00 00 00 00 00 00 00 18 00  
BTRFS info (device sde1): forced readonly
BTRFS warning (device sde1): Skipping commit of aborted transaction.  
BTRFS: error (device sde1) in cleanup_transaction:1894: errno=-5 IO failure
BTRFS info (device sde1): delayed_refs has NO entry
btrfs_dev_stat_print_on_error: 244 callbacks suppressed


gargamel:~# dmtail 3
[1887142.765448] BTRFS error (device sde1): bdev /dev/sde1 errs: wr 1038, rd 4529, flush 0, corrupt 0, gen 0
[1887142.795820] BTRFS error (device sde1): bdev /dev/sde1 errs: wr 1038, rd 4530, flush 0, corrupt 0, gen 0
[1887142.826176] BTRFS error (device sde1): bdev /dev/sde1 errs: wr 1038, rd 4531, flush 0, corrupt 0, gen 0
gargamel:~# cat /proc/partitions  |grep sd[ep]
   8      240 3750738264 sdp
   8      241 3750737223 sdp1
gargamel:~# mount | grep sde
/dev/sde1 on /mnt/btrfs_space type btrfs (ro,noatime,compress=lzo,ssd,discard,space_cache,skip_balance,subvolid=5,subvol=/)
/dev/sde1 on /var/local/space type btrfs (ro,noexec,noatime,compress=lzo,ssd,discard,space_cache,skip_balance,subvolid=257,subvol=/varlocalspace)
/dev/sde1 on /var/cache/zoneminder type btrfs (ro,nosuid,nodev,noatime,compress=lzo,ssd,discard,space_cache,skip_balance,subvolid=257,subvol=/varlocalspace/zoneminder)
/dev/sde1 on /var/lib/mysql type btrfs (ro,nosuid,nodev,noatime,compress=lzo,ssd,discard,space_cache,skip_balance,subvolid=3648,subvol=/mysql)
gargamel:~# umount /mnt/btrfs_space; umount /var/local/space; umount /var/cache/zoneminder; umount /var/lib/mysql
gargamel:~# mount | grep sde

gargamel:~# mount /dev/sdp1 /mnt/mnt
mount: /mnt/mnt: mount(2) system call failed: File exists.
gargamel:~# dmtail 2
[1887142.826176] BTRFS error (device sde1): bdev /dev/sde1 errs: wr 1038, rd 4531, flush 0, corrupt 0, gen 0
[1887453.610947] BTRFS warning (device sde1): duplicate device fsid:devid for 727c7ba3-f6f9-462a-8472-453dd7d46d8a:1 old:/dev/sde1 new:/dev/sdp1

gargamel:/usr/local/bin# btrfs device scan --forget 
gargamel:/usr/local/bin# mount /dev/sdp1 /mnt/mnt
mount: /mnt/mnt: mount(2) system call failed: File exists.


After reboot, I made sure sde is not used by anything weird, just simple mounts:
gargamel:~# lsblk  | grep sde
sde                                 8:64   1 931.5G  0 disk  
├─sde1                              8:65   1 488.3M  0 part  
├─sde2                              8:66   1  14.9G  0 part  
├─sde3                              8:67   1    80G  0 part  
└─sde4                              8:68   1 836.1G  0 part 

Any ideas?

Marc
-- 
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
 
Home page: http://marc.merlins.org/                       | PGP 7F55D5F27AAF9D08

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: 5.4.20: cannot mount device that blipped off the bus: duplicate device fsid:devid for
  2020-04-14  0:38             ` Marc MERLIN
@ 2020-04-16 10:43               ` Anand Jain
  2020-04-19 19:13                 ` Marc MERLIN
  2020-04-20 11:10               ` Anand Jain
  1 sibling, 1 reply; 15+ messages in thread
From: Anand Jain @ 2020-04-16 10:43 UTC (permalink / raw)
  To: Marc MERLIN; +Cc: Nikolay Borisov, dsterba, linux-btrfs, kernel-team



On 4/14/20 8:38 AM, Marc MERLIN wrote:
> Anaud, I had this happen agin with 5.5.11, and it was impossible to do
> anything to fix it, I had to reboot again.
> btrfs device scan --forget
> did nothing.
> 
> See details:
> BTRFS: device label btrfs_space devid 1 transid 35178413 /dev/sde1
> BTRFS info (device sde1): use lzo compression, level 0
> BTRFS info (device sde1): disk space caching is enabled
> BTRFS info (device sde1): has skinny extents
> BTRFS info (device sde1): enabling ssd optimizations
> sd 6:1:3:0: [sde] tag#642 FAILED Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK cmd_age=2s
> sd 6:1:3:0: [sde] tag#640 FAILED Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK cmd_age=2s
> sd 6:1:3:0: [sde] tag#702 FAILED Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK cmd_age=2s
> sd 6:1:3:0: [sde] tag#702 CDB: Write(16) 8a 00 00 00 00 00 f1 a7 3a 68 00 00 01 f0 00 00
> blk_update_request: I/O error, dev sde, sector 4054268520 op 0x1:(WRITE) flags 0x100000 phys_seg 62 prio class 0
> sd 6:1:3:0: [sde] tag#701 FAILED Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK cmd_age=2s
> sd 6:1:3:0: [sde] tag#701 CDB: Write(16) 8a 00 00 00 00 00 f1 a7 38 68 00 00 02 00 00 00
> blk_update_request: I/O error, dev sde, sector 4054268008 op 0x1:(WRITE) flags 0x104000 phys_seg 64 prio class 0
> sd 6:1:3:0: [sde] tag#700 FAILED Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK cmd_age=2s
> sd 6:1:3:0: [sde] tag#700 CDB: Write(16) 8a 00 00 00 00 00 f1 a7 36 68 00 00 02 00 00 00
> blk_update_request: I/O error, dev sde, sector 4054267496 op 0x1:(WRITE) flags 0x104000 phys_seg 64 prio class 0
> BTRFS error (device sde1): bdev /dev/sde1 errs: wr 1, rd 0, flush 0, corrupt 0, gen 0
> sd 6:1:3:0: [sde] tag#641 FAILED Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK cmd_age=10s
> sd 6:1:3:0: [sde] tag#641 CDB: Unmap/Read sub-channel 42 00 00 00 00 00 00 00 18 00


> BTRFS info (device sde1): forced readonly

Unfortunately that's the only thing we do as of now.

> BTRFS warning (device sde1): Skipping commit of aborted transaction.
> BTRFS: error (device sde1) in cleanup_transaction:1894: errno=-5 IO failure
> BTRFS info (device sde1): delayed_refs has NO entry
> btrfs_dev_stat_print_on_error: 244 callbacks suppressed
> 
> gargamel:~# dmtail 3
> [1887142.765448] BTRFS error (device sde1): bdev /dev/sde1 errs: wr 1038, rd 4529, flush 0, corrupt 0, gen 0
> [1887142.795820] BTRFS error (device sde1): bdev /dev/sde1 errs: wr 1038, rd 4530, flush 0, corrupt 0, gen 0
> [1887142.826176] BTRFS error (device sde1): bdev /dev/sde1 errs: wr 1038, rd 4531, flush 0, corrupt 0, gen 0

> gargamel:~# cat /proc/partitions  |grep sd[ep]
>     8      240 3750738264 sdp
>     8      241 3750737223 sdp1

So the same device reappears as sdp. But btrfs does not close a failed 
device yet (patches are in the mailing list) the old path sde
is still in the block layer and opened. I guess /proc/partitions
doesn't show non working sde.

> gargamel:~# mount | grep sde 
> /dev/sde1 on /mnt/btrfs_space type btrfs (ro,noatime,compress=lzo,ssd,discard,space_cache,skip_balance,subvolid=5,subvol=/)
> /dev/sde1 on /var/local/space type btrfs (ro,noexec,noatime,compress=lzo,ssd,discard,space_cache,skip_balance,subvolid=257,subvol=/varlocalspace)
> /dev/sde1 on /var/cache/zoneminder type btrfs (ro,nosuid,nodev,noatime,compress=lzo,ssd,discard,space_cache,skip_balance,subvolid=257,subvol=/varlocalspace/zoneminder)
> /dev/sde1 on /var/lib/mysql type btrfs (ro,nosuid,nodev,noatime,compress=lzo,ssd,discard,space_cache,skip_balance,subvolid=3648,subvol=/mysql)


> gargamel:~# umount /mnt/btrfs_space; umount /var/local/space; umount /var/cache/zoneminder; umount /var/lib/mysql


> gargamel:~# mount | grep sde
better to have grep-ed sdp also, here.
And /proc/self/mounts will be more accurate as it probes the fs module.

> gargamel:~# mount /dev/sdp1 /mnt/mnt
> mount: /mnt/mnt: mount(2) system call failed: File exists.

> gargamel:~# dmtail 2
> [1887142.826176] BTRFS error (device sde1): bdev /dev/sde1 errs: wr 1038, rd 4531, flush 0, corrupt 0, gen 0
> [1887453.610947] BTRFS warning (device sde1): duplicate device fsid:devid for 727c7ba3-f6f9-462a-8472-453dd7d46d8a:1 old:/dev/sde1 new:/dev/sdp1

Unmount wasn't successful above. Or it was remounted by automount? just 
guessing.

> 
> gargamel:/usr/local/bin# btrfs device scan --forget
> gargamel:/usr/local/bin# mount /dev/sdp1 /mnt/mnt
> mount: /mnt/mnt: mount(2) system call failed: File exists.
> 

  Can you please send a complete kernel logs.

> After reboot, I made sure sde is not used by anything weird, just simple mounts:
> gargamel:~# lsblk  | grep sde
> sde                                 8:64   1 931.5G  0 disk
> ├─sde1                              8:65   1 488.3M  0 part
> ├─sde2                              8:66   1  14.9G  0 part
> ├─sde3                              8:67   1    80G  0 part
> └─sde4                              8:68   1 836.1G  0 part
> 


So in summary the chronological order of events are...

  sde disappears.
  btrfs does not close the device.
  block layer creates sdp when the disappeared device reappears.
  unmount of sde was tried but it might not have completely successful 
we don't have sufficient logs to prove it.
  mount of sdp fails per log indicates that sde is still mounted.

So thing(s) to fix is/are:
  The root of the issue - When sde fails we need to close the device
  so that block layer can reuse sde when it reappears (not sdp).
  In btrfs as we have closed the failed device btrfs dev scan --forget
  can work to cleanup the stale entries left behind during unmount.

  We can do something better here:
  When two different device with same fsid uuid and devid and one of it
  is mounted we have to fail the scan/mount of the newer device for
  obvious reasons. That's when we get the log - 'duplicate device fsid'.
  But here the case it bit skewed that both are same device with same
  major number but different minor number (sde sdp). I need to figure
  out a way so that we don't treat these two device paths as different
  device. Probably should check the guid/wwid assigned by the block
  layer which should be same for both of these devices, or in the
  last resort check scsi inquiry_VPD page and get the serial number
  but its going too much beyond what FS should do. Let me check with
  block layer experts what they suggest.

  We might need a workaround tool to force clean a given FSID to avoid
  reboot.

Still unknown:
  unmount is successful? And mount logs shows that device sde still 
exists in btrfs.

Sorry I was diverted into other stuffs when you reported last time, let 
me take a fresh look.

Thanks, Anand


> Any ideas?
> 
> Marc
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: 5.4.20: cannot mount device that blipped off the bus: duplicate device fsid:devid for
  2020-04-16 10:43               ` Anand Jain
@ 2020-04-19 19:13                 ` Marc MERLIN
  0 siblings, 0 replies; 15+ messages in thread
From: Marc MERLIN @ 2020-04-19 19:13 UTC (permalink / raw)
  To: Anand Jain; +Cc: Nikolay Borisov, dsterba, linux-btrfs, kernel-team

On Thu, Apr 16, 2020 at 06:43:39PM +0800, Anand Jain wrote:
> > BTRFS info (device sde1): forced readonly
> 
> Unfortunately that's the only thing we do as of now.

Of course, and that's fine, but I don't understand why after unmounting
the filesystem cleanly, the references aren't freed.
That part really seems like a bug to me.

> So the same device reappears as sdp. But btrfs does not close a failed
> device yet (patches are in the mailing list) the old path sde
> is still in the block layer and opened. I guess /proc/partitions
> doesn't show non working sde.
> 
Correct on all points

> > gargamel:~# mount | grep sde
> better to have grep-ed sdp also, here.

it was not mounted yet, I checked that.

> And /proc/self/mounts will be more accurate as it probes the fs module.

Noted.

> > [1887142.826176] BTRFS error (device sde1): bdev /dev/sde1 errs: wr 1038, rd 4531, flush 0, corrupt 0, gen 0
> > [1887453.610947] BTRFS warning (device sde1): duplicate device fsid:devid for 727c7ba3-f6f9-462a-8472-453dd7d46d8a:1 old:/dev/sde1 new:/dev/sdp1
> 
> Unmount wasn't successful above. Or it was remounted by automount? just
> guessing.

umount was successful and automount does not handle this device.
/dev/sde was not mounted for sure and /dev/sdp was unmountable 

> > gargamel:/usr/local/bin# btrfs device scan --forget
> > gargamel:/usr/local/bin# mount /dev/sdp1 /mnt/mnt
> > mount: /mnt/mnt: mount(2) system call failed: File exists.
> 
>  Can you please send a complete kernel logs.

They contain a lot of crap that wouldn't fit on the list, but I pasted
everything relevant.

>  sde disappears.
>  btrfs does not close the device.

it remounts the mountpoints read only, which is fine (they can't be
unmounted because they are in use).

>  block layer creates sdp when the disappeared device reappears.
>  unmount of sde was tried but it might not have completely successful we
> don't have sufficient logs to prove it.

umount looked complete on my side, there is nothing in the logs that
shows otherwise, but as you said, unmount does not log anything.

>  mount of sdp fails per log indicates that sde is still mounted.

correct.

> So thing(s) to fix is/are:
>  The root of the issue - When sde fails we need to close the device
>  so that block layer can reuse sde when it reappears (not sdp).
>  In btrfs as we have closed the failed device btrfs dev scan --forget
>  can work to cleanup the stale entries left behind during unmount.

ideally "btrfs dev scan --forget" should be automatic. It feels like
a weird command for an admin to know or have to use. Other filesystems
do not need it.

>  We can do something better here:
>  When two different device with same fsid uuid and devid and one of it
>  is mounted we have to fail the scan/mount of the newer device for
>  obvious reasons. That's when we get the log - 'duplicate device fsid'.
>  But here the case it bit skewed that both are same device with same
>  major number but different minor number (sde sdp). I need to figure
>  out a way so that we don't treat these two device paths as different
>  device. Probably should check the guid/wwid assigned by the block
>  layer which should be same for both of these devices, or in the
>  last resort check scsi inquiry_VPD page and get the serial number
>  but its going too much beyond what FS should do. Let me check with
>  block layer experts what they suggest.

defense in depth sounds great here, if any of those can work too, that'd
be great.

> Still unknown:
>  unmount is successful? And mount logs shows that device sde still exists in
> btrfs.

It failed while mountpoints were still it use, and after the correct
fuser -kvm /path
umount worked great and the device disappeared from /proc/mounts.
As you said, there are no kernel logs on unmount, so it's hard to say
more.
If you want me to apply a patch that puts more logging on unmount
(against 5.5 or 5.6), please let me know, but of course, it could be
weeks or months before I get that blip again.
I think this could be reproduced by simply having a drive mounted, and
unplugging it while the machine is live, and plugging it back in at
runtime. I could technically do it with my hardware, but it happens on a 
a database I don't really want to lose or corrupt.

> Sorry I was diverted into other stuffs when you reported last time, let me
> take a fresh look.

No worries, we've all been there :)
Also, it's not like I can get a refund on my support contract I don't have ;)

Marc
-- 
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.

Home page: http://marc.merlins.org/                       | PGP 7F55D5F27AAF9D08

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: 5.4.20: cannot mount device that blipped off the bus: duplicate device fsid:devid for
  2020-04-14  0:38             ` Marc MERLIN
  2020-04-16 10:43               ` Anand Jain
@ 2020-04-20 11:10               ` Anand Jain
  2020-04-20 14:56                 ` Marc MERLIN
  1 sibling, 1 reply; 15+ messages in thread
From: Anand Jain @ 2020-04-20 11:10 UTC (permalink / raw)
  To: Marc MERLIN; +Cc: Nikolay Borisov, dsterba, linux-btrfs, kernel-team




The steps below are they in the chronological order?

> gargamel:~# dmtail 3
> [1887142.765448] BTRFS error (device sde1): bdev /dev/sde1 errs: wr 1038, rd 4529, flush 0, corrupt 0, gen 0
> [1887142.795820] BTRFS error (device sde1): bdev /dev/sde1 errs: wr 1038, rd 4530, flush 0, corrupt 0, gen 0
> [1887142.826176] BTRFS error (device sde1): bdev /dev/sde1 errs: wr 1038, rd 4531, flush 0, corrupt 0, gen 0
> gargamel:~# cat /proc/partitions  |grep sd[ep]
>     8      240 3750738264 sdp
>     8      241 3750737223 sdp1
> gargamel:~# mount | grep sde
> /dev/sde1 on /mnt/btrfs_space type btrfs (ro,noatime,compress=lzo,ssd,discard,space_cache,skip_balance,subvolid=5,subvol=/)
> /dev/sde1 on /var/local/space type btrfs (ro,noexec,noatime,compress=lzo,ssd,discard,space_cache,skip_balance,subvolid=257,subvol=/varlocalspace)
> /dev/sde1 on /var/cache/zoneminder type btrfs (ro,nosuid,nodev,noatime,compress=lzo,ssd,discard,space_cache,skip_balance,subvolid=257,subvol=/varlocalspace/zoneminder)
> /dev/sde1 on /var/lib/mysql type btrfs (ro,nosuid,nodev,noatime,compress=lzo,ssd,discard,space_cache,skip_balance,subvolid=3648,subvol=/mysql)
> gargamel:~# umount /mnt/btrfs_space; umount /var/local/space; umount /var/cache/zoneminder; umount /var/lib/mysql
> gargamel:~# mount | grep sde
> 
> gargamel:~# mount /dev/sdp1 /mnt/mnt
> mount: /mnt/mnt: mount(2) system call failed: File exists.
> gargamel:~# dmtail 2
> [1887142.826176] BTRFS error (device sde1): bdev /dev/sde1 errs: wr 1038, rd 4531, flush 0, corrupt 0, gen 0
> [1887453.610947] BTRFS warning (device sde1): duplicate device fsid:devid for 727c7ba3-f6f9-462a-8472-453dd7d46d8a:1 old:/dev/sde1 new:/dev/sdp1


  Before and after --forget command
     btrfs fi show -m
  could have told us what devices are still mounted.

I will send a boilerplate code to dump device list from the kernel it 
will help to debug. As of now this boilderplate code which I have been 
using is too localized needs a lot of cleanups, will take sometime.


> gargamel:/usr/local/bin# btrfs device scan --forget
> gargamel:/usr/local/bin# mount /dev/sdp1 /mnt/mnt
> mount: /mnt/mnt: mount(2) system call failed: File exists.


Thanks, Anand

> 
> After reboot, I made sure sde is not used by anything weird, just simple mounts:
> gargamel:~# lsblk  | grep sde
> sde                                 8:64   1 931.5G  0 disk
> ├─sde1                              8:65   1 488.3M  0 part
> ├─sde2                              8:66   1  14.9G  0 part
> ├─sde3                              8:67   1    80G  0 part
> └─sde4                              8:68   1 836.1G  0 part
> 
> Any ideas?
> 
> Marc
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: 5.4.20: cannot mount device that blipped off the bus: duplicate device fsid:devid for
  2020-04-20 11:10               ` Anand Jain
@ 2020-04-20 14:56                 ` Marc MERLIN
  2020-04-21  7:33                   ` Anand Jain
  0 siblings, 1 reply; 15+ messages in thread
From: Marc MERLIN @ 2020-04-20 14:56 UTC (permalink / raw)
  To: Anand Jain; +Cc: Nikolay Borisov, dsterba, linux-btrfs, kernel-team

On Mon, Apr 20, 2020 at 07:10:24PM +0800, Anand Jain wrote:
> The steps below are they in the chronological order?
 
That is my recollection, yes.

>  Before and after --forget command
>     btrfs fi show -m
>  could have told us what devices are still mounted.
 
Oh, I didn't know about this. If/when it happens next, I'll 
run this to show btrfs' understanding of what's mounted instead of
the kernel's understanding (/proc/self/mounts)

> I will send a boilerplate code to dump device list from the kernel it will
> help to debug. As of now this boilderplate code which I have been using is
> too localized needs a lot of cleanups, will take sometime.

Sounds good.
 
Thanks,
Marc
-- 
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
 
Home page: http://marc.merlins.org/  

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH] btrfs: boilerplate: devlist and fsinfo
  2020-03-21 20:23 5.4.20: cannot mount device that blipped off the bus: duplicate device fsid:devid for Marc MERLIN
  2020-03-21 21:25 ` Nikolay Borisov
@ 2020-04-21  7:21 ` Anand Jain
  1 sibling, 0 replies; 15+ messages in thread
From: Anand Jain @ 2020-04-21  7:21 UTC (permalink / raw)
  To: linux-btrfs; +Cc: marc

From: Anand Jain <Anand.Jain@oracle.com>

** This patch is not for integration but to debug/visualize the
btrfs device tree and fs_info. **

usage:
	cat /proc/fs/btrfs/devlist
	cat /proc/fs/btrfs/fsinfo

Signed-off-by: Anand Jain <anand.jain@oracle.com>
---

This patch can also be pulled from the branch boilerplate-v5.6
  git@github.com:asj/btrfs-boilerplate.git boilerplate-v5.6

 fs/btrfs/Makefile |   2 +-
 fs/btrfs/procfs.c | 600 ++++++++++++++++++++++++++++++++++++++++++++++
 fs/btrfs/procfs.h |   2 +
 fs/btrfs/super.c  |   6 +-
 4 files changed, 608 insertions(+), 2 deletions(-)
 create mode 100644 fs/btrfs/procfs.c
 create mode 100644 fs/btrfs/procfs.h

diff --git a/fs/btrfs/Makefile b/fs/btrfs/Makefile
index 9a0ff3384381..8a894ee1d27b 100644
--- a/fs/btrfs/Makefile
+++ b/fs/btrfs/Makefile
@@ -11,7 +11,7 @@ btrfs-y += super.o ctree.o extent-tree.o print-tree.o root-tree.o dir-item.o \
 	   compression.o delayed-ref.o relocation.o delayed-inode.o scrub.o \
 	   reada.o backref.o ulist.o qgroup.o send.o dev-replace.o raid56.o \
 	   uuid-tree.o props.o free-space-tree.o tree-checker.o space-info.o \
-	   block-rsv.o delalloc-space.o block-group.o discard.o
+	   block-rsv.o delalloc-space.o block-group.o discard.o procfs.o
 
 btrfs-$(CONFIG_BTRFS_FS_POSIX_ACL) += acl.o
 btrfs-$(CONFIG_BTRFS_FS_CHECK_INTEGRITY) += check-integrity.o
diff --git a/fs/btrfs/procfs.c b/fs/btrfs/procfs.c
new file mode 100644
index 000000000000..6dc39bb31d9d
--- /dev/null
+++ b/fs/btrfs/procfs.c
@@ -0,0 +1,600 @@
+#include <linux/seq_file.h>
+#include <linux/vmalloc.h>
+#include <linux/proc_fs.h>
+#include "ctree.h"
+#include "volumes.h"
+#include "rcu-string.h"
+#include "procfs.h"
+
+#define BPSL	256
+
+#define BTRFS_PROC_PATH		"fs/btrfs"
+#define BTRFS_PROC_DEVLIST	"devlist"
+#define BTRFS_PROC_FSINFO	"fsinfo"
+
+//#define USE_ALLOC_LIST
+//#define VOL_FLAGS
+//#define FS_OPEN_RW
+#define BALANCE_RUNNING
+//#define OLD_PROC
+
+struct proc_dir_entry	*btrfs_proc_root;
+
+static void fs_state_to_str(struct btrfs_fs_info *fs_info, char *str)
+{
+	if (test_bit(BTRFS_FS_STATE_ERROR, &fs_info->fs_state))
+		strcat(str, "|ERROR");
+	if (test_bit(BTRFS_FS_STATE_REMOUNTING, &fs_info->fs_state))
+		strcat(str, "|REMOUNTING");
+	if (test_bit(BTRFS_FS_STATE_TRANS_ABORTED, &fs_info->fs_state))
+		strcat(str, "|TRANS_ABORTED");
+	if (test_bit(BTRFS_FS_STATE_DEV_REPLACING, &fs_info->fs_state))
+		strcat(str, "|REPLACING");
+	if (test_bit(BTRFS_FS_STATE_DUMMY_FS_INFO, &fs_info->fs_state))
+		strcat(str, "|DUMMY");
+}
+
+static void fs_flags_to_str(struct btrfs_fs_info *fs_info, char *str)
+{
+	if (test_bit(BTRFS_FS_BARRIER, &fs_info->flags))
+		strcat(str, "|BARRIER");
+	if (test_bit(BTRFS_FS_CLOSING_START, &fs_info->flags))
+		strcat(str, "|CLOSING_START");
+	if (test_bit(BTRFS_FS_CLOSING_DONE, &fs_info->flags))
+		strcat(str, "|CLOSING_DONE");
+	if (test_bit(BTRFS_FS_LOG_RECOVERING, &fs_info->flags))
+		strcat(str, "|RECOVERING");
+	if (test_bit(BTRFS_FS_OPEN, &fs_info->flags))
+		strcat(str, "|OPEN");
+	if (test_bit(BTRFS_FS_QUOTA_ENABLED, &fs_info->flags))
+		strcat(str, "|QUOTA_ENABLED");
+	if (test_bit(BTRFS_FS_UPDATE_UUID_TREE_GEN, &fs_info->flags))
+		strcat(str, "|UPDATE_UUID_TREE_GEN");
+	if (test_bit(BTRFS_FS_CREATING_FREE_SPACE_TREE, &fs_info->flags))
+		strcat(str, "|FREE_SPACE_TREE");
+	if (test_bit(BTRFS_FS_BTREE_ERR, &fs_info->flags))
+		strcat(str, "|BTREE_ERR");
+	if (test_bit(BTRFS_FS_LOG1_ERR, &fs_info->flags))
+		strcat(str, "|LOG1_ERR");
+	if (test_bit(BTRFS_FS_LOG2_ERR, &fs_info->flags))
+		strcat(str, "|LOG2_ERR");
+	if (test_bit(BTRFS_FS_QUOTA_OVERRIDE, &fs_info->flags))
+		strcat(str, "|QUOTA_OVERRIDE");
+	if (test_bit(BTRFS_FS_FROZEN, &fs_info->flags))
+		strcat(str, "|FROZEN");
+	if (test_bit(BTRFS_FS_EXCL_OP, &fs_info->flags))
+		strcat(str, "|EXCL_OP");
+#ifdef BALANCE_RUNNING
+	if (test_bit(BTRFS_FS_BALANCE_RUNNING, &fs_info->flags))
+		strcat(str, "|BALANCE_RUNNING");
+#else
+	if (atomic_read(&fs_info->balance_running))
+		strcat(str, "|BALANCE_RUNNING");
+#endif
+	if (atomic_read(&fs_info->balance_pause_req))
+		strcat(str, "|BALANCE_PAUSEREQ");
+	if (atomic_read(&fs_info->balance_cancel_req))
+		strcat(str, "|BALANCE_CANCELREQ");
+}
+
+static void balance_ctl_flags_to_str(struct btrfs_balance_control *bctl,
+				     char *str)
+{
+	if (BTRFS_BALANCE_DATA & bctl->flags)
+		strcat(str, "|DATA");
+	if (BTRFS_BALANCE_SYSTEM & bctl->flags)
+		strcat(str, "|SYSTEM");
+	if (BTRFS_BALANCE_METADATA & bctl->flags)
+		strcat(str, "|METADATA");
+	if (BTRFS_BALANCE_FORCE & bctl->flags)
+		strcat(str, "|FORCE");
+	if (BTRFS_BALANCE_RESUME & bctl->flags)
+		strcat(str, "|RESUME");
+}
+
+static char *bg_flags_to_str(u64 chunk_type, char *str)
+{
+	if (chunk_type & BTRFS_BLOCK_GROUP_RAID0)
+		strcat(str, "|RAID0");
+	if (chunk_type & BTRFS_BLOCK_GROUP_RAID1)
+		strcat(str, "|RAID1");
+	if (chunk_type & BTRFS_BLOCK_GROUP_RAID5)
+		strcat(str, "|RAID5");
+	if (chunk_type & BTRFS_BLOCK_GROUP_RAID6)
+		strcat(str, "|RAID6");
+	if (chunk_type & BTRFS_BLOCK_GROUP_DUP)
+		strcat(str, "|DUP");
+	if (chunk_type & BTRFS_BLOCK_GROUP_RAID10)
+		strcat(str, "|RAID10");
+	if (chunk_type & BTRFS_AVAIL_ALLOC_BIT_SINGLE)
+		strcat(str, "|ALLOC_SINGLE");
+
+	return str;
+}
+
+static void balance_args_to_str(struct btrfs_balance_args *bargs, char *str,
+				char *prefix)
+{
+	int ret = 0;
+
+	ret = sprintf(str, "\tbalance_args.%s\t", prefix);
+	str=str+ret;
+
+	if (bargs->flags & BTRFS_BALANCE_ARGS_SOFT)
+		strcat(str, "|SOFT");
+
+	if (bargs->flags & BTRFS_BALANCE_ARGS_PROFILES) {
+		strcat(str, "|profiles=");
+		str = bg_flags_to_str(bargs->profiles, str);
+		str=str+ret;
+	}
+
+	if (bargs->flags & BTRFS_BALANCE_ARGS_USAGE) {
+		ret = sprintf(str, "|usage=%llu ", bargs->usage);
+		str=str+ret;
+	}
+
+	if (bargs->flags & BTRFS_BALANCE_ARGS_USAGE_RANGE) {
+		ret = sprintf(str, "|usage_min=%u usage_max=%u",
+			      bargs->usage_min, bargs->usage_max);
+		str=str+ret;
+	}
+
+	if (bargs->flags & BTRFS_BALANCE_ARGS_DEVID) {
+		ret = sprintf(str, "|devid=%llu ", bargs->devid);
+		str=str+ret;
+	}
+
+	if (bargs->flags & BTRFS_BALANCE_ARGS_DRANGE) {
+		ret = sprintf(str, "|DRANGE pstart=%llu pend=%llu ",
+			      bargs->pstart, bargs->pend);
+		str=str+ret;
+	}
+
+	if (bargs->flags & BTRFS_BALANCE_ARGS_VRANGE) {
+		ret = sprintf(str, "|VRANGE vstart=%llu vend %llu",
+			      bargs->vstart, bargs->vend);
+		str=str+ret;
+	}
+
+	if (bargs->flags & BTRFS_BALANCE_ARGS_LIMIT) {
+		ret = sprintf(str, "|limit=%llu ", bargs->limit);
+		str=str+ret;
+	}
+
+	if (bargs->flags & BTRFS_BALANCE_ARGS_LIMIT_RANGE) {
+		ret = sprintf(str, "|limit_min=%u limit_max=%u",
+			      bargs->limit_min, bargs->limit_max);
+		str=str+ret;
+	}
+
+	if (bargs->flags & BTRFS_BALANCE_ARGS_STRIPES_RANGE) {
+		ret = sprintf(str, "|stripes_min=%u stripes_max=%u ",
+			bargs->stripes_min, bargs->stripes_max);
+		str=str+ret;
+	}
+
+	if (bargs->flags & BTRFS_BALANCE_ARGS_CONVERT) {
+		strcat(str, "|convert=");
+		str = bg_flags_to_str(bargs->target, str);
+	}
+}
+
+static void print_balance_args(struct btrfs_balance_args *bargs, char *prefix,
+				struct seq_file *seq)
+{
+#define BTRFS_SEQ_PRINT3(plist, arg)\
+		snprintf(__str, BPSL, plist, arg);\
+		seq_printf(seq, __str)
+	char __str[BPSL];
+
+	char tmp_str[BPSL];
+
+	memset(tmp_str, '\0', 256);
+	balance_args_to_str(bargs, tmp_str, prefix);
+	BTRFS_SEQ_PRINT3("%s\n", tmp_str);
+}
+
+static void balance_progress_to_str(struct btrfs_balance_progress *bstat, char *str)
+{
+	int ret=0;
+	ret = sprintf(str, "expected=%llu ", bstat->expected);
+	str=str+ret;
+	ret = sprintf(str, "considered=%llu ", bstat->considered);
+	str=str+ret;
+	ret = sprintf(str, "completed=%llu ", bstat->completed);
+}
+
+void btrfs_print_fsinfo(struct seq_file *seq)
+{
+	/* Btrfs Procfs String Len */
+#define BTRFS_SEQ_PRINT2(plist, arg)\
+		snprintf(str, BPSL, plist, arg);\
+		seq_printf(seq, str)
+
+	char str[BPSL];
+	char b[BDEVNAME_SIZE];
+	struct list_head *cur_uuid;
+	struct btrfs_fs_info *fs_info;
+	struct btrfs_fs_devices *fs_devices;
+	struct list_head *fs_uuids = btrfs_get_fs_uuids();
+
+	seq_printf(seq, "\n#Its for debugging and experimental only, parameters may change without notice.\n\n");
+
+	list_for_each(cur_uuid, fs_uuids) {
+		char fs_str[256] = {0};
+		fs_devices  = list_entry(cur_uuid, struct btrfs_fs_devices, fs_list);
+		fs_info = fs_devices->fs_info;
+		if (!fs_info)
+			continue;
+
+		BTRFS_SEQ_PRINT2("[fsid: %pU]\n", fs_devices->fsid);
+		BTRFS_SEQ_PRINT2("\tsb->s_bdev:\t\t%s\n",
+				fs_info->sb->s_bdev ?
+				bdevname(fs_info->sb->s_bdev, b):
+				"null");
+		BTRFS_SEQ_PRINT2("\tlatest_bdev:\t\t%s\n",
+				fs_devices->latest_bdev ?
+				bdevname(fs_devices->latest_bdev, b):
+				"null");
+
+		fs_state_to_str(fs_info, fs_str);
+		BTRFS_SEQ_PRINT2("\tfs_state:\t\t%s\n", fs_str);
+
+		memset(fs_str, '\0', 256);
+		fs_flags_to_str(fs_info, fs_str);
+		BTRFS_SEQ_PRINT2("\tfs_flags:\t\t%s\n", fs_str);
+
+		BTRFS_SEQ_PRINT2("\tsuper_copy->flags\t0x%llx\n",
+				fs_info->super_copy->flags);
+		BTRFS_SEQ_PRINT2("\tsuper_for_commit->flags\t0x%llx\n",
+				fs_info->super_for_commit->flags);
+		BTRFS_SEQ_PRINT2("\tnodesize\t\t%u\n", fs_info->nodesize);
+		BTRFS_SEQ_PRINT2("\tsectorsize\t\t%u\n", fs_info->sectorsize);
+
+		if (fs_info->balance_ctl) {
+			memset(fs_str, '\0', 256);
+			balance_ctl_flags_to_str(fs_info->balance_ctl, fs_str);
+			BTRFS_SEQ_PRINT2("\tbalance_control\t\t%s\n", fs_str);
+
+			if (fs_info->balance_ctl->flags & BTRFS_BALANCE_DATA)
+				print_balance_args(&fs_info->balance_ctl->data, "data", seq);
+			if (fs_info->balance_ctl->flags & BTRFS_BALANCE_METADATA)
+				print_balance_args(&fs_info->balance_ctl->meta, "meta", seq);
+			if (fs_info->balance_ctl->flags & BTRFS_BALANCE_SYSTEM)
+				print_balance_args(&fs_info->balance_ctl->sys, "sys", seq);
+
+			memset(fs_str, '\0', 256);
+			balance_progress_to_str(&fs_info->balance_ctl->stat, fs_str);
+			BTRFS_SEQ_PRINT2("\tbalance_progress\t%s\n", fs_str);
+
+		} else {
+			BTRFS_SEQ_PRINT2("\tbalance_control\t\t%s\n", "null");
+		}
+
+		BTRFS_SEQ_PRINT2("\tdev_replace.replace_state\t\t%llu\n",
+				 fs_info->dev_replace.replace_state);
+		BTRFS_SEQ_PRINT2("\tdev_replace.time start\t\t%lld\n",
+				 fs_info->dev_replace.time_started);
+		BTRFS_SEQ_PRINT2("\tdev_replace.time stopped\t%lld\n",
+				 fs_info->dev_replace.time_stopped);
+		BTRFS_SEQ_PRINT2("\tdev_replace.cursor_left\t\t%llu\n",
+				 fs_info->dev_replace.cursor_left);
+		BTRFS_SEQ_PRINT2("\tdev_replace.committed_cursor_left\t%llu\n",
+				 fs_info->dev_replace.committed_cursor_left);
+		BTRFS_SEQ_PRINT2("\tdev_replace.cursor_left_last_write_of_item\t%llu\n",
+				 fs_info->dev_replace.cursor_left_last_write_of_item);
+		BTRFS_SEQ_PRINT2("\tdev_replace.cursor_right\t\t%llu\n",
+				 fs_info->dev_replace.cursor_right);
+		BTRFS_SEQ_PRINT2("\tdev_replace.cont_reading_from_srcdev_mode\t%llu\n",
+				 fs_info->dev_replace.cont_reading_from_srcdev_mode);
+		BTRFS_SEQ_PRINT2("\tdev_replace.is_valid\t\t\t%d\n",
+				 fs_info->dev_replace.is_valid);
+		BTRFS_SEQ_PRINT2("\tdev_replace.item_needs_writeback\t%d\n",
+				 fs_info->dev_replace.item_needs_writeback);
+		BTRFS_SEQ_PRINT2("\tdev_replace.srcdev\t\t\t%p\n",
+				 fs_info->dev_replace.srcdev);
+		BTRFS_SEQ_PRINT2("\tdev_replace.tgtdev\t\t\t%p\n",
+				 fs_info->dev_replace.tgtdev);
+		BTRFS_SEQ_PRINT2("\tdev_replace.bio_counter.count\t\t%llu\n",
+				 fs_info->dev_replace.bio_counter.count);
+	}
+}
+
+static void dev_state_to_str(struct btrfs_device *device, char *dev_state_str)
+{
+	if (test_bit(BTRFS_DEV_STATE_WRITEABLE, &device->dev_state))
+		strcat(dev_state_str, "|WRITEABLE");
+	if (test_bit(BTRFS_DEV_STATE_IN_FS_METADATA, &device->dev_state))
+		strcat(dev_state_str, "|IN_FS_METADATA");
+	if (test_bit(BTRFS_DEV_STATE_MISSING, &device->dev_state))
+		strcat(dev_state_str, "|MISSING");
+	if (test_bit(BTRFS_DEV_STATE_REPLACE_TGT, &device->dev_state))
+		strcat(dev_state_str, "|REPLACE_TGT");
+	if (test_bit(BTRFS_DEV_STATE_FLUSH_SENT, &device->dev_state))
+		strcat(dev_state_str, "|FLUSH_SENT");
+#ifdef READPOLICY
+	if (test_bit(BTRFS_DEV_STATE_RD_PREFERRED, &device->dev_state))
+		strcat(dev_state_str, "|RD_PREFFRRED");
+#endif
+	if (device->dev_stats_valid)
+		strcat(dev_state_str, "|dev_stats_valid");
+}
+
+#ifdef VOL_FLAGS
+static void vol_flags_to_str(struct btrfs_fs_devices *fs_devices, char *vol_flags)
+{
+	if (test_bit(BTRFS_VOL_FLAG_ROTATING, &fs_devices->vol_flags))
+		strcat(vol_flags, "|ROTATING");
+	if (test_bit(BTRFS_VOL_FLAG_SEEDING, &fs_devices->vol_flags))
+		strcat(vol_flags, "|SEEDING");
+	if (test_bit(BTRFS_VOL_FLAG_EXCL_OPS, &fs_devices->vol_flags))
+		strcat(vol_flags, "|EXCL_OPS");
+}
+#endif
+
+void btrfs_print_devlist(struct seq_file *seq, struct btrfs_fs_devices *the_fs_devices)
+{
+/* Btrfs Procfs String Len */
+#define BTRFS_SEQ_PRINT(plist, arg)\
+		snprintf(str, BPSL, plist, arg);\
+		if (sprt) {\
+			if (seq) {\
+				seq_printf(seq, "\t");\
+			}\
+		}\
+		if (seq) {\
+			seq_printf(seq, str);\
+		} else {\
+			printk("boilerplate: %s", str);\
+		}
+
+	char str[BPSL];
+	struct btrfs_device *device;
+	struct btrfs_fs_devices *fs_devices;
+	struct btrfs_fs_devices *cur_fs_devices;
+	struct btrfs_fs_devices *sprt; //sprout fs devices
+	struct list_head *fs_uuids = btrfs_get_fs_uuids();
+	struct list_head *cur_uuid;
+
+	if (seq)
+		seq_printf(seq, "\n#Its for debugging and experimental only, parameters may change without notice.\n\n");
+
+	/* Todo: there must be better way than nested locks */
+	list_for_each(cur_uuid, fs_uuids) {
+#ifdef VOL_FLAGS
+		char vol_flags[256] = {0};
+#endif
+		cur_fs_devices  = list_entry(cur_uuid, struct btrfs_fs_devices, fs_list);
+
+		mutex_lock(&cur_fs_devices->device_list_mutex);
+
+		fs_devices = cur_fs_devices;
+		sprt = NULL;
+
+again_fs_devs:
+		if (the_fs_devices && the_fs_devices != cur_fs_devices)
+			goto skip;
+
+		if (sprt) {
+			BTRFS_SEQ_PRINT("[[seed_fsid: %pU]]\n", fs_devices->fsid);
+			BTRFS_SEQ_PRINT("\tsprout_fsid:\t\t%pU\n", sprt->fsid);
+		} else {
+			BTRFS_SEQ_PRINT("[fsid: %pU]\n", fs_devices->fsid);
+		}
+		if (fs_devices->seed) {
+			BTRFS_SEQ_PRINT("\tseed_fsid:\t\t%pU\n", fs_devices->seed->fsid);
+		}
+		BTRFS_SEQ_PRINT("\tmetadata_uuid:\t\t%pU\n", fs_devices->metadata_uuid);
+		BTRFS_SEQ_PRINT("\tfs_devs_addr:\t\t%p\n", fs_devices);
+		BTRFS_SEQ_PRINT("\tnum_devices:\t\t%llu\n", fs_devices->num_devices);
+		BTRFS_SEQ_PRINT("\topen_devices:\t\t%llu\n", fs_devices->open_devices);
+		BTRFS_SEQ_PRINT("\trw_devices:\t\t%llu\n", fs_devices->rw_devices);
+		BTRFS_SEQ_PRINT("\tmissing_devices:\t%llu\n", fs_devices->missing_devices);
+		BTRFS_SEQ_PRINT("\ttotal_rw_bytes:\t\t%llu\n", fs_devices->total_rw_bytes);
+		BTRFS_SEQ_PRINT("\ttotal_devices:\t\t%llu\n", fs_devices->total_devices);
+		BTRFS_SEQ_PRINT("\topened:\t\t\t%d\n", fs_devices->opened);
+#ifdef VOL_FLAGS
+		vol_flags_to_str(fs_devices, vol_flags);
+		BTRFS_SEQ_PRINT("\vol_flags:\\%s\n", vol_flags);
+#else
+		BTRFS_SEQ_PRINT("\tseeding:\t\t%d\n", fs_devices->seeding);
+		BTRFS_SEQ_PRINT("\trotating:\t\t%d\n", fs_devices->rotating);
+#endif
+		BTRFS_SEQ_PRINT("\tfsid_kobj_state:\t%d\n", fs_devices->fsid_kobj.state_initialized);
+		BTRFS_SEQ_PRINT("\tfsid_kobj_insysfs:\t%d\n", fs_devices->fsid_kobj.state_in_sysfs);
+
+		if (fs_devices->devices_kobj) {
+		BTRFS_SEQ_PRINT("\tkobj_state:\t\t%d\n", fs_devices->devices_kobj->state_initialized);
+		BTRFS_SEQ_PRINT("\tkobj_insysfs:\t\t%d\n", fs_devices->devices_kobj->state_in_sysfs);
+		} else {
+		BTRFS_SEQ_PRINT("\tkobj_state:\t\t%s\n", "null");
+		BTRFS_SEQ_PRINT("\tkobj_insysfs:\t\t%s\n", "null");
+		}
+
+#ifdef READPOLICY
+		switch (fs_devices->read_policy) {
+		case BTRFS_READ_POLICY_PID:
+			BTRFS_SEQ_PRINT2("\tread_policy\t\t%s\n", "BTRFS_READ_POLICY_PID:");
+			break;
+		case BTRFS_READ_POLICY_DEVICE:
+			list_for_each_entry(device, &fs_devices->devices, dev_list) {
+				if (test_bit(BTRFS_DEV_STATE_RD_PREFERRED, &device->dev_state)) {
+					BTRFS_SEQ_PRINT("%llu ", device->devid);
+				}
+			}
+			BTRFS_SEQ_PRINT("%s\n", " ");
+			break;
+		default:
+			BTRFS_SEQ_PRINT2("\tread_policy\t%s\n", "unknown\n");
+		}
+#endif
+		list_for_each_entry(device, &fs_devices->devices, dev_list) {
+			char dev_state_str[256] = {0};
+
+			BTRFS_SEQ_PRINT("\t[[UUID: %pU]]\n", device->uuid);
+			BTRFS_SEQ_PRINT("\t\tdev_addr:\t%p\n", device);
+			rcu_read_lock();
+			BTRFS_SEQ_PRINT("\t\tdevice:\t\t%s\n",
+				device->name ? rcu_str_deref(device->name): "(null)");
+			rcu_read_unlock();
+			BTRFS_SEQ_PRINT("\t\tdevid:\t\t%llu\n", device->devid);
+			BTRFS_SEQ_PRINT("\t\tgeneration:\t%llu\n", device->generation);
+			BTRFS_SEQ_PRINT("\t\ttotal_bytes:\t%llu\n", device->total_bytes);
+			BTRFS_SEQ_PRINT("\t\tdev_totalbytes:\t%llu\n", device->disk_total_bytes);
+			BTRFS_SEQ_PRINT("\t\tbytes_used:\t%llu\n", device->bytes_used);
+			BTRFS_SEQ_PRINT("\t\ttype:\t\t%llu\n", device->type);
+			BTRFS_SEQ_PRINT("\t\tio_align:\t%u\n", device->io_align);
+			BTRFS_SEQ_PRINT("\t\tio_width:\t%u\n", device->io_width);
+			BTRFS_SEQ_PRINT("\t\tsector_size:\t%u\n", device->sector_size);
+			BTRFS_SEQ_PRINT("\t\tmode:\t\t0x%llx\n", (u64)device->mode);
+			dev_state_to_str(device, dev_state_str);
+			if (strlen(dev_state_str) == 0) {
+			BTRFS_SEQ_PRINT("\t\tdev_state:\t0x%lx\n", device->dev_state);
+			} else {
+			BTRFS_SEQ_PRINT("\t\tdev_state:\t%s\n", dev_state_str);
+			}
+			BTRFS_SEQ_PRINT("\t\tbdev:\t\t%s\n", device->bdev ? "not_null":"null");
+			if (device->bdev) {
+			struct backing_dev_info *bdi = device->bdev->bd_bdi;
+			BTRFS_SEQ_PRINT("\t\tbdi:\t\t%s\n", bdi ? "not_null": "null");
+			if (bdi) {
+			struct bdi_writeback *wb = &bdi->wb;
+			BTRFS_SEQ_PRINT("\t\twb:\t\t%s\n", wb ? "not_null": "null");
+			if (wb) {
+			BTRFS_SEQ_PRINT("\t\twb congested state:\t%lx\n", wb->congested->state);
+			}
+			}
+			}
+		}
+
+#ifdef USE_ALLOC_LIST
+		/* print device from the alloc_list */
+		list_for_each_entry(device, &fs_devices->alloc_list, dev_alloc_list) {
+			char dev_state_str[256] = {0};
+
+			BTRFS_SEQ_PRINT("\t[[uuid: %pU]]\n", device->uuid);
+			BTRFS_SEQ_PRINT("\t\tdev_addr:\t%p\n", device);
+			rcu_read_lock();
+			BTRFS_SEQ_PRINT("\t\tdevice:\t\t%s\n",
+				device->name ? rcu_str_deref(device->name): "(null)");
+			rcu_read_unlock();
+			BTRFS_SEQ_PRINT("\t\tdevid:\t\t%llu\n", device->devid);
+			BTRFS_SEQ_PRINT("\t\tgeneration:\t%llu\n", device->generation);
+			BTRFS_SEQ_PRINT("\t\ttotal_bytes:\t%llu\n", device->total_bytes);
+			BTRFS_SEQ_PRINT("\t\tdev_totalbytes:\t%llu\n", device->disk_total_bytes);
+			BTRFS_SEQ_PRINT("\t\tbytes_used:\t%llu\n", device->bytes_used);
+			BTRFS_SEQ_PRINT("\t\ttype:\t\t%llu\n", device->type);
+			BTRFS_SEQ_PRINT("\t\tio_align:\t%u\n", device->io_align);
+			BTRFS_SEQ_PRINT("\t\tio_width:\t%u\n", device->io_width);
+			BTRFS_SEQ_PRINT("\t\tsector_size:\t%u\n", device->sector_size);
+			BTRFS_SEQ_PRINT("\t\tmode:\t\t0x%llx\n", (u64)device->mode);
+			dev_state_to_str(device, dev_state_str);
+			if (strlen(dev_state_str) == 0) {
+			BTRFS_SEQ_PRINT("\t\tdev_state:\t0x%lx\n", device->dev_state);
+			} else {
+			BTRFS_SEQ_PRINT("\t\tdev_state:\t%s\n", dev_state_str);
+			}
+			BTRFS_SEQ_PRINT("\t\tbdev:\t\t%s\n", device->bdev ? "not_null":"null");
+			if (device->bdev) {
+			struct backing_dev_info *bdi = device->bdev->bd_bdi;
+			BTRFS_SEQ_PRINT("\t\tbdi:\t\t%s\n", bdi ? "not_null": "null");
+			if (bdi) {
+			struct bdi_writeback *wb = &bdi->wb;
+			BTRFS_SEQ_PRINT("\t\twb:\t\t%s\n", wb ? "not_null": "null");
+			if (wb) {
+			BTRFS_SEQ_PRINT("\t\twb congested state:\t%lx\n", wb->congested->state);
+			}
+			}
+			}
+		}
+#endif
+skip:
+		if (fs_devices->seed) {
+			sprt = fs_devices;
+			fs_devices = fs_devices->seed;
+			goto again_fs_devs;
+		}
+		if (seq)
+			seq_printf(seq, "\n");
+
+		mutex_unlock(&cur_fs_devices->device_list_mutex);
+	}
+}
+
+static int btrfs_fsinfo_show(struct seq_file *seq, void *offset)
+{
+	btrfs_print_fsinfo(seq);
+	return 0;
+}
+
+static int btrfs_devlist_show(struct seq_file *seq, void *offset)
+{
+	btrfs_print_devlist(seq, NULL);
+	return 0;
+}
+
+static int btrfs_seq_fsinfo_open(struct inode *inode, struct file *file)
+{
+	return single_open(file, btrfs_fsinfo_show, PDE_DATA(inode));
+}
+
+static int btrfs_seq_devlist_open(struct inode *inode, struct file *file)
+{
+	return single_open(file, btrfs_devlist_show, PDE_DATA(inode));
+}
+
+#ifdef OLD_PROC
+static const struct file_operations btrfs_seq_devlist_fops = {
+	.owner   = THIS_MODULE,
+	.open    = btrfs_seq_devlist_open,
+	.read    = seq_read,
+	.llseek  = seq_lseek,
+	.release = single_release,
+};
+#else
+static const struct proc_ops btrfs_seq_devlist_fops = {
+	.proc_open    = btrfs_seq_devlist_open,
+	.proc_read    = seq_read,
+	.proc_lseek  = seq_lseek,
+	.proc_release = single_release,
+};
+#endif
+
+#ifdef OLD_PROC
+static const struct file_operations btrfs_seq_fsinfo_fops = {
+	.owner   = THIS_MODULE,
+	.open    = btrfs_seq_fsinfo_open,
+	.read    = seq_read,
+	.llseek  = seq_lseek,
+	.release = single_release,
+};
+#else
+static const struct proc_ops btrfs_seq_fsinfo_fops = {
+	.proc_open    = btrfs_seq_fsinfo_open,
+	.proc_read    = seq_read,
+	.proc_lseek  = seq_lseek,
+	.proc_release = single_release,
+};
+#endif
+
+void btrfs_init_procfs(void)
+{
+	btrfs_proc_root = proc_mkdir(BTRFS_PROC_PATH, NULL);
+	if (btrfs_proc_root) {
+		proc_create_data(BTRFS_PROC_DEVLIST, S_IRUGO, btrfs_proc_root,
+					&btrfs_seq_devlist_fops, NULL);
+		proc_create_data(BTRFS_PROC_FSINFO, S_IRUGO, btrfs_proc_root,
+					&btrfs_seq_fsinfo_fops, NULL);
+	}
+	return;
+}
+
+void btrfs_exit_procfs(void)
+{
+	if (btrfs_proc_root) {
+		remove_proc_entry(BTRFS_PROC_DEVLIST, btrfs_proc_root);
+		remove_proc_entry(BTRFS_PROC_FSINFO, btrfs_proc_root);
+	}
+	remove_proc_entry(BTRFS_PROC_PATH, NULL);
+}
diff --git a/fs/btrfs/procfs.h b/fs/btrfs/procfs.h
new file mode 100644
index 000000000000..f7b712b58a5d
--- /dev/null
+++ b/fs/btrfs/procfs.h
@@ -0,0 +1,2 @@
+void btrfs_exit_procfs(void);
+void btrfs_init_procfs(void);
diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index 67c63858812a..c9dfceff6aa4 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -47,7 +47,7 @@
 #include "tests/btrfs-tests.h"
 #include "block-group.h"
 #include "discard.h"
-
+#include "procfs.h"
 #include "qgroup.h"
 #define CREATE_TRACE_POINTS
 #include <trace/events/btrfs.h>
@@ -2393,6 +2393,8 @@ static int __init init_btrfs_fs(void)
 	if (err)
 		return err;
 
+	btrfs_init_procfs();
+
 	btrfs_init_compress();
 
 	err = btrfs_init_cachep();
@@ -2477,6 +2479,7 @@ static int __init init_btrfs_fs(void)
 	btrfs_destroy_cachep();
 free_compress:
 	btrfs_exit_compress();
+	btrfs_exit_procfs();
 	btrfs_exit_sysfs();
 
 	return err;
@@ -2496,6 +2499,7 @@ static void __exit exit_btrfs_fs(void)
 	btrfs_interface_exit();
 	btrfs_end_io_wq_exit();
 	unregister_filesystem(&btrfs_fs_type);
+	btrfs_exit_procfs();
 	btrfs_exit_sysfs();
 	btrfs_cleanup_fs_uuids();
 	btrfs_exit_compress();
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: 5.4.20: cannot mount device that blipped off the bus: duplicate device fsid:devid for
  2020-04-20 14:56                 ` Marc MERLIN
@ 2020-04-21  7:33                   ` Anand Jain
  2020-04-22  5:54                     ` Marc MERLIN
  0 siblings, 1 reply; 15+ messages in thread
From: Anand Jain @ 2020-04-21  7:33 UTC (permalink / raw)
  To: Marc MERLIN; +Cc: Nikolay Borisov, dsterba, linux-btrfs, kernel-team

Marc,

  Could you please use the kernel patch (sent to the list or at
  git@github.com:asj/btrfs-boilerplate.git boilerplate-v5.6) it can dump
  the btrfs kernel device_list into the user space using procfs. (This
  patch is only for debugging).

  I tried test (as below) if there will be any availability issue
  (that is requiring to reboot) steps used are as below, and I am
  unable to reproduce. When it happens again at your end, these insight
  into the kernel might shed some more light on the issue.

--------------------------------
$ fillfs /btrfs 10000
$ devmgt detach /dev/sda

[65985.636630] BTRFS: error (device sda) in 
btrfs_commit_transaction:2345: errno=-5 IO failure (Error while writing 
out transaction)
[65985.636631] BTRFS info (device sda): forced readonly
[65985.636633] BTRFS warning (device sda): Skipping commit of aborted 
transaction.
[65985.636634] BTRFS: error (device sda) in cleanup_transaction:1894: 
errno=-5 IO failure
[65985.636636] BTRFS info (device sda): delayed_refs has NO entry

$ devmgt attach host0

[66501.910237] BTRFS warning (device sda): duplicate device fsid:devid 
for 8cc98c45-1a11-4a30-bca8-9760c246ccb4:1 old:/dev/sda new:/dev/sdb

$ btrfs fi show -m
Label: none  uuid: 8cc98c45-1a11-4a30-bca8-9760c246ccb4
	Total devices 1 FS bytes used 16.06MiB
	*** Some devices missing

above -m option reads the device path from the kernel which does provide 
as /dev/sda but as we check its access in the user-space and as its not 
accessible so we report missing.

$ cat /proc/fs/btrfs/devlist
::
		device:		/dev/sda
::
		generation:	10
::
		dev_state:	|WRITEABLE|IN_FS_METADATA|dev_stats_valid
		bdev:		not_null

$ mount /dev/sdb /btrfs1
mount: /btrfs1: mount(2) system call failed: File exists.

The above mount fails because we find the same fs signature on both 
/dev/sda (stale) and /dev/sdb and further the generation number on both 
of these devices are same.

$ btrfs in dump-super /dev/sdb | grep ^generation
generation		10

$ btrfs dev scan --forget
$ cat /proc/fs/btrfs/devlist
::
		device:		/dev/sda

--forget option can't clean the device because its still mounted.

$ umount /btrfs
$ cat /proc/fs/btrfs/devlist | egrep 'device:|bdev'
		device:		/dev/sda
		bdev:		null

unmount is successful and bdev is null. Now --forget should work.

$ btrfs dev scan --forget
$ cat /proc/fs/btrfs/devlist | egrep 'device:|bdev'
$

Now as there isn't any stale device in the kernel and mount will be 
successful.

$ mount /dev/sdb /btrfs
$ cat /proc/fs/btrfs/devlist | egrep 'device:|bdev'
		device:		/dev/sdb
		bdev:		not_null

So reboot was required.
---------------------

Thanks, Anand

On 4/20/20 10:56 PM, Marc MERLIN wrote:
> On Mon, Apr 20, 2020 at 07:10:24PM +0800, Anand Jain wrote:
>> The steps below are they in the chronological order?
>   
> That is my recollection, yes.
> 
>>   Before and after --forget command
>>      btrfs fi show -m
>>   could have told us what devices are still mounted.
>   
> Oh, I didn't know about this. If/when it happens next, I'll
> run this to show btrfs' understanding of what's mounted instead of
> the kernel's understanding (/proc/self/mounts)
> 
>> I will send a boilerplate code to dump device list from the kernel it will
>> help to debug. As of now this boilderplate code which I have been using is
>> too localized needs a lot of cleanups, will take sometime.
> 
> Sounds good.
>   
> Thanks,
> Marc
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: 5.4.20: cannot mount device that blipped off the bus: duplicate device fsid:devid for
  2020-04-21  7:33                   ` Anand Jain
@ 2020-04-22  5:54                     ` Marc MERLIN
  0 siblings, 0 replies; 15+ messages in thread
From: Marc MERLIN @ 2020-04-22  5:54 UTC (permalink / raw)
  To: Anand Jain; +Cc: Nikolay Borisov, dsterba, linux-btrfs, kernel-team

On Tue, Apr 21, 2020 at 03:33:24PM +0800, Anand Jain wrote:
> 
> Marc,
> 
>  Could you please use the kernel patch (sent to the list or at
>  git@github.com:asj/btrfs-boilerplate.git boilerplate-v5.6) it can dump

Thanks.
Since it might not be super obvious to someone else how to get it, 
the actual patch is:
https://github.com/asj/btrfs-boilerplate/commit/f6874237fbe66c281b5bf84ead41983fb43ed9b9

or wget https://github.com/asj/btrfs-boilerplate/commit/f6874237fbe66c281b5bf84ead41983fb43ed9b9.patch

> $ cat /proc/fs/btrfs/devlist

I patched this in my 5.6 kernel, installed, rebooted, and I can confirm
it's working.

I'll report back in due time :)

Thanks,
Marc
-- 
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
 
Home page: http://marc.merlins.org/                       | PGP 7F55D5F27AAF9D08

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2020-04-22  5:54 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-21 20:23 5.4.20: cannot mount device that blipped off the bus: duplicate device fsid:devid for Marc MERLIN
2020-03-21 21:25 ` Nikolay Borisov
2020-03-25 20:14   ` Marc MERLIN
2020-03-25 23:56     ` Anand Jain
2020-03-26  1:30       ` Marc MERLIN
2020-03-26  3:33         ` Anand Jain
2020-03-26  4:26           ` Marc MERLIN
2020-04-14  0:38             ` Marc MERLIN
2020-04-16 10:43               ` Anand Jain
2020-04-19 19:13                 ` Marc MERLIN
2020-04-20 11:10               ` Anand Jain
2020-04-20 14:56                 ` Marc MERLIN
2020-04-21  7:33                   ` Anand Jain
2020-04-22  5:54                     ` Marc MERLIN
2020-04-21  7:21 ` [PATCH] btrfs: boilerplate: devlist and fsinfo Anand Jain

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.