* Custom driver FS brokenness at 4GB?
@ 2015-05-27 13:56 Rob Harris
2015-05-28 10:59 ` Jan Kara
0 siblings, 1 reply; 4+ messages in thread
From: Rob Harris @ 2015-05-27 13:56 UTC (permalink / raw)
To: linux-ext4
Greetings. I have an odd issue and need some ideas of where to go next
-- I'm out of hair to rip out.
I'm writing a custom block device driver talking to some custom RAID
hardware (>32TB) using DMA scatter-gather, with no partitions and am
using make_request() to service all the BIO requests to simplify
debugging. I have the driver working to the point where using DD against
the block device seems to work fine (I'm setting iflag|oflag=direct to
ensure it's writing to the disk). I also have the blk_queue set to only
request a single 4k I/O per BIO (again to simplify debugging for now.)
Also, again to debug, I have a mutex wrapping the entire make_request
call to ensure that only a single request is being serviced at a time.
So, this should be as "simple" as I can make the environment to debug
this problem.
Once the driver is loaded, when I try to create a file system (ext4 but
the same thing happens with xfs) it seems like there is some corruption
occurring, but only when I set the sector size of the block device over
4GB. For instance, when I set the size to 4G, I can mkfs.ext4, but after
2 or 3 mount/umounts the FS refuses to mount anymore and the kernel log
complains that the journal is missing. This was discovered running this
loop...
#!/bin/sh
COUNT=4032
while [ 1 ] ; do
figlet ${COUNT}
( umount /mnt ; rmmod smc ) || true
modprobe smc capacity_in_mb=${COUNT} debug=1
mkfs.ext4 -m 0 /dev/smcd
mount /dev/smcd /mnt
cp count_512m.dat /mnt/test
umount /mnt
mount /dev/smcd /mnt
umount /mnt
mount /dev/smcd /mnt
cmp count_512m.dat /mnt/test
umount /mnt
mount /dev/smcd /mnt # ***
sync
umount /mnt
mount /dev/smcd /mnt
sleep 1
umount /mnt
COUNT=$(( COUNT + 64 ))
sleep 1
done
Sometimes I'll get in the kernel log:
May 27 09:39:01 febtober kernel: [64547.304695] EXT4-fs (smcd):
ext4_check_descriptors: Checksum for group 0 failed (7009!=0)
May 27 09:39:01 febtober kernel: [64547.305744] EXT4-fs (smcd): group
descriptors corrupted!
Others I'll get:
May 27 09:46:49 ryftone-smcdrv kernel: [65014.342850] EXT4-fs (smcd): no
journal found
I've seen this loop fail as early as COUNT=4096, but as late as
COUNT=4220; removing the sync changes the behavior.
When it fails, it usually does so on the 3rd mount (***).
FYI, I effectively call: set_capacity( disk, capacity_in_mb * 2048 ); (
2048 * 512b (kernel sector) = 1M )
Another example: if I set the sector count of the disk to 16G, I can run
mkfs.ext4 but the first mount fails and I see May 27 09:07:27 febtober
kernel: [62653.269387] EXT4-fs (smcd): ext4_check_descriptors: Block
bitmap for group 0 not in group (block 4294967295)!
But, again, if I set the sector size < 4G, everything seems fine. I can
currently DD read and write across that 4G boundary without issue --
it's ONLY the filesystem accesses. My gut is screaming there's 32/64 bit
overflow condition somewhere but for the life of me I can't find it. Is
there something I need to set to tell the block layer I have a 64-bit
addressible device? set_capacity is always the number of LINUX KERNEL
sectors (not what I set blk_queue_logical|physical_block_size to) correct?
I'm currently on 3.16.0 (Ubuntu 14.04.2 LTS) if it matters.
Any help/pointers would be greatly appreciated.
--Rob Harris
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Custom driver FS brokenness at 4GB?
2015-05-27 13:56 Custom driver FS brokenness at 4GB? Rob Harris
@ 2015-05-28 10:59 ` Jan Kara
2015-05-28 17:43 ` Andreas Dilger
0 siblings, 1 reply; 4+ messages in thread
From: Jan Kara @ 2015-05-28 10:59 UTC (permalink / raw)
To: Rob Harris; +Cc: linux-ext4
On Wed 27-05-15 09:56:29, Rob Harris wrote:
> Greetings. I have an odd issue and need some ideas of where to go
> next -- I'm out of hair to rip out.
>
> I'm writing a custom block device driver talking to some custom RAID
> hardware (>32TB) using DMA scatter-gather, with no partitions and am
> using make_request() to service all the BIO requests to simplify
> debugging. I have the driver working to the point where using DD
> against the block device seems to work fine (I'm setting
> iflag|oflag=direct to ensure it's writing to the disk). I also have
> the blk_queue set to only request a single 4k I/O per BIO (again to
> simplify debugging for now.) Also, again to debug, I have a mutex
> wrapping the entire make_request call to ensure that only a single
> request is being serviced at a time. So, this should be as "simple"
> as I can make the environment to debug this problem.
>
> Once the driver is loaded, when I try to create a file system (ext4
> but the same thing happens with xfs) it seems like there is some
> corruption occurring, but only when I set the sector size of the
> block device over 4GB. For instance, when I set the size to 4G, I
> can mkfs.ext4, but after 2 or 3 mount/umounts the FS refuses to
> mount anymore and the kernel log complains that the journal is
> missing. This was discovered running this loop...
Hard to tell exactly but with 4GB being 32-bit limit, I would first look
for some int / unsigned int number overflow. You could possibly better
debug this when writing some pattern via DD that is different for each
block to verify that each block indeed lands in the expected location...
Honza
>
> #!/bin/sh
> COUNT=4032
>
> while [ 1 ] ; do
>
> figlet ${COUNT}
>
> ( umount /mnt ; rmmod smc ) || true
> modprobe smc capacity_in_mb=${COUNT} debug=1
> mkfs.ext4 -m 0 /dev/smcd
>
> mount /dev/smcd /mnt
> cp count_512m.dat /mnt/test
> umount /mnt
> mount /dev/smcd /mnt
> umount /mnt
> mount /dev/smcd /mnt
> cmp count_512m.dat /mnt/test
> umount /mnt
> mount /dev/smcd /mnt # ***
> sync
> umount /mnt
> mount /dev/smcd /mnt
> sleep 1
> umount /mnt
>
> COUNT=$(( COUNT + 64 ))
> sleep 1
>
> done
>
> Sometimes I'll get in the kernel log:
> May 27 09:39:01 febtober kernel: [64547.304695] EXT4-fs (smcd):
> ext4_check_descriptors: Checksum for group 0 failed (7009!=0)
> May 27 09:39:01 febtober kernel: [64547.305744] EXT4-fs (smcd):
> group descriptors corrupted!
>
> Others I'll get:
> May 27 09:46:49 ryftone-smcdrv kernel: [65014.342850] EXT4-fs
> (smcd): no journal found
>
>
> I've seen this loop fail as early as COUNT=4096, but as late as
> COUNT=4220; removing the sync changes the behavior.
> When it fails, it usually does so on the 3rd mount (***).
> FYI, I effectively call: set_capacity( disk, capacity_in_mb * 2048
> ); ( 2048 * 512b (kernel sector) = 1M )
>
> Another example: if I set the sector count of the disk to 16G, I can
> run mkfs.ext4 but the first mount fails and I see May 27 09:07:27
> febtober kernel: [62653.269387] EXT4-fs (smcd):
> ext4_check_descriptors: Block bitmap for group 0 not in group (block
> 4294967295)!
>
> But, again, if I set the sector size < 4G, everything seems fine. I
> can currently DD read and write across that 4G boundary without
> issue -- it's ONLY the filesystem accesses. My gut is screaming
> there's 32/64 bit overflow condition somewhere but for the life of
> me I can't find it. Is there something I need to set to tell the
> block layer I have a 64-bit addressible device? set_capacity is
> always the number of LINUX KERNEL sectors (not what I set
> blk_queue_logical|physical_block_size to) correct?
>
> I'm currently on 3.16.0 (Ubuntu 14.04.2 LTS) if it matters.
>
> Any help/pointers would be greatly appreciated.
>
> --Rob Harris
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
--
Jan Kara <jack@suse.cz>
SUSE Labs, CR
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Custom driver FS brokenness at 4GB?
2015-05-28 10:59 ` Jan Kara
@ 2015-05-28 17:43 ` Andreas Dilger
2015-05-28 18:30 ` Rob Harris
0 siblings, 1 reply; 4+ messages in thread
From: Andreas Dilger @ 2015-05-28 17:43 UTC (permalink / raw)
To: Jan Kara; +Cc: Rob Harris, linux-ext4
On May 28, 2015, at 4:59 AM, Jan Kara <jack@suse.cz> wrote:
>
> On Wed 27-05-15 09:56:29, Rob Harris wrote:
>> Greetings. I have an odd issue and need some ideas of where to go
>> next -- I'm out of hair to rip out.
>>
>> I'm writing a custom block device driver talking to some custom RAID
>> hardware (>32TB) using DMA scatter-gather, with no partitions and am
>> using make_request() to service all the BIO requests to simplify
>> debugging. I have the driver working to the point where using DD
>> against the block device seems to work fine (I'm setting
>> iflag|oflag=direct to ensure it's writing to the disk). I also have
>> the blk_queue set to only request a single 4k I/O per BIO (again to
>> simplify debugging for now.) Also, again to debug, I have a mutex
>> wrapping the entire make_request call to ensure that only a single
>> request is being serviced at a time. So, this should be as "simple"
>> as I can make the environment to debug this problem.
>>
>> Once the driver is loaded, when I try to create a file system (ext4
>> but the same thing happens with xfs) it seems like there is some
>> corruption occurring, but only when I set the sector size of the
>> block device over 4GB. For instance, when I set the size to 4G, I
>> can mkfs.ext4, but after 2 or 3 mount/umounts the FS refuses to
>> mount anymore and the kernel log complains that the journal is
>> missing. This was discovered running this loop...
> Hard to tell exactly but with 4GB being 32-bit limit, I would first look
> for some int / unsigned int number overflow. You could possibly better
> debug this when writing some pattern via DD that is different for each
> block to verify that each block indeed lands in the expected location...
We have a tool "llverdev" which does exactly this - write a pattern
to each block in the block device (or in sparse regions covering the
device) with a timestamp and block number to track down sources of
block addressing errors:
http://git.hpdd.intel.com/fs/lustre-release.git/blob/HEAD:/lustre/utils/llverdev.c
Cheers, Andreas
> Honza
>>
>> #!/bin/sh
>> COUNT=4032
>>
>> while [ 1 ] ; do
>>
>> figlet ${COUNT}
>>
>> ( umount /mnt ; rmmod smc ) || true
>> modprobe smc capacity_in_mb=${COUNT} debug=1
>> mkfs.ext4 -m 0 /dev/smcd
>>
>> mount /dev/smcd /mnt
>> cp count_512m.dat /mnt/test
>> umount /mnt
>> mount /dev/smcd /mnt
>> umount /mnt
>> mount /dev/smcd /mnt
>> cmp count_512m.dat /mnt/test
>> umount /mnt
>> mount /dev/smcd /mnt # ***
>> sync
>> umount /mnt
>> mount /dev/smcd /mnt
>> sleep 1
>> umount /mnt
>>
>> COUNT=$(( COUNT + 64 ))
>> sleep 1
>>
>> done
>>
>> Sometimes I'll get in the kernel log:
>> May 27 09:39:01 febtober kernel: [64547.304695] EXT4-fs (smcd):
>> ext4_check_descriptors: Checksum for group 0 failed (7009!=0)
>> May 27 09:39:01 febtober kernel: [64547.305744] EXT4-fs (smcd):
>> group descriptors corrupted!
>>
>> Others I'll get:
>> May 27 09:46:49 ryftone-smcdrv kernel: [65014.342850] EXT4-fs
>> (smcd): no journal found
>>
>>
>> I've seen this loop fail as early as COUNT=4096, but as late as
>> COUNT=4220; removing the sync changes the behavior.
>> When it fails, it usually does so on the 3rd mount (***).
>> FYI, I effectively call: set_capacity( disk, capacity_in_mb * 2048
>> ); ( 2048 * 512b (kernel sector) = 1M )
>>
>> Another example: if I set the sector count of the disk to 16G, I can
>> run mkfs.ext4 but the first mount fails and I see May 27 09:07:27
>> febtober kernel: [62653.269387] EXT4-fs (smcd):
>> ext4_check_descriptors: Block bitmap for group 0 not in group (block
>> 4294967295)!
>>
>> But, again, if I set the sector size < 4G, everything seems fine. I
>> can currently DD read and write across that 4G boundary without
>> issue -- it's ONLY the filesystem accesses. My gut is screaming
>> there's 32/64 bit overflow condition somewhere but for the life of
>> me I can't find it. Is there something I need to set to tell the
>> block layer I have a 64-bit addressible device? set_capacity is
>> always the number of LINUX KERNEL sectors (not what I set
>> blk_queue_logical|physical_block_size to) correct?
>>
>> I'm currently on 3.16.0 (Ubuntu 14.04.2 LTS) if it matters.
>>
>> Any help/pointers would be greatly appreciated.
>>
>> --Rob Harris
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
> --
> Jan Kara <jack@suse.cz>
> SUSE Labs, CR
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
Cheers, Andreas
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Custom driver FS brokenness at 4GB?
2015-05-28 17:43 ` Andreas Dilger
@ 2015-05-28 18:30 ` Rob Harris
0 siblings, 0 replies; 4+ messages in thread
From: Rob Harris @ 2015-05-28 18:30 UTC (permalink / raw)
To: Andreas Dilger, Jan Kara; +Cc: linux-ext4
Thanks for the pointers everyone. After further testing and code review,
I was boneheadedly truncating a u64 to a u32 for the sector address as
part of a function signature with an obscured typedef.
*facepalm*
All seems well now. Thanks for the help!
-R
On 05/28/2015 01:43 PM, Andreas Dilger wrote:
> On May 28, 2015, at 4:59 AM, Jan Kara <jack@suse.cz> wrote:
>> On Wed 27-05-15 09:56:29, Rob Harris wrote:
>>> Greetings. I have an odd issue and need some ideas of where to go
>>> next -- I'm out of hair to rip out.
>>>
>>> I'm writing a custom block device driver talking to some custom RAID
>>> hardware (>32TB) using DMA scatter-gather, with no partitions and am
>>> using make_request() to service all the BIO requests to simplify
>>> debugging. I have the driver working to the point where using DD
>>> against the block device seems to work fine (I'm setting
>>> iflag|oflag=direct to ensure it's writing to the disk). I also have
>>> the blk_queue set to only request a single 4k I/O per BIO (again to
>>> simplify debugging for now.) Also, again to debug, I have a mutex
>>> wrapping the entire make_request call to ensure that only a single
>>> request is being serviced at a time. So, this should be as "simple"
>>> as I can make the environment to debug this problem.
>>>
>>> Once the driver is loaded, when I try to create a file system (ext4
>>> but the same thing happens with xfs) it seems like there is some
>>> corruption occurring, but only when I set the sector size of the
>>> block device over 4GB. For instance, when I set the size to 4G, I
>>> can mkfs.ext4, but after 2 or 3 mount/umounts the FS refuses to
>>> mount anymore and the kernel log complains that the journal is
>>> missing. This was discovered running this loop...
>> Hard to tell exactly but with 4GB being 32-bit limit, I would first look
>> for some int / unsigned int number overflow. You could possibly better
>> debug this when writing some pattern via DD that is different for each
>> block to verify that each block indeed lands in the expected location...
> We have a tool "llverdev" which does exactly this - write a pattern
> to each block in the block device (or in sparse regions covering the
> device) with a timestamp and block number to track down sources of
> block addressing errors:
>
> http://git.hpdd.intel.com/fs/lustre-release.git/blob/HEAD:/lustre/utils/llverdev.c
>
> Cheers, Andreas
>
>> Honza
>>> #!/bin/sh
>>> COUNT=4032
>>>
>>> while [ 1 ] ; do
>>>
>>> figlet ${COUNT}
>>>
>>> ( umount /mnt ; rmmod smc ) || true
>>> modprobe smc capacity_in_mb=${COUNT} debug=1
>>> mkfs.ext4 -m 0 /dev/smcd
>>>
>>> mount /dev/smcd /mnt
>>> cp count_512m.dat /mnt/test
>>> umount /mnt
>>> mount /dev/smcd /mnt
>>> umount /mnt
>>> mount /dev/smcd /mnt
>>> cmp count_512m.dat /mnt/test
>>> umount /mnt
>>> mount /dev/smcd /mnt # ***
>>> sync
>>> umount /mnt
>>> mount /dev/smcd /mnt
>>> sleep 1
>>> umount /mnt
>>>
>>> COUNT=$(( COUNT + 64 ))
>>> sleep 1
>>>
>>> done
>>>
>>> Sometimes I'll get in the kernel log:
>>> May 27 09:39:01 febtober kernel: [64547.304695] EXT4-fs (smcd):
>>> ext4_check_descriptors: Checksum for group 0 failed (7009!=0)
>>> May 27 09:39:01 febtober kernel: [64547.305744] EXT4-fs (smcd):
>>> group descriptors corrupted!
>>>
>>> Others I'll get:
>>> May 27 09:46:49 ryftone-smcdrv kernel: [65014.342850] EXT4-fs
>>> (smcd): no journal found
>>>
>>>
>>> I've seen this loop fail as early as COUNT=4096, but as late as
>>> COUNT=4220; removing the sync changes the behavior.
>>> When it fails, it usually does so on the 3rd mount (***).
>>> FYI, I effectively call: set_capacity( disk, capacity_in_mb * 2048
>>> ); ( 2048 * 512b (kernel sector) = 1M )
>>>
>>> Another example: if I set the sector count of the disk to 16G, I can
>>> run mkfs.ext4 but the first mount fails and I see May 27 09:07:27
>>> febtober kernel: [62653.269387] EXT4-fs (smcd):
>>> ext4_check_descriptors: Block bitmap for group 0 not in group (block
>>> 4294967295)!
>>>
>>> But, again, if I set the sector size < 4G, everything seems fine. I
>>> can currently DD read and write across that 4G boundary without
>>> issue -- it's ONLY the filesystem accesses. My gut is screaming
>>> there's 32/64 bit overflow condition somewhere but for the life of
>>> me I can't find it. Is there something I need to set to tell the
>>> block layer I have a 64-bit addressible device? set_capacity is
>>> always the number of LINUX KERNEL sectors (not what I set
>>> blk_queue_logical|physical_block_size to) correct?
>>>
>>> I'm currently on 3.16.0 (Ubuntu 14.04.2 LTS) if it matters.
>>>
>>> Any help/pointers would be greatly appreciated.
>>>
>>> --Rob Harris
>>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>> --
>> Jan Kara <jack@suse.cz>
>> SUSE Labs, CR
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
> Cheers, Andreas
>
>
>
>
>
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2015-05-28 18:31 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-05-27 13:56 Custom driver FS brokenness at 4GB? Rob Harris
2015-05-28 10:59 ` Jan Kara
2015-05-28 17:43 ` Andreas Dilger
2015-05-28 18:30 ` Rob Harris
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.