linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [GIT PULL] ext4 updates for 3.9
@ 2013-02-26 20:39 Theodore Ts'o
  2013-02-27 12:47 ` Markus Trippelsdorf
  0 siblings, 1 reply; 34+ messages in thread
From: Theodore Ts'o @ 2013-02-26 20:39 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-ext4, linux-kernel


The following changes since commit 9931faca02c604c22335f5a935a501bb2ace6e20:

  Linux 3.8-rc3 (2013-01-09 18:59:55 -0800)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4.git tags/ext4_for_linus

for you to fetch changes up to 304e220f0879198b1f5309ad6f0be862b4009491:

  ext4: fix free clusters calculation in bigalloc filesystem (2013-02-22 15:27:52 -0500)

----------------------------------------------------------------
The one new feature added in this patch series is the ability to use
the "punch hole" functionality for inodes that are not using extent
maps.

In the bug fix category, we fixed some races in the AIO and fstrim
code, and some potential NULL pointer dereferences and memory leaks in
error handling code paths.

In the optimization category, we fixed a performance regression in the
jbd2 layer introduced by commit d9b0193 (introduced in v3.0) which
shows up in the AIM7 benchmark.  We also further optimized jbd2 by
minimize the amount of time that transaction handles are held active.

This patch series also features some additional enhancement of the
extent status tree, which is now used to cache extent information in a
more efficient/compact form than what we use on-disk.

----------------------------------------------------------------
Akria Fujita (1):
      ext4: fix smatch warning in move_extent.c's mext_replace_branches()

Chen Gang (1):
      ext4: fix memory leak when quota options are specified multiple times

Cong Ding (1):
      ext4: reduce one "if" comparison in ext4_dirhash()

Eric Sandeen (1):
      jbd2: don't wake kjournald unnecessarily

Eryu Guan (2):
      ext4: check bh in ext4_read_block_bitmap()
      ext4: no need to remove extent if len is 0 in ext4_es_remove_extent()

Guo Chao (4):
      ext4: release buffer when checksum failed
      ext4: remove unused variable in add_dirent_to_buf()
      ext4: remove useless assignment in dx_probe()
      ext4: remove unnecessary NULL pointer check

Jan Kara (13):
      ext4: Always use ext4_bio_write_page() for writeout
      ext4: use redirty_page_for_writepage() in ext4_bio_write_page()
      ext4: remove __ext4_journalled_writepage() from mpage_da_submit_io()
      ext4: move work from io_end to inode
      ext4: simplify list handling in ext4_do_flush_completed_IO()
      ext4: dirty page has always buffers attached
      ext4: simplify mpage_add_bh_to_extent()
      ext4: Make ext4_bio_writepage() handle unprepared buffers
      ext4: fix ext4_writepage() to achieve data=ordered guarantees
      ext4: fix possible use-after-free with AIO
      ext4: move several mount options to standard handling loop
      ext4: make mount option parsing loop more logical
      ext4: print error when argument of inode_readahead_blk is invalid

Julia Lawall (1):
      ext4: use WARN in ext4_alloc_blocks

Lukas Czerner (4):
      ext4: remove unused variable flags
      ext4: remove explicit WARN_ON when ext4_map_blocks() fails
      ext4: fix xattr block allocation/release with bigalloc
      ext4: fix free clusters calculation in bigalloc filesystem

Niu Yawei (1):
      ext4: fix race in ext4_mb_add_n_trim()

Theodore Ts'o (24):
      ext4: return ENOMEM if sb_getblk() fails
      ext4: trigger the lazy inode table initialization after resize
      ext4: release sysfs kobject when failing to enable quotas on mount
      quota: autoload the quota_v2 module for QFMT_VFS_V1 quota format
      ext4: check incompatible mount options while mounting ext2/3
      ext4: optimize mballoc for large allocations
      jbd2: track request delay statistics
      jbd2: revert "jbd2: add COW fields to struct jbd2_journal_handle"
      jbd2: add tracepoints which provide per-handle statistics
      ext4: move the jbd2 wrapper functions out of super.c
      ext4: pass context information to jbd2__journal_start()
      ext4: grab page before starting transaction handle in write_begin()
      ext4: start handle at the last possible moment in ext4_unlink()
      ext4: start handle at the last possible moment in ext4_rmdir()
      ext4: fix the number of credits needed for ext4_ext_migrate()
      ext4: fix the number of credits needed for ext4_unlink() and ext4_rmdir()
      ext4: fix the number of credits needed for acl ops with inline data
      ext4: start handle at the last possible moment when creating inodes
      ext4: use module parameters instead of debugfs for mballoc_debug
      jbd2: use module parameters instead of debugfs for jbd_debug
      ext4: use KERN_WARNING for warning messages
      ext4: add debugging context for warning in ext4_da_update_reserve_space()
      ext4: refactor code to read directory blocks into ext4_read_dirblock()
      ext4: use ERR_PTR() abstraction for ext4_append()

Wang Shilong (1):
      ext4: use unlikely to improve the efficiency of the kernel

Zheng Liu (11):
      ext4: add tracepoint in punching hole
      ext4: add punching hole support for non-extent-mapped files
      ext4: refine extent status tree
      ext4: add physical block and status member into extent status tree
      ext4: rename and improbe ext4_es_find_extent()
      ext4: let ext4_ext_map_blocks return EXT4_MAP_UNWRITTEN flag
      ext4: track all extent status in extent status tree
      ext4: lookup block mapping in extent status tree
      ext4: remove single extent cache
      ext4: adjust some functions for reclaiming extents from extent status tree
      ext4: reclaim extents from extent status tree

 fs/ext4/acl.c               |   7 +-
 fs/ext4/balloc.c            |  13 +-
 fs/ext4/dir.c               |   1 +
 fs/ext4/ext4.h              | 123 ++++++--
 fs/ext4/ext4_extents.h      |   6 -
 fs/ext4/ext4_jbd2.c         | 102 +++++++
 fs/ext4/ext4_jbd2.h         |  51 +++-
 fs/ext4/extents.c           | 312 ++++++++-------------
 fs/ext4/extents_status.c    | 631 ++++++++++++++++++++++++++++++-----------
 fs/ext4/extents_status.h    |  83 +++++-
 fs/ext4/file.c              |  16 +-
 fs/ext4/hash.c              |   6 +-
 fs/ext4/ialloc.c            |  29 +-
 fs/ext4/indirect.c          | 259 ++++++++++++++++-
 fs/ext4/inline.c            |  12 +-
 fs/ext4/inode.c             | 664 ++++++++++++++++++++------------------------
 fs/ext4/ioctl.c             |  13 +-
 fs/ext4/mballoc.c           |  69 ++---
 fs/ext4/mballoc.h           |   4 +-
 fs/ext4/migrate.c           |  15 +-
 fs/ext4/mmp.c               |   4 +-
 fs/ext4/move_extent.c       |  10 +-
 fs/ext4/namei.c             | 497 +++++++++++++++------------------
 fs/ext4/page-io.c           |  85 ++----
 fs/ext4/resize.c            |  36 +--
 fs/ext4/super.c             | 485 ++++++++++++++------------------
 fs/ext4/xattr.c             |  23 +-
 fs/ext4/xattr.h             |  68 -----
 fs/jbd2/commit.c            |   8 +
 fs/jbd2/journal.c           |  66 ++---
 fs/jbd2/transaction.c       |  29 +-
 include/linux/jbd2.h        |  44 ++-
 include/linux/quota.h       |   1 +
 include/trace/events/ext4.h | 214 +++++++++++---
 include/trace/events/jbd2.h | 106 ++++++-
 35 files changed, 2408 insertions(+), 1684 deletions(-)


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [GIT PULL] ext4 updates for 3.9
  2013-02-26 20:39 [GIT PULL] ext4 updates for 3.9 Theodore Ts'o
@ 2013-02-27 12:47 ` Markus Trippelsdorf
  2013-02-27 15:34   ` Theodore Ts'o
  0 siblings, 1 reply; 34+ messages in thread
From: Markus Trippelsdorf @ 2013-02-27 12:47 UTC (permalink / raw)
  To: Theodore Ts'o; +Cc: Linus Torvalds, linux-ext4, linux-kernel

On 2013.02.26 at 15:39 -0500, Theodore Ts'o wrote:
> 
> The following changes since commit 9931faca02c604c22335f5a935a501bb2ace6e20:
> 
>   Linux 3.8-rc3 (2013-01-09 18:59:55 -0800)
> 
> are available in the git repository at:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4.git tags/ext4_for_linus

Just booted todays Linux tree and got the following errors:

...
Feb 27 13:33:31 x4 kernel: EXT4-fs (sda): mounted filesystem with ordered data mode. Opts: (null)
...
Feb 27 13:33:32 x4 kernel: EXT4-fs error (device sda): ext4_find_dest_de:1657: inode #70647809: block 14164000: comm cupsd: bad entry in directory: rec_len % 4 != 0 - offset=0(0), inode=2569822761, rec_len=3837, name_len=1
Feb 27 13:33:32 x4 kernel: EXT4-fs error (device sda): ext4_find_dest_de:1657: inode #70911401: block 15213579: comm pdnsd: bad entry in directory: rec_len % 4 != 0 - offset=0(0), inode=2000846358, rec_len=36782, name_len=120

/dev/sda       ext4      1.4T  655G  651G  51% /var
/dev/sda on /var type ext4 (rw,noatime,data=ordered)

Running "fsck.ext4 -f /dev/sda" shows no problems.

-- 
Markus

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [GIT PULL] ext4 updates for 3.9
  2013-02-27 12:47 ` Markus Trippelsdorf
@ 2013-02-27 15:34   ` Theodore Ts'o
  2013-02-27 15:44     ` Markus Trippelsdorf
  0 siblings, 1 reply; 34+ messages in thread
From: Theodore Ts'o @ 2013-02-27 15:34 UTC (permalink / raw)
  To: Markus Trippelsdorf; +Cc: Linus Torvalds, linux-ext4, linux-kernel

On Wed, Feb 27, 2013 at 01:47:27PM +0100, Markus Trippelsdorf wrote:
> Just booted todays Linux tree and got the following errors:
> 
> ...
> Feb 27 13:33:31 x4 kernel: EXT4-fs (sda): mounted filesystem with ordered data mode. Opts: (null)
> ...
> Feb 27 13:33:32 x4 kernel: EXT4-fs error (device sda): ext4_find_dest_de:1657: inode #70647809: block 14164000: comm cupsd: bad entry in directory: rec_len % 4 != 0 - offset=0(0), inode=2569822761, rec_len=3837, name_len=1
> Feb 27 13:33:32 x4 kernel: EXT4-fs error (device sda): ext4_find_dest_de:1657: inode #70911401: block 15213579: comm pdnsd: bad entry in directory: rec_len % 4 != 0 - offset=0(0), inode=2000846358, rec_len=36782, name_len=120

Is this reproducible?  This looks like the in-memory copy of the
directory got corrupted.  This could be caused by a hardware error, or
a wild pointer, or a bug in the buffer cache code, etc.  Since there
are so many different possible causes of this kind of complaint, we
really need some kind of reproducible test case to do anything with this.

I did do a test compile of the ext4 tree with the latest linus.git
tree merged in, and ran a full set of repgression tests before I sent
my pull request.  Now, the regression tests take over 14 hours to run,
and there is a delay between when a maintainer sends the pull request
to when Linus acts on it --- so Linus almost certainly pulled in some
other trees betewen when I did my final regression testing and when I
sent the pull request and he pulled it into my tree.

I'll see if I can reproduce this on my end, on Linus's tree after the
ext4 tree was merged in, but at least in the past, this is the sort of
thing that is almost certainly caused by a hardware failure or bug
somewhere in the device driver, mm, or buffer cache, given that the
directory looks completely insane and a subsequent e2fsck -f didn't
discover any problem.

Is there anything special about your system?  How much memory do you
have?  What kind of device is /dev/sda?  What sort of workload did you
have running on your system before the failure?

Also, can you send us the output of "debugfs -R "stat <70647809>"
/dev/sda" so I can confirm that block 14164000 really is assigned to
inode 70647809?  The one potential cause of this error I can think of
that might be related to recent changes in ext4 is if the extent
status tree had the wrong logical-to-physical mapping cached for the
directory inode.

Regards,

						- Ted

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [GIT PULL] ext4 updates for 3.9
  2013-02-27 15:34   ` Theodore Ts'o
@ 2013-02-27 15:44     ` Markus Trippelsdorf
  2013-02-27 17:01       ` Markus Trippelsdorf
  0 siblings, 1 reply; 34+ messages in thread
From: Markus Trippelsdorf @ 2013-02-27 15:44 UTC (permalink / raw)
  To: Theodore Ts'o, Linus Torvalds, linux-ext4, linux-kernel

On 2013.02.27 at 10:34 -0500, Theodore Ts'o wrote:
> On Wed, Feb 27, 2013 at 01:47:27PM +0100, Markus Trippelsdorf wrote:
> > Just booted todays Linux tree and got the following errors:
> > 
> > ...
> > Feb 27 13:33:31 x4 kernel: EXT4-fs (sda): mounted filesystem with ordered data mode. Opts: (null)
> > ...
> > Feb 27 13:33:32 x4 kernel: EXT4-fs error (device sda): ext4_find_dest_de:1657: inode #70647809: block 14164000: comm cupsd: bad entry in directory: rec_len % 4 != 0 - offset=0(0), inode=2569822761, rec_len=3837, name_len=1
> > Feb 27 13:33:32 x4 kernel: EXT4-fs error (device sda): ext4_find_dest_de:1657: inode #70911401: block 15213579: comm pdnsd: bad entry in directory: rec_len % 4 != 0 - offset=0(0), inode=2000846358, rec_len=36782, name_len=120
> 
> Is this reproducible? 

Haven't checked yet, because I got scared.

> Is there anything special about your system?  How much memory do you
> have?  What kind of device is /dev/sda?  What sort of workload did you
> have running on your system before the failure?

There is nothing special about my system. /dev/sda is a standard SATA
drive: 
 Model Family:     Seagate Barracuda Green (AF)
 Device Model:     ST1500DL003-9VT16L
The error happens during boot.
I use ECC memory.

> Also, can you send us the output of "debugfs -R "stat <70647809>"
> /dev/sda" so I can confirm that block 14164000 really is assigned to
> inode 70647809?  The one potential cause of this error I can think of
> that might be related to recent changes in ext4 is if the extent
> status tree had the wrong logical-to-physical mapping cached for the
> directory inode.

Inode: 70647809   Type: directory    Mode:  0755   Flags: 0x80000
Generation: 4138624941    Version: 0x00000000:0000000d
User:     0   Group:     0   Size: 4096
File ACL: 0    Directory ACL: 0
Links: 13   Blockcount: 8
Fragment:  Address: 0    Number: 0    Size: 0
 ctime: 0x50ab80de:2c77d5bc -- Tue Nov 20 14:08:46 2012
 atime: 0x50aa5519:c556886c -- Mon Nov 19 16:49:45 2012
 mtime: 0x50ab80de:2c77d5bc -- Tue Nov 20 14:08:46 2012
crtime: 0x50aa4a81:06bc6fd0 -- Mon Nov 19 16:04:33 2012
Size of extra inode fields: 28
EXTENTS:
(0):282599456

-- 
Markus

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [GIT PULL] ext4 updates for 3.9
  2013-02-27 15:44     ` Markus Trippelsdorf
@ 2013-02-27 17:01       ` Markus Trippelsdorf
  2013-02-27 17:10         ` gnehzuil.liu
  0 siblings, 1 reply; 34+ messages in thread
From: Markus Trippelsdorf @ 2013-02-27 17:01 UTC (permalink / raw)
  To: Theodore Ts'o, Zheng Liu, Linus Torvalds, Dave Jones,
	Borislav Petkov, linux-ext4, linux-kernel

On 2013.02.27 at 16:44 +0100, Markus Trippelsdorf wrote:
> On 2013.02.27 at 10:34 -0500, Theodore Ts'o wrote:
> > On Wed, Feb 27, 2013 at 01:47:27PM +0100, Markus Trippelsdorf wrote:
> > > Just booted todays Linux tree and got the following errors:
> > > 
> > > ...
> > > Feb 27 13:33:31 x4 kernel: EXT4-fs (sda): mounted filesystem with ordered data mode. Opts: (null)
> > > ...
> > > Feb 27 13:33:32 x4 kernel: EXT4-fs error (device sda): ext4_find_dest_de:1657: inode #70647809: block 14164000: comm cupsd: bad entry in directory: rec_len % 4 != 0 - offset=0(0), inode=2569822761, rec_len=3837, name_len=1
> > > Feb 27 13:33:32 x4 kernel: EXT4-fs error (device sda): ext4_find_dest_de:1657: inode #70911401: block 15213579: comm pdnsd: bad entry in directory: rec_len % 4 != 0 - offset=0(0), inode=2000846358, rec_len=36782, name_len=120
> > 
> > Is this reproducible? 

The issue is caused by:

commit d100eef2440fea13e4f09e88b1c8bcbca64beb9f
Author: Zheng Liu <wenqing.lz@taobao.com>
Date:   Mon Feb 18 00:29:59 2013 -0500

    ext4: lookup block mapping in extent status tree

-- 
Markus

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [GIT PULL] ext4 updates for 3.9
  2013-02-27 17:01       ` Markus Trippelsdorf
@ 2013-02-27 17:10         ` gnehzuil.liu
  2013-02-27 17:22           ` Markus Trippelsdorf
  0 siblings, 1 reply; 34+ messages in thread
From: gnehzuil.liu @ 2013-02-27 17:10 UTC (permalink / raw)
  To: Markus Trippelsdorf
  Cc: Theodore Ts'o, Zheng Liu, Linus Torvalds, Dave Jones,
	Borislav Petkov, linux-ext4, linux-kernel

在 2013-2-28,上午1:01,Markus Trippelsdorf <markus@trippelsdorf.de> 写道:

> On 2013.02.27 at 16:44 +0100, Markus Trippelsdorf wrote:
>> On 2013.02.27 at 10:34 -0500, Theodore Ts'o wrote:
>>> On Wed, Feb 27, 2013 at 01:47:27PM +0100, Markus Trippelsdorf wrote:
>>>> Just booted todays Linux tree and got the following errors:
>>>> 
>>>> ...
>>>> Feb 27 13:33:31 x4 kernel: EXT4-fs (sda): mounted filesystem with ordered data mode. Opts: (null)
>>>> ...
>>>> Feb 27 13:33:32 x4 kernel: EXT4-fs error (device sda): ext4_find_dest_de:1657: inode #70647809: block 14164000: comm cupsd: bad entry in directory: rec_len % 4 != 0 - offset=0(0), inode=2569822761, rec_len=3837, name_len=1
>>>> Feb 27 13:33:32 x4 kernel: EXT4-fs error (device sda): ext4_find_dest_de:1657: ignore#70911401: block 15213579: comm pdnsd: bad entry in directory: rec_len % 4 != 0 - offset=0(0), inode=2000846358, rec_len=36782, name_len=120
>>> 
>>> Is this reproducible? 
> 
> The issue is caused by:
> 
> commit d100eef2440fea13e4f09e88b1c8bcbca64beb9f
> Author: Zheng Liu <wenqing.lz@taobao.com>
> Date:   Mon Feb 18 00:29:59 2013 -0500
> 
>    ext4: lookup block mapping in extent status tree

Hi Markus,

Thanks for the report.  I am very sorry about that.  Now I am trying to fix it.

Thanks,
                                                - Zheng

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [GIT PULL] ext4 updates for 3.9
  2013-02-27 17:10         ` gnehzuil.liu
@ 2013-02-27 17:22           ` Markus Trippelsdorf
  2013-02-27 17:38             ` gnehzuil.liu
  0 siblings, 1 reply; 34+ messages in thread
From: Markus Trippelsdorf @ 2013-02-27 17:22 UTC (permalink / raw)
  To: gnehzuil.liu
  Cc: Theodore Ts'o, Zheng Liu, Linus Torvalds, Dave Jones,
	Borislav Petkov, linux-ext4, linux-kernel

On 2013.02.28 at 01:10 +0800, gnehzuil.liu wrote:
> 在 2013-2-28,上午1:01,Markus Trippelsdorf <markus@trippelsdorf.de> 写道:
> 
> > On 2013.02.27 at 16:44 +0100, Markus Trippelsdorf wrote:
> >> On 2013.02.27 at 10:34 -0500, Theodore Ts'o wrote:
> >>> On Wed, Feb 27, 2013 at 01:47:27PM +0100, Markus Trippelsdorf wrote:
> >>>> Just booted todays Linux tree and got the following errors:
> >>>> 
> >>>> ...
> >>>> Feb 27 13:33:31 x4 kernel: EXT4-fs (sda): mounted filesystem with ordered data mode. Opts: (null)
> >>>> ...
> >>>> Feb 27 13:33:32 x4 kernel: EXT4-fs error (device sda): ext4_find_dest_de:1657: inode #70647809: block 14164000: comm cupsd: bad entry in directory: rec_len % 4 != 0 - offset=0(0), inode=2569822761, rec_len=3837, name_len=1
> >>>> Feb 27 13:33:32 x4 kernel: EXT4-fs error (device sda): ext4_find_dest_de:1657: ignore#70911401: block 15213579: comm pdnsd: bad entry in directory: rec_len % 4 != 0 - offset=0(0), inode=2000846358, rec_len=36782, name_len=120
> >>> 
> >>> Is this reproducible? 
> > 
> > The issue is caused by:
> > 
> > commit d100eef2440fea13e4f09e88b1c8bcbca64beb9f
> > Author: Zheng Liu <wenqing.lz@taobao.com>
> > Date:   Mon Feb 18 00:29:59 2013 -0500
> > 
> >    ext4: lookup block mapping in extent status tree
> 
> Thanks for the report.  I am very sorry about that.  Now I am trying to fix it.
> 
Thank you.

Please note that if I run a kernel build with: 
"git reset --hard f7fec032aa782d3fd7e51fbdf08aa3a296c01500" 
(the commit before d100eef24) I get a different but similar error:

  EXT4-fs (sda): error count: 4
  EXT4-fs (sda): initial error at 1361983458: ext4_find_dest_de:1657: inode 70911401: block 15213579 
  EXT4-fs (sda): last error at 1361983663: ext4_find_dest_de:1658: inode 70647809: block 14164000

-- 
Markus

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [GIT PULL] ext4 updates for 3.9
  2013-02-27 17:22           ` Markus Trippelsdorf
@ 2013-02-27 17:38             ` gnehzuil.liu
  2013-02-27 17:45               ` Markus Trippelsdorf
  0 siblings, 1 reply; 34+ messages in thread
From: gnehzuil.liu @ 2013-02-27 17:38 UTC (permalink / raw)
  To: Markus Trippelsdorf
  Cc: Theodore Ts'o, Zheng Liu, Linus Torvalds, Dave Jones,
	Borislav Petkov, linux-ext4, linux-kernel

在 2013-2-28,上午1:22,Markus Trippelsdorf <markus@trippelsdorf.de> 写道:

> On 2013.02.28 at 01:10 +0800, gnehzuil.liu wrote:
>> 在 2013-2-28,上午1:01,Markus Trippelsdorf <markus@trippelsdorf.de> 写道:
>> 
>>> On 2013.02.27 at 16:44 +0100, Markus Trippelsdorf wrote:
>>>> On 2013.02.27 at 10:34 -0500, Theodore Ts'o wrote:
>>>>> On Wed, Feb 27, 2013 at 01:47:27PM +0100, Markus Trippelsdorf wrote:
>>>>>> Just booted todays Linux tree and got the following errors:
>>>>>> 
>>>>>> ...
>>>>>> Feb 27 13:33:31 x4 kernel: EXT4-fs (sda): mounted filesystem with ordered data mode. Opts: (null)
>>>>>> ...
>>>>>> Feb 27 13:33:32 x4 kernel: EXT4-fs error (device sda): ext4_find_dest_de:1657: inode #70647809: block 14164000: comm cupsd: bad entry in directory: rec_len % 4 != 0 - offset=0(0), inode=2569822761, rec_len=3837, name_len=1
>>>>>> Feb 27 13:33:32 x4 kernel: EXT4-fs error (device sda): ext4_find_dest_de:1657: ignore#70911401: block 15213579: comm pdnsd: bad entry in directory: rec_len % 4 != 0 - offset=0(0), inode=2000846358, rec_len=36782, name_len=120
>>>>> 
>>>>> Is this reproducible? 
>>> 
>>> The issue is caused by:
>>> 
>>> commit d100eef2440fea13e4f09e88b1c8bcbca64beb9f
>>> Author: Zheng Liu <wenqing.lz@taobao.com>
>>> Date:   Mon Feb 18 00:29:59 2013 -0500
>>> 
>>>   ext4: lookup block mapping in extent status tree
>> 
>> Thanks for the report.  I am very sorry about that.  Now I am trying to fix it.
>> 
> Thank you.
> 
> Please note that if I run a kernel build with: 
> "git reset --hard f7fec032aa782d3fd7e51fbdf08aa3a296c01500" 
> (the commit before d100eef24) I get a different but similar error:
> 
>  EXT4-fs (sda): error count: 4
>  EXT4-fs (sda): initial error at 1361983458: ext4_find_dest_de:1657: inode 70911401: block 15213579 
>  EXT4-fs (sda): last error at 1361983663: ext4_find_dest_de:1658: inode 70647809: block 14164000

Yup, thanks for the note.  Could you please revert to this commit a25a4e1a5d5dc0f97dddbca44e695c532d8228c1 (the commit before f7fec032)?  I suspect that commit f7fec032 is the root cause.  Thanks for your help.

Regards,
                                                - Zheng

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [GIT PULL] ext4 updates for 3.9
  2013-02-27 17:38             ` gnehzuil.liu
@ 2013-02-27 17:45               ` Markus Trippelsdorf
  2013-02-27 17:52                 ` Linus Torvalds
  0 siblings, 1 reply; 34+ messages in thread
From: Markus Trippelsdorf @ 2013-02-27 17:45 UTC (permalink / raw)
  To: gnehzuil.liu
  Cc: Theodore Ts'o, Zheng Liu, Linus Torvalds, Dave Jones,
	Borislav Petkov, linux-ext4, linux-kernel

On 2013.02.28 at 01:38 +0800, gnehzuil.liu wrote:
> 在 2013-2-28,上午1:22,Markus Trippelsdorf <markus@trippelsdorf.de> 写道:
> 
> > On 2013.02.28 at 01:10 +0800, gnehzuil.liu wrote:
> >> 在 2013-2-28,上午1:01,Markus Trippelsdorf <markus@trippelsdorf.de> 写道:
> >> 
> >>> On 2013.02.27 at 16:44 +0100, Markus Trippelsdorf wrote:
> >>>> On 2013.02.27 at 10:34 -0500, Theodore Ts'o wrote:
> >>>>> On Wed, Feb 27, 2013 at 01:47:27PM +0100, Markus Trippelsdorf wrote:
> >>>>>> Just booted todays Linux tree and got the following errors:
> >>>>>> 
> >>>>>> ...
> >>>>>> Feb 27 13:33:31 x4 kernel: EXT4-fs (sda): mounted filesystem with ordered data mode. Opts: (null)
> >>>>>> ...
> >>>>>> Feb 27 13:33:32 x4 kernel: EXT4-fs error (device sda): ext4_find_dest_de:1657: inode #70647809: block 14164000: comm cupsd: bad entry in directory: rec_len % 4 != 0 - offset=0(0), inode=2569822761, rec_len=3837, name_len=1
> >>>>>> Feb 27 13:33:32 x4 kernel: EXT4-fs error (device sda): ext4_find_dest_de:1657: ignore#70911401: block 15213579: comm pdnsd: bad entry in directory: rec_len % 4 != 0 - offset=0(0), inode=2000846358, rec_len=36782, name_len=120
> >>>>> 
> >>>>> Is this reproducible? 
> >>> 
> >>> The issue is caused by:
> >>> 
> >>> commit d100eef2440fea13e4f09e88b1c8bcbca64beb9f
> >>> Author: Zheng Liu <wenqing.lz@taobao.com>
> >>> Date:   Mon Feb 18 00:29:59 2013 -0500
> >>> 
> >>>   ext4: lookup block mapping in extent status tree
> >> 
> >> Thanks for the report.  I am very sorry about that.  Now I am trying to fix it.
> >> 
> > Thank you.
> > 
> > Please note that if I run a kernel build with: 
> > "git reset --hard f7fec032aa782d3fd7e51fbdf08aa3a296c01500" 
> > (the commit before d100eef24) I get a different but similar error:
> > 
> >  EXT4-fs (sda): error count: 4
> >  EXT4-fs (sda): initial error at 1361983458: ext4_find_dest_de:1657: inode 70911401: block 15213579 
> >  EXT4-fs (sda): last error at 1361983663: ext4_find_dest_de:1658: inode 70647809: block 14164000
> 
> Yup, thanks for the note.  Could you please revert to this commit
> a25a4e1a5d5dc0f97dddbca44e695c532d8228c1 (the commit before f7fec032)?
> I suspect that commit f7fec032 is the root cause.  Thanks for your
> help. 

git revert 06b0c886214a223dde7b21cbfc3008fd20a8ce16..74cd15cd02708c7188581f279f33a98b2ae8d322
fixes all issues...

-- 
Markus

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [GIT PULL] ext4 updates for 3.9
  2013-02-27 17:45               ` Markus Trippelsdorf
@ 2013-02-27 17:52                 ` Linus Torvalds
  2013-02-27 18:49                   ` Theodore Ts'o
  0 siblings, 1 reply; 34+ messages in thread
From: Linus Torvalds @ 2013-02-27 17:52 UTC (permalink / raw)
  To: Markus Trippelsdorf
  Cc: gnehzuil.liu, Theodore Ts'o, Zheng Liu, Dave Jones,
	Borislav Petkov, linux-ext4, linux-kernel

On Wed, Feb 27, 2013 at 9:45 AM, Markus Trippelsdorf
<markus@trippelsdorf.de> wrote:
>
> git revert 06b0c886214a223dde7b21cbfc3008fd20a8ce16..74cd15cd02708c7188581f279f33a98b2ae8d322
> fixes all issues...

Hmm. I'm hoping this will have a quick resolution that doesn't mean
having to revert all that. But it's good to have the option.

Zheng, Ted, I'm holding up further merging (ugh, second time this
merge window) in the hope that we can get this sorted out quickly. I
hate continuing to merge stuff when there are known nasty issues
pending in my tree.

I can't hold for all that long, but it would be really nice if this
got sorted out quickly.

                   Linus

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [GIT PULL] ext4 updates for 3.9
  2013-02-27 17:52                 ` Linus Torvalds
@ 2013-02-27 18:49                   ` Theodore Ts'o
  2013-02-27 18:56                     ` Markus Trippelsdorf
                                       ` (3 more replies)
  0 siblings, 4 replies; 34+ messages in thread
From: Theodore Ts'o @ 2013-02-27 18:49 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Markus Trippelsdorf, gnehzuil.liu, Zheng Liu, Dave Jones,
	Borislav Petkov, linux-ext4, linux-kernel

Markus, Dave, can you confirm that this fixes your problem?

Thanks!!

(Sigh, this is a real brown paper bug; I'm embarassed I missed this in
my code review.)

						- Ted

>From f47f0d11096ca5d9e1965d8a9f266aa13fe2b73b Mon Sep 17 00:00:00 2001
From: Theodore Ts'o <tytso@mit.edu>
Date: Wed, 27 Feb 2013 13:47:52 -0500
Subject: [PATCH] ext4: fix extent status tree regression for file systems >
 512GB

This fixes a regression introduced by commit f7fec032aa782.  The
problem was that the extents status flags caused us to mask out block
numbers smaller than 2**28 blocks.  Since we didn't test with file
systems smaller than 512GB, we didn't notice this during the
development cycle.

A typical failure looks like this:

EXT4-fs error (device sdb1): htree_dirblock_to_tree:919: inode #172235804: block
152052301: comm ls: bad entry in directory: rec_len is smaller than minimal -
offset=0(0), inode=0, rec_len=0, name_len=0

... where 'debugfs -R "stat <172235804>" /dev/sdb1' reports that the
inode has block number 688923213.  When viewed in hex, block number
152052301 (from the syslog) is 0x910224D, while block number 688923213
is 0x2910224D.  Note the missing "0x20000000" in the block number.

Reported-by: Markus Trippelsdorf <markus@trippelsdorf.de>
Reported-by: Dave Jones <davej@redhat.com>
Cc: Zheng Liu <gnehzuil.liu@gmail.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
---
 fs/ext4/extents_status.h | 16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/fs/ext4/extents_status.h b/fs/ext4/extents_status.h
index cf83e77..c795ff6 100644
--- a/fs/ext4/extents_status.h
+++ b/fs/ext4/extents_status.h
@@ -20,10 +20,10 @@
 #define es_debug(fmt, ...)	no_printk(fmt, ##__VA_ARGS__)
 #endif
 
-#define EXTENT_STATUS_WRITTEN	0x80000000	/* written extent */
-#define EXTENT_STATUS_UNWRITTEN	0x40000000	/* unwritten extent */
-#define EXTENT_STATUS_DELAYED	0x20000000	/* delayed extent */
-#define EXTENT_STATUS_HOLE	0x10000000	/* hole */
+#define EXTENT_STATUS_WRITTEN	(((unsigned long long) 1) << 63)
+#define EXTENT_STATUS_UNWRITTEN	(((unsigned long long) 1) << 62)
+#define EXTENT_STATUS_DELAYED	(((unsigned long long) 1) << 61)
+#define EXTENT_STATUS_HOLE	(((unsigned long long) 1) << 60)
 
 #define EXTENT_STATUS_FLAGS	(EXTENT_STATUS_WRITTEN | \
 				 EXTENT_STATUS_UNWRITTEN | \
@@ -58,22 +58,22 @@ extern int ext4_es_lookup_extent(struct inode *inode, ext4_lblk_t lblk,
 
 static inline int ext4_es_is_written(struct extent_status *es)
 {
-	return (es->es_pblk & EXTENT_STATUS_WRITTEN);
+	return (es->es_pblk & EXTENT_STATUS_WRITTEN) != 0;
 }
 
 static inline int ext4_es_is_unwritten(struct extent_status *es)
 {
-	return (es->es_pblk & EXTENT_STATUS_UNWRITTEN);
+	return (es->es_pblk & EXTENT_STATUS_UNWRITTEN) != 0;
 }
 
 static inline int ext4_es_is_delayed(struct extent_status *es)
 {
-	return (es->es_pblk & EXTENT_STATUS_DELAYED);
+	return (es->es_pblk & EXTENT_STATUS_DELAYED) != 0;
 }
 
 static inline int ext4_es_is_hole(struct extent_status *es)
 {
-	return (es->es_pblk & EXTENT_STATUS_HOLE);
+	return (es->es_pblk & EXTENT_STATUS_HOLE) != 0;
 }
 
 static inline ext4_fsblk_t ext4_es_status(struct extent_status *es)
-- 
1.7.12.rc0.22.gcdd159b


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* Re: [GIT PULL] ext4 updates for 3.9
  2013-02-27 18:49                   ` Theodore Ts'o
@ 2013-02-27 18:56                     ` Markus Trippelsdorf
  2013-02-27 19:19                       ` Dave Jones
  2013-02-27 18:57                     ` Dave Jones
                                       ` (2 subsequent siblings)
  3 siblings, 1 reply; 34+ messages in thread
From: Markus Trippelsdorf @ 2013-02-27 18:56 UTC (permalink / raw)
  To: Theodore Ts'o, Linus Torvalds, gnehzuil.liu, Zheng Liu,
	Dave Jones, Borislav Petkov, linux-ext4, linux-kernel

On 2013.02.27 at 13:49 -0500, Theodore Ts'o wrote:

> Markus, Dave, can you confirm that this fixes your problem?

Yes, it fixes the issue.
Thanks Ted.

-- 
Markus

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [GIT PULL] ext4 updates for 3.9
  2013-02-27 18:49                   ` Theodore Ts'o
  2013-02-27 18:56                     ` Markus Trippelsdorf
@ 2013-02-27 18:57                     ` Dave Jones
  2013-02-27 19:04                       ` Theodore Ts'o
  2013-02-27 18:59                     ` Zheng Liu
  2013-02-27 19:06                     ` Borislav Petkov
  3 siblings, 1 reply; 34+ messages in thread
From: Dave Jones @ 2013-02-27 18:57 UTC (permalink / raw)
  To: Theodore Ts'o, Linus Torvalds, Markus Trippelsdorf,
	gnehzuil.liu, Zheng Liu, Borislav Petkov, linux-ext4,
	linux-kernel

On Wed, Feb 27, 2013 at 01:49:12PM -0500, Theodore Ts'o wrote:
 > Markus, Dave, can you confirm that this fixes your problem?
 > 
 > Thanks!!
 > 
 > (Sigh, this is a real brown paper bug; I'm embarassed I missed this in
 > my code review.)

Building now.  Can you confirm that nothing on-disk should be awry ?
Or will I need a new fsck to detect what happened ?

	Dave


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [GIT PULL] ext4 updates for 3.9
  2013-02-27 18:49                   ` Theodore Ts'o
  2013-02-27 18:56                     ` Markus Trippelsdorf
  2013-02-27 18:57                     ` Dave Jones
@ 2013-02-27 18:59                     ` Zheng Liu
  2013-02-27 19:06                     ` Borislav Petkov
  3 siblings, 0 replies; 34+ messages in thread
From: Zheng Liu @ 2013-02-27 18:59 UTC (permalink / raw)
  To: Theodore Ts'o, Linus Torvalds, Markus Trippelsdorf,
	Zheng Liu, Dave Jones, Borislav Petkov, linux-ext4, linux-kernel

Hi Ted,

On 02/28/2013 02:49 AM, Theodore Ts'o wrote:
> Markus, Dave, can you confirm that this fixes your problem?
> 
> Thanks!!
> 
> (Sigh, this is a real brown paper bug; I'm embarassed I missed this in
> my code review.)

Sorry, I don't have a big disk in my hand now.  So I could reproduce it.
 But the patch looks good.  Thanks for fixing it.

Regards,
						- Zheng

> 
> 						- Ted
> 
> From f47f0d11096ca5d9e1965d8a9f266aa13fe2b73b Mon Sep 17 00:00:00 2001
> From: Theodore Ts'o <tytso@mit.edu>
> Date: Wed, 27 Feb 2013 13:47:52 -0500
> Subject: [PATCH] ext4: fix extent status tree regression for file systems >
>  512GB
> 
> This fixes a regression introduced by commit f7fec032aa782.  The
> problem was that the extents status flags caused us to mask out block
> numbers smaller than 2**28 blocks.  Since we didn't test with file
> systems smaller than 512GB, we didn't notice this during the
> development cycle.
> 
> A typical failure looks like this:
> 
> EXT4-fs error (device sdb1): htree_dirblock_to_tree:919: inode #172235804: block
> 152052301: comm ls: bad entry in directory: rec_len is smaller than minimal -
> offset=0(0), inode=0, rec_len=0, name_len=0
> 
> ... where 'debugfs -R "stat <172235804>" /dev/sdb1' reports that the
> inode has block number 688923213.  When viewed in hex, block number
> 152052301 (from the syslog) is 0x910224D, while block number 688923213
> is 0x2910224D.  Note the missing "0x20000000" in the block number.
> 
> Reported-by: Markus Trippelsdorf <markus@trippelsdorf.de>
> Reported-by: Dave Jones <davej@redhat.com>
> Cc: Zheng Liu <gnehzuil.liu@gmail.com>
> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
> ---
>  fs/ext4/extents_status.h | 16 ++++++++--------
>  1 file changed, 8 insertions(+), 8 deletions(-)
> 
> diff --git a/fs/ext4/extents_status.h b/fs/ext4/extents_status.h
> index cf83e77..c795ff6 100644
> --- a/fs/ext4/extents_status.h
> +++ b/fs/ext4/extents_status.h
> @@ -20,10 +20,10 @@
>  #define es_debug(fmt, ...)	no_printk(fmt, ##__VA_ARGS__)
>  #endif
>  
> -#define EXTENT_STATUS_WRITTEN	0x80000000	/* written extent */
> -#define EXTENT_STATUS_UNWRITTEN	0x40000000	/* unwritten extent */
> -#define EXTENT_STATUS_DELAYED	0x20000000	/* delayed extent */
> -#define EXTENT_STATUS_HOLE	0x10000000	/* hole */
> +#define EXTENT_STATUS_WRITTEN	(((unsigned long long) 1) << 63)
> +#define EXTENT_STATUS_UNWRITTEN	(((unsigned long long) 1) << 62)
> +#define EXTENT_STATUS_DELAYED	(((unsigned long long) 1) << 61)
> +#define EXTENT_STATUS_HOLE	(((unsigned long long) 1) << 60)
>  
>  #define EXTENT_STATUS_FLAGS	(EXTENT_STATUS_WRITTEN | \
>  				 EXTENT_STATUS_UNWRITTEN | \
> @@ -58,22 +58,22 @@ extern int ext4_es_lookup_extent(struct inode *inode, ext4_lblk_t lblk,
>  
>  static inline int ext4_es_is_written(struct extent_status *es)
>  {
> -	return (es->es_pblk & EXTENT_STATUS_WRITTEN);
> +	return (es->es_pblk & EXTENT_STATUS_WRITTEN) != 0;
>  }
>  
>  static inline int ext4_es_is_unwritten(struct extent_status *es)
>  {
> -	return (es->es_pblk & EXTENT_STATUS_UNWRITTEN);
> +	return (es->es_pblk & EXTENT_STATUS_UNWRITTEN) != 0;
>  }
>  
>  static inline int ext4_es_is_delayed(struct extent_status *es)
>  {
> -	return (es->es_pblk & EXTENT_STATUS_DELAYED);
> +	return (es->es_pblk & EXTENT_STATUS_DELAYED) != 0;
>  }
>  
>  static inline int ext4_es_is_hole(struct extent_status *es)
>  {
> -	return (es->es_pblk & EXTENT_STATUS_HOLE);
> +	return (es->es_pblk & EXTENT_STATUS_HOLE) != 0;
>  }
>  
>  static inline ext4_fsblk_t ext4_es_status(struct extent_status *es)
> 


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [GIT PULL] ext4 updates for 3.9
  2013-02-27 18:57                     ` Dave Jones
@ 2013-02-27 19:04                       ` Theodore Ts'o
  2013-02-27 19:11                         ` Dave Jones
  0 siblings, 1 reply; 34+ messages in thread
From: Theodore Ts'o @ 2013-02-27 19:04 UTC (permalink / raw)
  To: Dave Jones, Linus Torvalds, Markus Trippelsdorf, gnehzuil.liu,
	Zheng Liu, Borislav Petkov, linux-ext4, linux-kernel

On Wed, Feb 27, 2013 at 01:57:55PM -0500, Dave Jones wrote:
> Building now.  Can you confirm that nothing on-disk should be awry ?
> Or will I need a new fsck to detect what happened ?
> 

Well, it's possible that a read from data file which had blocks
located above 512GB might have gotten bogus information, or a block
allocated above 512GB might result in a write to the wrong place on
disk.

So if you are very cautious, running fsck just to make sure things are
OK is not a bad idea.  But it's likely that the directory sanity
checks would have caught things quickly, or trying run an executable
would have caused a seg fault quickly enough.  If your system crashed
very quickly during the boot process, you'll probably be OK.

       	      	      	   	      	  - Ted

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [GIT PULL] ext4 updates for 3.9
  2013-02-27 18:49                   ` Theodore Ts'o
                                       ` (2 preceding siblings ...)
  2013-02-27 18:59                     ` Zheng Liu
@ 2013-02-27 19:06                     ` Borislav Petkov
  3 siblings, 0 replies; 34+ messages in thread
From: Borislav Petkov @ 2013-02-27 19:06 UTC (permalink / raw)
  To: Theodore Ts'o
  Cc: Linus Torvalds, Markus Trippelsdorf, gnehzuil.liu, Zheng Liu,
	Dave Jones, linux-ext4, linux-kernel

On Wed, Feb 27, 2013 at 01:49:12PM -0500, Theodore Ts'o wrote:
> -#define EXTENT_STATUS_WRITTEN	0x80000000	/* written extent */
> -#define EXTENT_STATUS_UNWRITTEN	0x40000000	/* unwritten extent */
> -#define EXTENT_STATUS_DELAYED	0x20000000	/* delayed extent */
> -#define EXTENT_STATUS_HOLE	0x10000000	/* hole */
> +#define EXTENT_STATUS_WRITTEN	(((unsigned long long) 1) << 63)
> +#define EXTENT_STATUS_UNWRITTEN	(((unsigned long long) 1) << 62)
> +#define EXTENT_STATUS_DELAYED	(((unsigned long long) 1) << 61)
> +#define EXTENT_STATUS_HOLE	(((unsigned long long) 1) << 60)

Just a nitpick:

1ULL << ...

is shorter and probably the standard way we do flags.

-- 
Regards/Gruss,
    Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [GIT PULL] ext4 updates for 3.9
  2013-02-27 19:04                       ` Theodore Ts'o
@ 2013-02-27 19:11                         ` Dave Jones
  2013-02-27 19:19                           ` Theodore Ts'o
  0 siblings, 1 reply; 34+ messages in thread
From: Dave Jones @ 2013-02-27 19:11 UTC (permalink / raw)
  To: Theodore Ts'o, Linus Torvalds, Markus Trippelsdorf,
	gnehzuil.liu, Zheng Liu, Borislav Petkov, linux-ext4,
	linux-kernel

On Wed, Feb 27, 2013 at 02:04:19PM -0500, Theodore Ts'o wrote:
 > On Wed, Feb 27, 2013 at 01:57:55PM -0500, Dave Jones wrote:
 > > Building now.  Can you confirm that nothing on-disk should be awry ?
 > > Or will I need a new fsck to detect what happened ?
 > > 
 > 
 > Well, it's possible that a read from data file which had blocks
 > located above 512GB might have gotten bogus information, or a block
 > allocated above 512GB might result in a write to the wrong place on
 > disk.
 > 
 > So if you are very cautious, running fsck just to make sure things are
 > OK is not a bad idea.  But it's likely that the directory sanity
 > checks would have caught things quickly, or trying run an executable
 > would have caused a seg fault quickly enough.  If your system crashed
 > very quickly during the boot process, you'll probably be OK.

It had been up a few hours before I hit those checks.

fsck never found anything.  I do have another disk (XFS formatted) with
a backup from a week ago. I'll run a --dry-run rsync to see if it
picks up any changes that I don't expect.

	Dave


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [GIT PULL] ext4 updates for 3.9
  2013-02-27 18:56                     ` Markus Trippelsdorf
@ 2013-02-27 19:19                       ` Dave Jones
  2013-02-27 19:27                         ` Zheng Liu
  2013-02-27 19:29                         ` Theodore Ts'o
  0 siblings, 2 replies; 34+ messages in thread
From: Dave Jones @ 2013-02-27 19:19 UTC (permalink / raw)
  To: Markus Trippelsdorf
  Cc: Theodore Ts'o, Linus Torvalds, gnehzuil.liu, Zheng Liu,
	Borislav Petkov, linux-ext4, linux-kernel

On Wed, Feb 27, 2013 at 07:56:25PM +0100, Markus Trippelsdorf wrote:
 > On 2013.02.27 at 13:49 -0500, Theodore Ts'o wrote:
 > 
 > > Markus, Dave, can you confirm that this fixes your problem?
 > 
 > Yes, it fixes the issue.

Looks like it's fixed here too.

How did this make it through -next without anyone hitting it ?

I can't remember how many years ago I last bought a disk < 1TB,
and I can't be alone.  Or is everyone all about SSDs these days?

Is anyone running xfstests or similar on linux-next regularly ?

	Dave


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [GIT PULL] ext4 updates for 3.9
  2013-02-27 19:11                         ` Dave Jones
@ 2013-02-27 19:19                           ` Theodore Ts'o
  0 siblings, 0 replies; 34+ messages in thread
From: Theodore Ts'o @ 2013-02-27 19:19 UTC (permalink / raw)
  To: Dave Jones, Linus Torvalds, Markus Trippelsdorf, gnehzuil.liu,
	Zheng Liu, Borislav Petkov, linux-ext4, linux-kernel

On Wed, Feb 27, 2013 at 02:11:15PM -0500, Dave Jones wrote:
> 
> fsck never found anything.  I do have another disk (XFS formatted) with
> a backup from a week ago. I'll run a --dry-run rsync to see if it
> picks up any changes that I don't expect.

Probably a good idea.

Given that Markus reports that this fixes the problem for him, and
regression smoke test has passed, I'll get a cleaned up patch ready
for Linus to pull ASAP.

My deep apologies that we didn't catch this earlier.  i'm going to
think about how we can improve our testing regime to make sure
problems like this get caught earlier.

				- Ted

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [GIT PULL] ext4 updates for 3.9
  2013-02-27 19:19                       ` Dave Jones
@ 2013-02-27 19:27                         ` Zheng Liu
  2013-02-27 19:29                         ` Theodore Ts'o
  1 sibling, 0 replies; 34+ messages in thread
From: Zheng Liu @ 2013-02-27 19:27 UTC (permalink / raw)
  To: Dave Jones, Markus Trippelsdorf, Theodore Ts'o,
	Linus Torvalds, Zheng Liu, Borislav Petkov, linux-ext4,
	linux-kernel

On 02/28/2013 03:19 AM, Dave Jones wrote:
> On Wed, Feb 27, 2013 at 07:56:25PM +0100, Markus Trippelsdorf wrote:
>  > On 2013.02.27 at 13:49 -0500, Theodore Ts'o wrote:
>  > 
>  > > Markus, Dave, can you confirm that this fixes your problem?
>  > 
>  > Yes, it fixes the issue.
> 
> Looks like it's fixed here too.
> 
> How did this make it through -next without anyone hitting it ?

Sorry, my apology.

> 
> I can't remember how many years ago I last bought a disk < 1TB,
> and I can't be alone.  Or is everyone all about SSDs these days?

I admit that I run xfstests on a SSD that has 128G.

> 
> Is anyone running xfstests or similar on linux-next regularly ?

Ted has a dev branch for -next, and I work on this branch.

Regards,
						- Zheng

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [GIT PULL] ext4 updates for 3.9
  2013-02-27 19:19                       ` Dave Jones
  2013-02-27 19:27                         ` Zheng Liu
@ 2013-02-27 19:29                         ` Theodore Ts'o
  2013-02-27 20:12                           ` [GIT PULL URGENT] ext4 regression fix " Linus Torvalds
                                             ` (3 more replies)
  1 sibling, 4 replies; 34+ messages in thread
From: Theodore Ts'o @ 2013-02-27 19:29 UTC (permalink / raw)
  To: Dave Jones, Markus Trippelsdorf, Linus Torvalds, gnehzuil.liu,
	Zheng Liu, Borislav Petkov, linux-ext4, linux-kernel

On Wed, Feb 27, 2013 at 02:19:23PM -0500, Dave Jones wrote:
> 
> Looks like it's fixed here too.
> 
> How did this make it through -next without anyone hitting it ?
> 
> I can't remember how many years ago I last bought a disk < 1TB,
> and I can't be alone.  Or is everyone all about SSDs these days?

I use LVM, so I have a number of volues which are smaler than 512GB,
but very few which are actually larger than 1TB.  And none on my test
boxes.  I was running the bleeding edge ext4 code on my laptop as for
dogfooding purposes, but I have an 80GB mSATA SSD and a 500GB HDD on
my X230 laptop (it requires a thin laptop drive, and 7mm drives don't
come any bigger, alas).

> Is anyone running xfstests or similar on linux-next regularly ?

I run xfstests on the ext4 tree, and I ran it on ext4 plus Linus's tip
before I submitted a pull request.  The problem is that XFSTESTS is
S-L-O-W if you use large partitions, so typically I use a 5GB
partition sizes for my test runs.  Normally we're worried about race
condition bugs, not something as bone-headed as a bitmasking problem,
so it makes sense to use a smaller disk for most of your testing.
(Some folks do their xfstests run on SSD's or tmpfs image files, again
for speed reasons, and it's unlikely they would be big enough.)

So what we probably need to do is to have a separate set of tests
using a loopback mount, and perhaps an artificially created file
system which has a large percentage of the blocks in the middle of the
file system busied out, to make efficient testing of these sorts of
bugs more efficient.  As I said, I'm thinking about how's the best way
to improve our testing regime to catch such problems the next time around.

Cheers,

   	       	       	      	       	    - Ted

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [GIT PULL URGENT] ext4 regression fix for 3.9
  2013-02-27 19:29                         ` Theodore Ts'o
@ 2013-02-27 20:12                           ` Linus Torvalds
  2013-02-27 20:15                             ` Theodore Ts'o
  2013-03-01  3:30                             ` Dave Jones
  2013-02-27 20:14                           ` Theodore Ts'o
                                             ` (2 subsequent siblings)
  3 siblings, 2 replies; 34+ messages in thread
From: Linus Torvalds @ 2013-02-27 20:12 UTC (permalink / raw)
  To: Dave Jones, Markus Trippelsdorf, Theodore Ts'o, gnehzuil.liu,
	Zheng Liu, Borislav Petkov, linux-ext4, linux-kernel

The following changes since commit 304e220f0879198b1f5309ad6f0be862b4009491:

  ext4: fix free clusters calculation in bigalloc filesystem (2013-02-22 15:27:52 -0500)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4.git tags/ext4_for_linus

for you to fetch changes up to 8e919d13048cd5acaadb2b15b48acbfb8832d3c2:

  ext4: fix extent status tree regression for file systems > 512GB (2013-02-27 14:54:37 -0500)

----------------------------------------------------------------
This fixes a real brown paper bag bug which causes ext4 to choke on
file systems larger than 512GB.

----------------------------------------------------------------
Theodore Ts'o (1):
      ext4: fix extent status tree regression for file systems > 512GB

 fs/ext4/extents_status.h | 19 +++++++++++--------
 1 file changed, 11 insertions(+), 8 deletions(-)

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [GIT PULL URGENT] ext4 regression fix for 3.9
  2013-02-27 19:29                         ` Theodore Ts'o
  2013-02-27 20:12                           ` [GIT PULL URGENT] ext4 regression fix " Linus Torvalds
@ 2013-02-27 20:14                           ` Theodore Ts'o
  2013-02-27 20:58                           ` [GIT PULL] ext4 updates " Dmitry Monakhov
  2013-02-28 13:18                           ` Dave Chinner
  3 siblings, 0 replies; 34+ messages in thread
From: Theodore Ts'o @ 2013-02-27 20:14 UTC (permalink / raw)
  To: Linus Torvalds, Dave Jones, Markus Trippelsdorf, gnehzuil.liu,
	Zheng Liu, Borislav Petkov, linux-ext4, linux-kernel

The following changes since commit 304e220f0879198b1f5309ad6f0be862b4009491:

  ext4: fix free clusters calculation in bigalloc filesystem (2013-02-22 15:27:52 -0500)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4.git tags/ext4_for_linus

for you to fetch changes up to 8e919d13048cd5acaadb2b15b48acbfb8832d3c2:

  ext4: fix extent status tree regression for file systems > 512GB (2013-02-27 14:54:37 -0500)

----------------------------------------------------------------
This fixes a real brown paper bag bug which causes ext4 to choke on
file systems larger than 512GB.

----------------------------------------------------------------
Theodore Ts'o (1):
      ext4: fix extent status tree regression for file systems > 512GB

 fs/ext4/extents_status.h | 19 +++++++++++--------
 1 file changed, 11 insertions(+), 8 deletions(-)

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [GIT PULL URGENT] ext4 regression fix for 3.9
  2013-02-27 20:12                           ` [GIT PULL URGENT] ext4 regression fix " Linus Torvalds
@ 2013-02-27 20:15                             ` Theodore Ts'o
  2013-02-27 20:23                               ` Linus Torvalds
  2013-03-01  3:30                             ` Dave Jones
  1 sibling, 1 reply; 34+ messages in thread
From: Theodore Ts'o @ 2013-02-27 20:15 UTC (permalink / raw)
  To: Linus Torvalds, Dave Jones, Markus Trippelsdorf, gnehzuil.liu,
	Zheng Liu, Borislav Petkov, linux-ext4, linux-kernel

On Wed, Feb 27, 2013 at 03:12:17PM -0500, Linus Torvalds wrote:
> The following changes since commit 304e220f0879198b1f5309ad6f0be862b4009491:

Obviously, this wasn't from Linus.  I screwed up on sending the pull
request.  Sigh...

						- Ted

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [GIT PULL URGENT] ext4 regression fix for 3.9
  2013-02-27 20:15                             ` Theodore Ts'o
@ 2013-02-27 20:23                               ` Linus Torvalds
  2013-02-27 20:41                                 ` Borislav Petkov
  0 siblings, 1 reply; 34+ messages in thread
From: Linus Torvalds @ 2013-02-27 20:23 UTC (permalink / raw)
  To: Theodore Ts'o, Linus Torvalds, Dave Jones,
	Markus Trippelsdorf, gnehzuil.liu, Zheng Liu, Borislav Petkov,
	linux-ext4, linux-kernel

On Wed, Feb 27, 2013 at 12:15 PM, Theodore Ts'o <tytso@mit.edu> wrote:
> On Wed, Feb 27, 2013 at 03:12:17PM -0500, Linus Torvalds wrote:
>> The following changes since commit 304e220f0879198b1f5309ad6f0be862b4009491:
>
> Obviously, this wasn't from Linus.  I screwed up on sending the pull
> request.  Sigh...

That looked really odd in my inbox for a while ;)

          Linus "the *real* one" Torvalds

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [GIT PULL URGENT] ext4 regression fix for 3.9
  2013-02-27 20:23                               ` Linus Torvalds
@ 2013-02-27 20:41                                 ` Borislav Petkov
  0 siblings, 0 replies; 34+ messages in thread
From: Borislav Petkov @ 2013-02-27 20:41 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Theodore Ts'o, Dave Jones, Markus Trippelsdorf, gnehzuil.liu,
	Zheng Liu, linux-ext4, linux-kernel

On Wed, Feb 27, 2013 at 12:23:18PM -0800, Linus Torvalds wrote:
> On Wed, Feb 27, 2013 at 12:15 PM, Theodore Ts'o <tytso@mit.edu> wrote:
> > On Wed, Feb 27, 2013 at 03:12:17PM -0500, Linus Torvalds wrote:
> >> The following changes since commit 304e220f0879198b1f5309ad6f0be862b4009491:
> >
> > Obviously, this wasn't from Linus.  I screwed up on sending the pull
> > request.  Sigh...
> 
> That looked really odd in my inbox for a while ;)
> 
>           Linus "the *real* one" Torvalds

He doesn't trust anyone - he sends pull requests to himself!

:-)

-- 
Regards/Gruss,
    Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [GIT PULL] ext4 updates for 3.9
  2013-02-27 19:29                         ` Theodore Ts'o
  2013-02-27 20:12                           ` [GIT PULL URGENT] ext4 regression fix " Linus Torvalds
  2013-02-27 20:14                           ` Theodore Ts'o
@ 2013-02-27 20:58                           ` Dmitry Monakhov
  2013-02-27 21:30                             ` Theodore Ts'o
  2013-03-01 15:41                             ` Eric Sandeen
  2013-02-28 13:18                           ` Dave Chinner
  3 siblings, 2 replies; 34+ messages in thread
From: Dmitry Monakhov @ 2013-02-27 20:58 UTC (permalink / raw)
  To: Theodore Ts'o, Dave Jones, Markus Trippelsdorf,
	Linus Torvalds, gnehzuil.liu, Zheng Liu, Borislav Petkov,
	linux-ext4, linux-kernel

On Wed, 27 Feb 2013 14:29:07 -0500, Theodore Ts'o <tytso@mit.edu> wrote:
> On Wed, Feb 27, 2013 at 02:19:23PM -0500, Dave Jones wrote:
> > 
> > Looks like it's fixed here too.
> > 
> > How did this make it through -next without anyone hitting it ?
> > 
> > I can't remember how many years ago I last bought a disk < 1TB,
> > and I can't be alone.  Or is everyone all about SSDs these days?
> 
> I use LVM, so I have a number of volues which are smaler than 512GB,
> but very few which are actually larger than 1TB.  And none on my test
> boxes.  I was running the bleeding edge ext4 code on my laptop as for
> dogfooding purposes, but I have an 80GB mSATA SSD and a 500GB HDD on
> my X230 laptop (it requires a thin laptop drive, and 7mm drives don't
> come any bigger, alas).
> 
> > Is anyone running xfstests or similar on linux-next regularly ?
> 
> I run xfstests on the ext4 tree, and I ran it on ext4 plus Linus's tip
> before I submitted a pull request.  The problem is that XFSTESTS is
> S-L-O-W if you use large partitions, so typically I use a 5GB
Indeed. That's why i give-up rotated disks and run xfstest only on SSD
or brd module 
> partition sizes for my test runs.  Normally we're worried about race
> condition bugs, not something as bone-headed as a bitmasking problem,
> so it makes sense to use a smaller disk for most of your testing.
> (Some folks do their xfstests run on SSD's or tmpfs image files, again
> for speed reasons, and it's unlikely they would be big enough.)
> 
> So what we probably need to do is to have a separate set of tests
> using a loopback mount, and perhaps an artificially created file
> system which has a large percentage of the blocks in the middle of the
> file system busied out, to make efficient testing of these sorts of
> bugs more efficient.  As I said, I'm thinking about how's the best way
> to improve our testing regime to catch such problems the next time around.
Amazing idea. Something like:

#dd if=/dev/zero of=/tmp/fs.img bs=1M seek=2000000 count=1
#mkfs.ext4 -m0 -i4096000 /tmp/fs.img
#mount /tmp/fs.img /mnt/ -oloop
#for ((i=0; i < 2000; i++));do   fallocate -l $((1024*1024*1024)) /mnt/f$i ;done
#for ((i=0; i < 2000; i++));do   truncate -s $((1023*1024*1024)) /mnt/f$i ;done

As result file system image has 2gb of free space wich is fragmented to ~2000
chunks 1Mb each. But image itself is quite small
# df /mnt
Filesystem      1K-blocks       Used Available Use% Mounted on
/dev/loop0     2047678076 2045679228   1998848 100% /mnt
# du -sch /tmp/fs.img 
242M     /tmp/fs.img
242M     total

Later we can simply run xfstest/fio/fsx on this image.
I'll prepare new xfstest based on that idea. But the only disadvantage
is that loop dev has bottleneck, all requests will be serialized on i_mutex.
> 
> Cheers,
> 
>    	       	       	      	       	    - Ted
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [GIT PULL] ext4 updates for 3.9
  2013-02-27 20:58                           ` [GIT PULL] ext4 updates " Dmitry Monakhov
@ 2013-02-27 21:30                             ` Theodore Ts'o
  2013-03-01 15:41                             ` Eric Sandeen
  1 sibling, 0 replies; 34+ messages in thread
From: Theodore Ts'o @ 2013-02-27 21:30 UTC (permalink / raw)
  To: Dmitry Monakhov
  Cc: Dave Jones, Markus Trippelsdorf, Linus Torvalds, gnehzuil.liu,
	Zheng Liu, Borislav Petkov, linux-ext4, linux-kernel

On Thu, Feb 28, 2013 at 12:58:55AM +0400, Dmitry Monakhov wrote:
> Indeed. That's why i give-up rotated disks and run xfstest only on SSD
> or brd module 

Note that some problems only show up on slower devices (because then
the race window is wider), and sometimes they only show up on fast
devices.  This is why I've been doing runs on both HDD's and on tmpfs
kvm disk image files.

> Later we can simply run xfstest/fio/fsx on this image.
> I'll prepare new xfstest based on that idea. But the only disadvantage
> is that loop dev has bottleneck, all requests will be serialized on i_mutex.

If the image file is used as a disk file for KVM, that should avoid
the serialization bottleneck of using the loop device.

Regards,

						- Ted

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [GIT PULL] ext4 updates for 3.9
  2013-02-27 19:29                         ` Theodore Ts'o
                                             ` (2 preceding siblings ...)
  2013-02-27 20:58                           ` [GIT PULL] ext4 updates " Dmitry Monakhov
@ 2013-02-28 13:18                           ` Dave Chinner
  3 siblings, 0 replies; 34+ messages in thread
From: Dave Chinner @ 2013-02-28 13:18 UTC (permalink / raw)
  To: Theodore Ts'o, Dave Jones, Markus Trippelsdorf,
	Linus Torvalds, gnehzuil.liu, Zheng Liu, Borislav Petkov,
	linux-ext4, linux-kernel

On Wed, Feb 27, 2013 at 02:29:07PM -0500, Theodore Ts'o wrote:
> On Wed, Feb 27, 2013 at 02:19:23PM -0500, Dave Jones wrote:
> > 
> > Looks like it's fixed here too.
> > 
> > How did this make it through -next without anyone hitting it ?
> 
> > Is anyone running xfstests or similar on linux-next regularly ?
> 
> I run xfstests on the ext4 tree, and I ran it on ext4 plus Linus's tip
> before I submitted a pull request.  The problem is that XFSTESTS is
> S-L-O-W if you use large partitions, so typically I use a 5GB
> partition sizes for my test runs.

This isn't the case for XFS. I typically see 1TB scratch devices
only being ~10-20% slower than 10GB scratch devices, and 10TB only
being a little slower than 1TB scratch devices. I have to use sparse
devices and --large-fs for 100TB filesystem testing, so I can't
directly compare the speeds to those that I run on physical devices.
However I can say that it isn't significantly slower than using
small scratch devices...

> So what we probably need to do is to have a separate set of tests
> using a loopback mount, and perhaps an artificially created file
> system which has a large percentage of the blocks in the middle of the
> file system busied out, to make efficient testing of these sorts of

That's exactly what the --large-fs patch set I posted months ago does
for ext4 - it uses fallocate() to fill all but 50GB of the large
filesystem without actually writing any data and runs the standard
tests in the remaining unused space.

However, last time I tested ext4 with this patchset (when I posted
the patches months ago), multi-TB preallocation on ext4 was still too
slow to make it practical for testing on devices larger than 2-3TB.
Perhaps it would make testing 1-2TB ext4 filesystems fast enough
that you could do it regularly...

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [GIT PULL URGENT] ext4 regression fix for 3.9
  2013-02-27 20:12                           ` [GIT PULL URGENT] ext4 regression fix " Linus Torvalds
  2013-02-27 20:15                             ` Theodore Ts'o
@ 2013-03-01  3:30                             ` Dave Jones
  2013-03-01  4:00                               ` Theodore Ts'o
  1 sibling, 1 reply; 34+ messages in thread
From: Dave Jones @ 2013-03-01  3:30 UTC (permalink / raw)
  To: Linus Torvalds, Markus Trippelsdorf, gnehzuil.liu, Zheng Liu,
	Borislav Petkov, linux-ext4, linux-kernel

On Wed, Feb 27, 2013 at 03:12:17PM -0500, Linus Torvalds wrote:
 > The following changes since commit 304e220f0879198b1f5309ad6f0be862b4009491:
 > 
 >   ext4: fix free clusters calculation in bigalloc filesystem (2013-02-22 15:27:52 -0500)
 > 
 > are available in the git repository at:
 > 
 >   git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4.git tags/ext4_for_linus
 > 
 > for you to fetch changes up to 8e919d13048cd5acaadb2b15b48acbfb8832d3c2:
 > 
 >   ext4: fix extent status tree regression for file systems > 512GB (2013-02-27 14:54:37 -0500)
 > 
 > ----------------------------------------------------------------
 > This fixes a real brown paper bag bug which causes ext4 to choke on
 > file systems larger than 512GB.
 > 
 > ----------------------------------------------------------------
 > Theodore Ts'o (1):
 >       ext4: fix extent status tree regression for file systems > 512GB
 > 
 >  fs/ext4/extents_status.h | 19 +++++++++++--------
 >  1 file changed, 11 insertions(+), 8 deletions(-)

This has fixed the problem I reported, but I notice now that my desktop is really
sluggish. perf top shows it's almost constantly spinning in ext4_es_reclaim_extents_count

Any ideas ?

	Dave


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [GIT PULL URGENT] ext4 regression fix for 3.9
  2013-03-01  3:30                             ` Dave Jones
@ 2013-03-01  4:00                               ` Theodore Ts'o
  2013-03-02 19:54                                 ` Linus Torvalds
  0 siblings, 1 reply; 34+ messages in thread
From: Theodore Ts'o @ 2013-03-01  4:00 UTC (permalink / raw)
  To: Dave Jones, Linus Torvalds, Markus Trippelsdorf, gnehzuil.liu,
	Zheng Liu, Borislav Petkov, linux-ext4, linux-kernel

On Thu, Feb 28, 2013 at 10:30:05PM -0500, Dave Jones wrote:
> 
> This has fixed the problem I reported, but I notice now that my
> desktop is really sluggish. perf top shows it's almost constantly
> spinning in ext4_es_reclaim_extents_count
> 
> Any ideas ?

ext4_es_reclaim_extents_count() is getting called out of the slab
shrinker.  It's getting called too often when there is significant
memory pressure.  We can optimize this so we're not calculating it all
the time.

						- Ted

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [GIT PULL] ext4 updates for 3.9
  2013-02-27 20:58                           ` [GIT PULL] ext4 updates " Dmitry Monakhov
  2013-02-27 21:30                             ` Theodore Ts'o
@ 2013-03-01 15:41                             ` Eric Sandeen
  1 sibling, 0 replies; 34+ messages in thread
From: Eric Sandeen @ 2013-03-01 15:41 UTC (permalink / raw)
  To: Dmitry Monakhov
  Cc: Theodore Ts'o, Dave Jones, Markus Trippelsdorf,
	Linus Torvalds, gnehzuil.liu, Zheng Liu, Borislav Petkov,
	linux-ext4, linux-kernel

On 2/27/13 2:58 PM, Dmitry Monakhov wrote:
> On Wed, 27 Feb 2013 14:29:07 -0500, Theodore Ts'o <tytso@mit.edu> wrote:
>> On Wed, Feb 27, 2013 at 02:19:23PM -0500, Dave Jones wrote:
>>>
>>> Looks like it's fixed here too.
>>>
>>> How did this make it through -next without anyone hitting it ?
>>>
>>> I can't remember how many years ago I last bought a disk < 1TB,
>>> and I can't be alone.  Or is everyone all about SSDs these days?
>>
>> I use LVM, so I have a number of volues which are smaler than 512GB,
>> but very few which are actually larger than 1TB.  And none on my test
>> boxes.  I was running the bleeding edge ext4 code on my laptop as for
>> dogfooding purposes, but I have an 80GB mSATA SSD and a 500GB HDD on
>> my X230 laptop (it requires a thin laptop drive, and 7mm drives don't
>> come any bigger, alas).
>>
>>> Is anyone running xfstests or similar on linux-next regularly ?
>>
>> I run xfstests on the ext4 tree, and I ran it on ext4 plus Linus's tip
>> before I submitted a pull request.  The problem is that XFSTESTS is
>> S-L-O-W if you use large partitions, so typically I use a 5GB
> Indeed. That's why i give-up rotated disks and run xfstest only on SSD
> or brd module 
>> partition sizes for my test runs.  Normally we're worried about race
>> condition bugs, not something as bone-headed as a bitmasking problem,
>> so it makes sense to use a smaller disk for most of your testing.
>> (Some folks do their xfstests run on SSD's or tmpfs image files, again
>> for speed reasons, and it's unlikely they would be big enough.)
>>
>> So what we probably need to do is to have a separate set of tests
>> using a loopback mount, and perhaps an artificially created file
>> system which has a large percentage of the blocks in the middle of the
>> file system busied out, to make efficient testing of these sorts of
>> bugs more efficient.  As I said, I'm thinking about how's the best way
>> to improve our testing regime to catch such problems the next time around.
> Amazing idea. Something like:
> 
> #dd if=/dev/zero of=/tmp/fs.img bs=1M seek=2000000 count=1
> #mkfs.ext4 -m0 -i4096000 /tmp/fs.img
> #mount /tmp/fs.img /mnt/ -oloop
> #for ((i=0; i < 2000; i++));do   fallocate -l $((1024*1024*1024)) /mnt/f$i ;done
> #for ((i=0; i < 2000; i++));do   truncate -s $((1023*1024*1024)) /mnt/f$i ;done
> 
> As result file system image has 2gb of free space wich is fragmented to ~2000
> chunks 1Mb each. But image itself is quite small
> # df /mnt
> Filesystem      1K-blocks       Used Available Use% Mounted on
> /dev/loop0     2047678076 2045679228   1998848 100% /mnt
> # du -sch /tmp/fs.img 
> 242M     /tmp/fs.img
> 242M     total
> 
> Later we can simply run xfstest/fio/fsx on this image.
> I'll prepare new xfstest based on that idea. But the only disadvantage
> is that loop dev has bottleneck, all requests will be serialized on i_mutex.

Before anyone does too much work, it would be worth revisiting
dchinner's
	[PATCH 0/10] xfstests: rework large filesystem testing
series from July 2012 to see if it meets the needs already.

It almost got all reviews, with one sticking point left, AFAICT.

-Eric




^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [GIT PULL URGENT] ext4 regression fix for 3.9
  2013-03-01  4:00                               ` Theodore Ts'o
@ 2013-03-02 19:54                                 ` Linus Torvalds
  2013-03-02 23:15                                   ` Theodore Ts'o
  0 siblings, 1 reply; 34+ messages in thread
From: Linus Torvalds @ 2013-03-02 19:54 UTC (permalink / raw)
  To: Theodore Ts'o, Dave Jones, Linus Torvalds,
	Markus Trippelsdorf, gnehzuil.liu, Zheng Liu, Borislav Petkov,
	linux-ext4, linux-kernel

On Thu, Feb 28, 2013 at 8:00 PM, Theodore Ts'o <tytso@mit.edu> wrote:
>
> ext4_es_reclaim_extents_count() is getting called out of the slab
> shrinker.  It's getting called too often when there is significant
> memory pressure.  We can optimize this so we're not calculating it all
> the time.

This needs to be fixed or reverted. I traced back some user
interaction problems to this same issue. It literally gets so bad that
the whole system is choppy, and a profile shows that a lot of time is
being spent in the spin-lock protecting these data structures. Called
from ext4_es_reclaim_extents_count, ext4_es_lru_add and
ext4_es_shrink.

We're talking very user-noticeable pauses, so it's not just the
spinlock, there's some real badness that comes in with it too.
Possibly related to other (sleeping) locks being held while the data
structures are then refilled or something.

                   Linus

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [GIT PULL URGENT] ext4 regression fix for 3.9
  2013-03-02 19:54                                 ` Linus Torvalds
@ 2013-03-02 23:15                                   ` Theodore Ts'o
  0 siblings, 0 replies; 34+ messages in thread
From: Theodore Ts'o @ 2013-03-02 23:15 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Dave Jones, Markus Trippelsdorf, gnehzuil.liu, Zheng Liu,
	Borislav Petkov, linux-ext4, linux-kernel

On Sat, Mar 02, 2013 at 11:54:37AM -0800, Linus Torvalds wrote:
> 
> This needs to be fixed or reverted. I traced back some user
> interaction problems to this same issue. It literally gets so bad that
> the whole system is choppy, and a profile shows that a lot of time is
> being spent in the spin-lock protecting these data structures. Called
> from ext4_es_reclaim_extents_count, ext4_es_lru_add and
> ext4_es_shrink.

I'm doing final testing on a pull request that will fix these
problems.  I'll send it to you later this evening.

	   	     	       	     - Ted

^ permalink raw reply	[flat|nested] 34+ messages in thread

end of thread, other threads:[~2013-03-02 23:15 UTC | newest]

Thread overview: 34+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-02-26 20:39 [GIT PULL] ext4 updates for 3.9 Theodore Ts'o
2013-02-27 12:47 ` Markus Trippelsdorf
2013-02-27 15:34   ` Theodore Ts'o
2013-02-27 15:44     ` Markus Trippelsdorf
2013-02-27 17:01       ` Markus Trippelsdorf
2013-02-27 17:10         ` gnehzuil.liu
2013-02-27 17:22           ` Markus Trippelsdorf
2013-02-27 17:38             ` gnehzuil.liu
2013-02-27 17:45               ` Markus Trippelsdorf
2013-02-27 17:52                 ` Linus Torvalds
2013-02-27 18:49                   ` Theodore Ts'o
2013-02-27 18:56                     ` Markus Trippelsdorf
2013-02-27 19:19                       ` Dave Jones
2013-02-27 19:27                         ` Zheng Liu
2013-02-27 19:29                         ` Theodore Ts'o
2013-02-27 20:12                           ` [GIT PULL URGENT] ext4 regression fix " Linus Torvalds
2013-02-27 20:15                             ` Theodore Ts'o
2013-02-27 20:23                               ` Linus Torvalds
2013-02-27 20:41                                 ` Borislav Petkov
2013-03-01  3:30                             ` Dave Jones
2013-03-01  4:00                               ` Theodore Ts'o
2013-03-02 19:54                                 ` Linus Torvalds
2013-03-02 23:15                                   ` Theodore Ts'o
2013-02-27 20:14                           ` Theodore Ts'o
2013-02-27 20:58                           ` [GIT PULL] ext4 updates " Dmitry Monakhov
2013-02-27 21:30                             ` Theodore Ts'o
2013-03-01 15:41                             ` Eric Sandeen
2013-02-28 13:18                           ` Dave Chinner
2013-02-27 18:57                     ` Dave Jones
2013-02-27 19:04                       ` Theodore Ts'o
2013-02-27 19:11                         ` Dave Jones
2013-02-27 19:19                           ` Theodore Ts'o
2013-02-27 18:59                     ` Zheng Liu
2013-02-27 19:06                     ` Borislav Petkov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).