All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] xfs: implement ->dirty_inode callout
@ 2009-06-23 15:38 Felix Blyakher
  2009-06-23 21:30 ` Christoph Hellwig
  2009-06-29  8:14 ` Michael Weissenbacher
  0 siblings, 2 replies; 3+ messages in thread
From: Felix Blyakher @ 2009-06-23 15:38 UTC (permalink / raw)
  To: xfs mailing list

I'd like to (re)propose Dave's patch from the last October to
address the problem of atime never making to the disk. Many
people complained about it.
I may have slightly adjusted the patch to fit the latest
code, and verified it addresses the issue.

The reference to the discussion on this matter on xfs mailing
list is here:

http://oss.sgi.com/archives/xfs/2008-10/msg02102.html

---

 From Dave Chinner <david@fromorbit.com>

XFS: Implement ->dirty_inode callout

Hook up ->dirty_inode so that when the VFS dirties an inode we
can mark the XFS inode as "dirty with unlogged changes". This
allows events such as touch_atime() to propagate the dirty state
right through to XFS so it gets written back to disk.

Signed-off-by: Dave Chinner <david@fromorbit.com>

diff --git a/fs/xfs/linux-2.6/xfs_aops.c b/fs/xfs/linux-2.6/xfs_aops.c
index 7ec89fc..ee6863e 100644
--- a/fs/xfs/linux-2.6/xfs_aops.c
+++ b/fs/xfs/linux-2.6/xfs_aops.c
@@ -215,7 +215,6 @@ xfs_setfilesize(

         if (ip->i_d.di_size < isize) {
                 ip->i_d.di_size = isize;
-               ip->i_update_core = 1;
                 ip->i_update_size = 1;
                 xfs_mark_inode_dirty_sync(ip);
         }
diff --git a/fs/xfs/linux-2.6/xfs_iops.c b/fs/xfs/linux-2.6/xfs_iops.c
index 58973bb..7a51612 100644
--- a/fs/xfs/linux-2.6/xfs_iops.c
+++ b/fs/xfs/linux-2.6/xfs_iops.c
@@ -120,19 +120,11 @@ xfs_ichgtime(
         }

         /*
-        * We update the i_update_core field _after_ changing
-        * the timestamps in order to coordinate properly with
-        * xfs_iflush() so that we don't lose timestamp updates.
-        * This keeps us from having to hold the inode lock
-        * while doing this.  We use the SYNCHRONIZE macro to
-        * ensure that the compiler does not reorder the update
-        * of i_update_core above the timestamp updates above.
+        * Update complete - now make sure everyone knows that the inode
+        * is dirty.
          */
-       if (sync_it) {
-               SYNCHRONIZE();
-               ip->i_update_core = 1;
+       if (sync_it)
                 xfs_mark_inode_dirty_sync(ip);
-       }
  }

  /*
diff --git a/fs/xfs/linux-2.6/xfs_super.c b/fs/xfs/linux-2.6/xfs_super.c
index 36fb8a6..3cc9ef2 100644
--- a/fs/xfs/linux-2.6/xfs_super.c
+++ b/fs/xfs/linux-2.6/xfs_super.c
@@ -973,6 +973,28 @@ xfs_fs_inode_init_once(
  }

  /*
+ * Dirty the XFS inode when mark_inode_dirty_sync() is called so that
+ * we catch unlogged VFS level updates to the inode. Care must be taken
+ * here - the transaction code calls mark_inode_dirty_sync() to mark  
the
+ * VFS inode dirty in a transaction and clears the i_update_core field;
+ * it must clear the field after calling mark_inode_dirty_sync() to
+ * correctly indicate that the dirty state has been propagated into the
+ * inode log item.
+ *
+ * We need the barrier() to maintain correct ordering between unlogged
+ * updates and the transaction commit code that clears the  
i_update_core
+ * field. This requires all updates to be completed before marking the
+ * inode dirty.
+ */
+STATIC void
+xfs_fs_dirty_inode(
+       struct inode    *inode)
+{
+       barrier();
+       XFS_I(inode)->i_update_core = 1;
+}
+
+/*
   * Attempt to flush the inode, this will actually fail
   * if the inode is pinned, but we dirty the inode again
   * at the point when it is unpinned after a log write,
@@ -1546,6 +1568,7 @@ xfs_fs_get_sb(
  static struct super_operations xfs_super_operations = {
         .alloc_inode            = xfs_fs_alloc_inode,
         .destroy_inode          = xfs_fs_destroy_inode,
+       .dirty_inode            = xfs_fs_dirty_inode,
         .write_inode            = xfs_fs_write_inode,
         .clear_inode            = xfs_fs_clear_inode,
         .put_super              = xfs_fs_put_super,
diff --git a/fs/xfs/xfs_inode_item.c b/fs/xfs/xfs_inode_item.c
index 977c4ae..43c31c0 100644
--- a/fs/xfs/xfs_inode_item.c
+++ b/fs/xfs/xfs_inode_item.c
@@ -232,6 +232,15 @@ xfs_inode_item_format(
         nvecs        = 1;

         /*
+        * Make sure the linux inode is dirty. We do this before
+        * clearing i_update_core as the VFS will call back into
+        * XFS here and set i_update_core, so we need to dirty the
+        * inode first so that the ordering of i_update_core and
+        * unlogged modifications still works as described below.
+        */
+       xfs_mark_inode_dirty_sync(ip);
+
+       /*
          * Clear i_update_core if the timestamps (or any other
          * non-transactional modification) need flushing/logging
          * and we're about to log them with the rest of the core.
@@ -275,11 +284,6 @@ xfs_inode_item_format(
          */
         xfs_synchronize_atime(ip);

-       /*
-        * make sure the linux inode is dirty
-        */
-       xfs_mark_inode_dirty_sync(ip);
-
         vecp->i_addr = (xfs_caddr_t)&ip->i_d;
         vecp->i_len  = sizeof(struct xfs_icdinode);
         XLOG_VEC_SET_TYPE(vecp, XLOG_REG_TYPE_ICORE);


_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] xfs: implement ->dirty_inode callout
  2009-06-23 15:38 [PATCH] xfs: implement ->dirty_inode callout Felix Blyakher
@ 2009-06-23 21:30 ` Christoph Hellwig
  2009-06-29  8:14 ` Michael Weissenbacher
  1 sibling, 0 replies; 3+ messages in thread
From: Christoph Hellwig @ 2009-06-23 21:30 UTC (permalink / raw)
  To: Felix Blyakher; +Cc: xfs mailing list

On Tue, Jun 23, 2009 at 10:38:43AM -0500, Felix Blyakher wrote:
> I'd like to (re)propose Dave's patch from the last October to
> address the problem of atime never making to the disk. Many
> people complained about it.
> I may have slightly adjusted the patch to fit the latest
> code, and verified it addresses the issue.
>
> The reference to the discussion on this matter on xfs mailing
> list is here:
>
> http://oss.sgi.com/archives/xfs/2008-10/msg02102.html


Can you run some benchmarks to see what impact it has with the
new relatime default?

The only places where we actual look at i_update_core are fsync and
the decision wether to flush out the inode fully in reclaim and
I we should sync out the inode there.

The VFS ends up calling into ->dirty_inode from the following places:

 - set_page_dirty
 - mark_buffer_dirty
 - touch_atime
 - file_update_time
 - generic_file_direct_write

And it doesn't distinguish between setting I_DIRTY_SYNC and
I_DIRTY_DATASYNC which means we'll also get it for the first two
events which only affect the data and not the inode core.

I'd be much more comfortable if we'd pass down those flags to
->dirty_inode and optimize based on that.


On the bright side this patch will also allow us to hack around the VFS
layering violation which currently causes missed c/mtime updates on
mmaped writes.

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] xfs: implement ->dirty_inode callout
  2009-06-23 15:38 [PATCH] xfs: implement ->dirty_inode callout Felix Blyakher
  2009-06-23 21:30 ` Christoph Hellwig
@ 2009-06-29  8:14 ` Michael Weissenbacher
  1 sibling, 0 replies; 3+ messages in thread
From: Michael Weissenbacher @ 2009-06-29  8:14 UTC (permalink / raw)
  To: Felix Blyakher; +Cc: xfs mailing list

Hi Felix!
> I'd like to (re)propose Dave's patch from the last October to
> address the problem of atime never making to the disk. Many
> people complained about it.
Tested here on 2.6.30 & works perfectly. I noticed that "strictatime" and "relatime" mounts options can't be used with XFS - had to change the default directly in fs/namespace.c.

> Can you run some benchmarks to see what impact it has with the
> new relatime default?
I did some benchmarks here and wasn't able to see any significant differences, even when atime is fully enabled. The tests were done with kernel 2.6.30 on a VMWare ESXi Server 3.5.0. Real Hardware is a dual Quad Xeon 2.66GHz, 6-disk 146GB 15k RAID5 on PERC6 with 128MB BBU cache.

xfs_info of test fs
--------------------
meta-data=/dev/sda3              isize=256    agcount=4, agsize=2365440 blks
         =                       sectsz=512   attr=2
data     =                       bsize=4096   blocks=9461760, imaxpct=25
         =                       sunit=0      swidth=0 blks
naming   =version 2              bsize=4096   ascii-ci=0
log      =internal               bsize=4096   blocks=32768, version=2
         =                       sectsz=512   sunit=0 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0

bonnie++ results
----------------
bonnie_relatime_nopatch.txt
Version 1.93c       ------Sequential Output------ --Sequential Input- --Random-
Concurrency   1     -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
gentoo-x64-xfste 2G  1261  97 191145  34 95765  27  2951  97 234814  22  1169  17
Latency             12375us     271ms     927ms   11581us   17640us   54677us
Version 1.93c       ------Sequential Create------ --------Random Create--------
gentoo-x64-xfstest  -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
files:max:min        /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP
    32:100000:10/64  3951  42  1842  26 15114  87  4240  44   170   2  7004  57
Latency               622ms   19046us     156ms     543ms     106ms    1080ms
1.93c,1.93c,gentoo-x64-xfstest,1,1246251173,2G,,1261,97,191145,34,95765,27,2951,97,234814,22,1169,17,32,100000,10,,64,3951,42,1842,26,15114,87,4240,44,170,2,7004,57,12375us,271ms,927ms,11581us,17640us,54677us,622ms,19046us,156ms,543ms,106ms,1080ms

bonnie_relatime_withpach.txt
Version 1.93c       ------Sequential Output------ --Sequential Input- --Random-
Concurrency   1     -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
gentoo-x64-xfste 2G  1271  99 284571  57 120562  27  2925  98 234804  24  1227  19
Latency             12436us     429ms     145ms   13323us   11892us   48055us
Version 1.93c       ------Sequential Create------ --------Random Create--------
gentoo-x64-xfstest  -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
files:max:min        /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP
    32:100000:10/64  3178  46  2308  32 17881  96  4451  54   171   2  7380  60
Latency               573ms   17378us   28025us   77271us     326ms     928ms
1.93c,1.93c,gentoo-x64-xfstest,1,1245948051,2G,,1271,99,284571,57,120562,27,2925,98,234804,24,1227,19,32,100000,10,,64,3178,46,2308,32,17881,96,4451,54,171,2,7380,60,12436us,429ms,145ms,13323us,11892us,48055us,573ms,17378us,28025us,77271us,326ms,928ms

bonnie_atime_withpatch.txt
Version 1.93c       ------Sequential Output------ --Sequential Input- --Random-
Concurrency   1     -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
gentoo-x64-xfste 2G  1309  99 166258  21 117374  29  3015  98 223034  25  1182  22
Latency              8626us     496ms   23729us    9696us   28558us   43342us
Version 1.93c       ------Sequential Create------ --------Random Create--------
gentoo-x64-xfstest  -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
files:max:min        /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP
    32:100000:10/64  3696  52  1806  27 16913  89  4072  60   167   2  6838  58
Latency               600ms   21008us   26798us   55639us    1068ms    1045ms
1.93c,1.93c,gentoo-x64-xfstest,1,1245923465,2G,,1309,99,166258,21,117374,29,3015,98,223034,25,1182,22,32,100000,10,,64,3696,52,1806,27,16913,89,4072,60,167,2,6838,58,8626us,496ms,23729us,9696us,28558us,43342us,600ms,21008us,26798us,55639us,1068ms,1045ms

bonnie_noatime.txt
Version 1.93c       ------Sequential Output------ --Sequential Input- --Random-
Concurrency   1     -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
gentoo-x64-xfste 2G  1302  99 312421  52 115066  29  3121  99 243930  24  1282  18
Latency              7750us     562ms   24417us   10083us   12990us   50122us
Version 1.93c       ------Sequential Create------ --------Random Create--------
gentoo-x64-xfstest  -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
files:max:min        /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP
    32:100000:10/64  3917  55  1942  28 16849  91  4043  59   172   2  7705  63
Latency               312ms     105ms    3530us     130ms     159ms     690ms
1.93c,1.93c,gentoo-x64-xfstest,1,1245922347,2G,,1302,99,312421,52,115066,29,3121,99,243930,24,1282,18,32,100000,10,,64,3917,55,1942,28,16849,91,4043,59,172,2,7705,63,7750us,562ms,24417us,10083us,12990us,50122us,312ms,105ms,3530us,130ms,159ms,690ms


hth,
Michael

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2009-06-29  8:13 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-06-23 15:38 [PATCH] xfs: implement ->dirty_inode callout Felix Blyakher
2009-06-23 21:30 ` Christoph Hellwig
2009-06-29  8:14 ` Michael Weissenbacher

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.