All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC] jbd2: reduce the number of writes when commiting a transacation
@ 2012-04-20 11:06 Zheng Liu
  2012-04-20 11:21 ` Andreas Dilger
  0 siblings, 1 reply; 10+ messages in thread
From: Zheng Liu @ 2012-04-20 11:06 UTC (permalink / raw)
  To: linux-ext4, linux-fsdevel

Hi list,

In this thread[1], I found a defect in jbd2 because it needs two wrties
to finish a transacation because it writes journal header and data to
disk and it will write commit to disk after above writes are done.
AFAIK, in jbd2, it will call submit_bh twice at least to write the data
because journal header, data and commit are stored in different
buffer_heads.  If we don't call them separately, these calls might be
out of order.  Obviously, it must ensure that journal header and data are
written before commit.  But this brings a huge overhead in this
benchmark[2].  So, IMHO, if we could use *bio* to store these data
rather than buffer_head, we could avoid this overhead because we can
call submit_bio only once to write all of data, which contains journal
header, data and commit.  Here is an issue that I don't determine.  If
we use submit_bio to write journal data, it will make all of data with
WRITE_FLUSH_FUA flag.  But now there is only commit data with this flag.
I am not sure whether or not it brings some other unpridictable
problems. :(

Please feel free to comment this RFC.  Thank you.

1. http://www.spinics.net/lists/linux-ext4/msg31637.html
2. benchmark: time for((i=0;i<2000;i++)); do \
		dd if=/dev/zero of=/mnt/sda1/testfile conv=notrunc bs=4k \
		count=1 seek=`expr $i \* 16` oflag=sync,direct 2>/dev/null; \
		done

Regards,
Zheng

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2012-04-25 20:34 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-04-20 11:06 [RFC] jbd2: reduce the number of writes when commiting a transacation Zheng Liu
2012-04-20 11:21 ` Andreas Dilger
2012-04-23  2:25   ` Zheng Liu
2012-04-23  6:24     ` Andreas Dilger
2012-04-23  7:23       ` Zheng Liu
2012-04-23 22:19       ` djwong
2012-04-24 19:41         ` Ted Ts'o
2012-04-25 20:34           ` djwong
2012-04-24 21:57       ` Jan Kara
2012-04-25  1:27         ` Ted Ts'o

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.