All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jan Kara <jack@suse.cz>
To: Zheng Liu <gnehzuil.liu@gmail.com>
Cc: Jouni Siren <jouni.siren@iki.fi>, linux-ext4@vger.kernel.org
Subject: Re: Bug: Large writes can fail on ext4 if the write buffer is not empty
Date: Thu, 12 Apr 2012 22:20:29 +0200	[thread overview]
Message-ID: <20120412202029.GB19808@quack.suse.cz> (raw)
In-Reply-To: <20120412160658.GA9697@gmail.com>

On Fri 13-04-12 00:06:59, Zheng Liu wrote:
> On Thu, Apr 12, 2012 at 05:47:41PM +0300, Jouni Siren wrote:
> > Hi,
> > 
> > I recently ran into problems when writing large blocks of data (more than about 2 GB) with a single call, if there is already some data in the write buffer. The problem seems to be specific to ext4, or at least it does not happen when writing to nfs on the same system. Also, the problem does not happen, if the write buffer is flushed before the large write.
> > 
> > The following C++ program should write a total of 4294967304 bytes, but I end up with a file of size 2147483664.
> > 
> > #include <fstream>
> > 
> > int
> > main(int argc, char** argv)
> > {
> >   std::streamsize data_size = (std::streamsize)1 << 31;
> >   char* data = new char[data_size];
> > 
> >   std::ofstream output("test.dat", std::ios_base::binary);
> >   output.write(data, 8);
> >   output.write(data, data_size);
> >   output.write(data, data_size);
> >   output.close();
> > 
> >   delete[] data;
> >   return 0;
> > }
> > 
> > 
> > The relevant part of strace is the following:
> > 
> > open("test.dat", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 3
> > writev(3, [{"\0\0\0\0\0\0\0\0", 8}, {"", 2147483648}], 2) = -2147483640
> > writev(3, [{0xffffffff80c6d258, 2147483648}, {"", 2147483648}], 2) = -1 EFAULT (Bad address)
> > write(3, "\0\0\0\0\0\0\0\0", 8)         = 8
> > close(3)                                = 0
> > 
> > 
> > The first two writes are combined into a single writev call that reports having written -2147483640 bytes. This is the same as 8 + 2147483648, when interpreted as a signed 32-bit integer. After the first call, everything more or less fails. This happens on a Linux system, where uname -a returns
> > 
> > Linux alm01 2.6.32-220.7.1.el6.x86_64 #1 SMP Tue Mar 6 15:45:33 CST 2012 x86_64 x86_64 x86_64 GNU/Linux
> > 
> > 
> > I believe that the bug can be found in file.c, function ext4_file_write, where variable ret has type int. Function generic_file_aio_write returns the number of bytes written as a ssize_t, and the returned value is stored in ret and eventually returned by ext4_file_write. If the number of bytes written is more than INT_MAX, the value returned by ext4_file_write will be incorrect.
> > 
> > If you need more information on the problem, I will be happy to provide it.
> 
> Hi Jouni,
> 
> Indeed, I think that it is a bug.  So the solution is straightforward.
> Could you please try this patch?  Thank you.
> 
> Regards,
> Zheng
> 
> From: Zheng Liu <wenqing.lz@taobao.com>
> Subject: [PATCH] ext4: change return value from int to ssize_t in ext4_file_write
> 
> in 32 bit platform, when we do a write operation with a huge number of data, it
> will cause that the ret variable overflows.  So it is replaced with ssize_t.
  Actually, the problem happens for 64-bit platforms. On 32-platforms,
ssize_t is 32-bit and thus we would do only a short write to fit into 2^31.
But on 64-bit, ssize_t is 64-bit so there we can end up writing more than
2^31. So please change the changelog. The patch itself is good, so you can
add:
  Reviewed-by: Jan Kara <jack@suse.cz>

								Honza
> 
> Reported-by: Jouni Siren <jouni.siren@iki.fi>
> Signed-off-by: Zheng Liu <wenqing.lz@taobao.com>
> ---
>  fs/ext4/file.c |    2 +-
>  1 files changed, 1 insertions(+), 1 deletions(-)
> 
> diff --git a/fs/ext4/file.c b/fs/ext4/file.c
> index cb70f18..8c7642a 100644
> --- a/fs/ext4/file.c
> +++ b/fs/ext4/file.c
> @@ -95,7 +95,7 @@ ext4_file_write(struct kiocb *iocb, const struct iovec *iov,
>  {
>  	struct inode *inode = iocb->ki_filp->f_path.dentry->d_inode;
>  	int unaligned_aio = 0;
> -	int ret;
> +	ssize_t ret;
>  
>  	/*
>  	 * If we have encountered a bitmap-format file, the size limit
> -- 
> 1.7.4.1
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

  reply	other threads:[~2012-04-12 21:37 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-04-12 14:47 Bug: Large writes can fail on ext4 if the write buffer is not empty Jouni Siren
2012-04-12 16:06 ` Zheng Liu
2012-04-12 20:20   ` Jan Kara [this message]
2012-04-13  1:22     ` [PATCH RESEND] ext4: change return value from int to ssize_t in ext4_file_write Zheng Liu
2012-05-22 19:44       ` Eric Sandeen
2012-05-28 22:08       ` Ted Ts'o
2012-04-13  0:10 ` Bug: Large writes can fail on ext4 if the write buffer is not empty Dave Chinner
2012-04-19 13:10 ` Jouko Orava
2012-04-19 14:15   ` Eric Sandeen
2012-04-19 14:38     ` Jouko Orava
2012-04-19 14:45       ` Eric Sandeen
2012-04-19 15:09         ` Jouko Orava
2012-04-19 15:28           ` Zheng Liu
2012-04-20  2:12     ` Dave Chinner
2012-04-19 14:56   ` Zheng Liu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120412202029.GB19808@quack.suse.cz \
    --to=jack@suse.cz \
    --cc=gnehzuil.liu@gmail.com \
    --cc=jouni.siren@iki.fi \
    --cc=linux-ext4@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.