* Re: High CPU Utilization When Copying to Ext4
[not found] <341DAA96EE3A8444B6E4657BE8A846EA4B3DA126FE@NDJSSCC06.ndc.nasa.gov>
@ 2011-06-27 3:05 ` Ted Ts'o
2011-06-27 9:24 ` Lukas Czerner
2011-06-28 18:37 ` Mccauliff, Sean D. (ARC-PX)[Lockheed Martin Space OPNS]
0 siblings, 2 replies; 14+ messages in thread
From: Ted Ts'o @ 2011-06-27 3:05 UTC (permalink / raw)
To: Mccauliff, Sean D. (ARC-PX)[Lockheed Martin Space OPNS]; +Cc: linux-ext4
On Sun, Jun 26, 2011 at 12:33:16PM -0500, Mccauliff, Sean D. (ARC-PX)[Lockheed Martin Space OPNS] wrote:
> Sorry if this is not the correct mailing list for ext4 questions.
-ext3-users, +linux-ext4
> I'm copying terabytes of data from an ext3 file system to a new ext4
> file system. I'm seeing high CPU usage from the processes
> flush-253:2, kworker-3:0, kworker-2:2, kworker-1:1, and kworker-0:0.
> Does anyone on the list have any idea what these processes do, why
> they are consuming so much cpu time and if there is something that
> can be done about it? This is using Fedora 15.
You're using Fedora 15, so you're using a 2.6.38 kernel, right?
How are you copying the files? Are you using cp? rsync? NFS? CIFS?
what sort of files are you copying? Are they large files, many of
small files? Are there lots of hard links? etc.
- Ted
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: High CPU Utilization When Copying to Ext4
2011-06-27 3:05 ` High CPU Utilization When Copying to Ext4 Ted Ts'o
@ 2011-06-27 9:24 ` Lukas Czerner
2011-06-28 18:37 ` Mccauliff, Sean D. (ARC-PX)[Lockheed Martin Space OPNS]
1 sibling, 0 replies; 14+ messages in thread
From: Lukas Czerner @ 2011-06-27 9:24 UTC (permalink / raw)
To: Ted Ts'o
Cc: Mccauliff, Sean D. (ARC-PX)[Lockheed Martin Space OPNS], linux-ext4
On Sun, 26 Jun 2011, Ted Ts'o wrote:
> On Sun, Jun 26, 2011 at 12:33:16PM -0500, Mccauliff, Sean D. (ARC-PX)[Lockheed Martin Space OPNS] wrote:
> > Sorry if this is not the correct mailing list for ext4 questions.
>
> -ext3-users, +linux-ext4
>
> > I'm copying terabytes of data from an ext3 file system to a new ext4
> > file system. I'm seeing high CPU usage from the processes
> > flush-253:2, kworker-3:0, kworker-2:2, kworker-1:1, and kworker-0:0.
> > Does anyone on the list have any idea what these processes do, why
> > they are consuming so much cpu time and if there is something that
> > can be done about it? This is using Fedora 15.
>
> You're using Fedora 15, so you're using a 2.6.38 kernel, right?
>
> How are you copying the files? Are you using cp? rsync? NFS? CIFS?
>
> what sort of files are you copying? Are they large files, many of
> small files? Are there lots of hard links? etc.
>
> - Ted
Also, how high is high CPU usage ? Is ext4lazyinit thread running (ps
aux | grep ext4lazyinit) ? If it is could you try to mount it with '-o
noinit_itable' mount option to see if it helps ?
Thanks!
-Lukas
^ permalink raw reply [flat|nested] 14+ messages in thread
* RE: High CPU Utilization When Copying to Ext4
2011-06-27 3:05 ` High CPU Utilization When Copying to Ext4 Ted Ts'o
2011-06-27 9:24 ` Lukas Czerner
@ 2011-06-28 18:37 ` Mccauliff, Sean D. (ARC-PX)[Lockheed Martin Space OPNS]
2011-06-28 20:14 ` Theodore Tso
2011-06-28 20:17 ` Andreas Dilger
1 sibling, 2 replies; 14+ messages in thread
From: Mccauliff, Sean D. (ARC-PX)[Lockheed Martin Space OPNS] @ 2011-06-28 18:37 UTC (permalink / raw)
To: Ted Ts'o; +Cc: linux-ext4
uname -a
Linux sasr200-2.arc.nasa.gov 2.6.38.7-30.fc15.x86_64 #1 SMP Fri May 27 05:15:53 UTC 2011 x86_64 x86_64 x86_64 GNU/Linux
There are about 10M files. Many are small. There are about 2M files that are sparse files. It's hen the copy program gets to these files that the cpu usage gets very high. There are no links of any kind.
The copy program is written in Java, but uses the fiemap to get the logical address ranges that have actually been allocated. It merges any contiguous logical address ranges when it reads and writes to the new file.
The copy has completed. This is a snipped from top I had saved. This machine has 4 cores and 8G of ram. There are 32 threads doing copies. At any time each has a directory to itself.
% cpu
0573 root 20 0 7574m 1.9g 1356 S 204.3 24.9 3054:22 java
27702 root 20 0 0 0 0 R 70.5 0.0 689:01.73 flush-253:2
22467 root 20 0 0 0 0 S 22.6 0.0 7:55.98 kworker/3:1
22351 root 20 0 0 0 0 S 21.6 0.0 9:42.58 kworker/1:3
22686 root 20 0 0 0 0 S 21.3 0.0 0:26.19 kworker/2:0
22679 root 20 0 0 0 0 S 13.8 0.0 0:29.14 kworker/0:1
38 root 20 0 0 0 0 S 9.2 0.0 91:21.19 kswapd0
22700 root 20 0 0 0 0 S 7.9 0.0 0:04.64 kworker/0:0
10566 root 20 0 0 0 0 S 3.6 0.0 17:14.77 jbd2/dm-2-8
If I remember correctly top said that: 97% of time was sys time. So even the time used by Java was still almost all kernel time. Only a few megabytes was actually swapped.
Thanks,
Sean
________________________________________
From: Ted Ts'o [tytso@mit.edu]
Sent: Sunday, June 26, 2011 8:05 PM
To: Mccauliff, Sean D. (ARC-PX)[Lockheed Martin Space OPNS]
Cc: linux-ext4@vger.kernel.org
Subject: Re: High CPU Utilization When Copying to Ext4
On Sun, Jun 26, 2011 at 12:33:16PM -0500, Mccauliff, Sean D. (ARC-PX)[Lockheed Martin Space OPNS] wrote:
> Sorry if this is not the correct mailing list for ext4 questions.
-ext3-users, +linux-ext4
> I'm copying terabytes of data from an ext3 file system to a new ext4
> file system. I'm seeing high CPU usage from the processes
> flush-253:2, kworker-3:0, kworker-2:2, kworker-1:1, and kworker-0:0.
> Does anyone on the list have any idea what these processes do, why
> they are consuming so much cpu time and if there is something that
> can be done about it? This is using Fedora 15.
You're using Fedora 15, so you're using a 2.6.38 kernel, right?
How are you copying the files? Are you using cp? rsync? NFS? CIFS?
what sort of files are you copying? Are they large files, many of
small files? Are there lots of hard links? etc.
- Ted
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: High CPU Utilization When Copying to Ext4
2011-06-28 18:37 ` Mccauliff, Sean D. (ARC-PX)[Lockheed Martin Space OPNS]
@ 2011-06-28 20:14 ` Theodore Tso
2011-06-28 20:20 ` Mccauliff, Sean D. (ARC-PX)[Lockheed Martin Space OPNS]
2011-06-28 20:17 ` Andreas Dilger
1 sibling, 1 reply; 14+ messages in thread
From: Theodore Tso @ 2011-06-28 20:14 UTC (permalink / raw)
To: Mccauliff, Sean D. (ARC-PX)[Lockheed Martin Space OPNS]; +Cc: linux-ext4
On Jun 28, 2011, at 2:37 PM, Mccauliff, Sean D. (ARC-PX)[Lockheed Martin Space OPNS] wrote:
> uname -a
> Linux sasr200-2.arc.nasa.gov 2.6.38.7-30.fc15.x86_64 #1 SMP Fri May 27 05:15:53 UTC 2011 x86_64 x86_64 x86_64 GNU/Linux
>
> There are about 10M files. Many are small. There are about 2M files that are sparse files. It's hen the copy program gets to these files that the cpu usage gets very high. There are no links of any kind.
>
> The copy program is written in Java, but uses the fiemap to get the logical address ranges that have actually been allocated. It merges any contiguous logical address ranges when it reads and writes to the new file.
Fiemap?!? What kind of copy algorithm are you using?
Why aren't you just doing a "read 10 megs from ext3",
"write 10 megs to ext4"? How does fiemap figure into this?
-- Ted
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: High CPU Utilization When Copying to Ext4
2011-06-28 18:37 ` Mccauliff, Sean D. (ARC-PX)[Lockheed Martin Space OPNS]
2011-06-28 20:14 ` Theodore Tso
@ 2011-06-28 20:17 ` Andreas Dilger
2011-06-29 23:16 ` Sean McCauliff
1 sibling, 1 reply; 14+ messages in thread
From: Andreas Dilger @ 2011-06-28 20:17 UTC (permalink / raw)
To: Mccauliff, Sean D. (ARC-PX)[Lockheed Martin Space OPNS]
Cc: Ted Ts'o, linux-ext4
On 2011-06-28, at 12:37 PM, Mccauliff, Sean D. (ARC-PX)[Lockheed Martin Space OPNS] wrote:
> uname -a
> Linux sasr200-2.arc.nasa.gov 2.6.38.7-30.fc15.x86_64 #1 SMP Fri May 27 05:15:53 UTC 2011 x86_64 x86_64 x86_64 GNU/Linux
>
> There are about 10M files. Many are small. There are about 2M files that are sparse files. It's hen the copy program gets to these files that the cpu usage gets very high. There are no links of any kind.
>
> The copy program is written in Java, but uses the fiemap to get the logical address ranges that have actually been allocated. It merges any contiguous logical address ranges when it reads and writes to the new file.
Note that you need to be careful with FIEMAP for copying files... There were
some problems reported to this list with this, if the file was newly written.
It is safest to always pass FIEMAP_FLAG_SYNC before copying the file to ensure
the blocks are mapped to disk.
> The copy has completed. This is a snipped from top I had saved. This machine has 4 cores and 8G of ram. There are 32 threads doing copies. At any time each has a directory to itself.
>
> % cpu
> 0573 root 20 0 7574m 1.9g 1356 S 204.3 24.9 3054:22 java
> 27702 root 20 0 0 0 0 R 70.5 0.0 689:01.73 flush-253:2
> 22467 root 20 0 0 0 0 S 22.6 0.0 7:55.98 kworker/3:1
> 22351 root 20 0 0 0 0 S 21.6 0.0 9:42.58 kworker/1:3
> 22686 root 20 0 0 0 0 S 21.3 0.0 0:26.19 kworker/2:0
> 22679 root 20 0 0 0 0 S 13.8 0.0 0:29.14 kworker/0:1
> 38 root 20 0 0 0 0 S 9.2 0.0 91:21.19 kswapd0
> 22700 root 20 0 0 0 0 S 7.9 0.0 0:04.64 kworker/0:0
> 10566 root 20 0 0 0 0 S 3.6 0.0 17:14.77 jbd2/dm-2-8
>
> If I remember correctly top said that: 97% of time was sys time. So even the time used by Java was still almost all kernel time. Only a few megabytes was actually swapped.
Looking at the above, "java" is using by far the most memory/CPU, unless this
program is not just doing the copy?
You could run oprofile to see where the CPU cycles are being used.
> ________________________________________
> From: Ted Ts'o [tytso@mit.edu]
> Sent: Sunday, June 26, 2011 8:05 PM
> To: Mccauliff, Sean D. (ARC-PX)[Lockheed Martin Space OPNS]
> Cc: linux-ext4@vger.kernel.org
> Subject: Re: High CPU Utilization When Copying to Ext4
>
> On Sun, Jun 26, 2011 at 12:33:16PM -0500, Mccauliff, Sean D. (ARC-PX)[Lockheed Martin Space OPNS] wrote:
>> Sorry if this is not the correct mailing list for ext4 questions.
>
> -ext3-users, +linux-ext4
>
>> I'm copying terabytes of data from an ext3 file system to a new ext4
>> file system. I'm seeing high CPU usage from the processes
>> flush-253:2, kworker-3:0, kworker-2:2, kworker-1:1, and kworker-0:0.
>> Does anyone on the list have any idea what these processes do, why
>> they are consuming so much cpu time and if there is something that
>> can be done about it? This is using Fedora 15.
>
> You're using Fedora 15, so you're using a 2.6.38 kernel, right?
>
> How are you copying the files? Are you using cp? rsync? NFS? CIFS?
>
> what sort of files are you copying? Are they large files, many of
> small files? Are there lots of hard links? etc.
>
> - Ted
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
Cheers, Andreas
--
Andreas Dilger
Principal Engineer
Whamcloud, Inc.
^ permalink raw reply [flat|nested] 14+ messages in thread
* RE: High CPU Utilization When Copying to Ext4
2011-06-28 20:14 ` Theodore Tso
@ 2011-06-28 20:20 ` Mccauliff, Sean D. (ARC-PX)[Lockheed Martin Space OPNS]
2011-06-29 13:08 ` Theodore Tso
0 siblings, 1 reply; 14+ messages in thread
From: Mccauliff, Sean D. (ARC-PX)[Lockheed Martin Space OPNS] @ 2011-06-28 20:20 UTC (permalink / raw)
To: Theodore Tso; +Cc: linux-ext4
Last time I benchmarked cp and tar with the respective sparse file options they where extremely slow as they (claim) to identify sparseness by contiguous regions of zeros. This was quite sometime ago so perhaps cp and tar have changed.
Sean
________________________________________
From: Theodore Tso [tytso@MIT.EDU]
Sent: Tuesday, June 28, 2011 1:14 PM
To: Mccauliff, Sean D. (ARC-PX)[Lockheed Martin Space OPNS]
Cc: linux-ext4@vger.kernel.org
Subject: Re: High CPU Utilization When Copying to Ext4
On Jun 28, 2011, at 2:37 PM, Mccauliff, Sean D. (ARC-PX)[Lockheed Martin Space OPNS] wrote:
> uname -a
> Linux sasr200-2.arc.nasa.gov 2.6.38.7-30.fc15.x86_64 #1 SMP Fri May 27 05:15:53 UTC 2011 x86_64 x86_64 x86_64 GNU/Linux
>
> There are about 10M files. Many are small. There are about 2M files that are sparse files. It's hen the copy program gets to these files that the cpu usage gets very high. There are no links of any kind.
>
> The copy program is written in Java, but uses the fiemap to get the logical address ranges that have actually been allocated. It merges any contiguous logical address ranges when it reads and writes to the new file.
Fiemap?!? What kind of copy algorithm are you using?
Why aren't you just doing a "read 10 megs from ext3",
"write 10 megs to ext4"? How does fiemap figure into this?
-- Ted
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: High CPU Utilization When Copying to Ext4
2011-06-28 20:20 ` Mccauliff, Sean D. (ARC-PX)[Lockheed Martin Space OPNS]
@ 2011-06-29 13:08 ` Theodore Tso
2011-06-30 0:01 ` Sean McCauliff
0 siblings, 1 reply; 14+ messages in thread
From: Theodore Tso @ 2011-06-29 13:08 UTC (permalink / raw)
To: Mccauliff, Sean D. (ARC-PX)[Lockheed Martin Space OPNS]; +Cc: linux-ext4
On Jun 28, 2011, at 4:20 PM, Mccauliff, Sean D. (ARC-PX)[Lockheed Martin Space OPNS] wrote:
> Last time I benchmarked cp and tar with the respective sparse file options they where extremely slow as they (claim) to identify sparseness by contiguous regions of zeros. This was quite sometime ago so perhaps cp and tar have changed.
How many of your files are sparse? If the source file is not sparse (which you can check by looking at st_blocks and comparing it to the st_size value), and skip calling fiemap in that case.
Also, how many times are you calling fiemap per file? Are you calling once per block, or something silly like this?
(This is all of the details that should have been in your initial question, by the way.... we're not mind readers, you know. Can you just send a copy of the key parts of your Java code?)
-- Ted
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: High CPU Utilization When Copying to Ext4
2011-06-28 20:17 ` Andreas Dilger
@ 2011-06-29 23:16 ` Sean McCauliff
0 siblings, 0 replies; 14+ messages in thread
From: Sean McCauliff @ 2011-06-29 23:16 UTC (permalink / raw)
To: Andreas Dilger; +Cc: Ted Ts'o, linux-ext4
On 06/28/2011 01:17 PM, Andreas Dilger wrote:
> Note that you need to be careful with FIEMAP for copying files... There were
> some problems reported to this list with this, if the file was newly written.
> It is safest to always pass FIEMAP_FLAG_SYNC before copying the file to ensure
> the blocks are mapped to disk.
Thanks!
>
>> The copy has completed. This is a snipped from top I had saved. This machine has 4 cores and 8G of ram. There are 32 threads doing copies. At any time each has a directory to itself.
>>
>> % cpu
>> 0573 root 20 0 7574m 1.9g 1356 S 204.3 24.9 3054:22 java
>> 27702 root 20 0 0 0 0 R 70.5 0.0 689:01.73 flush-253:2
>> 22467 root 20 0 0 0 0 S 22.6 0.0 7:55.98 kworker/3:1
>> 22351 root 20 0 0 0 0 S 21.6 0.0 9:42.58 kworker/1:3
>> 22686 root 20 0 0 0 0 S 21.3 0.0 0:26.19 kworker/2:0
>> 22679 root 20 0 0 0 0 S 13.8 0.0 0:29.14 kworker/0:1
>> 38 root 20 0 0 0 0 S 9.2 0.0 91:21.19 kswapd0
>> 22700 root 20 0 0 0 0 S 7.9 0.0 0:04.64 kworker/0:0
>> 10566 root 20 0 0 0 0 S 3.6 0.0 17:14.77 jbd2/dm-2-8
>>
>> If I remember correctly top said that: 97% of time was sys time. So even the time used by Java was still almost all kernel time. Only a few megabytes was actually swapped.
>
> Looking at the above, "java" is using by far the most memory/CPU, unless this
> program is not just doing the copy?
It does walk down the directory tree. When it finds a directory it
creates a new object in the thread work queue for the directory it
found. Threads read off the the queue creating directories in the
destination and copying files to the destination.
>
> You could run oprofile to see where the CPU cycles are being used.
I will do this next time I'm running the copy.
Thanks,
Sean
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: High CPU Utilization When Copying to Ext4
2011-06-29 13:08 ` Theodore Tso
@ 2011-06-30 0:01 ` Sean McCauliff
2011-06-30 2:33 ` Ted Ts'o
0 siblings, 1 reply; 14+ messages in thread
From: Sean McCauliff @ 2011-06-30 0:01 UTC (permalink / raw)
To: Theodore Tso; +Cc: linux-ext4
On 06/29/2011 06:08 AM, Theodore Tso wrote:
>
> On Jun 28, 2011, at 4:20 PM, Mccauliff, Sean D. (ARC-PX)[Lockheed Martin Space OPNS] wrote:
>
>> Last time I benchmarked cp and tar with the respective sparse file options they where extremely slow as they (claim) to identify sparseness by contiguous regions of zeros. This was quite sometime ago so perhaps cp and tar have changed.
>
> How many of your files are sparse? If the source file is not sparse (which you can check by looking at st_blocks and comparing it to the st_size value), and skip calling fiemap in that case.
I already know most of the files are not sparse and can identify them by
the directory name they reside in. So the copy program just does a
straight copy for the non-sparse files without the fiemap trickery.
I've already mentioned that I have about 2M sparse files.
>
> Also, how many times are you calling fiemap per file? Are you calling once per block, or something silly like this?
Twice. Once to get the number of struct fiemap_extent and another time
with the correct number of struct fiemap_extent.
>
> (This is all of the details that should have been in your initial question, by the way.... we're not mind readers, you know. Can you just send a copy of the key parts of your Java code?)
Sorry, I didn't mean to bother you. I did try and email ext3-users so
as to not take up any developer time with my question. Portions of the
source are below. It might also be useful to know the source and
destination file systems live on a 3par SAN, RAID 1+0 stripped across
240 7200 rpm disks. The source file system uses LVM to combine several
3par volumes into a single volume. The destination file system does not
use LVM. There are two FC HBAs, they are load balanced using
multipathd. My original question:
> I'm copying terabytes of data from an ext3 file system to a new ext4
> file system. I'm seeing high CPU usage from the processes flush-
>253:2, kworker-3:0, kworker-2:2, kworker-1:1, and kworker-0:0. Does
> anyone on the list have any idea what these processes do, why they
>are consuming so much cpu time and if there is something that can be
>done about it? This is using Fedora 15.
Thanks,
Sean
///This is a snipped from extentmap.cpp, I thought I would spare you
//the madness of looking the JNI portion.
static void initFiemap(struct fiemap* fiemap, __u32 nExtents) {
if (fiemap == 0) {
throw FiemapException("Bad fiemap pointer.");
}
memset(fiemap, 0, sizeof(struct fiemap));
//Start mapping the file from user space length 0.
fiemap->fm_start = 0;
//Start mapping to the last possible byte of user space.
fiemap->fm_length = ~0ULL;
//In the current code this is now FIEMAP_FLAG_SYNC
fiemap->fm_flags = 0;
fiemap->fm_extent_count = nExtents;
fiemap->fm_mapped_extents = 0;
memset(fiemap->fm_extents, 0, sizeof(struct fiemap_extent) * nExtents);
}
static struct fiemap *readFiemap(int fd) throw (FiemapException) {
struct fiemap* extentMap =
reinterpret_cast<struct fiemap*>(malloc(sizeof(struct fiemap)));
if (extentMap == 0) {
throw FiemapException("Failed to allocate fiemap struct.");
}
FiemapDeallocator fiemapDeallocator(extentMap);
initFiemap(extentMap, 0);
// Find out how many extents there are
if (ioctl(fd, FS_IOC_FIEMAP, extentMap) < 0) {
char errbuf[128];
strerror_r(errno, errbuf, 127);
throw FiemapException(errbuf);
}
__u32 nExtents = extentMap->fm_mapped_extents;
__u32 extents_size = sizeof(struct fiemap_extent) * nExtents;
fiemapDeallocator.noDeallocate();
// Resize fiemap to allow us to read in the extents.
extentMap = reinterpret_cast<struct
fiemap*>(realloc(extentMap,sizeof(struct fiemap) + extents_size));
if (extentMap == 0) {
throw FiemapException("Out of memory allocating fiemap.");
}
initFiemap(extentMap, nExtents);
FiemapDeallocator reallocDeallocator(extentMap);
if (ioctl(fd, FS_IOC_FIEMAP, extentMap) < 0) {
char errbuf[128];
strerror_r(errno, errbuf, 127);
throw FiemapException(errbuf);
}
reallocDeallocator.noDeallocate();
return extentMap;
}
////This is from the Java code SparseFileUtil.java
public List<SimpleInterval> extents(File file) throws IOException {
//A SimpleInterval is just a 64bit start and end pair
SimpleInterval[] extents = null;
try {
extents = extentsForFile(file.getAbsolutePath());
} catch (IllegalArgumentException iae) {
throw new IllegalArgumentException("For file \"" + file +
"\".", iae);
}
if (extents.length == 0) {
return Collections.emptyList();
}
Arrays.sort(extents, comp);
List<SimpleInterval> mergedExtents = new ArrayList<SimpleInterval>();
SimpleInterval current = extents[0];
//merge adjacent extents
for (int i=1; i < extents.length; i++) {
SimpleInterval sortedExtent = extents[i];
if (current.end() < sortedExtent.start()) {
mergedExtents.add(current);
current = sortedExtent;
} else {
current = new SimpleInterval(Math.min(sortedExtent.start(),
current.start()),
Math.max(current.end(), sortedExtent.end()));
}
}
mergedExtents.add(current);
return mergedExtents;
}
public void copySparseFile(File src, File dest) throws IOException {
if (!src.exists()) {
throw new FileNotFoundException(src.getAbsolutePath());
}
if (src.isDirectory()) {
throw new IllegalArgumentException("Src must be a file.");
}
List<SimpleInterval> extents = extents(src);
if (extents.size() == 1 && extents.get(0).start() == 0) {
FileUtils.copyFile(src, dest);
return;
}
byte[] buf = new byte[1024*1024];
RandomAccessFile srcRaf = new RandomAccessFile(src, "r");
try {
RandomAccessFile destRaf = new RandomAccessFile(dest, "rw");
try {
for (SimpleInterval extent : extents) {
long extentSize = extent.end() - extent.start() + 1;
srcRaf.seek(extent.start());
destRaf.seek(extent.start());
while (extentSize > 0) {
int readLen = (int) Math.min(buf.length, extentSize);
int nread = srcRaf.read(buf,0, readLen);
if (nread == -1) {
break; //file ends before extent ends.
}
extentSize -= nread;
destRaf.write(buf, 0, nread);
}
}
} finally {
FileUtil.close(destRaf);
}
} finally {
FileUtil.close(srcRaf);
}
}
private native SimpleInterval[] extentsForFile(String fname) throws
IOException;
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: High CPU Utilization When Copying to Ext4
2011-06-30 0:01 ` Sean McCauliff
@ 2011-06-30 2:33 ` Ted Ts'o
2011-07-08 17:08 ` Sean McCauliff
0 siblings, 1 reply; 14+ messages in thread
From: Ted Ts'o @ 2011-06-30 2:33 UTC (permalink / raw)
To: Sean McCauliff; +Cc: linux-ext4
On Wed, Jun 29, 2011 at 05:01:45PM -0700, Sean McCauliff wrote:
> Sorry, I didn't mean to bother you. I did try and email ext3-users
> so as to not take up any developer time with my question.
Yeah, but it's not likely anyone on that list would be able to help
you. Both ext3 and ext4 isn't expected to take a huge amount of CPU
under normal conditions when doing this type of copying where you will
be likely disk bound.
Well, you're not using fallocate() (at least you haven't disclosed it
to date), and writing into fallocated space is the only thing that
would be using a workqueue at all (which is what the kworker threads
are using).
So I very much doubt it has anything to do with ext4. The fiber
channel drivers do use workqueues a fair amount, so yes, it would be
useful to know that you are using a fiber channel SAN. At this point
I'd suggest that you use oprofile or perf to see where the CPU is
being consumed. Perf is probably better since it will allow you to
see the call chains.
- Ted
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: High CPU Utilization When Copying to Ext4
2011-06-30 2:33 ` Ted Ts'o
@ 2011-07-08 17:08 ` Sean McCauliff
2011-07-08 17:17 ` Andi Kleen
2011-07-08 18:24 ` Ted Ts'o
0 siblings, 2 replies; 14+ messages in thread
From: Sean McCauliff @ 2011-07-08 17:08 UTC (permalink / raw)
To: Ted Ts'o; +Cc: linux-ext4
I tried running perf on the copy program on subset of the sparse files.
It seems like ext4 is the source of high cpu utilization. At this
point this high cpu utilization is very annoying, but I can live with
this problem. If you know something simple I could do to alleviate this
problem I would be most appreciative. At the end of this email is a
consolidation of information about this problem.
Events: 6M cycles
-76.80% java [kernel.kallsyms] [k] ext4_mb_good_group
- ext4_mb_good_group
- 99.24% ext4_mb_regular_allocator
ext4_mb_new_blocks
ext4_ext_map_blocks
ext4_map_blocks
- mpage_da_map_and_submit
- 96.25% write_cache_pages_da
ext4_da_writepages
do_writepages
writeback_single_inode
writeback_sb_inodes
writeback_inodes_wb
balance_dirty_pages_ratelimited_nr
generic_file_buffered_write
__generic_file_aio_write
generic_file_aio_write
ext4_file_write
do_sync_write
vfs_write
sys_write
system_call_fastpath
- 0x338480df7d
100.00% writeBytes
+ 3.75% ext4_da_writepages
+ 0.76% ext4_mb_new_blocks
+4.07% java [kernel.kallsyms] [k] do_raw_spin_lock
+2.19% java [kernel.kallsyms] [k] _raw_spin_lock_irqsave
+1.53% java [kernel.kallsyms] [k] ext4_get_group_info
+1.07% java [kernel.kallsyms] [k] ext4_mb_regular_allocator
+1.07% java [kernel.kallsyms] [k] compaction_alloc
+0.85% java [kernel.kallsyms] [k] read_hpet
+0.40% java [kernel.kallsyms] [k] copy_user_generic_string
+0.32% java [kernel.kallsyms] [k] __bitmap_empty
+0.31% java [kernel.kallsyms] [k] ktime_get
Specifics:
The copy program is written in Java with some C code that calls the
fiemap ioctl. It uses this to maintain the sparseness of the
destination files and seems to be much faster then doing contiguous zero
detection like tar or cp in order to identify the holes in the files.
The copy program is using 64 threads.
During the copy system cpu is over 90%, iowait is generally only 1 or 2%.
Source file system is 8T ext3, destination file system is 16T ext4.
Files are sparse, non-sparse size is 17M. They have about a few hundred
extents on average as reported by filefrag. The destination file
generated by the copy program has fewer extents, but are otherwise
identical. I assume this is due to smarter allocation by ext4.
The source file system is built on top of LVM which is built on top of
four multipath devices which load balance for a pair of qlogic FC HBAs.
The destination file system is built on top of a single multipath
device which load balances the same pair of HBAs (no LVM).
The san is a 3par with 240 SATA drives. Each lun exported to the server
is in a RAID1+0 configuration striped over all the drives. The server
is directly connection without a FC switch.
Fedora 15.
Linux xxxx.arc.nasa.gov 2.6.38.8-32.fc15.x86_64 #1 SMP Mon Jun 13
19:49:05 UTC 2011 x86_64 x86_64 x86_64 GNU/Linux
The server has 8 cores and 64G of memory.
Nothing else is running or consuming substantial resources on this
server. top shows that java, flush and kworker processes are consuming cpu.
Thanks!
Sean
On 06/29/2011 07:33 PM, Ted Ts'o wrote:
> On Wed, Jun 29, 2011 at 05:01:45PM -0700, Sean McCauliff wrote:
>> Sorry, I didn't mean to bother you. I did try and email ext3-users
>> so as to not take up any developer time with my question.
>
> Yeah, but it's not likely anyone on that list would be able to help
> you. Both ext3 and ext4 isn't expected to take a huge amount of CPU
> under normal conditions when doing this type of copying where you will
> be likely disk bound.
>
> Well, you're not using fallocate() (at least you haven't disclosed it
> to date), and writing into fallocated space is the only thing that
> would be using a workqueue at all (which is what the kworker threads
> are using).
>
> So I very much doubt it has anything to do with ext4. The fiber
> channel drivers do use workqueues a fair amount, so yes, it would be
> useful to know that you are using a fiber channel SAN. At this point
> I'd suggest that you use oprofile or perf to see where the CPU is
> being consumed. Perf is probably better since it will allow you to
> see the call chains.
>
> - Ted
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: High CPU Utilization When Copying to Ext4
2011-07-08 17:08 ` Sean McCauliff
@ 2011-07-08 17:17 ` Andi Kleen
2011-07-08 17:41 ` Sean McCauliff
2011-07-08 18:24 ` Ted Ts'o
1 sibling, 1 reply; 14+ messages in thread
From: Andi Kleen @ 2011-07-08 17:17 UTC (permalink / raw)
To: Sean McCauliff; +Cc: Ted Ts'o, linux-ext4
Sean McCauliff <Sean.D.McCauliff@nasa.gov> writes:
> I tried running perf on the copy program on subset of the sparse
> files. It seems like ext4 is the source of high cpu utilization. At
> this point this high cpu utilization is very annoying, but I can live
> with this problem. If you know something simple I could do to
> alleviate this problem I would be most appreciative. At the end of
> this email is a consolidation of information about this problem.
>
> Events: 6M cycles
> -76.80% java [kernel.kallsyms] [k] ext4_mb_good_group
> - ext4_mb_good_group
Is your file system too full?
A lot of file system get inefficient in allocating blocks when the
file system is nearly full.
-Andi
--
ak@linux.intel.com -- Speaking for myself only
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: High CPU Utilization When Copying to Ext4
2011-07-08 17:17 ` Andi Kleen
@ 2011-07-08 17:41 ` Sean McCauliff
0 siblings, 0 replies; 14+ messages in thread
From: Sean McCauliff @ 2011-07-08 17:41 UTC (permalink / raw)
To: Andi Kleen; +Cc: Ted Ts'o, linux-ext4
43% of the destination file system is in use.
Sean
On 07/08/2011 10:17 AM, Andi Kleen wrote:
> Sean McCauliff<Sean.D.McCauliff@nasa.gov> writes:
>
>> I tried running perf on the copy program on subset of the sparse
>> files. It seems like ext4 is the source of high cpu utilization. At
>> this point this high cpu utilization is very annoying, but I can live
>> with this problem. If you know something simple I could do to
>> alleviate this problem I would be most appreciative. At the end of
>> this email is a consolidation of information about this problem.
>>
>> Events: 6M cycles
>> -76.80% java [kernel.kallsyms] [k] ext4_mb_good_group
>> - ext4_mb_good_group
>
> Is your file system too full?
>
> A lot of file system get inefficient in allocating blocks when the
> file system is nearly full.
>
> -Andi
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: High CPU Utilization When Copying to Ext4
2011-07-08 17:08 ` Sean McCauliff
2011-07-08 17:17 ` Andi Kleen
@ 2011-07-08 18:24 ` Ted Ts'o
1 sibling, 0 replies; 14+ messages in thread
From: Ted Ts'o @ 2011-07-08 18:24 UTC (permalink / raw)
To: Sean McCauliff; +Cc: linux-ext4
OK, #1, can you try doing an experiment with "cp -r <src-path>
<dest-path>", and tell me whether the copy speed is faster or slower,
and whether the CPU utilization is faster or slower? Even if it's a
bit slower to use cp -r, it will be easier for us to try to reproduce
things and to understand what cp -r is doing as opposed to some
mystery java program.
#2, if you know how to set up ftrace, can you enable the
ext4_mballoc_alloc tracepoint and send me a sample output (a few
hundred lines will be plenty).
The short version is basically:
<start your java copy program>
echo 1 > /sys/kernel/debug/tracing/events/ext4/ext4_mballoc_alloc/enable
<wait 30 seconds>
cat /sys/kernel/debug/tracing/trace > /var/tmp/trace-save
echo 0 > /sys/kernel/debug/tracing/events/ext4/ext4_mballoc_alloc/enable
- Ted
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2011-07-08 18:24 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
[not found] <341DAA96EE3A8444B6E4657BE8A846EA4B3DA126FE@NDJSSCC06.ndc.nasa.gov>
2011-06-27 3:05 ` High CPU Utilization When Copying to Ext4 Ted Ts'o
2011-06-27 9:24 ` Lukas Czerner
2011-06-28 18:37 ` Mccauliff, Sean D. (ARC-PX)[Lockheed Martin Space OPNS]
2011-06-28 20:14 ` Theodore Tso
2011-06-28 20:20 ` Mccauliff, Sean D. (ARC-PX)[Lockheed Martin Space OPNS]
2011-06-29 13:08 ` Theodore Tso
2011-06-30 0:01 ` Sean McCauliff
2011-06-30 2:33 ` Ted Ts'o
2011-07-08 17:08 ` Sean McCauliff
2011-07-08 17:17 ` Andi Kleen
2011-07-08 17:41 ` Sean McCauliff
2011-07-08 18:24 ` Ted Ts'o
2011-06-28 20:17 ` Andreas Dilger
2011-06-29 23:16 ` Sean McCauliff
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.