Slow file copy in archive

classic Classic list List threaded Threaded
16 messages Options
Reply | Threaded
Open this post in threaded view
|

Slow file copy in archive

mman-2
Hi all,

We have integrated Clover into our build in Hudson. When the tests are run we generate 3 reports and that's it.

Executing the tests takes typically less than 30 minutes, generating the reports roughly the same, but publishing the result an insane amount of time.

For example, this morning (we run the build at 6am) it took less than 25 minutes to generate the 3 reports, but more than 2 hours to publish the result. From the console output of the job that produces the report:
Started by upstream project "Clover" build number 94
[hudson] $ ant -file cloverbuild.xml clover.html
Buildfile: cloverbuild.xml

with.clover:
[clover-setup] Clover Version 2.4.2, built on December 01 2008 (build-752)
...
[clover-report] Writing HTML report to '/root/.hudson/jobs/Clover/workspace/clover_html'
[clover-report] Done. Processed 660 packages in 1249022ms (1892ms per package).
[clover-report] Writing historical report to '/root/.hudson/jobs/Clover/workspace/clover_history_html'
[clover-report] Read 54 history points.
[clover-report] using movers interval of 1 day
[clover-report] using movers interval of 1 day
[clover-report] using movers interval of 1 day
[clover-report] Done.
[clover-report] Writing report to '/root/.hudson/jobs/Clover/workspace/clover.xml'

BUILD SUCCESSFUL
Total time: 24 minutes 50 seconds
Publishing Clover coverage report...
Publishing Clover coverage results...
Email was triggered for: Success
There are 1 triggered emails.
Sending email for trigger: Success
Sending e-mails to: ...
Finished: SUCCESS


So as you can see Clover is done after 24 minutes and 50 seconds, but the overall build time is 2 hours and 46 minutes. Basically, it takes 2 hours and 21 minutes to archive the reports. And I confirm that because I see the console hanging at publishing (don't remember if report or results) for more than 2 hours.

The reports that we are talking about are roughly a few thousand files (less than 10,000) for a total of ~6 GB. It's not peanuts, I understand that, but why does it take so long? Anything that we can do to speed up?


Thanks,
MM
Reply | Threaded
Open this post in threaded view
|

Re: Slow file copy in archive

Michael Donohue
I would guess there's some sort of quadratic algorithm in there.   There's no bug filed right now, so you might as well file this.  If there's any sort of test case you can provide, that would help significantly.  

-Michael

On Thu, Apr 16, 2009 at 5:26 PM, Manlio Malaidini <[hidden email]> wrote:
Hi all,

We have integrated Clover into our build in Hudson. When the tests are run we generate 3 reports and that's it.

Executing the tests takes typically less than 30 minutes, generating the reports roughly the same, but publishing the result an insane amount of time.

For example, this morning (we run the build at 6am) it took less than 25 minutes to generate the 3 reports, but more than 2 hours to publish the result. From the console output of the job that produces the report:
Started by upstream project "Clover" build number 94
[hudson] $ ant -file cloverbuild.xml clover.html
Buildfile: cloverbuild.xml

with.clover:
[clover-setup] Clover Version 2.4.2, built on December 01 2008 (build-752)
...
[clover-report] Writing HTML report to '/root/.hudson/jobs/Clover/workspace/clover_html'
[clover-report] Done. Processed 660 packages in 1249022ms (1892ms per package).
[clover-report] Writing historical report to '/root/.hudson/jobs/Clover/workspace/clover_history_html'
[clover-report] Read 54 history points.
[clover-report] using movers interval of 1 day
[clover-report] using movers interval of 1 day
[clover-report] using movers interval of 1 day
[clover-report] Done.
[clover-report] Writing report to '/root/.hudson/jobs/Clover/workspace/clover.xml'

BUILD SUCCESSFUL
Total time: 24 minutes 50 seconds
Publishing Clover coverage report...
Publishing Clover coverage results...
Email was triggered for: Success
There are 1 triggered emails.
Sending email for trigger: Success
Sending e-mails to: ...
Finished: SUCCESS


So as you can see Clover is done after 24 minutes and 50 seconds, but the overall build time is 2 hours and 46 minutes. Basically, it takes 2 hours and 21 minutes to archive the reports. And I confirm that because I see the console hanging at publishing (don't remember if report or results) for more than 2 hours.

The reports that we are talking about are roughly a few thousand files (less than 10,000) for a total of ~6 GB. It's not peanuts, I understand that, but why does it take so long? Anything that we can do to speed up?


Thanks,
MM

Reply | Threaded
Open this post in threaded view
|

Re: Slow file copy in archive

Ronald Klop
Does it have anything to do with this?
https://hudson.dev.java.net/issues/show_bug.cgi?id=3524

Ronald.

Op woensdag, 22 april 2009 02:47 schreef Michael Donohue :

I would guess there's some sort of quadratic algorithm in there.   There's no bug filed right now, so you might as well file this.  If there's any sort of test case you can provide, that would help significantly.  

-Michael

On Thu, Apr 16, 2009 at 5:26 PM, Manlio Malaidini <<a onclick="composeMailto('mmalaidini@gmail.com'); return false;" href="#">mmalaidini@...> wrote:
Hi all,

We have integrated Clover into our build in Hudson. When the tests are run we generate 3 reports and that's it.

Executing the tests takes typically less than 30 minutes, generating the reports roughly the same, but publishing the result an insane amount of time.

For example, this morning (we run the build at 6am) it took less than 25 minutes to generate the 3 reports, but more than 2 hours to publish the result. From the console output of the job that produces the report:
Started by upstream project "Clover" build number 94
[hudson] $ ant -file cloverbuild.xml clover.html
Buildfile: cloverbuild.xml

with.clover:
[clover-setup] Clover Version 2.4.2, built on December 01 2008 (build-752)
...
[clover-report] Writing HTML report to '/root/.hudson/jobs/Clover/workspace/clover_html'
[clover-report] Done. Processed 660 packages in 1249022ms (1892ms per package).
[clover-report] Writing historical report to '/root/.hudson/jobs/Clover/workspace/clover_history_html'
[clover-report] Read 54 history points.
[clover-report] using movers interval of 1 day
[clover-report] using movers interval of 1 day
[clover-report] using movers interval of 1 day
[clover-report] Done.
[clover-report] Writing report to '/root/.hudson/jobs/Clover/workspace/clover.xml'

BUILD SUCCESSFUL
Total time: 24 minutes 50 seconds
Publishing Clover coverage report...
Publishing Clover coverage results...
Email was triggered for: Success
There are 1 triggered emails.
Sending email for trigger: Success
Sending e-mails to: ...
Finished: SUCCESS
 


So as you can see Clover is done after 24 minutes and 50 seconds, but the overall build time is 2 hours and 46 minutes. Basically, it takes 2 hours and 21 minutes to archive the reports. And I confirm that because I see the console hanging at publishing (don't remember if report or results) for more than 2 hours.

The reports that we are talking about are roughly a few thousand files (less than 10,000) for a total of ~6 GB. It's not peanuts, I understand that, but why does it take so long? Anything that we can do to speed up?


Thanks,
MM

 

Reply | Threaded
Open this post in threaded view
|

Re: Slow file copy in archive

mman-2
In reply to this post by Michael Donohue
On Tue, Apr 21, 2009 at 8:47 PM, Michael Donohue <[hidden email]> wrote:
I would guess there's some sort of quadratic algorithm in there.  

It could be.

Actually, I was imprecise. 3 artifacts are archived:
*) clover.xml (negligible)
*) clover_html ==> a directory of 6.7 GB and 94135 files
*) clover_history_html ==> a directory of 644 KB and 29 files
So the real deal is clover_html.
 
There's no bug filed right now, so you might as well file this.

Will do that. In the meantime I think I'll do this:
*) as part of the build gzip everything above, move it to the archive directory and gunzip
*) get rid of the archive artifacts configuration from the current build

It is my understanding that anything that goes in the archive directory is available as "artifacts" => it doesn't really matter how it got there, right?

  If there's any sort of test case you can provide, that would help significantly.  

You mean something like:
*) configure a build that generates 10000 files of 10 KB each
*) archive the files
*) check that it takes less than N seconds/minutes?
 
Thanks
MM
Reply | Threaded
Open this post in threaded view
|

Re: Slow file copy in archive

mman-2
In reply to this post by Ronald Klop
On Wed, Apr 22, 2009 at 5:05 AM, Ronald Klop <[hidden email]> wrote:
Does it have anything to do with this?
https://hudson.dev.java.net/issues/show_bug.cgi?id=3524

Ronald.

In theory not: I'm not using a distributed build, everything is local, one machine only.

So unless both behaviours are related to the way those files are copied, rather than how the network traffic is handled, I think that they are not related to each other.

Am I totally off?
Thanks,
MM

Reply | Threaded
Open this post in threaded view
|

Re: Slow file copy in archive

Ronald Klop
Op donderdag, 23 april 2009 13:43 schreef Manlio Malaidini :
On Wed, Apr 22, 2009 at 5:05 AM, Ronald Klop <<a onclick="composeMailto('ronald-mailinglist@base.nl'); return false;" href="#">ronald-mailinglist@...> wrote:
Does it have anything to do with this?
https://hudson.dev.java.net/issues/show_bug.cgi?id=3524

Ronald.

In theory not: I'm not using a distributed build, everything is local, one machine only.

So unless both behaviours are related to the way those files are copied, rather than how the network traffic is handled, I think that they are not related to each other.

Am I totally off?
Thanks,
MM

 
If I run my testcase (issue 3524) on my master archiving the artifacts is quick. So these might be different issue's.

Ronald.


Reply | Threaded
Open this post in threaded view
|

Re: Slow file copy in archive

mman-2
In reply to this post by mman-2
On Thu, Apr 23, 2009 at 7:38 AM, Manlio Malaidini <[hidden email]> wrote:
On Tue, Apr 21, 2009 at 8:47 PM, Michael Donohue <[hidden email]> wrote:
I would guess there's some sort of quadratic algorithm in there.  

It could be.

Actually, I was imprecise. 3 artifacts are archived:
*) clover.xml (negligible)
*) clover_html ==> a directory of 6.7 GB and 94135 files
*) clover_history_html ==> a directory of 644 KB and 29 files
So the real deal is clover_html.
 

More data:
*) 766 directories in clover_html folder
*) a brutal cp -R takes roughly 36' (still a lot less than 2h and 40' measured on last build)


Thanks
MM
Reply | Threaded
Open this post in threaded view
|

Re: Slow file copy in archive

mman-2
In reply to this post by Michael Donohue
On Tue, Apr 21, 2009 at 8:47 PM, Michael Donohue <[hidden email]> wrote:
I would guess there's some sort of quadratic algorithm in there.   There's no bug filed right now, so you might as well file this.  If there's any sort of test case you can provide, that would help significantly.  

I've filed a bug with a rough testcase here: https://hudson.dev.java.net/issues/show_bug.cgi?id=3558

Thanks,
MM
Reply | Threaded
Open this post in threaded view
|

Re: Slow file copy in archive

Kohsuke Kawaguchi
Administrator
In reply to this post by mman-2
Manlio Malaidini wrote:

> On Wed, Apr 22, 2009 at 5:05 AM, Ronald Klop <[hidden email]>wrote:
>
>> Does it have anything to do with this?
>> https://hudson.dev.java.net/issues/show_bug.cgi?id=3524
>>
>> Ronald.
>>
>
> In theory not: I'm not using a distributed build, everything is local, one
> machine only.
For local->local copy, Hudson uses the copy task from Ant.

Can you do a quick test for me? Write an Ant build script that does copy
   like

   <copy todir="/tmp">
     <fileset dir="path/to/workspace"
              includes="glob-that-you-specify-to-hudson" />

Does that have a similar performance characteristic?


HUDSON-3524 is about remote copying, which runs a rather different codepath.

> So unless both behaviours are related to the way those files are copied,
> rather than how the network traffic is handled, I think that they are not
> related to each other.
>
> Am I totally off?
> Thanks,
> MM
>


--
Kohsuke Kawaguchi
Sun Microsystems                   http://weblogs.java.net/blog/kohsuke/

smime.p7s (4K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Slow file copy in archive

Peter Reilly-2
ant's <copy/> is brutal slow for very large number of files.

It reads all the file names into memory and then does the
coping. This reading into memory can actually take a long
time and can cause OOMs.

At one stage I attempted to write a streaming version, but
this project did not get out of concept phase.

Peter

On Fri, Apr 24, 2009 at 6:16 AM, Kohsuke Kawaguchi
<[hidden email]> wrote:

> Manlio Malaidini wrote:
>>
>> On Wed, Apr 22, 2009 at 5:05 AM, Ronald Klop
>> <[hidden email]>wrote:
>>
>>> Does it have anything to do with this?
>>> https://hudson.dev.java.net/issues/show_bug.cgi?id=3524
>>>
>>> Ronald.
>>>
>>
>> In theory not: I'm not using a distributed build, everything is local, one
>> machine only.
>
> For local->local copy, Hudson uses the copy task from Ant.
>
> Can you do a quick test for me? Write an Ant build script that does copy
>  like
>
>  <copy todir="/tmp">
>    <fileset dir="path/to/workspace"
>             includes="glob-that-you-specify-to-hudson" />
>
> Does that have a similar performance characteristic?
>
>
> HUDSON-3524 is about remote copying, which runs a rather different codepath.
>
>> So unless both behaviours are related to the way those files are copied,
>> rather than how the network traffic is handled, I think that they are not
>> related to each other.
>>
>> Am I totally off?
>> Thanks,
>> MM
>>
>
>
> --
> Kohsuke Kawaguchi
> Sun Microsystems                   http://weblogs.java.net/blog/kohsuke/
>

<div><br></div>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Slow file copy in archive

mman-2
In reply to this post by Kohsuke Kawaguchi
On Fri, Apr 24, 2009 at 1:16 AM, Kohsuke Kawaguchi <[hidden email]> wrote:
 
Manlio Malaidini wrote:
 
On Wed, Apr 22, 2009 at 5:05 AM, Ronald Klop <[hidden email]>wrote:
 
 
Does it have anything to do with this?
 
Ronald.
 
 
In theory not: I'm not using a distributed build, everything is local, one
machine only.
 
 
For local->local copy, Hudson uses the copy task from Ant.
 
Can you do a quick test for me? Write an Ant build script that does copy  like
 
 <copy todir="/tmp">
   <fileset dir="path/to/workspace"
            includes="glob-that-you-specify-to-hudson" />
 
Does that have a similar performance characteristic?
 
 
I have done it against the testcase that I've specified at https://hudson.dev.java.net/issues/show_bug.cgi?id=3558
 
The result is better than Hudson (by a factor of 3), but worse than "cp -R" (again, roughly by a factor of 3):
 
[workspace]# time ant
Buildfile: build.xml
copy:
     [copy] Copying 10000 files to /tmp/x
BUILD SUCCESSFUL
Total time: 10 seconds
real    0m10.624s
user    0m5.095s
sys     0m2.226s
[workspace]#
 
Where:
 
[workspace]# cat build.xml
<project default="copy" basedir="./">
 <target name="copy">
  <copy todir="/tmp/x">
    <fileset dir="./"
             includes="file*.bin" />
  </copy>
 </target>
</project>
 
And the files were generated with:
 
time for i in {1..10000}; do dd if=/dev/urandom of="file$i.bin" bs=10K count=1; done
 
(which takes roughly 50 seconds)
 
Total time of last build of the test case was 1 min 21 sec, of which 45 seconds to generate the files.
 
I wonder if a possible workaround can be:
*) don't archive expensive artifacts, but only simple ones
*) run a cron job roughly 30' after that check last build and copies files in archive directory with "cp -R"
In alternative, can I hook into the build a different copy method than ant's copy?
 
Thanks,
MM
 
Reply | Threaded
Open this post in threaded view
|

Re: Slow file copy in archive

Kohsuke Kawaguchi
Administrator
In reply to this post by Peter Reilly-2

I suppose the enhanced glob handling makes the copying more expensive.
Plus Ant doesn't use NIO, which would improve the copy performance also.

Now I suppose the question is what to do with this...

Thanks for the insight.

Peter Reilly wrote:

> ant's <copy/> is brutal slow for very large number of files.
>
> It reads all the file names into memory and then does the
> coping. This reading into memory can actually take a long
> time and can cause OOMs.
>
> At one stage I attempted to write a streaming version, but
> this project did not get out of concept phase.
>
> Peter
>
> On Fri, Apr 24, 2009 at 6:16 AM, Kohsuke Kawaguchi
> <[hidden email]> wrote:
>> Manlio Malaidini wrote:
>>>
>>> On Wed, Apr 22, 2009 at 5:05 AM, Ronald Klop
>>> <[hidden email]>wrote:
>>>
>>>> Does it have anything to do with this?
>>>> https://hudson.dev.java.net/issues/show_bug.cgi?id=3524
>>>>
>>>> Ronald.
>>>>
>>>
>>> In theory not: I'm not using a distributed build, everything is local, one
>>> machine only.
>>
>> For local->local copy, Hudson uses the copy task from Ant.
>>
>> Can you do a quick test for me? Write an Ant build script that does copy
>> ?like
>>
>> ?<copy todir="/tmp">
>> ? ?<fileset dir="path/to/workspace"
>> ? ? ? ? ? ? includes="glob-that-you-specify-to-hudson" />
>>
>> Does that have a similar performance characteristic?
>>
>>
>> HUDSON-3524 is about remote copying, which runs a rather different codepath.
>>
>>> So unless both behaviours are related to the way those files are copied,
>>> rather than how the network traffic is handled, I think that they are not
>>> related to each other.
>>>
>>> Am I totally off?
>>> Thanks,
>>> MM
>>>
>>
>>
>> --
>> Kohsuke Kawaguchi
>> Sun Microsystems ? ? ? ? ? ? ? ? ? http://weblogs.java.net/blog/kohsuke/
>>
>
> <div><br></div>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>

--
Kohsuke Kawaguchi
Sun Microsystems                   http://weblogs.java.net/blog/kohsuke/

smime.p7s (4K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Slow file copy in archive

Kohsuke Kawaguchi
Administrator
In reply to this post by mman-2
Manlio Malaidini wrote:

> On Fri, Apr 24, 2009 at 1:16 AM, Kohsuke Kawaguchi <
> [hidden email]> wrote:
>
>> Manlio Malaidini wrote:
>>
>>> On Wed, Apr 22, 2009 at 5:05 AM, Ronald Klop <[hidden email]>wrote:
>>>
>>>
>>>
>>>> Does it have anything to do with this?
>>>> https://hudson.dev.java.net/issues/show_bug.cgi?id=3524
>>>>
>>>> Ronald.
>>>>
>>>>
>>>>
>>> In theory not: I'm not using a distributed build, everything is local, one
>>> machine only.
>>>
>>>
>>
>> For local->local copy, Hudson uses the copy task from Ant.
>> Can you do a quick test for me? Write an Ant build script that does copy
>>  like
>>
>>  <copy todir="/tmp">
>>    <fileset dir="path/to/workspace"
>>             includes="glob-that-you-specify-to-hudson" />
>>
>> Does that have a similar performance characteristic?
>>
>>
>
> I have done it against the testcase that I've specified at
> https://hudson.dev.java.net/issues/show_bug.cgi?id=3558
>
> The result is better than Hudson (by a factor of 3), but worse than "cp -R"
> (again, roughly by a factor of 3):
>
> [workspace]# time ant
> Buildfile: build.xml
> copy:
>      [copy] Copying 10000 files to /tmp/x
> BUILD SUCCESSFUL
> Total time: 10 seconds
> real    0m10.624s
> user    0m5.095s
> sys     0m2.226s
> [workspace]#
>
> Where:
>
> [workspace]# cat build.xml
> <project default="copy" basedir="./">
>  <target name="copy">
>   <copy todir="/tmp/x">
>     <fileset dir="./"
>              includes="file*.bin" />
>   </copy>
>  </target>
> </project>
>
> And the files were generated with:
>
> time for i in {1..10000}; do dd if=/dev/urandom of="file$i.bin" bs=10K
> count=1; done
>
> (which takes roughly 50 seconds)
>
> Total time of last build of the test case was 1 min 21 sec, of which 45
> seconds to generate the files.
>
> I wonder if a possible workaround can be:
> *) don't archive expensive artifacts, but only simple ones
> *) run a cron job roughly 30' after that check last build and copies files
> in archive directory with "cp -R"
> In alternative, can I hook into the build a different copy method than ant's
> copy?
Certainly. If you can come up with a different implementation, I'm all
for integrating it.

>
> Thanks,
> MM
>


--
Kohsuke Kawaguchi
Sun Microsystems                   http://weblogs.java.net/blog/kohsuke/

smime.p7s (4K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Slow file copy in archive

mman-2
On Fri, Apr 24, 2009 at 12:49 PM, Kohsuke Kawaguchi <[hidden email]> wrote:
In alternative, can I hook into the build a different copy method than ant's
copy?

Certainly. If you can come up with a different implementation, I'm all for integrating it.
 
 
Would it be such a crazy idea to configure the server such that a shell command can be configured for copying purposes? On unix systems of course we'd use "cp -R", on windows we would maybe use xcopy, etc. You can't really beat whatever the OS has implemented for you. And if you are in real troubles, there should be an option to fall back on ant copy.
 
What do you think about it?
 
MM
Reply | Threaded
Open this post in threaded view
|

Re: Slow file copy in archive

Kohsuke Kawaguchi
Administrator
Manlio Malaidini wrote:

> On Fri, Apr 24, 2009 at 12:49 PM, Kohsuke Kawaguchi <
> [hidden email]> wrote:
>
>>   In alternative, can I hook into the build a different copy method than
>>> ant's
>>> copy?
>>>
>>
>> Certainly. If you can come up with a different implementation, I'm all for
>> integrating it.
>>
>
>
> Would it be such a crazy idea to configure the server such that a shell
> command can be configured for copying purposes? On unix systems of course
> we'd use "cp -R", on windows we would maybe use xcopy, etc. You can't really
> beat whatever the OS has implemented for you. And if you are in real
> troubles, there should be an option to fall back on ant copy.
>
> What do you think about it?
As a general principle, we don't want to add options/configurations
unless they are really needed, and this case doesn't pass the bar.

I think this is a kind of decision Hudson should be able to make by
itself. Look at the GLOB pattern, see if it can be satisfied with the cp
/xcopy command, and if so, fork cp. In that way, this can be hidden from
the user.

--
Kohsuke Kawaguchi
Sun Microsystems                   http://weblogs.java.net/blog/kohsuke/

smime.p7s (4K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Slow file copy in archive

mman-2
On Fri, Apr 24, 2009 at 2:02 PM, Kohsuke Kawaguchi <[hidden email]> wrote:
As a general principle, we don't want to add options/configurations unless they are really needed, and this case doesn't pass the bar.

I think this is a kind of decision Hudson should be able to make by itself. Look at the GLOB pattern, see if it can be satisfied with the cp /xcopy command, and if so, fork cp. In that way, this can be hidden from the user.
 
That would be even better.
 
MM