New remoting infrastructure is in place

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
15 messages Options
Reply | Threaded
Open this post in threaded view
|

New remoting infrastructure is in place

Kohsuke Kawaguchi-2

I implemented a new module that I plan to use as a foundation for better
distributed builds. In the past we talked about reusing various existing
technologies, but I couldn't find anything that fits my needs without
being too complicated.

The Hudson remoting infrastructure is based on the following abstraction:

> public interface Callable<V,T extends Throwable> extends Serializable {
>     /**
>      * Performs computation and returns the result,
>      * or throws some exception.
>      */
>     V call() throws T;
> }

Hudson will maintain an "agent" JVM on each slave (I still haven't
thought about how to start this JVM.) Agent and the master needs to have
one InputStream/OutputStream pair between them.

Code in Hudson can implement this callable interface, then "send" it to
  an agent for execution (by using Java serialization as the
transportation mechanism.) Once executed, the result will be propagated
back to the master. The call can be made synchronously or asynchronously.

For example, today, to wipe out a directory, we enumerate and delete
each file over NFS. With callable abstraction, you'll be sending the
code once for execution and enumeration will happen locally. This should
greatly reduce the network overhead of such operation.

Another useful primitive implemented in the current code is Pipe. Pipe
allows remote code and the local code to talk to each other via
InputStream/OutputStream. This can be used, for example, to scan
directory and compress files on a slave, then send the resulting zip
file into Pipe so that it can be saved on the master.

The current code also works correctly in the face of multiple
classloaders (which we use to isolate plugins.) All the necessary class
files are sent along with the callable object, so on a slave you really
just need this remoting code and nothing else. Of course class file
images are reused once they are sent.

I plan to slowly migrate the current code to use this. Eventually we
should be able to eliminate the use of NFS altogether --- NFS is a cause
of headache for various reasons (hard to do on Windows, interference
with mount, need for the same UID everywhere) --- one thing I'm looking
forward to is to run slaves with /tmp for blindingly fast build!

The biggest missing link right now is how to launch an agent on each
slave. Initially I plan to use hudson.Launcher (mostly because it's
already available in the current configuration), yet supporting other
mechanisms seem to have certain benefits. For example, if we can let
agent launched via Java Web Start, ssh requirement can be also removed,
making it even easier for Windows slaves. This also affects the
migration, too.

--
Kohsuke Kawaguchi
Sun Microsystems                   [hidden email]

smime.p7s (4K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: New remoting infrastructure is in place

Curt Cox
Kohsuke,

This remoting infrastructure sounds really interesting.  I would like
to learn more about it.  Could you post a link showing where I should
look to do so?  I had similar needs.  I looked at Jini and Cajo, but
ultimately ended up rolling my own RMI-based solution which sounds a
bit similar to yours.  I would like to review your code to see what I
can learn.

Thanks,
Curt

BTW -- any plans on making the remoting infrastructure independent of Hudson?

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: New remoting infrastructure is in place

Tom Ball
In reply to this post by Kohsuke Kawaguchi-2
I don't like this approach, having worked on client-server systems for
too many years.  It may work for mini-systems where you know each
slave's name and can assume the network won't fail (a poor assumption),
but why limit yourself unnecessarily?

The architecture I believe would work more robustly and scale better is
to have the master build server create and maintain a list of build
tasks, along with the necessary characteristics for running that task.
Slaves then poll the master occasionally for new tasks, giving their
system's characteristics.  If a task is available for those
characteristics, then the slave build checks out that task and notifies
the master when the task is complete.  A build finishes when there are
no remaining outstanding tasks.  The slaves can run whenever and
wherever they are, with their only configuration being the master's URL
for the task queue.  If the master isn't configured to process any
tasks, then it can scale to a very large set of build machines running
concurrent builds from different projects.  This sounds like a lot of
code, but most of it is already written:  Jini can easily handle this
sort of task.

As an example, a common requirement is to build platform-dependent code.
  The necessary characteristics for running this sort of build are just
the platform characteristics (hardware/OS/etc.), plus what JDK is
installed.  You configure your build specifying that you need to run
your build on a list of platforms with a certain JDK.  The master starts
up by loading the task list with the set of build tasks ("build this on
platform <x>") and, if it has cycles, polls for a task on a separate
thread as a slave.  The build status window should be updated to show
the status of each task, showing which platform builds have finished or
are failing, for example.  Slave failure (crashes, network connection
loss, etc.) can be monitored easily by using Jini's lease-based
sessions.  You can also avoid slave configuration by using Jini within
the same subnet, but setting a single URL isn't too much work.

The advantage of this approach is that you can add new hardware when
it's needed, or share it between projects when it is under-utilized.
For example, you might find that your Mac Mini (a wonderful build
machine, BTW!) is overloaded, so just buy and install another to fix the
problem without changing the build configuration in any way.  Or don't,
but instead run your workstation (if it's a Mac) as a build slave every
night. If the JDK and required libraries are specified as build
characteristics, then there won't be the issue of version mismatches
between casual build machines.  Full-time build machines, part-time
sharing, it doesn't matter.

Tom

Kohsuke Kawaguchi wrote:

>
> I implemented a new module that I plan to use as a foundation for better
> distributed builds. In the past we talked about reusing various existing
> technologies, but I couldn't find anything that fits my needs without
> being too complicated.
>
> The Hudson remoting infrastructure is based on the following abstraction:
>
>> public interface Callable<V,T extends Throwable> extends Serializable {
>>     /**
>>      * Performs computation and returns the result,
>>      * or throws some exception.
>>      */
>>     V call() throws T;
>> }
>
> Hudson will maintain an "agent" JVM on each slave (I still haven't
> thought about how to start this JVM.) Agent and the master needs to have
> one InputStream/OutputStream pair between them.
>
> Code in Hudson can implement this callable interface, then "send" it to
>  an agent for execution (by using Java serialization as the
> transportation mechanism.) Once executed, the result will be propagated
> back to the master. The call can be made synchronously or asynchronously.
>
> For example, today, to wipe out a directory, we enumerate and delete
> each file over NFS. With callable abstraction, you'll be sending the
> code once for execution and enumeration will happen locally. This should
> greatly reduce the network overhead of such operation.
>
> Another useful primitive implemented in the current code is Pipe. Pipe
> allows remote code and the local code to talk to each other via
> InputStream/OutputStream. This can be used, for example, to scan
> directory and compress files on a slave, then send the resulting zip
> file into Pipe so that it can be saved on the master.
>
> The current code also works correctly in the face of multiple
> classloaders (which we use to isolate plugins.) All the necessary class
> files are sent along with the callable object, so on a slave you really
> just need this remoting code and nothing else. Of course class file
> images are reused once they are sent.
>
> I plan to slowly migrate the current code to use this. Eventually we
> should be able to eliminate the use of NFS altogether --- NFS is a cause
> of headache for various reasons (hard to do on Windows, interference
> with mount, need for the same UID everywhere) --- one thing I'm looking
> forward to is to run slaves with /tmp for blindingly fast build!
>
> The biggest missing link right now is how to launch an agent on each
> slave. Initially I plan to use hudson.Launcher (mostly because it's
> already available in the current configuration), yet supporting other
> mechanisms seem to have certain benefits. For example, if we can let
> agent launched via Java Web Start, ssh requirement can be also removed,
> making it even easier for Windows slaves. This also affects the
> migration, too.
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: New remoting infrastructure is in place

Kohsuke Kawaguchi-2
In reply to this post by Curt Cox
Curt Cox wrote:
> This remoting infrastructure sounds really interesting.  I would like
> to learn more about it.  Could you post a link showing where I should
> look to do so?

For now, code and javadoc. See
https://hudson.dev.java.net/source/browse/hudson/hudson/main/remoting/src/main/java/hudson/remoting/

You'd probably have a better time if you check out this module and then
browse it in your IDE. Test code should help you get some idea of how it
works.

Now that I know there's some interest, I should spend some more time in
documenting this.

> BTW -- any plans on making the remoting infrastructure independent of Hudson?

It doesn't depend on anything else in Hudson, so it's quite feasible to
have its own life. It's just that I started prototyping this in Hudson
and kinda just ended up committing there.

If you are interested, I'm happy to give its own name and its own
project. The added benefit is that I can get a Subversion repository for
this code. Yay!

--
Kohsuke Kawaguchi
Sun Microsystems                   [hidden email]

smime.p7s (4K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: New remoting infrastructure is in place

Kohsuke Kawaguchi-2
In reply to this post by Tom Ball

Thanks for sharing your thoughts.

I think what you are describing below is really more about what we do on
top of a remoting mechanism. So to me it sounds somewhat orthogonal to
what underlying mechanism we use for remoting (be it Jini, Cajo, or
SunGrid!) --- I think you are saying that the current Hudson approach of
master configured to know each slave is not very good.


I agree that being able to form more dynamic grid is useful, although I
have some doubts.

In my experience maintaining a dozen slaves, you really do need a fair
amount of administration to make sure all slaves are properly
"configured". That includes making sure that clock is synced up
correctly. It includes making sure that proper versions of the tools are
installed (having the exact same version is sometimes important, for
some nasty problems.) It includes making sure that all the machines have
proper network configuration. It includes making sure all of them have
proper CVS/SCM access credentials to check out the source code.
Sometimes some jobs leave some daemons running, so occasionally you need
to go in and kill those things.

So in practice I doubt if it could be really ever made as easy as "plug
your PC into a grid and that's done." For all I know, every slave needs
to go through a certain configuration. IOW, running a cluster, any
cluster, has never been easy, and I suspect it will not get magically
easier any time soon.

That said, I think remoting based on a single InputStream/OutputStream
is one step ahead to the "flash grid" direction. Such connection can be
established in so many different ways (especially it can be tunneled
over HTTP), compared to the current requirement of ssh/rsh + NFS.


I should also point out that "dynamic provisioning" is already to some
extent possible today. I just added one more SunFire v210 slave
yesterday, but I never needed to change any job configuration. Hudson
allocates jobs to available machines automatically.

Some slaves are part time in my system, too. Slave owners can mark the
node offline to avoid Hudson from scheduling further jobs on that slave.
This has been useful for enlisting workstation owners who occasionally
(but rarely) do need to use their workstations.

It's only true "to some extent", because the scheduling algorithm is
dumb and needs some obvious improvements. It clearly needs something of
what you are describing, for example so that I can schedule massively
parallel test jobs on a large multi-CPU machine, or tie OS-specific jobs
to a group of slaves, etc.

Tom Ball wrote:

> I don't like this approach, having worked on client-server systems for
> too many years.  It may work for mini-systems where you know each
> slave's name and can assume the network won't fail (a poor assumption),
> but why limit yourself unnecessarily?
>
> The architecture I believe would work more robustly and scale better is
> to have the master build server create and maintain a list of build
> tasks, along with the necessary characteristics for running that task.
> Slaves then poll the master occasionally for new tasks, giving their
> system's characteristics.  If a task is available for those
> characteristics, then the slave build checks out that task and notifies
> the master when the task is complete.  A build finishes when there are
> no remaining outstanding tasks.  The slaves can run whenever and
> wherever they are, with their only configuration being the master's URL
> for the task queue.  If the master isn't configured to process any
> tasks, then it can scale to a very large set of build machines running
> concurrent builds from different projects.  This sounds like a lot of
> code, but most of it is already written:  Jini can easily handle this
> sort of task.
>
> As an example, a common requirement is to build platform-dependent code.
>   The necessary characteristics for running this sort of build are just
> the platform characteristics (hardware/OS/etc.), plus what JDK is
> installed.  You configure your build specifying that you need to run
> your build on a list of platforms with a certain JDK.  The master starts
> up by loading the task list with the set of build tasks ("build this on
> platform <x>") and, if it has cycles, polls for a task on a separate
> thread as a slave.  The build status window should be updated to show
> the status of each task, showing which platform builds have finished or
> are failing, for example.  Slave failure (crashes, network connection
> loss, etc.) can be monitored easily by using Jini's lease-based
> sessions.  You can also avoid slave configuration by using Jini within
> the same subnet, but setting a single URL isn't too much work.
>
> The advantage of this approach is that you can add new hardware when
> it's needed, or share it between projects when it is under-utilized.
> For example, you might find that your Mac Mini (a wonderful build
> machine, BTW!) is overloaded, so just buy and install another to fix the
> problem without changing the build configuration in any way.  Or don't,
> but instead run your workstation (if it's a Mac) as a build slave every
> night. If the JDK and required libraries are specified as build
> characteristics, then there won't be the issue of version mismatches
> between casual build machines.  Full-time build machines, part-time
> sharing, it doesn't matter.
>
> Tom
>
> Kohsuke Kawaguchi wrote:
>>
>> I implemented a new module that I plan to use as a foundation for better
>> distributed builds. In the past we talked about reusing various existing
>> technologies, but I couldn't find anything that fits my needs without
>> being too complicated.
>>
>> The Hudson remoting infrastructure is based on the following abstraction:
>>
>>> public interface Callable<V,T extends Throwable> extends Serializable {
>>>     /**
>>>      * Performs computation and returns the result,
>>>      * or throws some exception.
>>>      */
>>>     V call() throws T;
>>> }
>>
>> Hudson will maintain an "agent" JVM on each slave (I still haven't
>> thought about how to start this JVM.) Agent and the master needs to have
>> one InputStream/OutputStream pair between them.
>>
>> Code in Hudson can implement this callable interface, then "send" it to
>>  an agent for execution (by using Java serialization as the
>> transportation mechanism.) Once executed, the result will be propagated
>> back to the master. The call can be made synchronously or asynchronously.
>>
>> For example, today, to wipe out a directory, we enumerate and delete
>> each file over NFS. With callable abstraction, you'll be sending the
>> code once for execution and enumeration will happen locally. This should
>> greatly reduce the network overhead of such operation.
>>
>> Another useful primitive implemented in the current code is Pipe. Pipe
>> allows remote code and the local code to talk to each other via
>> InputStream/OutputStream. This can be used, for example, to scan
>> directory and compress files on a slave, then send the resulting zip
>> file into Pipe so that it can be saved on the master.
>>
>> The current code also works correctly in the face of multiple
>> classloaders (which we use to isolate plugins.) All the necessary class
>> files are sent along with the callable object, so on a slave you really
>> just need this remoting code and nothing else. Of course class file
>> images are reused once they are sent.
>>
>> I plan to slowly migrate the current code to use this. Eventually we
>> should be able to eliminate the use of NFS altogether --- NFS is a cause
>> of headache for various reasons (hard to do on Windows, interference
>> with mount, need for the same UID everywhere) --- one thing I'm looking
>> forward to is to run slaves with /tmp for blindingly fast build!
>>
>> The biggest missing link right now is how to launch an agent on each
>> slave. Initially I plan to use hudson.Launcher (mostly because it's
>> already available in the current configuration), yet supporting other
>> mechanisms seem to have certain benefits. For example, if we can let
>> agent launched via Java Web Start, ssh requirement can be also removed,
>> making it even easier for Windows slaves. This also affects the
>> migration, too.
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>

--
Kohsuke Kawaguchi
Sun Microsystems                   [hidden email]

smime.p7s (4K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: New remoting infrastructure is in place

Tom Ball
Kohsuke Kawaguchi wrote:

> In my experience maintaining a dozen slaves, you really do need a
> fair amount of administration to make sure all slaves are properly
> "configured". That includes making sure that clock is synced up
> correctly. It includes making sure that proper versions of the tools
> are installed (having the exact same version is sometimes important,
> for some nasty problems.) It includes making sure that all the
> machines have proper network configuration. It includes making sure
> all of them have proper CVS/SCM access credentials to check out the
> source code. Sometimes some jobs leave some daemons running, so
> occasionally you need to go in and kill those things.

These are certainly critical aspects to a build, but whenever possible
they should be checked by the build, rather than left to the release
engineer to handle.  The JDK, for example, first tests all critical
tools for their versions, as well as library paths and other essential
configuration variables.  The NetBeans build checks the JDK version used
for compilation.  Anything important to a build should be tested up
front.  I think Maven can address library dependencies, but have no
direct experience with it yet.

CVS/SCM access shouldn't need to be checked, since if these are wrong
the build will (should) fail.  Clock syncing could be checked by sending
the current date/time with the slave request, which the master can
reject if it is skewed too much.

> So in practice I doubt if it could be really ever made as easy as
> "plug your PC into a grid and that's done." For all I know, every
> slave needs to go through a certain configuration. IOW, running a
> cluster, any cluster, has never been easy, and I suspect it will not
> get magically easier any time soon.

There's no magic:  just that the hard work goes into the build scripts
rather than some notes in a README file or release engineer task list.
Plug your PC into this grid and it either passes or fails; checking the
build's status page should show you the reason for any failures (wrong
JDK, missing libraries, etc.).  If you are getting bad builds from a
machine which passes the build tests, then those need to be fixed first.

Please don't take my comments as general criticism of your project.  I
think Hudson is great and hope everyone starts using it.  I appreciate
the work you have put into it.

Tom

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: New remoting infrastructure is in place

Vladimir Sizikov
In reply to this post by Kohsuke Kawaguchi-2
Hi Kohsuke,

It seems that you're writing code much faster then I can read... :)
Anyways, a pretty interesting piece of code to study.

One question:

On Thu, Dec 14, 2006 at 10:56:10PM -0800, Kohsuke Kawaguchi wrote:
> Code in Hudson can implement this callable interface, then "send" it to
>  an agent for execution (by using Java serialization as the
> transportation mechanism.)

I've always thought that serialization is a very fragile mechanism and
it might be even incompatible between the different Java versions,
right?

Then, if the above statement is correct, it forces all the slaves to
use the same java as the master, doesn't it? It might be a problem for
some.

I understand and agree that for project building purposes all the
build machines should be similarly configured and should probably have
access to the same java version, but Hudson allows not
only building, but, for example, running tests in different
environments. In fact, that's how we use it currently - building the main
project and then running the tests in different environments.

And minor issue, I can't seem to compile remoting module at the
moment:

On a first run of mvn install:

[INFO] Compilation failure
D:\work\hudson\hudson\main\remoting\src\main\java\hudson\remoting\Request.java:[61,12]
unreported exception java.lang.Throwable; must be caught or declared
to be thrown

On a second run:
[INFO] Compilation failure
D:\work\hudson\hudson\main\remoting\src\main\java\hudson\remoting\Channel.java:[166,8]
cannot find symbol
symbol  : class UserResponse
location: class hudson.remoting.Channel
D:\work\hudson\hudson\main\remoting\src\main\java\hudson\remoting\Channel.java:[166,28]
cannot access hudson.remoting.UserResponse
file hudson\remoting\UserResponse.class not found
UserResponse<V> r = new UserRequest<V,T>(this, callable).call(this);
D:\work\hudson\hudson\main\remoting\src\main\java\hudson\remoting\Channel.java:[186,21]
cannot find symbol
symbol  : class UserResponse
location: class hudson.remoting.Channel
D:\work\hudson\hudson\main\remoting\src\main\java\hudson\remoting\Channel.java:[187,35]
cannot find symbol
symbol  : class UserResponse
location: class hudson.remoting.Channel
D:\work\hudson\hudson\main\remoting\src\main\java\hudson\remoting\Channel.java:[188,30]
cannot find symbol
symbol: class UserResponse
protected V adapt(UserResponse<V> r) throws ExecutionException {

I'm using JDK 1.5.0_08 under WinXP.

Thanks,
  --Vladimir

> Once executed, the result will be propagated
> back to the master. The call can be made synchronously or asynchronously.
>
> For example, today, to wipe out a directory, we enumerate and delete
> each file over NFS. With callable abstraction, you'll be sending the
> code once for execution and enumeration will happen locally. This should
> greatly reduce the network overhead of such operation.
>
> Another useful primitive implemented in the current code is Pipe. Pipe
> allows remote code and the local code to talk to each other via
> InputStream/OutputStream. This can be used, for example, to scan
> directory and compress files on a slave, then send the resulting zip
> file into Pipe so that it can be saved on the master.
>
> The current code also works correctly in the face of multiple
> classloaders (which we use to isolate plugins.) All the necessary class
> files are sent along with the callable object, so on a slave you really
> just need this remoting code and nothing else. Of course class file
> images are reused once they are sent.
>
> I plan to slowly migrate the current code to use this. Eventually we
> should be able to eliminate the use of NFS altogether --- NFS is a cause
> of headache for various reasons (hard to do on Windows, interference
> with mount, need for the same UID everywhere) --- one thing I'm looking
> forward to is to run slaves with /tmp for blindingly fast build!
>
> The biggest missing link right now is how to launch an agent on each
> slave. Initially I plan to use hudson.Launcher (mostly because it's
> already available in the current configuration), yet supporting other
> mechanisms seem to have certain benefits. For example, if we can let
> agent launched via Java Web Start, ssh requirement can be also removed,
> making it even easier for Windows slaves. This also affects the
> migration, too.
>
> --
> Kohsuke Kawaguchi
> Sun Microsystems                   [hidden email]


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: New remoting infrastructure is in place

Kohsuke Kawaguchi-2
Vladimir Sizikov wrote:
> It seems that you're writing code much faster then I can read... :)
> Anyways, a pretty interesting piece of code to study.

Thanks! But with much spelling mistakes, I'm sure :-)

> On Thu, Dec 14, 2006 at 10:56:10PM -0800, Kohsuke Kawaguchi wrote:
>> Code in Hudson can implement this callable interface, then "send" it to
>>  an agent for execution (by using Java serialization as the
>> transportation mechanism.)
>
> I've always thought that serialization is a very fragile mechanism and
> it might be even incompatible between the different Java versions,
> right?

The Java SE team spends a lot of efforts making sure that serialized
forms of core classes are compatible across different versions. So in
practice it's very unlikely that Java version differences cause
serialization compatibility issues (at least I haven't heard of any.)

I think one might be able to say Java serialization is fragile in the
sense that classes need to be evolved carefully, if you want different
versions to talk to each other, although such a caution is needed for
most technologies that serves similar role, including XStream that we
already use.

I should also add that because evolution problem is a non-issue for the
remoting, as the slave and the master always use the same version of
classes (master sends class files to slaves over the remoting module.)

> Then, if the above statement is correct, it forces all the slaves to
> use the same java as the master, doesn't it? It might be a problem for
> some.

For above reasons, I don't think it is likely that slaves need the same
java as the master.

Both do need to have compatible versions of the remoting jar.

> I understand and agree that for project building purposes all the
> build machines should be similarly configured and should probably have
> access to the same java version, but Hudson allows not
> only building, but, for example, running tests in different
> environments. In fact, that's how we use it currently - building the main
> project and then running the tests in different environments.

Yes. Agreed.


> And minor issue, I can't seem to compile remoting module at the
> moment:
>
> On a first run of mvn install:
>
> [INFO] Compilation failure
> D:\work\hudson\hudson\main\remoting\src\main\java\hudson\remoting\Request.java:[61,12]
> unreported exception java.lang.Throwable; must be caught or declared
> to be thrown

I saw this problem before. It seems like javac in some versions of JDK5
(or maybe all) have a bug in handling generic "throws" clause. I
switched to JDK6 and that fixed it.

Maybe I should stop too zealous about the use of generics so that we can
avoid this issue. The downside of throwing away the generic "throws"
clause from Callable is that the code to be remoted all have to throw
the same base exception type (like RemoteException), or non at all
(which means only RuntimeException is allowed), and therefore it forces
unnecessary try/catch blocks and/or unnecessary exception
wrapping/unwrapping.

... or we could just require JDK6 for developing Hudson. Maybe we can
even have Maven enforce that, so that people will get a better error
message.

What do you think?


>
> On a second run:
> [INFO] Compilation failure
> D:\work\hudson\hudson\main\remoting\src\main\java\hudson\remoting\Channel.java:[166,8]
> cannot find symbol
> symbol  : class UserResponse
> location: class hudson.remoting.Channel
> D:\work\hudson\hudson\main\remoting\src\main\java\hudson\remoting\Channel.java:[166,28]
> cannot access hudson.remoting.UserResponse
> file hudson\remoting\UserResponse.class not found
> UserResponse<V> r = new UserRequest<V,T>(this, callable).call(this);
> D:\work\hudson\hudson\main\remoting\src\main\java\hudson\remoting\Channel.java:[186,21]
> cannot find symbol
> symbol  : class UserResponse
> location: class hudson.remoting.Channel
> D:\work\hudson\hudson\main\remoting\src\main\java\hudson\remoting\Channel.java:[187,35]
> cannot find symbol
> symbol  : class UserResponse
> location: class hudson.remoting.Channel
> D:\work\hudson\hudson\main\remoting\src\main\java\hudson\remoting\Channel.java:[188,30]
> cannot find symbol
> symbol: class UserResponse
> protected V adapt(UserResponse<V> r) throws ExecutionException {
>
> I'm using JDK 1.5.0_08 under WinXP.
I'm not too sure about this, but maybe it's related to the first issue.
Continuous builds of Hudson is reporting an error at this moment
(although a different one), so probably there's a bug.

I'll take a look at this tonight or tomorrow.

--
Kohsuke Kawaguchi
Sun Microsystems                   [hidden email]

smime.p7s (4K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: New remoting infrastructure is in place

Jesse Glick
Kohsuke Kawaguchi wrote:
> It seems like javac in some versions of JDK5 (or maybe all) have a
> bug in handling generic "throws" clause. I switched to JDK6 and that
> fixed it.

Is there a bug reported for this against javac? Critical bug fixes like
this will get backported to 5.x updates, but only if they are known.

-J.

--
[hidden email]  x22801  netbeans.org  ant.apache.org
       http://google.com/search?q=e%5E%28pi*i%29%2B1

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: New remoting infrastructure is in place

Kohsuke Kawaguchi-2
In reply to this post by Tom Ball
Tom Ball wrote:

> These are certainly critical aspects to a build, but whenever possible
> they should be checked by the build, rather than left to the release
> engineer to handle.  The JDK, for example, first tests all critical
> tools for their versions, as well as library paths and other essential
> configuration variables.  The NetBeans build checks the JDK version used
> for compilation.  Anything important to a build should be tested up
> front.  I think Maven can address library dependencies, but have no
> direct experience with it yet.
>
> CVS/SCM access shouldn't need to be checked, since if these are wrong
> the build will (should) fail.  Clock syncing could be checked by sending
> the current date/time with the slave request, which the master can
> reject if it is skewed too much.
Checking the errors is one thing, but fixing them is another. I was just
saying that maintaining the slaves does take some work, whether or not
build scripts and Hudson are smart enough to detect problems or not.

For example Hudson does already detect clock skew, but it cannot fix it
automatically because it takes a root permission to do so.


>> So in practice I doubt if it could be really ever made as easy as
>> "plug your PC into a grid and that's done." For all I know, every
>> slave needs to go through a certain configuration. IOW, running a
>> cluster, any cluster, has never been easy, and I suspect it will not
>> get magically easier any time soon.
>
> There's no magic:  just that the hard work goes into the build scripts
> rather than some notes in a README file or release engineer task list.
> Plug your PC into this grid and it either passes or fails; checking the
> build's status page should show you the reason for any failures (wrong
> JDK, missing libraries, etc.).  If you are getting bad builds from a
> machine which passes the build tests, then those need to be fixed first.
I think this kind of things is very much on our target radar. It would
be really cool if one could access Hudson website, make itself a slave
by launching a JNLP app. Then this launcher app does the environment
check to make sure everything is OK.

One thing I don't want, however, is to have developers of the builds
worry about slave configurations. If someone running a CI job on Hudson
sees a build failure, it should better be because something wrong was
committed to the build, not because someone added a half-configured slave.


> Please don't take my comments as general criticism of your project.  I
> think Hudson is great and hope everyone starts using it.  I appreciate
> the work you have put into it.

No, I really do appreciate your comments. I never thought of it as a
general criticism.


--
Kohsuke Kawaguchi
Sun Microsystems                   [hidden email]

smime.p7s (4K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: New remoting infrastructure is in place

Vladimir Sizikov
In reply to this post by Kohsuke Kawaguchi-2
Hi Kohsuke,

On Sun, Dec 17, 2006 at 05:40:13PM -0800, Kohsuke Kawaguchi wrote:
> >I've always thought that serialization is a very fragile mechanism and
> >it might be even incompatible between the different Java versions,
> >right?
>
> The Java SE team spends a lot of efforts making sure that serialized
> forms of core classes are compatible across different versions. So in
> practice it's very unlikely that Java version differences cause
> serialization compatibility issues (at least I haven't heard of any.)

Since I come from Java ME side, we've seen these problems in the
past. Java ME implementations are generally not
"serialization-compatible" with JDK.

But maybe it's of lesser concern between different JDK versions.

> >And minor issue, I can't seem to compile remoting module at the
> >moment:
> >
> >On a first run of mvn install:
> >
> >[INFO] Compilation failure
> >D:\work\hudson\hudson\main\remoting\src\main\java\hudson\remoting\Request.java:[61,12]
> >unreported exception java.lang.Throwable; must be caught or declared
> >to be thrown
>
> I saw this problem before. It seems like javac in some versions of JDK5
> (or maybe all) have a bug in handling generic "throws" clause. I
> switched to JDK6 and that fixed it.
>
> ... or we could just require JDK6 for developing Hudson. Maybe we can
> even have Maven enforce that, so that people will get a better error
> message.
>
> What do you think?

Requiring JDK6 for developing Hudson is fine with me, I do have JDK6
installed, and have no problems using it.

But as Jesse said, it would be good to file a bug against JDK 5 to
make sure the bug is known and is going to be fixed eventually.

Thanks,
  --Vladimir

> >On a second run:
> >[INFO] Compilation failure
> >D:\work\hudson\hudson\main\remoting\src\main\java\hudson\remoting\Channel.java:[166,8]
> >cannot find symbol
> >symbol  : class UserResponse
> >location: class hudson.remoting.Channel
> >D:\work\hudson\hudson\main\remoting\src\main\java\hudson\remoting\Channel.java:[166,28]
> >cannot access hudson.remoting.UserResponse
> >file hudson\remoting\UserResponse.class not found
> >UserResponse<V> r = new UserRequest<V,T>(this, callable).call(this);
> >D:\work\hudson\hudson\main\remoting\src\main\java\hudson\remoting\Channel.java:[186,21]
> >cannot find symbol
> >symbol  : class UserResponse
> >location: class hudson.remoting.Channel
> >D:\work\hudson\hudson\main\remoting\src\main\java\hudson\remoting\Channel.java:[187,35]
> >cannot find symbol
> >symbol  : class UserResponse
> >location: class hudson.remoting.Channel
> >D:\work\hudson\hudson\main\remoting\src\main\java\hudson\remoting\Channel.java:[188,30]
> >cannot find symbol
> >symbol: class UserResponse
> >protected V adapt(UserResponse<V> r) throws ExecutionException {
> >
> >I'm using JDK 1.5.0_08 under WinXP.
>
> I'm not too sure about this, but maybe it's related to the first issue.
> Continuous builds of Hudson is reporting an error at this moment
> (although a different one), so probably there's a bug.
>
> I'll take a look at this tonight or tomorrow.
>
> --
> Kohsuke Kawaguchi
> Sun Microsystems                   [hidden email]


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: New remoting infrastructure is in place

Kohsuke Kawaguchi-2
In reply to this post by Jesse Glick
Jesse Glick wrote:
> Kohsuke Kawaguchi wrote:
>> It seems like javac in some versions of JDK5 (or maybe all) have a
>> bug in handling generic "throws" clause. I switched to JDK6 and that
>> fixed it.
>
> Is there a bug reported for this against javac? Critical bug fixes like
> this will get backported to 5.x updates, but only if they are known.

I tried Bugster but couldn't figure out a good search criteria. The full
text search is OR, not AND, which IMO is pretty useless.

I tried to at least reproduce the failure so I did a clean build with
JDK1.5. I didn't reproduce the problem. So I'm bit puzzled.

Now I'm starting to wonder if this has something to do with the use of
retrotranslator, which used to alter class files in place, before I
hacked POM the day before yesterday.

Perhaps it erased the type signature from the referenced Response class
when it was trying to compile Request class (but then, that wouldn't
explain why a declaration like Response<RSP,EXC> didn't report any error.)

I created a very small test case to see if I can reproduce the problem,
but it worked just fine.

I'm changing my Hudson to build Hudson on 1.5 now. Let's see how that
goes...

--
Kohsuke Kawaguchi
Sun Microsystems                   [hidden email]

smime.p7s (4K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: New remoting infrastructure is in place

Kohsuke Kawaguchi-2
In reply to this post by Vladimir Sizikov
Vladimir Sizikov wrote:

> Hi Kohsuke,
>
> On Sun, Dec 17, 2006 at 05:40:13PM -0800, Kohsuke Kawaguchi wrote:
>> >I've always thought that serialization is a very fragile mechanism and
>> >it might be even incompatible between the different Java versions,
>> >right?
>>
>> The Java SE team spends a lot of efforts making sure that serialized
>> forms of core classes are compatible across different versions. So in
>> practice it's very unlikely that Java version differences cause
>> serialization compatibility issues (at least I haven't heard of any.)
>
> Since I come from Java ME side, we've seen these problems in the
> past. Java ME implementations are generally not
> "serialization-compatible" with JDK.
>
> But maybe it's of lesser concern between different JDK versions.
I see.

>> >And minor issue, I can't seem to compile remoting module at the
>> >moment:

BTW the build should be back to normal now. Try "mvn install" at the
main module.

--
Kohsuke Kawaguchi
Sun Microsystems                   [hidden email]

smime.p7s (4K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: New remoting infrastructure is in place

N Daley
In reply to this post by Kohsuke Kawaguchi-2
On Sun, Dec 17, 2006 at 05:40:13PM -0800, Kohsuke Kawaguchi wrote:

> > >I've always thought that serialization is a very fragile mechanism and

> > >it might be even incompatible between the different Java versions,

> > >right?

> >

> > The Java SE team spends a lot of efforts making sure that serialized

> > forms of core classes are compatible across different versions. So in

> > practice it's very unlikely that Java version differences cause

> > serialization compatibility issues (at least I haven't heard of any.)


>
> Since I come from Java ME side, we've seen these problems in the

> past. Java ME implementations are generally not

> "serialization-compatible" with JDK.


>
> But maybe it's of lesser concern between different JDK versions.




As someone who spent a couple years working for Sun and testing Java RMI and Serialization, we cared deeply about ensuring (and testing) the interoperability of these technologies between differing OSes and JVM versions.  Of course, changing the serialized form of a class in an incompatible way without changing the serialVersionUID would cause a problem...but that would be a user bug.

Cheers,
Nige











__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around
http://mail.yahoo.com

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: New remoting infrastructure is in place

Kohsuke Kawaguchi-2
N Daley wrote:
> As someone who spent a couple years working for Sun and testing Java RMI
> and Serialization, we cared deeply about ensuring (and testing) the
> interoperability of these technologies between differing OSes and JVM
> versions.  Of course, changing the serialized form of a class in an
> incompatible way without changing the serialVersionUID would cause a
> problem...but that would be a user bug.

Thanks. I thought so. If you worked on serialization, you might know Joe
Fialli, whom I learned these things from.

--
Kohsuke Kawaguchi
Sun Microsystems                   [hidden email]

smime.p7s (4K) Download Attachment