Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Java > How to set up a fast correct java build?

Reply
Thread Tools

How to set up a fast correct java build?

 
 
Joshua Maurice
Guest
Posts: n/a
 
      01-08-2010
I'm sorry if this is answered in a FAQ somewhere, but the first
comp.lang.java.programmer FAQ I found referenced Java version 1.1, so
I stopped reading there.

I'm working on a project now with over 20,000 java source files, in
addition to more than 4,000 C++ source files, some forms of custom
code generation, an eclipse build, and probably other things I don't
know offhand.

Due to various requirements, we cannot put all of the source files
into a single jar. Many jars are requirements. (Different jars for
plugins, for client API, server impl. Then multiply by several
different products, and we arrive at over required 200 jars.)

How do you build your java code?

I'm looking for a fast, correct build. Let me be very specific with
the term "correct". A build is correct iff a build is equivalent to if
you completely cleaned the source file system beforehand of previously
build files.

To be fast with such a large number of files, you basically need to
have an incremental parallel build, possibly distributed (though this
is harder with Java than say C++, for example).

Let's ignore parallel (and distributed) for the moment. If I can find
a way to incrementally build java, then I could probably write the
parallel part myself with make. However, from my limited googling,
there is no such thing as a correct incremental java compile, except
maybe Jikes.

javac doesn't cut it on its own. If it had the "-M" option of gcc or
any other sane compiler, I could pretty easily hack it together
myself. I need the compiler's help to do it; the class file does not
contain sufficient information. Anything based on reading class files
for dependency information is fatally flawed as it might miss critical
dependencies. My QA department does not want to take an incremental
build if there's a 5-10% chance of it being incorrect. They'd rather
wait for a clean build, and rightfully so. They don't want to waste
days of work just to learn it's a build issue.

javamake aka JMake appears to be barely supported, and it seems from
what little I can gather that it uses information from classfiles, so
it is also incorrect and insufficient. It's GNU copyleft license might
also make my company's lawyers wince.

Ant doesn't cut it either. Its depend task does even less than
javamake. It doesn't check any dependencies except the classfile
timestamp on its java source file. I doubt it even checks classfile
and java timestamps for non-public top level classes.

As a Hail Mary, I could look up the Java grammar, or some open source
Java parser, and parse the Java files and get the information myself.
I just need to be able to get the list of package qualified classes
used by a Java source file.

However, perhaps Jikes could do what I need. Does anyone have any
notable experience with it? Will it give me a list of package
qualified classes used by each Java source file? Unfortunately, its
documentation appears non-existent. Could potentially anyone point me
to it perhaps?

I was thinking that perhaps there's some way to invoke Eclipse from
the command line. Is there a way? And is Eclipse just "smart enough"
to correctly incrementally compile Java code when invoked from the
command line?

How do you compile your Java? 20,000 source files is not something to
laugh at and just "clean each time" nor "cross your fingers and hope
that Ant depends or javamake catches all of the dependencies" as each
missed dependency could result in lost developer man weeks. Moreover,
the problem is exacerbated with the other non-Java components of our
build. For example, some java source is generated by in-house custom
tools (in order to get serialization between C++ and Java), and
incremental becomes even more important as it's not just 20,000 java
source files anymore. It's even more taking even more time, greatly
increasing the need for a correct fast (and thus incremental and
parallel) build.

Frankly, the current state of affairs in the Java community is not
acceptable, and even laughable, given that solutions to these problems
(fast build, correct build) are known and have been known for many,
many years in the context of C and C++. (That javac cannot or will not
output dependency information ala gcc -M is amazing.) Having said
that, I would love to be proven quite wrong. I thank you in advance
for any advice or insight you are willing to give.
 
Reply With Quote
 
 
 
 
Joshua Maurice
Guest
Posts: n/a
 
      01-08-2010
On Jan 8, 3:08*am, Joshua Maurice <(E-Mail Removed)> wrote:
> However, perhaps Jikes could do what I need. Does anyone have any
> notable experience with it? Will it give me a list of package
> qualified classes used by each Java source file? Unfortunately, its
> documentation appears non-existent. Could potentially anyone point me
> to it perhaps?


Nevermind. It appears from Wikipedia that this is also no longer
updated.

I just found gcj from the GNU compiler collections as well. Sadly, I
don't see an option in the manual for outputting class file
dependencies like gcc -M. It's also not working on my current Linux
install either, so I can't really test further.

It appears the "sanest" approach is to obtain or write my own Java
parser. I don't need to do any syntax or semantic checking at all. I
just need an exhaustive list of all used package qualified class
names. Still sounds like a large project though. /sigh
 
Reply With Quote
 
 
 
 
Alessio Stalla
Guest
Posts: n/a
 
      01-08-2010
On Jan 8, 12:08*pm, Joshua Maurice <(E-Mail Removed)> wrote:
> I'm sorry if this is answered in a FAQ somewhere, but the first
> comp.lang.java.programmer FAQ I found referenced Java version 1.1, so
> I stopped reading there.
>
> I'm working on a project now with over 20,000 java source files, in
> addition to more than 4,000 C++ source files, some forms of custom
> code generation, an eclipse build, and probably other things I don't
> know offhand.
>
> Due to various requirements, we cannot put all of the source files
> into a single jar. Many jars are requirements. (Different jars for
> plugins, for client API, server impl. Then multiply by several
> different products, and we arrive at over required 200 jars.)
>
> How do you build your java code?
>
> I'm looking for a fast, correct build. Let me be very specific with
> the term "correct". A build is correct iff a build is equivalent to if
> you completely cleaned the source file system beforehand of previously
> build files.
>
> To be fast with such a large number of files, you basically need to
> have an incremental parallel build, possibly distributed (though this
> is harder with Java than say C++, for example).
>
> Let's ignore parallel (and distributed) for the moment. If I can find
> a way to incrementally build java, then I could probably write the
> parallel part myself with make. However, from my limited googling,
> there is no such thing as a correct incremental java compile, except
> maybe Jikes.
>
> javac doesn't cut it on its own. If it had the "-M" option of gcc or
> any other sane compiler, I could pretty easily hack it together
> myself. I need the compiler's help to do it; the class file does not
> contain sufficient information. Anything based on reading class files
> for dependency information is fatally flawed as it might miss critical
> dependencies. My QA department does not want to take an incremental
> build if there's a 5-10% chance of it being incorrect. They'd rather
> wait for a clean build, and rightfully so. They don't want to waste
> days of work just to learn it's a build issue.
>
> javamake aka JMake appears to be barely supported, and it seems from
> what little I can gather that it uses information from classfiles, so
> it is also incorrect and insufficient. It's GNU copyleft license might
> also make my company's lawyers wince.
>
> Ant doesn't cut it either. Its depend task does even less than
> javamake. It doesn't check any dependencies except the classfile
> timestamp on its java source file. I doubt it even checks classfile
> and java timestamps for non-public top level classes.
>
> As a Hail Mary, I could look up the Java grammar, or some open source
> Java parser, and parse the Java files and get the information myself.
> I just need to be able to get the list of package qualified classes
> used by a Java source file.
>
> However, perhaps Jikes could do what I need. Does anyone have any
> notable experience with it? Will it give me a list of package
> qualified classes used by each Java source file? Unfortunately, its
> documentation appears non-existent. Could potentially anyone point me
> to it perhaps?
>
> I was thinking that perhaps there's some way to invoke Eclipse from
> the command line. Is there a way? And is Eclipse just "smart enough"
> to correctly incrementally compile Java code when invoked from the
> command line?
>
> How do you compile your Java? 20,000 source files is not something to
> laugh at and just "clean each time" nor "cross your fingers and hope
> that Ant depends or javamake catches all of the dependencies" as each
> missed dependency could result in lost developer man weeks. Moreover,
> the problem is exacerbated with the other non-Java components of our
> build. For example, some java source is generated by in-house custom
> tools (in order to get serialization between C++ and Java), and
> incremental becomes even more important as it's not just 20,000 java
> source files anymore. It's even more taking even more time, greatly
> increasing the need for a correct fast (and thus incremental and
> parallel) build.
>
> Frankly, the current state of affairs in the Java community is not
> acceptable, and even laughable, given that solutions to these problems
> (fast build, correct build) are known and have been known for many,
> many years in the context of C and C++. (That javac cannot or will not
> output dependency information ala gcc -M is amazing.) Having said
> that, I would love to be proven quite wrong. I thank you in advance
> for any advice or insight you are willing to give.


I don't get why you believe that Java class files do not contain
enough dependency information. At least the class hierarchy and
classes referred to by fields and methods (signature and code) should
be found there. Reflection can be problematic, but it would be with a
Java parser too. However, since I'm not an expert in the matter, I may
be missing something.

If you choose the Java parser way, instead, there are probably ways to
avoid writing your own. You mentioned you use Eclipse: Eclipse is able
to gather dependency information, at least at the source level (they
call it "Call hierarchy" iirc). You might be able to hook at that,
either with Eclipse's plugin facilities or by hacking Eclipse itself,
which is open source. Or, you might use an existing Java parser:
OpenJDK and BeanShell (Java interpreter) surely contain one.

hth,
Alessio
 
Reply With Quote
 
Andreas Leitgeb
Guest
Posts: n/a
 
      01-08-2010
Joshua Maurice <(E-Mail Removed)> wrote:
> It appears the "sanest" approach is to obtain or write my own Java
> parser. I don't need to do any syntax or semantic checking at all. I
> just need an exhaustive list of all used package qualified class
> names. Still sounds like a large project though. /sigh


This topic appears here every once in a while.

Java just doesn't lend itself to structured C/C++-style dependencies,
because of the 1:n relation-ship of source files to generated .class
files.

In C/C++ you have a couple of source-files (ideally one .c and a flock
of .h's) that determine the contents of a particular object file. And
you know exactly which object files you need for an executable.

In Java, however, one source file can result in any number of .class
files for all the inner, nested, anonymous, synthetic, or further
(non-public) toplevel classes. (I know, it's a partially redundant list.)

So the very concept of "which source-files are relevant for this .class"
is futile, when one doesn't know the set of .class files in advance.

The only generally *safe* Java build is the complete rebuild.

 
Reply With Quote
 
Andreas Leitgeb
Guest
Posts: n/a
 
      01-08-2010
Alessio Stalla <(E-Mail Removed)> wrote:
> I don't get why you believe that Java class files do not contain
> enough dependency information. At least the class hierarchy and
> classes referred to by fields and methods (signature and code) should
> be found there. Reflection can be problematic, but it would be with a
> Java parser too. However, since I'm not an expert in the matter, I may
> be missing something.


Indeed: static final fields used from other classes. Only their
values are used, but their names do not show up in your .class file.

 
Reply With Quote
 
Andreas Leitgeb
Guest
Posts: n/a
 
      01-08-2010
Andreas Leitgeb <(E-Mail Removed)> wrote:
> Alessio Stalla <(E-Mail Removed)> wrote:
>> I don't get why you believe that Java class files do not contain
>> enough dependency information. At least the class hierarchy and
>> classes referred to by fields and methods (signature and code) should
>> be found there. Reflection can be problematic, but it would be with a
>> Java parser too. However, since I'm not an expert in the matter, I may
>> be missing something.

>
> Indeed: static final fields used from other classes. Only their
> values are used, but their names do not show up in your .class file.


Small correction: static final fields with a compiletime known value.

 
Reply With Quote
 
Alessio Stalla
Guest
Posts: n/a
 
      01-08-2010
On Jan 8, 1:46*pm, Andreas Leitgeb <(E-Mail Removed)>
wrote:
> Andreas Leitgeb <(E-Mail Removed)> wrote:
> > Alessio Stalla <(E-Mail Removed)> wrote:
> >> I don't get why you believe that Java class files do not contain
> >> enough dependency information. At least the class hierarchy and
> >> classes referred to by fields and methods (signature and code) should
> >> be found there. Reflection can be problematic, but it would be with a
> >> Java parser too. However, since I'm not an expert in the matter, I may
> >> be missing something.

>
> > Indeed: *static final fields used from other classes. * Only their
> > values are used, but their names do not show up in your .class file.

>
> Small correction: *static final fields with a compiletime known value.


Ok, thanks, I didn't know it. However, if this is the only problem,
it's a very infrequent: how often is a static final field initialized
with a compile time constant changed? Probably infrequently enough
that you can just perform a full clean and build in those rare cases.

Cheers,
Alessio
 
Reply With Quote
 
Andreas Leitgeb
Guest
Posts: n/a
 
      01-08-2010
Alessio Stalla <(E-Mail Removed)> wrote:
> ... However, if this is the only problem,
> it's a very infrequent: how often is a static final field initialized
> with a compile time constant changed?


I came across it all too often during devel - but then again, some of
those compile-time "constants" had better been stored in a properties
file for runtime

> Probably infrequently enough
> that you can just perform a full clean and build in those rare cases.


My latest conclusion after such a discussion here was, that one could
write a tool to extract only the *API* of every (non-private) .class file
after a compile, and next time, compile only the modified sources, and
then rerun the API-extraction, and if anything changed: recompile all.

Some tools might already do that, and I never even started to
implement such an API-captchurer, myself - at least it looks like
a theoretically safe approach, that migh still beat always recompiling
all.

 
Reply With Quote
 
Tom Anderson
Guest
Posts: n/a
 
      01-08-2010
On Fri, 8 Jan 2010, Christian K?tbach wrote:

> Try maven.
>
> It is like Ant but better
>
> In fact you can produce clean builds of your software, as easy as typing "mvn
> clean compile" to a commandline.
>
> Maven can manage all your dependencies and release softwaremodules.
>
> But you need some kind of infrastructure (Artifactory, Nexus).


Did you even read the original post?

tom

--
There are lousy reviews, and then there's empirical shitness. -- pikelet
 
Reply With Quote
 
Tom Anderson
Guest
Posts: n/a
 
      01-08-2010
On Fri, 8 Jan 2010, Alessio Stalla wrote:

> On Jan 8, 1:46*pm, Andreas Leitgeb <(E-Mail Removed)>
> wrote:
>> Andreas Leitgeb <(E-Mail Removed)> wrote:
>>> Alessio Stalla <(E-Mail Removed)> wrote:
>>>> I don't get why you believe that Java class files do not contain
>>>> enough dependency information. At least the class hierarchy and
>>>> classes referred to by fields and methods (signature and code) should
>>>> be found there. Reflection can be problematic, but it would be with a
>>>> Java parser too. However, since I'm not an expert in the matter, I may
>>>> be missing something.

>>
>>> Indeed: *static final fields used from other classes. * Only their
>>> values are used, but their names do not show up in your .class file.

>>
>> Small correction: *static final fields with a compiletime known value.

>
> Ok, thanks, I didn't know it. However, if this is the only problem,
> it's a very infrequent: how often is a static final field initialized
> with a compile time constant changed? Probably infrequently enough
> that you can just perform a full clean and build in those rare cases.


No - because you don't even have enough information to *detect* these
cases. Consider:

// A.java
class A {
public static final String FOO = "foo";
}

// B.java
class B {
public static final String FOO = A.FOO;
}

You change A.java. How do you know you have to recompile B?

tom

--
There are lousy reviews, and then there's empirical shitness. -- pikelet
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Re: How include a large array? Edward A. Falk C Programming 1 04-04-2013 08:07 PM
simulation result is correct but synthesis result is not correct J.Ram VHDL 7 12-03-2008 01:26 PM
Correct White Balance Doesn't Mean Correct Color?? jim evans Digital Photography 28 12-27-2005 05:10 AM
correct or not correct? Dan HTML 7 10-02-2003 10:16 PM
To correct my program. please, check to find errors and correct me. joon Java 1 07-08-2003 06:13 AM



Advertisments