Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Java > Java (bytecode) execution speed

Reply
Thread Tools

Java (bytecode) execution speed

 
 
Chris Smith
Guest
Posts: n/a
 
      04-30-2007
Lee <(E-Mail Removed)> wrote:
> Perhaps someone can "debug" me on this:
>
> I had thought that bytcode was, so to speak, the "machine code" of the
> Java Virtual machime. If that were true, I can't see how there would be
> any room (or any need) for further compilation of the byte code. The
> byte code itelf would "drive" the VM, taking the VM from internal state
> to internal state until the computation was done.


Bytecode is just a file format to represent the actions of a piece of
Java code. Nothing more, and nothing less. It can be used directly, or
it can be converted to a different format.

Early implementations of Java interpreted it; that is, they used it
directly. Actually, implementations of Java on cellular phones and
other embedded devices often still do this because it's more efficient
in terms of memory usage and generally no one does high-performance
computation on a cell phone.

Newer Java implementations (as of 1999 or so) for desktop and server
platforms rarely interpret the bytecode. They translate it into the
native machine language, and let the processor run that native machine
language directly. This is, obviously, much faster.

> But if the bytecode were just a portable abstraction, something "above"
> the JVM's machine language but "below" the java source language, that
> would create the need to compile the byte code "the rest of the way
> down" to the actual jvm machine language, but NOT to native hardware
> machine language.


There is no JVM machine language. Perhaps what's confusing you is there
is no such thing as "the" JVM. There is a JVM for x86, another for x86-
64-bit platforms, another for Sparc, and so on... There are different
JVMs for different operating systems as well, though they often share
most of the JIT implementation. Each of these implementations of a JVM
contains its own different JIT compiler that generates code appropriate
for that processor.

So in the end, the JVM for a particular platform does the transformation
to the native machine language for that CPU, and from that point on it
just runs the code and sits back and waits for the code to call it;
essentially, after the JIT step, the JVM is essentially just a library
that is called by the application code.

--
Chris Smith
 
Reply With Quote
 
 
 
 
Chris Smith
Guest
Posts: n/a
 
      04-30-2007
Christian <(E-Mail Removed)> wrote:
> Is there any interpretation going on today in the jvm or is simply
> everything compiled to machinecode just in time before execution?


Modern JVMs do both; the performance-critical stuff is JIT'ed, but a lot
of one-off initialization code will be interpreted. At one time, JIT
compilers would frequently run before any code was executed; this was
changed because mixed mode (some interpreting, some compiling) reduces
the perceived start-up time of applications.

--
Chris Smith
 
Reply With Quote
 
 
 
 
Wojtek
Guest
Posts: n/a
 
      04-30-2007
Lee wrote :
> Lee wrote:
>> All other things being equal, we expect an interpreted language to run a
>> bit slower than native machine code.
>>

> <SNIP>
>
> At least two people were kind enough to point out that Java uses a JIT
> compilation system and that in any case the difference of execution time
> beween a Java progam and a hypothetical compiled version of the same
> algorithm (or as similar as the two languages allow), would probably be due
> more to the differences in heap managemant and/or garbage collection than in
> raw compilation speed.
>
> In that context it becomes plausible that in some circumstances Java might
> actually run faster than an equivalent C/C++ implementation.
>
> I must be missing an important nuance about Java and JIT.
>
> Perhaps someone can "debug" me on this:
>
> I had thought that bytcode was, so to speak, the "machine code" of the Java
> Virtual machime. If that were true, I can't see how there would be any room
> (or any need) for further compilation of the byte code. The byte code itelf
> would "drive" the VM, taking the VM from internal state to internal state
> until the computation was done.
>
> But if the bytecode were just a portable abstraction, something "above"
> the JVM's machine language but "below" the java source language, that would
> create the need to compile the byte code "the rest of the way down" to the
> actual jvm machine language, but NOT to native hardware machine language.
>
> So even in that case, the "compilation" would be down to the VM's machine
> language, not the actual hardware's machine language.
>
> But all the descriptions I see on the net about JIT talk in terms of
> compilation to native machine language. I can see how that would work with
> somethink like say the Pascal p-system, where pascal source would be compiled
> into "p-code", and then the p-code would either be interpreted or
> "just-in-time" compiled to native hardware machine language.
>
> My problem is that in my conception, when it is a question of running a
> virtual machine, the "compilation" would be to that vm's "machine" language
> and thats as "low" as you could go.


In a typical native environment you have:

source - what the programmer wants done
object code - what the programmer wants done, but in a form the
computer can understand
library - how to do stuff for a particular operating system
executable - what the programmer wants done along with how to do it for
that OS

So the sequence is:
source -> object code (compiler with optomization switches which the
programmer "guesses" will make the code run faster/better))
object code + library -> executable (linker)
** the executable is distributed
the user runs the executable

In Java you have:

source - what the programmer wants done
bytecode - what the programmer wants done, but in a form the Java
Virutal Machine can understand
** the bytecode is distributed

On the client machine, the JVM reads the byte code and since the JVM is
native to that OS it knows how to do stuff

The sequence is:
source -> byte code (compiler)
** the byte code is distributed
the user runs the JVM pointing to the byte code (The JVM does
not-the-fly optomizations depending on how THAT user uses the
application)
byte code + JVM

In a pure enterpreted environment (such as Perl and PHP)
source - what the programmer wants done
** the source is distributed
the user runs the Perl (or PHP) enterpreted pointing to the source code

This makes more sense with pretty diagrams

Note: Yes I know you can now get Perl and PHP linkers to produce
executables.

--
Wojtek


 
Reply With Quote
 
RedGrittyBrick
Guest
Posts: n/a
 
      04-30-2007
Wojtek wrote:
> In a pure enterpreted environment (such as Perl and PHP)


These days, things are rarely that simple.
http://www.perl.com/doc/FMTEYEWTK/comp-vs-interp.html






 
Reply With Quote
 
Lee
Guest
Posts: n/a
 
      05-01-2007
JT wrote:
> On Apr 30, 3:48 pm, Lee <(E-Mail Removed)> wrote:
>
>>I had thought that bytcode was, so to speak,
>>the "machine code" of the Java Virtual machime.

>
>
> It is.
>

So I'm not completely in cloud cuckoo land. Phew!

>
>>If that were true, I can't see how there would be any room
>>

>
>
> The Java Virtual Machine Specification is a precise document
> on the meaning of bytecodes. So JIT compilers simply
> attempt to produce native binaries that have the same behavior
> as the bytecode.
>


> I don't see how the JVMS prevents that.
>


See below

<snip>
>
> No. The byte code is the native machine code of the JVM.
>
>
>>My problem is that in my conception, when it is a question of running a
>>virtual machine, the "compilation" would be to that vm's "machine"
>>language

>
>
> The JIT does not compile to to the VM's machine language.
> In fact, the JIT always compiles to the CPU's machine language.
>
>
>>and thats as "low" as you could go.
>>

>
>
> Why?


Why I dont understand how you can go "lower" than the VM's machine code:

I'm handicapped by not knowing the design/architecture of the actual
JVM, so even though the best way to explain my difficulty would be to do
so in terms of the actual JVM instructions, I will do the next best
thing, and try to do it in terms of a simpler and entirely mythical
virtual machine.

Lets suppose I have a string handling virtual machine. Its got a string
store and it has two native operations (among others, but we're just
interested in showing why I think its not possible to get "below"
the virtual machines own virtual machine language. Your mission
impossible, should you choose to accept it is to expose the flaw
in how I'm thinking.

The two operations are "Head" which returns the first (zeroth) character
of the string, and "Tail" which returns the substring consisting of the
string that remains after removing the first (Zeroth) character. Gee,
sounds like Lisp car and cons, but "never mind".

So the "byte" code for "Head" is x01 and the byte code for "Tail" is
x02. The implementation of the string virtual machine on a particular
hardware platform consists of the native hardware machine instructions
that make the internal structures of the VM (implemented of course as
"real" structures built out of real memory and real registers and all that.

I suppose in one sense, you can say that the "compilation" of the byte
code x01 and/or x02 is the set of machine instructions used to implement
that part of the string virtual machine in the real hardware.

Compilation of the byte code would be nothing more or less than
re-implementing a portion of the string virtual machine. So that makes
no sense to me, as presumably you've done it right the first time when
you implemented the string virtual machine for that hardware in the
first place.

Are you saying that the vm is dynamically re-implementation at run time?

Another way to see my difficulty is to imagine that a virtual machine
instruction changes the internal state of the virtual machine in some
"Atomic" way. No single native machine instruction can do that, because
the native instructions change the state of the real machine, not the
state of the virtual machine. A small change in state of the virtual
machine involves lots of "non atomic" changes in the state of the
underlying real machine. The implementation of the virtual machine runs
lots of native operating instructions to acheive that effect, but those
instructions are determined when you implement the virtual machine, not
dynamically at run time. Unless of course I'm all wet and what you're
really doing is in fact dynamically re-writing the JVM implementation
which seems a bit mind boggling to me.






>The JVM itself has full access to the Operating System
> that the JVM is running on. Whenever you have native methods
> (eg. some of the GUI methods and the IO methods...), the JVM
> will have to invoke the corresponding services from the Operating
> System.

Er, yes. A fixed set of instructions determined at implementation time,
for each JVM machine instruction. Or is that not so?

>
> So, a JVM could invoke a JIT to translate frequently-executed code
> into a suitable binary format that the OS can execute.

Which means that the implementation of any given Java machine language
primitive is dynamically altered at run time. Eek! Can that be true?
>
> I see no problem there.


You dont? The native hardware instructions that find the head of a
string are re-invented every time somebody does the "head" operation?
Can that be right?

(And as you noted, there are many
> powerful Java JIT out there)
>
> - JT
>
>

 
Reply With Quote
 
Lew
Guest
Posts: n/a
 
      05-01-2007
Lee wrote:
> Why I dont understand how you can go "lower" than the VM's machine code:


> Are you saying that the vm is dynamically re-implementation at run time?


Yes.

> dynamically at run time. Unless of course I'm all wet and what you're
> really doing is in fact dynamically re-writing the JVM implementation
> which seems a bit mind boggling to me.


Yes, that's what's happening.

> Er, yes. A fixed set of instructions determined at implementation time,
> for each JVM machine instruction. Or is that not so?


That is not so.

> Which means that the implementation of any given Java machine language
> primitive is dynamically altered at run time. Eek! Can that be true?


Yes.

> You dont? The native hardware instructions that find the head of a
> string are re-invented every time somebody does the "head" operation?
> Can that be right?


No. Just when necessary to optimize the program.

--
Lew
 
Reply With Quote
 
Eric Sosman
Guest
Posts: n/a
 
      05-01-2007
Lee wrote:
> [...]
>
> Compilation of the byte code would be nothing more or less than
> re-implementing a portion of the string virtual machine. So that makes
> no sense to me, as presumably you've done it right the first time when
> you implemented the string virtual machine for that hardware in the
> first place.


For one thing, the virtual-to-native compilation can
eliminate all the decoding of the virtual instructions. A
straightforward interpreter will fetch a virtual instruction,
fiddle with it for a while, and dispatch to an appropriate
sequence of actual instructions that accomplish the virtual
instruction's mission. It may amount to only a few masks, a
few tests, and a big switch construct, but the interpreter
goes through it on every virtual instruction. Once the code
is compiled to native instructions, all the decoding and
dispatching simply vanishes: it was done once, by the compiler,
and need never be done again.

Another effect is that the virtual instructions are quite
often more general than they need to be for particular uses.
Stepping away from your two-instruction string machine for a
moment, let's suppose you've got a virtual instruction that adds
two integers to form their sum. The interpreter probably fetches
operand A, fetches operand B, adds them, and stores the sum in
target C. Well, the virtual-to-native compiler might "notice"
that A,B,C are the same variable, which the program adds to itself
in order to double it. The generated native machine code is then
quite unlikely to do two fetches: one will suffice, followed by
a register-to-register add or a left shift or some such. Not only
that, but the compiler may further notice that C is immediately
incremented after doubling, so instead of storing C and fetching
it back again for incrementation, the native machine code says
"Hey, I've already got it in this here register" and eliminates
both the store and the subsequent fetch.

>> [...]
>> So, a JVM could invoke a JIT to translate frequently-executed code
>> into a suitable binary format that the OS can execute.

> Which means that the implementation of any given Java machine language
> primitive is dynamically altered at run time. Eek! Can that be true?
>>
>> I see no problem there.

>
> You dont? The native hardware instructions that find the head of a
> string are re-invented every time somebody does the "head" operation?
> Can that be right?


Could be. The virtual-to-native compiler has the advantage
of being able to see the context in which a virtual instruction
is used, and may be able to take shortcuts, as in the instruction-
combining example above. As an example of a JVM-ish application
of this sort of thing, consider compiling `x[i] += x[i];', our
familiar doubling example but this time with arrays. Formally
speaking, each array reference requires a range check -- but the
JIT may notice that if the left-hand side passes the range check,
there is no need to do it a second time on the right-hand side.
Even better, the JIT may notice common patterns like

for (int i = 0; i < x.length; ++i)
x[i] += x[i];

.... and skip the range checking entirely.

A viewpoint you may find helpful, if a little wrenching at
first, is to think of the virtual instruction set as the elements
of a low-level programming language. You could, with sufficient
patience, write Java bytecode by hand, but it might be easier to
write Java and use javac to generate bytecode from it. Either
way, the bytecode is just an expression of a program, written in
a formal language, and there's no reason a translator couldn't
accept that formal language as its "source" for compilation.

--
Eric Sosman
http://www.velocityreviews.com/forums/(E-Mail Removed)lid
 
Reply With Quote
 
John W. Kennedy
Guest
Posts: n/a
 
      05-01-2007
Eric Sosman wrote:
> Could be. The virtual-to-native compiler has the advantage
> of being able to see the context in which a virtual instruction
> is used, and may be able to take shortcuts, as in the instruction-
> combining example above.


It also knows /exactly/ what processor it's running on, and can take
advantage of detailed timing information and new opcodes.

--
John W. Kennedy
"But now is a new thing which is very old--
that the rich make themselves richer and not poorer,
which is the true Gospel, for the poor's sake."
-- Charles Williams. "Judgement at Chelmsford"
* TagZilla 0.066 * http://tagzilla.mozdev.org
 
Reply With Quote
 
Lee
Guest
Posts: n/a
 
      05-01-2007
Lew wrote:
> Lee wrote:
>
>> Why I dont understand how you can go "lower" than the VM's machine code:

>
>
>> Are you saying that the vm is dynamically re-implementation at run time?

>
>
> Yes.
>


Awsome.

I kept thinking of the virtual machine as a fixed "simulation"
application, written once and set in stone; but I can see how its
possible to optimize whole blocks of code in ways that are not likely
when considering just one primitive operation.

Wow! I'm still blown away by the concept.
 
Reply With Quote
 
Joshua Cranmer
Guest
Posts: n/a
 
      05-01-2007
Lee wrote:
> Whats the current state of the art? Would we expect a java program to
> run at 0.5 * the speed of C, or 0.7 or 0.9 or what?


In one program contest I participate in, the Java factor is 1.5x, BUT
this is considering that all code is expected to run in 1 second or less
and that great emphasis is placed on optimized code.

I would expect that most applications would run at approximately native
speeds.


As a side-note, said competition used to use a 5x factor (but it used
1.3)....
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
private data stashed in local/global execution context of PyEval_EvalCode disappears down the execution stack sndive@gmail.com Python 9 11-14-2007 10:31 PM
execution speed java vs. C nicolasbock@gmail.com Java 27 12-10-2004 10:09 AM
Java vs. JavaApplet execution speed =?ISO-8859-1?Q?Tomas_Bj=F6rklund?= Java 3 05-19-2004 12:21 PM
Comparing execution speed of Java to other interpreted languages. Dave Rudolf Java 6 01-14-2004 07:12 PM
speed speed speed a.metselaar Computer Support 14 12-30-2003 03:34 AM



Advertisments