Java vs C portibility – intermediary code perspective

Note: don’t get too excited, I’m not back to blogging. I’m writing some articles for a basic programming class discussion board and I may share some here.

When a team embarks on a new project to develop, one of the first decisions they make is the choice of a programming language which is typically based on the type of application they’ll be building. Ever since Java became popular and up until the middle of the last decade Java was almost always the pick for systems whenever the target application needed to be portable. I will not discuss what other languages and technologies they use now, but rather I will discuss how Java has always been seen as first pick for portability and how it shouldn’t; I will investigate the portability of the C language and compare it to that of Java.

Let’s first discuss Java. When a developer writes a Java application and compiles it, the Java code gets translated into bytecode. Java bytecode is a standardized code set by Oracle (formerly Sun Microsystems) as intermediary code. Here’s an example of Java bytecode:

0: iconst_2
1: istore_1
2: iload_1
3: sipush 1000
6: if_icmpge 44
9: iconst_2
10: istore_2

This code is portable and independent of any machine architecture by design. The way it is designed so that it is portable is that they demand users to have another piece of software to be available at execution time, known as JVM (Java Virtual Machine). When a compiled java application is to be run, it is loaded into JVM which interprets the bytecode. Instruction by instruction gets compiled into assembly and then executed as native machine code. Since JVM is required to be on any device that shall run the Java application (bytecode), developers deem the Java bytecode independent of system architecture and thus portable. I would like you to focus on this part and how the Java cycle is as follows:

  1. Code: Java code
  2. Intermediary code: Bytecode
  3. Translator: JVM

Now, let’s take a break from Java and go to take a look at the C programming language. If you’d ask any C developer if the language is as portable as Java in the same sense, they will shout no. However, I am to show that this may not be true. Yes, it is true that the C language file we learn in school and in textbooks is said to go from C code to assembly/machine code; however, what they don’t tell you is that it’s not always the case. The practice that the Java team used, splitting up the architectural dependency by using an independent intermediary code translator, is clearly a good practice as it reduces the complexity of the compiler. This practice is also used by GCC (GNU Compiler Collection) C compiler. It is actually an intellectual decision that they decided to go with this method as it would lessen their effort in debugging and project management (divide and conquer philosophy). So here’s how GCC is doing it:

  1. Code: C code
  2. Intermediary code: GAS code
  3. Translator: GAS

GAS, GNU assembler, is cross-platform. … Do you realize that the cycle stages of the two languages here indicates that the compatibility of the two languages are conceptually the same? Mind-blown?

Before anyone attempts to criticize what I am attempting to convey, let me point out the following:

  • Both Java and C compilers links to libraries within the intermediary code.
  • Both Java and C intermediary codes require a translator application installed on the system prior to actual compilation and execution (JVM and GAS).
  • Both Java and C require all linked libraries to be installed on the system prior to actual compilation and execution.

So, in case you’re wondering what the actual difference is, it is the following:

  • Java compiler stops at bytecode whereas GCC stops after the entire program is assembled into a binary executable file.
  • JVM is a standard requirement of the language whereas GAS is not.

It cannot be argued that they carry on the same three stages; however, it could be argued that the expected result is different as that is specified by the language standard. C never asked to have an intermediary translation software; in fact, Dennis Richie, the creator of the language, aimed for a fully compiled executable for performance purposes. To the contrary, Java team wanted this intermediary stage to facilitate portability. However, the fact that C did not standardize it does not mean it does not exist! It does, GAS exists, and it looks like this:

.global _start

.text
_start:
movl $len, %edx
movl $msg, %ecx
movl $1, %ebx
movl $4, %eax
int $0x80

This is not the actual machine specific assembly code as most people think. This is GAS, machine independent intermediary code.  If you are to argue that it is in fact assembly, we will have to pay along but you must know that GAS is as much “assembly” as bytecode. GAS code can be generated using the GCC command as follows: gcc input.c -S

So now, let’s ask these rhetorical questions: is Java more portable than C? Should we use Java just because we need a portable solution? Or should C be the pick when efficiency is required by a portable system?

I will conclude this article in a not so scientific and technical way unlike what you’d expect; I wish to say that most of our perception of matters is not how things actually are but rather how they are presented to us. Always look beyond and you will be amazed of how we are not always making the best decisions.

Questions and criticisms are welcome.

Advertisements

~ by AnxiousNut on February 2, 2015.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

 
%d bloggers like this: