C vs. Java

When thinking about C one must consider that everything thing maps into a sequence of physical memory locations in the computer. This is quite different in comparison to how Java is thought about. In Java, most everything is an object of one sort or another. This difference come from why the two languages were designed and when they were designed.

Java started its life as a web-centric programming language called Oak. Since it was web-centric programs could be passed from machine to machine, posing several protection issues. To provide a certain level of protection Oak was designed to be interpreted, which also allowed the programming language to respond to unstable programs in a consistent way. The disadvantage to this is a reduction in general processing speed, since Java is an interpreted language with many runtime checks built in.

C was originally designed as a system programming language for the UNIX environment. So the language needed to be close to the machine, but functional enough to make programming easy. The resulting language replaced assembly language to create a very flexible language that also allows good access to the underlying machine. But some of C's greatest strengths are also its biggest detractors. Given that C is very close to the underlying machine, so there is very little stopping a program from doing what you tell it to do, which sometimes might not be what you want the machine to do. C also has the advantage that it can be directly compiled with assembly language to allow complete access to the machine.

The result is Java is a very powerful language/ programming environment that allows developers a great amount of flexibility and reliability at the price of speed. Where C can be less predictable if the programer is not careful in their coding, but gives the ultimate access to the machines underlying structure and total processing speed.

Basics

The flow of this document gives a high-level view of C. It starts with C's Pre-processor, which is a macro language for C and will pre-process the source code before the compiler does its work. Then the basics of types, operators, and expressions will be covered. This section is a first pass on the subject and will be expanded upon later. Then control flow will be discussed, but much of this is similar to Java syntax and you will have had previous exposure to it. Then the basic of functions will be discussed, which will include a discussion of main. The final section discusses the basic I/O functions provided by C and associated libraries.

Comments in C are specified by bracketing text between the characters /* and */. Examples of valid comments are given below.

/* comment */

/* comment line 1
* comment line 2
* comment line 3
*/

It is important to note that the C compiler is a one-pass compiler, that is, it will read the compiled code from top to bottom. This has the advantage that it is quick and simple, but the disadvantage that everything must be declared before it can be used. Another point to note, C provides the concepts of global variable and scoping but does not restrict access to data in anyway. Thus, there is no strong concept of public or private the way there is in Object Oriented Programming (OOP), which was but a dream at the time C was created.

Lastly, C does not use methods it uses functions. The difference is semantic, but some people consider it an important distinction.

C Pre-processor (CPP)

As indicated previously the CPP provides a macro facility to C and allow commands to be expanded into C-code chunks based on some simple rules. The CPP is critical to the development environment provided by C. This is mainly due to the concept of header files, which provide variable and function declarations so that single source files can be compiled individually. This allows for large libraries of common pre-compiled code to be provided to developers, which can then be compiled into the application that the developer is currently working on. The Java programmer might see this as similar to the class library provided by the Java SDK.

Header files are added with an include statement, that looks like the following.

#include <stdio.h>
#include "main.h"

If you note there are two type, one with <> and one with "", these determine where the system will search for the indicated file. An include declaration that uses the <> indicates that system should search in the system directories such as /usr/include. An include declaration that uses the "" indicates that system should search in the local directory where the compile is being performed.

The CPP can be used to create constants using the #define declaration, as shown below.

#define MAX_LIMIT 25

This declaration will create a constant called MAX_LIMIT and replace all occurrences of it in the source file with the value 25. In actually the #define declaration can be used for more that just creation of constants, it also allows for the creation of macros (illustrated below.)

#define ERROR(msg) printf("ERROR - %s\n", (msg));

The above will replace all instances of the macro ERROR with the printf state (that will generate output) and replace (msg) with the associated message.

Macros can be undefined with the #undef declaration.

The CPP also provides conditionals (#if, #else, and #endif) so different actions can be taken under certain conditions. One example of different conditions is if the developer is writing code for different hardware platforms (or operating system) and needs to add platform specific code. A second example, which is more commonly used, is to make sure that the contents of an include file is added only once. This can be a problem for large software systems that have many header files that may include common header files. Using the following form in your header file allows you to avoid these types of problems.

#ifndef MATH_LIB
#define MATH_LIB

....... code, code, code ............

#endif

This will allow the code within in this structure to be compiled only once, because once the if block is entered the MAIN_LIB is declared and the #ifdef will not allow the region to be entered again.

Predefined Macros

There are a number of Predefined macros, given below, that can be added to your code to add some extra functionality.

__LINE__
Line where the macro exists.
__FILE__ File where the macro exists.
__DATE__ Date that the source code was compiled.
__TIME__ Time that the source code was compiled.

Basic Types, Operators, Expressions

In C there are three basic types, given below, with many variations. In reality there are only two types, but this will be discussed later when more detail is given.

char The ASCII character set.
int Whole numbers.
float Real valued numbers.

Since C is so closely tied to the machine the size of the values contained in variable of each type can be machine specific, but be aware there is a limit to the amount of information that can be contained in each. Once again, more details will be given at a later point.

Arrays can also be declared in C using the [] operators, which specify size when the array is declared and the index when the array is being referenced.

int x = 110;
int y[8] = {0, 1, 2, 3, 4, 5, 6, 7};


y[0] = x;

In the above example, an integer variable x is created with an array of integers y. Both x and y are being initialized as part of their declaration. The third line then assigns the value of x to the first position of the array y, replacing the value 0 with 110. Any variable location can be referenced by its location in memory, which is call a pointer. Pointers and arrays are very related in the C programming language, and is what gives C much of its power and potential for error.

C does not have the concept of strings the way that Java does, instead it uses arrays of characters with a null terminator at the end (often called "C-strings".) The null terminator is a special constant that is given as '\0'. Note, the single quotes were used here, this has special significance in C and indicates that a single character is being represented. If double quotes are give, as "hi", this indicates a multi character string with a null terminator appended to the end of the string. It is very important to understand this concept, because many library functions accept these array of characters and process until the null terminator is detected. If no null terminator is detected then the string will continue to be processed, even if the string has been fully processed.

char string1[10];
char string2[] = "hi there";
int x = 0;

while (string2[x]) {
string1[x] = string2[x];
x++;
}

string1[x] = '\0';

The above code creates two character arrays, one is 10 characters and the other is the length of the initializing string plus 1 (for the null terminator.) The while will continue looping until the null terminator contained in string2 is reached. Then the last line adds a null terminator to the end of the string copied into string1.

Operators

C uses the same operators as does Java, with similar functionality.

Expressions

The result of an expression in Java is a boolean value, which has the value of true or false. In C, there was not (until recently) a boolean type, instead using a 0 to represent false and non-zero value to represent true. With the ANSI C99 standard, a boolean type was introduced, but in reality this reduces to an integer value of 0/1. For this class do not use the boolean type.

Since there is not a true boolean value in C, all expressions (or statements that determine equivalence) reduce to a 1 or 0. When used in the context of an if statement (or other control flow statement) a non-zero value is considered true and zero false.

Control Flow

C offers many of the control flow statements offered by Java, with things like throw and catch not included. Like Java, C uses brackets to indicate blocks of code within the control flow. Below is a quick list of the different control flow options available.

If-Else / Else-If if (expression) {
code, code, code
} else {
code, code, code
}

or

if (expression) {
code, code, code
} else if (expression) {
code, code, code
} else if (expression) {
code, code, code
} else {
code, code, code
}

Switch switch (expression) {
case const-expr :
code, code, code
case const-expr :
code, code, code
default :
code, code, code
}

Note, don't forget each expression requires a break statement, else the next case will be executed.

While Loop while (expression) {
code, code, code
}
For Loop for (expr1; expr2; expr3) {
code, code, code
}
Do-While Loop do {
code, code, code
} while (expression);

Function

Functions follow the common format given below, which is similar to Java method declaration.

int stuff (int x, int y);

int stuff (int x, int y) {
return x + y;
}

Above the function stuff is given in two forms, with the first line being the function declaration and the second being the function definition. Declarations allow functions to be used eventhough they have not been defined. Often declarations are placed in the header file, so that common functions can be stored is a single source file and used by other functions not in that source file. Declarations are not required, so long as the function is defined before it is called.

int main (int argc, char **argv) {
code, code, code

return 0;
}

There is one special function called main, which is the function that is called when the program execution begins. Main receives two parameters which contains any command line information that was used when the program was first executed. These parameters are provided by the Operating System, and more detail will be given on these later.

It is important to realize the C function parameters differ in one important way from Java method. C functions can only pass the basic data types and can not pass more complex structures directly. Instead a pointer to more complex structures are passed to the function, thus pointers will become very prominent in our future usage of C.

Basic I/O

C provides three basic I/O streams to interact with the outside world. These streams are treated as files that were opened by the system and are listed below. To access any of the I/O functions the header file stdio.h needs to be included.

stdin Standard input typically pulls information from the keyboard, but can also be redirected from another location.
stdout Standard output will write out to the display, but also can be redirected to different locations.
stderr Standard error is a specialized output mechanism that will write to the display, but is optimized so that information in immediately written out. This allow the developer to separate the normal output stream from output that was caused by an error.

As indicated the above streams are treated as files for reading and writing purposes. To open, close, and manipulate files the following commands can be used. For more details on these commands use the man pages to see their full description. Many the commands refer to a character pointer (or char *), which indicates the pointer of the beginning of a character array (see above.)

fopen Opens a file and returns a pointer of type FILE.
fclose Closes a file.
feof Returns a value indicating if an end-of-file was reached.
fputc Writes a character to a file.
fgetc Reads a character from a file.
fputs Writes a string to a file.
fgets Reads a line from a file.

Print and Scan:

The above functions perform basic I/O, but often it is useful to do something more complex. That is where the printf and scanf family of functions are useful, where printf produce output and scanf read input. They allow for the formating of data and the reading of structured strings. All variations of these functions accept a formating string that consist of characters and flags. Flags indicate data that should be pulled or stored in a variable and has the form of a % followed by a character indicating the type of data. The following is an example to print out a string followed by an integer followed by a float, "%s %i %f". The previous string would then be followed by the associated varaible in the given order. There is a great variety of flags that are very useful, so of the more useful ones are listed below. Note, these flags can also accept formating information.

%f Format a floating point value.
%i Format an integer value.
%s Format a string value.
%p Format a pointer value.
%c Format a single char.
%o Format integer as octal
%u Format integer as unsigned character
%x Format integer as hexadecimal

Printf also accepts several formating characters that are part of the ASCII character set.

\n new line
\" double quote
\a alert
\f form feed
\r charage return
\t horizontal tab
\v vertical tab

The printf/scanf members of this family are the most basic, interacting with stdout and stdin. For interacting with files fprintf/fscanf should be used, but since the standard I/O streams are essentially file handlers then these streams can be passed directly to fprintf/fscanf. Another variation is sprintf/sscanf, which interacts with a character string that is passed to them. The sprintf is useful for building complex character strings, but sscanf is even more useful because it allows character strings to be converted into other variable types (such as floating point values.)

Error Checking:

All well formated programs should check the return code from all functions. Return codes will vary depending on what the function is performing, but typically if the function returns a pointer then you should check for a NULL pointer. Since a null pointer is 0, you can wrap you function in an if statement that will fail on the null pointer. An example of this programing idiom is given below.

#include <stdio.h>
#include <errno.h>

File *inputHandle = NULL;
char errorMsg[512]

if ((inputHandle = fopen("somefile.txt", "r")) == NULL) {
perror(errorMsg);
printf("OPEN ERROR : %s\n", errorMsg);
exit(0);
}

Note the use of the function perror, which returns an error code for the function most recently executed. The last line in the if statement will exit the program with a specified return code. This can same type of format can be used for other types of return codes.

Another type of error checking, called consistency check or sanity checking, is to make sure a value is within a certain range. The library macro assert performs this functionality, by accepting a conditional value. If the conditional resolves to false, then an error message will be printed and the program exited.

One goal that many programmers have is to minimize the typing of error checking code, while still doing it. Often a programmer will create a set of macros that will print a common error message with a minimum a mount of typing. Also, some programmers will create a variety of macros for different situations.