anywherenotes

Friday, September 29, 2006

How-to program.

As my first attempt at creating a blog, I will gives some programming tutorials, hints and tips.

The first one is how to write a 'hello world' program in C. C is the programming language that I have learned first in college, and I believe that it's a good language to start with.

Here's the C program:
#include
int
main(int argc, char **argv)
{
printf("Hello World!\n");
return (0);
}

Even with a small program such as that there is a lot to discuss.
The line '#include ' will find file 'stdio.h' somewhere on your machine, and that file's contents will replace that line.

The #include is a pre-compiler directive, before a file such as the one above can be executed, it will be pre-compiled, compiled and linked. Normally with such a small file, there is a single command to do all of it 'cc' or 'gcc', depending on the compiler you are using. If Visual Studio is used, than there is a button you can click to precompile, compile and link all in one step.
Although the #include pre-compile directive looks for the file on your machine, it does not act as a 'search on computer', but rather it looks in dedicated places, such as the include directory that your compiler is aware of, and any other include directories that were specified during compilation. In case of 'gcc' compiler, it looks under /usr/include directory (on UNIX), and any other files which have been specified by the -I argument, such as: gcc -I ~/development/include <-- that will instruct the precompiler to look into your home directory/development/include.

So much for the first line. It's important to note that precompiler directives have to be on separate lines, where as normal C code can all be on one line. The precompiler directive always starts with '#', so as you can see, there are no other precompiler directives here, and everything else might as well be on one line:
int main(int argc, char **argv) { printf("Hello World!\n"); return (0); }
Terrible to read though.

The next important part is to understand that every C program is running inside a C runtime. It may make sense to people familiar with Java, where you have to install Java VM (Java runtime) prior to running any Java program. C runtime is not something you have to install on a client pc, it just compiles into your program. After working on some initializations, C runtime calls function 'main', which has to be in your program.

Function 'main' we defined as: int main(int argc, char **argv). The 'int' in front of 'main' refers to the type of data we will be returning from the function, which is an integer. Integers in C are signed and typically 32 bits. The low value will be ((2^31)+1) * -1, and the upper value will be 2^31. In the days of 386's, integer used to be 16 bits. The contents inside parentecies are called arguments. Our main accepts two arguments, an integer and a char **.
A 'char' stands for character, and '**' is a pointer to a pointer, which we won't go into for now.

The word 'argc' following the type 'int', is the name of the argument. 'argv' is the name of the second argument. As you noticed the return argument does not have a name, the reason for naming arguments but not return values, is arguments will be used inside the function, so function needs to have names for them, but named return values would not give any added benefit.

The '{' and '}' characters encapsulate function's body. printf - is actually a function which will come from standard C library (libc on unix, on windows the name depends on compilation options). Printf accepts a string as it's first argument, followed by any number of additional arguments.
The additional arguments are normally variable names, and are refered to by the first argument (string). When something is refered to as 'string' in C, that means it is a character array - which means more than one character.

The simplest character array is text enclosed in double quotes, for example "hello" would be a character array of length 5. The size of the array is actualy 6, though length is 5.
The reason for size being one greater, is at the end of every C string there is a NULL byte.
A NULL byte is a binary zero. In other words, the string "hello", contains ascii character 'h', followed by ascii character 'e', etc. and after the ascii character 'o' there is a binary 0. You don't have to type in a binary 0, as long as the string is in double quotes, it's there.

The line printf("Hello World!\n"); Calls function 'printf' and passes string "Hello World!\n". The '\n' is a new-line character. Any character with back-slash '\' is a special character, meaning don't use the value 'n', but use the special meaning of '\n'.
If the program is executed, this should be the output:
Hello World!

The '\n' makes sure that the following line will be printed on next line. The 'return (0);' returns zero (binary zero) to the C runtime, which called the main function. 0 indicates that the program ran successfully, any other value indicates error. Whatever main function returns, gets returned by the program to the calling shell.
In the world of Unix, it means that sh/ksh/csh - whichever one you're using - will get back an error code from program which you can look at using $? variable. In the world of windows, it probably doesn't mean much to you yet - until the program is called from another program, and you check it's return code.

It's best to always return 0 in case of success and anything else in case of failure. That anything else is usually -1 or 1. Although the function 'main' returns an integer, the program returns a value between 0 and 255 (inclusive). Returning -1 will actually return 255 from your program (because of signed/unsigned representations of values).

This is the end of 'hello world' program disection.