OBJECTIVE:
* Understanding program Execution
* Causes for performance bottlenecks in sequential and parallel programs.
* Improving the performance
Reference Books:
* Hi Performance Computing - O'Reilly series
PROGRAM EXECUTION:
Lets us see the execution process of a C program.
C Program
|
|
Compiler
| -> assembly lang.
|
assembler
| -> Object file *
|
Lib modules --> Linker
Other Objects | -> Object file
|
Loader
|
|
Executable file
* Object file created from after the assembling may have some
functions or variables, which are used in this file and defined elsewhere.
GENERIC FORMAT OF OBJECT FILE:
Typical format of an object file will be as follows.
|-----------------------|
| Header Info |
|-----------------------|
| Machine code |
|-----------------------|
| Initialized Data |
|-----------------------|
| Symbol Table |
|-----------------------|
| Relocation Info. |
|-----------------------|
for example consider the following C code.
#inclide<stdio.h>
#include<math.h>
float arr[100];
int size=100;
void main()
{
int i;
float sum;
for(i=0, sum=0.0; i<size; i++)
{
arr[i]=sqrt(arr[i]);
sum+=arr[i];
}
printf("sum is %f\n",sum);
}
compile the above program using gcc -c to get the object output.
(we can see the assembly output using -S option)
At the time of compilation, functions `sqrt' and `printf' are not defined
in the program. These are called undefined symbols.
similarly there are some defined symbols like arr, sum, main., these can
be used in any other functions in a different module (in a separate .c file).
so there should be some information stored in the object file about the these.
-> here there is a difference between the global variables(arr,size) and the
local variables(i,sum).
* Global variables are stored in the data segment whereas the local
variables are stored in the stack area .
* Global variables can be accessed by other functions (in this or other
modules) whereas local variables are local the function where they are defined.
[Static variables (defined with the "static" keyword in C are also stored in
data segment. ]
Even in the global variables there are some initialized variables and some
uninitialized variables.
/-- Initialized Variables
Data Segment --|
\-- Uninitialized variables
the object file generated from the above C code will be some thing like this.
*(this may vary depends on the processor architecture)
Offset Contains Comments
H { 0 94 No. of bytes of machine code
e I { 4 4 " " " " Initialized data
a n { 8 400 " " " " Uninitialized data
d f { 12 60 " " " " Symbol Table
e o { 16 ?? " " " " Relocation table
r {
M { 20 ?? main: first executable statement in main
a c { ... ... ...
c o { 66 ?? call sqrt
h d { ... ... ...
i e { 102 ?? call printf
n {
e {
********************************************************
Continuing from the Above lecture, remaining 3 sections (namely initialized
data, symbol table, relocation information) of one of the possible generic
formats of .obj file looks like the following.
(Ref: the C program discussed in 1st lecture)
------------------------------------------------------------------------------
Offset Contents Symbol table Comments
------------------------------------------------------------------------------
(Initialized Data)
114 100 (N. A.) value "size"; this is the ONLY entry
for the above C program, as array arr
is not initialized, even though it is
also global
(Symbol Table)
118 ?? size name of symbol "size" and its location
(within initialized data sec.)
is to be stored
130 ?? arr name of symbol "arr" and the starting
address of arr
142 ?? main() name of symbol "main" and its location
(offset in code segment) for OS to
start with main
154 ?? sqrt() name of symbol "sqrt", no addr. info.
function used here but defined in math.h
166 ?? printf() name of symbol "printf", no addr. info.
function used here, defined in stdio.h
(Relocation Information)
178 ?? stores the info about the offsets at
which external vars are called
Initialized Global variables are allocated in initialized data section and
uninitialized global vars are indirectly referred to by recording their sizes
in the header section.
We know that the sequence for getting executable code is as follows
source code
|
compiler and assembler
|
object file
|
library------> linker
|
executable
Generic format of .out file (on disk) is as follows
______________________________
| HEADER |
|____________________________|
| MACHINE CODE SEGMENT |
|____________________________|
| INITIALIZED DATA |
|____________________________|
Process may be described as a program under under execution.
While under execution the generic format of (logical) address
space of the program is as follows
|----------------------------------|
| MACHINE CODE | read-only
| | data
|----------------------------------|
| INITIALIZED DATA SECTION |
| | read
|----------------------------------| write
| UNINITIALIZED DATA SECTION | data
| |
|----------------------------------|
| HEAP (for dyn mem alloc) | grows downwards
|----------------------------------|
| STACK (local vars and fn. | grows upwards
| call/parameter passing |
|----------------------------------|
Dynamically allocated memory (created using malloc or new statements)
is from Heap.
Different object modules are combined in the linking stages which
primarily involves 2 steps - relocation & linking.
The first thing to do is relocation, which is combining two different object
modules together by creating a new load module and writing the machine codes
of object modules into it one after the other.
Only the first module put into load module has its offsets unchanged but all
other machine code sections of remaining modules have their offsets modified
according to the amount of code input previously. This is called as relocation.
As a consequence of relocation, the location info. (address) of the symbols
would have changed.
Next thing to be done is modification of the addresses of the external
variables in the locations where they are called/referred from. When
modules are separate, these entries were initialized to zero as compiler
had no information about the whereabouts of these external vars. The external
references may be functions also. Linking is a kind of patch-up work to
be done after relocation.
Note that the generic format obtained after linking and loading is the same
as that discussed at the start.
Now let us look at the use of stack for storing of local vars during
function calls.
| |
| |
| |
| |
|-------------------------------|<--- sp
| local vars of func2 |
|-------------------------------|
| prev. FP contents |<--- fp
|-------------------------------|
| ret address in func1 |
|-------------------------------|
| parameters to func2 |
|-------------------------------|<--- sp
| local vars of func1 |
|-------------------------------|
| prev. FP contents |<--- fp
|-------------------------------|
| ret addr in main |
|-------------------------------|
| parameters to func1 |
|-------------------------------|
| local vars of main |
|-------------------------------|<--- fp
As shown above, the first call is from main to func1. The Local vars in main
are in the stack, and the the parameters to be passed to func1 are pushed onto
them. The return address in main is pushed above that. The first instrn. in the
function func1 typically pushes the previous fp (frame pointer) value on the
stack, and changes the fp to point to this same (pushed) location. Local
variables for function func1 are allocated on top of the stack and the stack
pointer is moved (up). Local variables within the function are accesses
using the fp and an appropriate offset. When func1 calls func2, parameters,
and return address are pushed in stack as shown in the figure. When returning
from function func2, the last instrn. typically restores the sp(stack pointer)
to a location below the current fp, and the fp to the contents of the memory
location pointed by it (this corresponds to the fp value of the caller
(func1)). On returning from func2, in func1, the sp has to incremented
(brought down in the figure) to account the popping of pushed parameters.
Thus the sp(stack pointer) and fp(frame pointer) respectively do
the functions of popping the stack and guarding sp from going out of context
due to excessive popping.
Creating computer software is always a demanding and painstaking process -- an exercise in logic, clear expression, and almost fanatical attention to detail. It requires intelligence, dedication, and an enormous amount of hard work. But, a certain amount of unpredictable and often unrepeatable inspiration is what usually makes the difference between adequacy and excellence.
Monday, 19 August 2013
Program Execution
Labels:
Assembly
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment