Course: CMPSC 311 - Introduction to Systems Programming
Professor: Sencun Zhu
Slides by: Professor Patrick McDaniel and Professor Abutalib Aghayev
Files in C: foo.c, foo.h, bar.c
C workflow: Editing, linking, executing, debugging, profiling
Types of libraries: Statically linked libraries (libZ.a), Shared libraries (libc.so)
Object files: bar.o, foo.o
Defining a function in C
Example: sumTo function
Parameters: int max
Variables: int i, int sum
Loop: for (i=1; i<=max; i++)
Statement: sum += i
Return: return sum
Conversion from C to machine code
Example: dosum function
C source file: dosum.c
C compiler: gcc -S
Assembly source file: dosum.s
Assembler: as
Machine code: dosum.o
C compilers generate object ".o" files directly
Object code is re-locatable machine code
Object code generally cannot be executed without manipulation (e.g., via a linker)
Anatomy of a C program
Includes: #include <stdio.h>
Functions: main() and myfunc()
Example program
Running a C program
Compiling and linking: gcc -g -Wall main.c -o main
Executing the program: ./main
Running a program in UNIX
Search path: PATH environment variable
Adding to search path: export PATH=$PATH:/new/path
Multi-file C programs
Example: dosum function in dosum.c
Example: main function in sumnum.c
Prototyping functions
Multi-file C programs
Including standard libraries: #include <stdio.h>
Compiling multi-file programs
Linking multiple object files to produce an executable
Linking standard libraries
Object files in C
Object files contain machine code produced by the compiler
Object files might contain references to external symbols
Linking resolves external symbols
Similarities between C and Java
Syntax, types, type-casting, expressions, operators, scope, comments
Primitive types in C
Integer types: char, int, short, long
Floating point types: float, double
Type modifiers: signed, unsigned
C99 extended integer types
Solve the conundrum of "how big is a long int?"
Example using stdint.h library
Variables in C are similar to Java
Variables must be declared at the start of a function or block (not required since C99)
Variables need not be initialized before use (gcc -Wall will warn)
It is always recommended to initialize variables before use
Example code:
#include <stdio.h>
int main(void) {
int x, y = 5; // note x is uninitialized!
long z = x+y;
printf("z is '%ld'\n", z); // what’s printed?
{
int y = 10;
printf("y is '%d'\n", y);
}
int w = 20; // ok in c99
printf("y is '%d' , w is '%d'\n", y, w);
return 0;
}
The
const
qualifier in C indicates that a variable's value cannot changeThe compiler will issue an error if you try to violate this
Example code:
#include <stdio.h>
int main(void) {
const double MAX_GPA = 4.0;
printf("MAX_GPA: %g\n" , MAX_GPA);
MAX_GPA = 5.0; // illegal!
return 0;
}
Loops in C are similar to Java
For loops cannot declare variables in the loop header (changed in C99)
If/else, while, and do while loops are available
C does not have a boolean type (changed in C99 with
#include <stdbool.h>
)Any type can be used in conditionals, where 0 means false and everything else means true
Example code:
int i;
for (i=0; i < 100; i++) {
if (i % 10 == 0) {
printf("i: %d\n" , i);
}
}
Pointers in C allow you to store memory addresses of variables
Key concepts:
Taking the address of a variable:
&
Dereferencing a pointer:
*
Aliasing:
*ip
is an alias fori
Example code:
#include <stdio.h>
int main(void) {
int i = 5;
int *ip = &i;
printf("%d\n" , i);
printf("%p\n", ip);
*ip = 42;
printf("%d\n", i);
printf("%d\n", *ip);
}
C always passes arguments by value
Pointers allow you to pass arguments by reference
Example code:
void add_pbv(int c) {
c += 10;
printf("pbv c: %d\n", c);
}
void add_pbr(int *c) {
*c += 10;
printf("pbr *c: %d\n", *c);
}
int main(void) {
int x = 1;
printf("x: %d\n", x);
add_pbv(x);
printf("x: %d\n", x);
add_pbr(&x);
printf("x: %d\n", x);
return 0;
}
C passes arguments by value
If the callee modifies an argument, the caller's copy isn't modified
Example code:
void swap(int a, int b) {
int tmp = a;
a = b;
b = tmp;
}
int main(void) {
int a = 42, b = -7;
swap(a, b);
printf("a: %d, b: %d\n", a, b);
return 0;
}
Pointers can be used to pass arguments by reference in C
The callee still receives a copy of the argument, but it is a pointer to the variable in the scope of the caller
Example code:
void swap(int *a, int *b) {
int tmp = *a;
*a = *b;
*b = tmp;
}
int main(void) {
int a = 42, b = -7;
swap(&a, &b);
printf("a: %d, b: %d\n", a, b);
return 0;
}The key to C (and languages like it) is getting good at using pointers.
Pass-by-reference OS kernel [protected] stack main a 42 b -7
void swap(int *a, int *b) { int tmp = *a; *a = *b; *b = tmp; }
int main(void) { int a = 42, b = -7; swap(&a, &b); printf("a: %d, b: %d
", a, b); return 0; }CMPSC 311 - Introduction to Systems Programming
Pass-by-reference OS kernel [protected] stack main a 42 b -7
void swap(int *a, int *b) { int tmp = *a; *a = *b; *b = tmp; }
int main(void) { int a = 42, b = -7; swap(&a, &b); printf("a: %d, b: %d
", a, b); return 0; }swap a ? b ?
tmp ?
CMPSC 311 - Introduction to Systems Programming
Pass-by-reference OS kernel [protected] stack main a 42 b -7
void swap(int *a, int *b) { int tmp = *a; *a = *b; *b = tmp; }
int main(void) { int a = 42, b = -7; swap(&a, &b); printf("a: %d, b: %d
", a, b); return 0; }swap a b tmp ?
CMPSC 311 - Introduction to Systems Programming
Pass-by-reference OS kernel [protected] stack main a 42 b -7
void swap(int *a, int *b) { int tmp = *a; *a = *b; *b = tmp; }
int main(void) { int a = 42, b = -7; swap(&a, &b); printf("a: %d, b: %d
", a, b); return 0; }swap a b tmp 42
CMPSC 311 - Introduction to Systems Programming
Pass-by-reference OS kernel [protected] stack main a -7 b -7
void swap(int *a, int *b) { int tmp = *a; *a = *b; *b = tmp; }
int main(void) { int a = 42, b = -7; swap(&a, &b); printf("a: %d, b: %d
", a, b); return 0; }swap a b tmp 42
CMPSC 311 - Introduction to Systems Programming
Pass-by-reference OS kernel [protected] stack main a -7 b 42
void swap(int *a, int *b) { int tmp = *a; *a = *b; *b = tmp; }
int main(void) { int a = 42, b = -7; swap(&a, &b); printf("a: %d, b: %d
", a, b); return 0; }swap a b tmp 42
CMPSC 311 - Introduction to Systems Programming
Pass-by-reference OS kernel [protected] stack main a -7 b 42
void swap(int *a, int *b) { int tmp = *a; *a = *b; *b = tmp; }
int main(void) { int a = 42, b = -7; swap(&a, &b); printf("a: %d, b: %d
", a, b); return 0; }CMPSC 311 - Introduction to Systems Programming
Pass-by-reference OS kernel [protected] stack main a -7 b 42
void swap(int *a, int *b) { int tmp = *a; *a = *b; *b = tmp; }
int main(void) { int a = 42, b = -7; swap(&a, &b); printf("a: %d, b: %d
", a, b); return 0; }CMPSC 311 - Introduction to Systems Programming
Very different than Java
arrays
just a bare, contiguous block of memory of the correct size
array of 6 integers requires 6 x 4 bytes = 24 bytes of memory
arrays have no methods, do not know their own length (no bounds checking)
C doesn’t stop you from overstepping the end of an array!
many, many security bugs come from this (buffer overflow)
CMPSC 311 - Introduction to Systems Programming
Very different than Java
arrays
just a bare, contiguous block of memory of the correct size
array of 6 integers requires 6 x 4 bytes = 24 bytes of memory
arrays have no methods, do not know their own length (no bounds checking)
C doesn’t stop you from overstepping the end of an array!
many, many security bugs come from this (buffer overflow)
X[7] = 45; // Legal C, but can cause memory fault! ! ! !
CMPSC 311 - Introduction to Systems Programming
Very different than Java
strings
array of char
terminated by the NULL character ‘\0’
are not objects, have no methods; string.h has helpful utilities (see strings lecture coming soon!)
h e l l o
\0char *x = ”hello
”;x
CMPSC 311 - Introduction to Systems Programming
Very different than Java
errors and exceptions
C has no exceptions (no try / catch)
errors are returned as integer error codes from functions
sometimes makes error handling ugly and inelegant
some support from OS using signals (end of semester)
crashes
if you do something bad, you’ll end up spraying bytes around memory
hopefully causing a “segmentation fault” and crash
objects
there aren’t any; struct is closest feature (set of fields)
CMPSC 311 - Introduction to Systems Programming
Very different than Java
memory management
there is no garbage collector
anything you allocate you have to free (memory leaks)
local variables are allocated off of the stack
freed when you return from the function
global and static variables are allocated in a data segment
are freed when your program exits
you can allocate memory in the heap segment using malloc()
you must free malloc’ed memory with free()
failing to free is a leak, double-freeing is an error (hopefully crash)
CMPSC 311 - Introduction to Systems Programming
Very different than Java
console I/O
C library (libc) has portable routines for reading/writing, e.g., scanf() , printf()
file I/O
C library has portable routines for reading/writing
fopen() , fread() , fwrite() , fclose() , etc.
does buffering by default, is blocking by default
OS provides system calls
we’ll be using these: more control over buffering, blocking
Low level binary reads and writes, e.g., read() , write() , open() , close()
CMPSC 311 - Introduction to Systems Programming
Very different than Java
network I/O
C standard library has no notion of network I/O
OS provides (somewhat portable) routines
lots of complexity lies here
errors: network can fail
performance: network can be slow
concurrency: servers speak to thousands of clients simultaneously
Note: most of these topics will be covered in detail over the semester.
CMPSC 311 - Introduction to Systems Programming
Very different than Java
Libraries you can count on
C has very few compared to most other languages
no built-in trees, hash tables, linked lists, sort , etc.
you have to write many things on your own
particularly data structures
error prone, tedious, hard to build efficiently and portably
less productive language than Java, C++, python, or others
CMPSC 311 - Introduction to Systems Programming
Problem: ordering
Don’t call a function that hasn’t been declared yet:
#include <stdio.h>
int main(void) { printf("sumTo(5) is: %d
", sumTo(5)); return 0; }// sum integers from 1 to max
int sumTo(int max) { int i, sum = 0; for (i=1; i<=max; i++) { sum += i; } return sum; }
CMPSC 311 - Introduction to Systems Programming
Problem: ordering
Solution 1: reverse order of definition
#include <stdio.h>
Problem: ordering
Solution 2: provide function declaration
Teaches the compiler the argument and return types of the function that will appear later
The body-less function declaration is called a functional prototype.
Example code:
#include <stdio.h>
// this function prototype is a declaration of sumTo
int sumTo(int);int main(void) {
printf("sumTo(5) is: %d\n", sumTo(5));
return 0;
}// sum integers from 1 to max
int sumTo(int max) {
int i, sum = 0;
for (i=1; i<=max; i++) {
sum += i;
}
return sum;
}
UNIX Std*
There are three predefined streams provided to all UNIX programs
Standard input (stdin)