return;

C programming guide

Sanic fast intro to the C language

NOTE: this page is mostly finished in terms of content, but will go through a cleanup pass later.

If you're following this guide properly, you should already have compiled a program. I expect that you can change your code and compile it yourself.

This intro does not explain every little thing like "what is a semicolon", I expect that you can look a the code examples and tell what's going on with a simple explanation.

I recommend reading the full version of this page later.


Basics

int hello = 45;

This is a "statement": a single part of the code that does something. Statements must always end in a semicolon ;. Statements starting with a type (e.g. int) will declare (create) a new variable with that type. This statment declares a new "int" variable, and sets the value to 45.

Here's some more statements:

// This is a comment!
	
int hello = 40; // First variable.
hello = hello + 40; // Sets 'hello' to it's own value plus 40.  (hello is now 80)
hello += 40; // A short version of the previous statement.  (hello is now 120)
hello ++; // Increment: a short version of  hello += 1;  (hello is now 121)
hello --; // Decrement: a short version of  hello -= 1;  (hello is now 120)
int world = 2; // Second variable.
hello = hello * world; // Multiply 'hello' with 'world' and set the result to hello.  (hello is now 240)
// hello *= 100;  // This statement is "commented out".

/*
	This is a block comment.
	it will continue until the end tag.
*/

Comments do nothing, but can be used to remind yourself (or other people who read your code) of things. You can also "comment out" parts of your code to disable them, without having to delete the code.

 

Basic types

There's 3 different number types: a signed integer, an unsigned integer, and a floating point number.

Normal integers are secretly "signed" (typing int is the same as typing signed int). The left-most bit (referred to as the "sign bit") determines whether the value is negative or positive. 10000110 = negative, 0000110 = positive. Integers can only store whole numbers like 1, 2, 3, but cannot store fractions like 3.14.

Unsigned integers are like signed integers except they do not use a sign bit. This allows the number to be twice as high, but prevents it from being negative. For example signed char can store values between -128 to 127, while unsigned char can store values between 0 to 255.

Floating point values are completely different from integers. They can be positive or negative, and they can hold decimal numbers like 0.32.

Here's some common data types:

Note: the exact size of these values may be different depending on how/where you compile your program, notably long may be 32 or 64 bits. The sizes described above are accurate for basically all normal computers though.


Functions

A function (also called "procedure") is a bundle of code that you can jump into from anywhere else in the program, the program will do what's in there and then return back.

int something = gibs_me_dat(500, 25);

This statement calls a function, and sends the values 500 and 25 into it. The function returns back an integer which we store into something.

int gibs_me_dat (int x, int y) {
	return x * 2 + y;
}

Declaring a new function is similar to declaring a variable. Note that you don't need a semicolon after the function.

The type at the start is what the function returns back to whoever called it. You can use the void type here if you don't want the function to return anything.

Inside the parentheses are variables that the function takes in.

Inside the curly brackets are the actual contents of the function. You can put any statements you want in there.

return is a special keyword in C that stops the function and returns back to whoever called it. It also sends back the value that you put after it. You can return at any point in the function, not just at the end.


Libraries (#include)

We're skipping ahead a bit so we can understand the program that we compiled and start printing text:

#include <stdio.h>

int main () {
	printf("Bag of biscuits\n");
	return 0;
}

The printf function is technically not an inherent part of C, it's within a library that's bundled with your compiler, and then imported at the top with #include.

Lines starting with # are special instructions for the compiler, they won't actually be part of your program. #include is an instruction that says "place the contents of this file here". Libaries are basically just code that someone else wrote so you won't have to. There's 2 ways to import them:

#include "stdio.h"

#include <stdio.h>

"stdio.h" searches for the file from the same folder that your code file is at, while <stdio.h> searches it from folders that the compiler has been informed about. The compiler already knows where to find it's own files so you don't need to tell it that yourself. We'll look at this stuff in more detail in a later part of the guide.

Printing text with printf()

printf("His power is %i, perhaps more! %f \n", 9000, 3.14);

printf simply prints text into the command line console. Text must be surrounded by quotes. You can use a backslash \ to insert special symbols into text, \n will add a 'new line' character.

You can print variables by sending them after the text and adding a tag in the text. Tags start with %, for example %i will read the next variable as an int. %f prints a float, %s prints text. There's many more tags.

Note: Functions can't normally take in arbitrary amounts of variables like this, printf can do it because it uses "variadic arguments". There's various problems with variadic arguments in C so I don't recommend using it unless you want to make your own version of printf.


Data text, arrays, pointers

Think of your computer's memory as a very long line, with stuff placed onto it. All of your variables and other data are slotted somewhere onto that line.

An array is just many values placed back-to-back on the line, and text (string) is just an array of chars.

A pointer is just an integer whose value is a distance along that line where you'll find something. For example if the pointer's value is 1000, then it points to the 1001st byte in memory.

The pointer is an offset from the beginning. If the pointer's value is 1, then it's moving 1 byte away from the beginning, and thus points to the second byte. This is why an offset of 1000 would point to the 1001st byte instead of the 1000th. This is called "0 indexing".

When you create an array or a string, the array data is placed somewhere onto the line, and then your get a pointer that points to the location where the array begins. Every time you use an array or a string in C, you're actually just using a pointer.

Image of things in memory.

With the memory above, if you had a char pointer with it's value set to 6, you could use it as a string that says "Hello".

 

Basic pointers

int* pointy = NULL;

This is a pointer variable. It looks like we're declaring an integer, but there actually isn't an integer anywhere, only a pointer as denoted by the asterisk *. pointy is just an integer that represents a position in memory. NULL is just a special word for 0, when a pointer's value is 0 it means the pointer doesn't point to anything. Always initialize your pointers to either NULL or some other target, leaving pointers uninitialized can create bugs that are very hard to track down.

int thingy = 60;
int* pointy = &thingy;

& before a variable allows you to get the location of that variable's data. thingy's data is somewhere in memory, and pointy points into it. You can now use pointy to modify thingy's value:

*pointy = 5000;
pointy[0] = 5000; // Same as above.

To dereference (get or modify) the data at a pointer's destination, put an asterisk before it or [0] after it. Now the value of thingy is 5000 even though we didn't use thingy at all. Since pointy is an int pointer, this puts a 32bit integer to the destination.

Arrays

int fiddle[20] = {0};

Imagine defining 20 separate integer variables back-to-back, and then defining a pointer variable that points to the first one. This is that pointer. You can't change this pointer's value, it will always only point to the first integer.

{0} sets all the integers in the array to 0, you should always do this because otherwise they start out as random values. You can set the initial values in the array by putting them in the brackets:

int fiddle[20] = {1, 2, 3, 4, 5}; // Sets the first 5 integers as desribed, and all other integers to 0.
int fiddle[] = {1, 2, 3, 4, 5}; // If you don't set the array size, it will be based on how you initialize it. This array will have 5 integers.
fiddle[3] = 9001;

Dereferencing a pointer with square brackets allows you to access other values next to the first one. This sets the 4th integer to 9001.

WARNING: You cannot return an array like this from a function. Find out why in the "Memory management" section.

Pointer arithmetic

Pointer arithmetic means changing a pointer's value so that it points to different things in memory.

int fiddle[20] = {0};
int* pointy = fiddle; // Note: fiddle is already a pointer so we don't need to use '&' to get the address.

*pointy = 500; // Modifies first value in the array.
pointy += 1; // Move the pointer forward (the pointer type is 'int' so this moves by 4 bytes).
*pointy = 9001; // Modifies the second value in the array.

While we can't move the pointer fiddle, we can move the pointer pointy. Be careful with pointer arithmetic, if you accidentally access data that's outside of an array, your program's memory will get corrupted and probably crash your program.

Strings

Strings are arrays of chars. There's 4 ways to define them:

char fiddle1[] = {72, 101, 108, 108, 111, 0};
char fiddle2[] = {'H', 'e', 'l', 'l', 'o', '\0'};
char fiddle3[] = "Hello";
char* fiddle4 = "Hello";

All of these do the exact same thing. Strings in C are "null terminated", which means that the last character must have a value of 0. You can use \0 to insert 0 values into text. A double quoted " string will automatically have one 0 at the end.

fiddle4 is slightly different, it a "constant string" which is stored differently by the program. The only practical difference is that you cannot modify this string, otherwise it's the same.

When you call a function like printf("Hello"), you're actually giving that function a pointer to where that text is stored, not the array data itself. Any function that receives the string should take in a char pointer:

#include <string.h>

void print_me (char* text) {
	int text_length = strlen(text);
	printf("Your string (%s) is %i characters long.", text, text_length);
}

strlen takes in a char pointer, finds the 0 value at the end of the data, and returns the distance (which is also the number of characters). You need the library string.h to use strlen.

If you need to use strlen many times for the same string, you should consider storing the string length in a variable so strlen doesn't have to calculate the length many times.

NOTE: when you call strlen, you get the number of characters without the 0 at the end. If you need the total amount of data used by the string, you need to add 1 to the length.

You cannot use strlen to get the size of an array. You need to keep track of the array length yourself and sending both the array pointer and the length to other functions. Structs will help, we'll look at them later.


Memory management

There's 2 different sections where your computer puts data into: the stack and the heap. The stack is like a tower, every time you want something (like when you declare a variable), you put it to the top. When you return from a function, you remove everything that was placed onto the tower during the function. This way you can just use variables whenever you want without caring about memory.

The problem with the stack is that if you create an array and return it from the function, you're only returning the pointer, the array data itself will be tossed off the tower and can't be used anymore.

The solution to problems like that is the heap. The heap doesn't do anything by itself, instead you can manually request your operating system to give you memory from the heap. The operating system will reserve the requested amount of space from somewhere and then give you a pointer to it, then you can do whatever you want with that memory.

int* fiddle = malloc(2000); // Get 2000 bytes of memory.

malloc is the primary way to allocate (request) memory. fiddle can now be used in exactly the same way as the arrays and pointers that we used before. However this data will never go away unless you release it yourself.

Allocating 2000 bytes isn't necessarily very helpful though, let's say we want to allocate space for 20 integers. We know that an int is 4 bytes so we could just allocate 20*4 bytes, but there's better ways:

int* fiddle1 = malloc(20 * sizeof(int));
int* fiddle2 = calloc(20, sizeof(int));

sizeof takes in a type and returns how big that type is (in bytes), so if you give it an int, it returns 4.

calloc is exactly like malloc, except it takes the number of items and the size of the type separately.

Note that the memory that malloc reserves for you is uninitialized, so if we read integers from it, their values may be completely random. calloc however sets all the data to 0, using calloc instead of malloc may help with preventing bugs.

free(fiddle);

free takes in any value that you originally got from malloc or calloc, and releases it so the OS can use the memory for other things. If you give it a wrong value or you already free'd the memory, your program will crash.

All 3 functions come from stdlib.h. sizeof isn't actually a real function, it's more like #include and can always be used without any libraries.


if / else / switch

if, else

In order to program actually useful things, you need to do more than just direct variable math.

int x = 5;

if (x < 10) {
	printf("x is smaller than 10.\n");
}
else if (x > 100) {
	printf("x is bigger than 100!!!\n");
}
else {
	printf("x is somewehere between 10 and 100.\n");
}

if will check for a condition inside the parentheses first, and if the check fails then it goes to the else.

int x = 5;
if (x) printf("x is not 0.\n");
if (!x) printf("x is 0.\n");

The secret of if is that it only cares about whether the condition is 0 or not. As long as the condition is 0 it fails, if it's anything else then it succeeds. You can use ! to check the opposite of a condition. Also note that an if block does not need the { } brackets, but this way there can only be a single statement after the if.

int x = 1 < 2; // The value of x becomes 1.
int y = 20 < 1000; // The value of y becomes 1.
int z = x + y; // The value of z becomes 2.

Here's another secret: comparisons are like functions that return 1 for success or 0 for fail.

Comparison operators

switch

int x = 2;

switch (x) {
	case 0:
		printf("x is 0.");
		break;
	case 1:
		printf("x is 1.");  // Note the lack of 'break' after this.
	case 2:
		printf("x is 2.");
		break;
	default:
		printf("who knows wtf x is.");
}

If you need to check many different states for a variable, you can use switch, which is just a different kind of if/else chain. Note that if you hit any case, it will only stop when it hits a break. Therefore if x is 1, it will "fall through" into the next case and prints x is 1.x is 2.. If none of the cases are activated, then the default case is activated.


Structs & Unions

Structs

struct Position {
	int x;
	int y;
};

A struct is just multiple variables grouped together. When you create a struct, you're creating a new type, and can then create variables and arrays with it, return it from functions, get it's total size with sizeof, and so on.

struct Position hello = {0};
hello.x = 50;
hello.y = 920;

Like arrays, you should always initialize the data with {0} to prevent bugs. Struct types will not create a pointer like arrays do though, struct is more like a single big variable. A period . can be used to modify the variables inside of the struct. Note that if hello was a pointer to the struct, you would have to use an arrow instead: hello->x = 50;

struct Position hello = {
	.x = 50
	// Any values that you leave out (like .y) will be initialized to 0.
};

You can also initialize a struct's values inside the curly braces.

typedef  struct Position  Position;

Typing "struct Position" is a little annoying, we can use typedef to create a new type. Now when we type "Position" it has the same meaning as "struct Position":

Position hello = {0};
hello.x = 50;
hello.y = 920;

You can also typedef a struct directly when you define it, it looks a little weird because typedef wants the name to come last:

typedef struct {
	int x;
	int y;
} Position;

Unions

union Something {
	int x;
	float y;
};

Unions are very similar to structs, except all the variables are overlapping in memory. If you modify x, it will also unpredictably modify y, because they're in the same position in memory. If you sizeof() a union, you will get the size of the biggest variable in it, in this case 4.

The point of a union is to let you store different kinds of things in the same space so you can avoid wasting memory. You will need to use some other way to determine which member of the union to use. Here's an example of how you could use a union:

typedef struct {
	int type;
	union { // Note: it is not necessary to give the union a name since we can access it through the struct.
		int seeds;
		float amount_of_cream;
	};
} Food;

void eat_food (Food food) {
	if (food.type == 1) {
		printf("Nom nom icecream with %f pounds of cream!\n", food.amount_of_cream);
	}
	else if (food.type == 2) {
		printf("Ate an apple with %i seeds.\n", food.seeds);
	}
}

void main () {
	Food icecream = { .type = 1,   .amount_of_cream = 15.5 };
	Food apple    = { .type = 2,   .seeds = 8              };

	eat_food(icecream);
	eat_food(apple);
}

Loops + arrays

int x = 0;

while (x < 10) {
	printf("x is %i\n", x);
	x ++;
}

while is just like an if, except that when it ends, it goes back to the start and checks the condition again. This loop will keep repeating until x is no longer smaller than 10.

for (int x=0; x<10; x++) {
	printf("x is %i\n", x);
}

for is an special loop where the main components are inside the parentheses. The first part can set up a variable, the second part is the condition, and the third part happens after every loop. This loop is identical to the while loop above.

int stuff[10] = {0};

for (int x=0; x<10; x++) {
	stuff[x] += 2;
	if (x > 5) continue;
	printf("Value at position %i is %i\n", x, stuff[x]);
}

This adds 2 into every value in the array and prints the value. However, when x is above 5, the loop is skipped early on with continue, and thus numbers 6 and above are not printed.

int stuff[10] = {1, 0, 3, 5, 4, 0, 2, 1, 0, 3};
int twoposition = -1;

for (int x=0; x<10; x++) {
	if (stuff[i] == 2) {
		twoposition = x;
		break;
	}
}

if (twoposition != -1) printf("Found 2 at array position %i.\n", twoposition);
else                   printf("Oh no, there wa no number 2 in the array!\n");

break is similar to continue, except it stops the loop completely.


Macros

#define  hello  100
int x = hello;
 
int x = 100;

Macros are just text replacements. Now when you type hello, it gets replaced with 100.

#define  hello  int x = 100;
hello
hello
 
int x = 100;
int x = 100; // Error: x was already defined.

You can put anything you want into a macro and it will work. It will only become an error if the result after text replacement has an error.

#define  hello  int x = 100; \
                int y = 100;
hello
 
int x = 100;
int y = 100;

Macro ends in a new line, but you can make multi-line macros by inserting a backslash \ at the end of a line.

#define  hello(name, value)  int name = value;
hello(x, 100)
hello(y, 50)
 
int x = 100;
int y = 50;

Parentheses can be used to make the macro work similarly to a function.


Miscellaneous

Casting

float x = 100;
int y = (int)x; // Convert float into int.

float* xp = malloc(10 * sizeof(float));
int* yp = (int*)xp; // No conversion necessary for pointers, but the compiler will think that this is a mistake unless we cast the pointer.

Casting is a way to convert values from one type to another, and a way to tell the compiler that what we're doing is intentional and not a mistake.

Enums

enum {
	BANANA,
	ICECREAM,
	APPLE,
	MOUNTAINDEW = 666,
}

An enum will create a bunch of variables that each have a unique number. In this case BANANA becomes 0, ICECREAM becomes 1, APPLE becomes 2, and MOUNTAINDEW becomes 666. Remember the Food struct example above? You could use an enum to determine the food type, so you don't have to write and remember the type number yourself.

Void pointers

void* foo = malloc(1000);

A void pointer is just a normal pointer, except the compiler treats it as a pointer to unknown data. Since it has no type and therefore no known size, you cannot dereference it. Void pointers can be inserted into other pointers and vice versa without needing to cast it.

Function pointers

int (*state) (Person); // The function pointer. This syntax is awkward but that's just the way it is.

int idle_state (Person me) {
	if (me.saw_delicious_banana) state = moving_towards_banana_state;
	return 0;
}
int moving_towards_banana_state (Person me) {
	if (me.ate_banana) state = idle_state;
	if (me.exploded_from_eating_too_much) return 1;
	return 0;
}

void main () {
	state = idle_state; // State starts out as idle.
	Person person;
	while (1) {
		int result = state(person);
		if (result == 1) break; // If exploded from eating too many bananas, stop the loop.
	}
}

A function pointer is what it sounds like: a pointer that points into a function. You can use the pointer to call the function, and can change the pointer to a different function at any time. This can be used for example to make state machines, as demonstrated above. (Note: the code above doesn't work, it's just an example.)

Exact integer sizes

#include <stdint.h>

int32_t x = 1500; // A signed integer that's guaranteed to be 32 bits.
uint16_t x = 1500; // An unsigned integer that's guaranteed to be 16 bits.

If you want to be sure that an integer is a specific size, you can get special types from the stdint.h library.

Bitwise operators

int x = 5;
x = ~x; // invert the bits in x

Bitwise operators let you do advanced manipulation of bits. Here's a table of different bitwise operators:

symbol name explanation operation result
~ not Inverts bits. ~1101 0010
& and Bit becomes 1 if both sides have a 1 bit. 0110 & 1100 0100
| or Bit becomes 1 if either side has a 1 bit. 0110 | 1100 1110
^ xor Bit becomes 1 if the sides are different. 0110 ^ 1100 1110
< Left shift Moves bits left. 0001 << 2 0100
> Right shift Moves bits right. 1000 >> 2 0010

How do you open a window?

There's 2 ways: calling the operating system, and using a third party library like SDL. We will look at both methods next, but the rest of this guide will mostly use SDL.